The “direct” or “forward” problem” solves the “parameter-to-output” mapping that describes the “cause-to-effect” relationship in the respective physical process. Taking Eq. (3), for example, the “forward (or direct) problem” consists in solving it subject to appropriate boundary conditions to obtain the solution V(x), which is in turn used to compute the model responses r[V(x); α(x)], where α(x) ≡ [d(x), σ(x), ω, ε(x), J
c(x), P(x), Q
j
(x)] denotes the vector of spatially-(x -)dependent model parameters. The necessary and sufficient conditions for the direct problem to be well-posed were formulated by Hadamard (1865–1963), and can be stated as follows: (i) For each source, Q
j
, there exists a solution V(x); (ii) The solution V(x) is unique; (iii) The dependence of V(x) upon “the data” α(x) and boundary conditions is continuous. A problem that is not well-posed is called ill-posed. In general, two problems are called inverses of one another if the formulation of each involves all or part of the solution of the other. Several inverse problems can be formulated, as follows: (a) The classical “inverse source identification problem”: given the responses r, the known boundary conditions, and the parameters α(x), determine the sources Q
j
; (b) “Parameter identification problem”: given the responses r and the sources Q
j
, determine the parameters α(x); (c) When the domain contains inhomogeneous materials, and the responses r are given, identify internal boundaries between the inhomogeneous materials, identify the description of the system’s structure (“structural identification”), etc.
The existence of a solution for an inverse problem is, in most cases, secured by defining the data space to be the set of solutions to the direct problem. This approach may fail if the data is incomplete, perturbed or noisy. Problems involving differential operators, for example, are notoriously ill-posed, because the differentiation operator is not continuous with respect to any physically meaningful observation topology. If the uniqueness of a solution cannot de secured from the given data, additional data and/or a priori knowledge about the solution need to be used to further restrict the set of admissible solutions. Of the three Hadamard requirements, stability is the most difficult to ensure and verify. If an inverse problem fails to be stable, then small round-off errors or noise in the data will amplify to a degree that renders a computed solution useless.
Designing “HeteroFoaM” materials is fundamentally an inverse problem. The difficulties that must be overcome when developing validated predictive computational methods for designing multi-phase heterogeneous functional materials (HeteroFoaM) can be illustrated by considering the “inverse source identification problem” for a one-dimensional domain extending from x = 0 to x = d (considered to be infinite in the y- and z-directions), and having perfectly known constant material properties, in which the potential, V(x), is driven by a spatially varying source, Q(x) ≡ dQ
j
. For such an idealized material, Eq. (3) takes on the simple form
$$ -A{d}^2V(x)/d{x}^2=Q(x),\kern0.36em 0<x<d,\kern0.24em with\kern0.24em A\equiv d\left(\sigma +j\omega \varepsilon {\varepsilon}_0\right),\kern0.24em Q(x)\equiv d{Q}_j. $$
(9)
The “inverse source determination problem” is to determine the source Q(x) from measurements of the potential, V(x). A measurement would be recorded as a “detector” or “instrument response” that can be represented in the form
$$ M\equiv {\displaystyle \underset{0}{\overset{d}{\int }}V(x){R}_d}(x)dx, $$
(10)
where R
d
(x) represents the detector’s response function. For a known (measured) response value M, it is evident that Eq. (10) represents a Fredholm equation of the first kind for the determination of the spatially-dependent voltage, V(x), which cannot be solved as it stands to produce a unique solution V(x)! Moreover, Fredholm equations of the first kind are notoriously ill-posed, since the integration over the kernel [in this case, R
d
(x)] of the Fredholm equation has a “smoothing” effect on the high-frequency components, and may include cusps, and edges in V(x). This effect stems from the well-known Riemann-Lebesgue lemma, which for the purposes of this work can be written in the form
$$ \underset{\omega \to \infty }{ \lim}\;{\displaystyle \underset{a}{\overset{b}{\int }}f(x)}\;{e}^{j\omega x}dx\to 0,\kern0.48em with\kern0.24em f(x)= piecewise\kern0.24em continuous. $$
(11)
It can therefore be expected that the (inverse) determination of V(x) using the Fredholm Eq. (10) will amplify high frequency components (such as those stemming from measurement errors) in the measured “detector response” M.
In practice, the Fredholm equation (10) must be discretized in order to determine V(x). There are two main classes of methods for discretizing integral equations, namely quadrature methods and Galerkin (which include collocation, spectral, and pseudo-spectral) methods. Consider, for simplicity, that
$$ {R}_d(x)={C}_d\delta \left(x-{x}_n\right) $$
(12)
where C
d
is an appropriate “measurement conversion function”, so that the detector provides, at any spatial location x
n
, a measurement of the form
$$ {M}_n\equiv {C}_dV\left({x}_n\right),\kern0.36em n=0,1,\dots, N. $$
(13)
On the other hand, in order for Eq. (9) to make physical sense, it is clear that the function V(x) must be square integrable, piecewise continuous, and of bounded variation (having at most a finite set of discontinuities of finite magnitudes within the slab). Therefore, the function V(x) must admit a spectral (e.g., Fourier) representation, and the choice of basis-functions can be conveniently adapted to boundary conditions and possible periodicities and/or symmetries inherent in the problem under consideration. For our illustrative inverse problem, we expect to be able to measure V(x) at least at the left and right boundaries of the slab, i.e., at x = 0, and at x = d, respectively, obtaining two values, which will be conveniently denoted as M
0 ≡ C
d
V(0) and M
N
= C
d
V(x
N
), x
N
≡ d, respectively. From the point of view of the forward problem, these measurements would mathematically provide (two) Dirichlet boundary conditions for V(x), namely
$$ V(0)={C}_d/{M}_0,\kern0.36em V\left({x}_N\right)={C}_d/{M}_N, $$
(14)
to complement the differential equation (
9
), thus rendering the forward problem [namely, to determine the function V(x) when the source Q(x) is known] to be perfectly well-posed in the sense of Hadamard. Actually, the unique and exact solution, V
exact(x), of the forward problem consisting of Eqs. (
9
) and (
14
), is
$$ {V}^{exact}(x)={\displaystyle \sum_{n=1}^{\infty }{c}_n^{exact} \sin \frac{n\pi x}{d}}, $$
(15)
with the coefficients \( {c}_n^{exact} \) given by the expression
$$ {c}_n^{exact}\equiv \frac{n\pi /d\left[\left(-1\right){}^{n+1}\;V\left({x}_N\right)+V(0)\right]+{q}_n/A}{{\left(n\pi /d\right)}^2},\kern0.36em {q}_n\equiv {\displaystyle {\int}_0^dQ(x)}\frac{n\pi x}{d}dx. $$
(16)
Returning now to the inverse problem at hand, it becomes clear that the spectral representation shown in Eq. (15) underscores that fact the determination of V
exact(x) would require infinitely many measurements, M
n
, in order to determine all of the coefficients
\( {c}_n^{exact} \). But it is practically impossible to perform infinitely many measurements. In practice, therefore, the determination of the first J coefficients (c
1, …, c
J
) necessitates J measurements of V(x) at locations (x
1, …, x
J
), in order to construct the following system of equations (for determining the coefficients c
1, …, c
J
):
$$ \mathbf{S}\mathbf{C}=\mathbf{M},\kern0.48em \mathbf{S}\equiv \left(\begin{array}{ccc}\hfill \sin \frac{\pi {x}_1}{d}\hfill & \hfill \dots \hfill & \hfill \sin \frac{J\pi {x}_1}{d}\hfill \\ {}\hfill \vdots \hfill & \hfill \ddots \hfill & \hfill \vdots \hfill \\ {}\hfill \sin \frac{\pi {x}_J}{d}\hfill & \hfill \cdots \hfill & \hfill \sin \frac{J\pi {x}_J}{d}\hfill \end{array}\right),\kern0.36em \mathbf{C}\equiv \left(\begin{array}{c}\hfill {c}_1\hfill \\ {}\hfill \vdots \hfill \\ {}\hfill {c}_J\hfill \end{array}\right);\kern0.36em \mathbf{M}\equiv \left(\begin{array}{c}\hfill {M}_1\hfill \\ {}\hfill \vdots \hfill \\ {}\hfill {M}_J\hfill \end{array}\right). $$
(17)
Solving the above system yields the coefficients (c
1, …, c
J
) as the solution of the equation
$$ \mathbf{C}={\mathbf{S}}^{-1}\mathbf{R}. $$
(18)
It is clear from the foregoing considerations that the coefficients c
n
cannot possibly be determined perfectly, for at least the following three reasons: (i) it is impossible to perform infinitely many measurements; (ii) the measurements M
j
cannot be performed perfectly, so they will be afflicted by measurement errors; and (iii) inverting the matrix S will introduce additional numerical errors. Therefore, the reconstructed coefficients c
n
will be affected by errors, which can be considered to be additive, of the form
$$ {c}_n={c}_n^{exact}+{\varepsilon}_n. $$
(19)
where \( {c}_n^{exact} \) denotes the exact, but unknown n
th-coefficient, while ε
n
denotes the corresponding error. Hence, the reconstructed potential, denoted here as V
rec(x), will have the form
$$ {V}^{rec}(x)={\displaystyle \sum_{n=1}^J{c}_n^{exact}} \sin \frac{n\pi x}{d}+{\displaystyle \sum_{n=1}^J{\varepsilon}_n} \sin \frac{n\pi x}{d}. $$
(20)
The above representation of the potential clearly indicates that its reconstruction from measurements introduces errors over the entire spatial-frequency spectrum. It is especially important to note that the highest-frequency spatial errors cannot be controlled from the “measurement side” since they arise precisely because of the truncation to finitely many terms, which actually stems from the inability to perform infinitely many measurements.
The effects of the errors displayed in Eq. (19) on attempting to determine the source Q(x) from Eq. (9) can now be displayed explicitly, as follows. If the exact expression, V
exact(x), given in Eq. (15) were available, if the model represented by Eq. (9) were perfect, and if the boundary values given in Eq. (14) were perfectly well known, then the exact expression for the source, Q
exact(x), could be obtained by replacing Eq. (15) into Eq. (9). The expression thus obtained for Q
exact(x) would be
$$ {Q}^{exact}(x)=A{\displaystyle \sum_{n=1}^{\infty }{c}_n^{exact}{\left(\frac{n\pi }{d}\right)}^2 \sin \frac{n\pi x}{d}}, $$
(21)
where the exact coefficients \( {c}_n^{exact} \) would decay sufficiently fast, as functions of n, to ensure the convergence of the infinite series on the right side of Eq. (20). This property can be readily verified by using Eq. (16) to compute the exact coefficients, \( {c}_n^{exact} \), that would result from various particular forms of the source Q
exact(x).
However, the exact potential V
exact(x), is unavailable! Only the reconstructed potential, V
rec(x), given in Eq. (20) is available. Replacing this expression into Eq. (9) yields
$$ A{\displaystyle \sum_{n=1}^J{c}_n^{exact}{\left(\frac{n\pi }{d}\right)}^2 \sin \frac{n\pi x}{d}}+A{\displaystyle \sum_{n=1}^J{\varepsilon}_n{\left(\frac{n\pi }{d}\right)}^2 \sin \frac{j\pi x}{d}}={Q}^{exact}(x)+{Q}^{error}\left(x,J\right), $$
(22)
where
$$ {Q}^{error}\left(x,J\right)\equiv A{\displaystyle \sum_{n=1}^J{\varepsilon}_n{\left(\frac{n\pi }{d}\right)}^2 \sin \frac{n\pi x}{d}}-A{\displaystyle \sum_{n=J+1}^{\infty }{c}_n^{ex}{\left(\frac{n\pi }{d}\right)}^2 \sin \frac{n\pi x}{d}}. $$
(23)
Even though the exact values of the coefficients ε
j
are unknown, they are nevertheless some numerical constants, and the crucial fact is that they do not depend on n. It therefore follows that, in the limit of large J, the second sum in Eq. (23) will vanish, but the first sum will diverge to infinity, so that
$$ \underset{J\to \infty }{ \lim }{Q}^{error}\left(x,J\right)\to \infty . $$
(24)
The above behavior of Q
error(x, J) clearly highlights the destructive effect of high frequency errors when attempting to determine the source from flux measurements by using the forward Eq. (3): high-frequency error components arising from the reconstruction of the flux from measurements cause a large deviation between the true source and the source that would be reconstructed from flux measurements. Furthermore, this discrepancy between the true and the reconstructed source is the larger the higher the frequency of the error component in the reconstructed potential from measurements. The fundamental reason for this behavior is that the non-compact Laplace operator the “amplifies” the high-frequency error components if the forward equation is used to reconstruct the source, Q(x), from measurements of the potential, V(x).
The procedures used to solve approximately an ill-posed problem such as that described above are called regularization procedures (methods), after the systematic works by Philipps (Phillips 1962) and, who obtained “optimal solutions” by solving the minimization problem for a “cost functional”, F(x), containing user-defined parameters and meant to minimize a user-defined “error”; usually, this minimization problem takes on the form
$$ \underset{x}{Min}\left[F(x)\right],\kern0.36em F(x)\equiv {\left| Ax-d\right|}^2+\beta {\left\Vert x-{x}_0\right\Vert}^2 $$
(25)
where the Lagrange-like multiplier β is a “free parameter” meant to accomplish a “user-defined compromise” between two requirements: (i) to satisfy the model equation Ax − d = 0, and (ii) to be close to the a priori knowledge x
0. A rich literature (too numerous to cite here) of variations on the Tichonov-Philips regularization procedure has since emerged; their common characteristic is the fundamental dependence of the “regularized solution” on “user-tunable” parameters, like β in Eq. (25).
In an attempt to eliminate the appearance of “user-tunable parameters”, Cacuci and co-workers (Barhen et al. 1980) combined concepts from information theory and Bayes’ theorem to calibrate (“adjust”) simultaneously system (model) responses and parameters, in order to obtain best-estimate values for both the responses and system parameter, with reduced uncertainties, with applications to reactor physics and design. Several years later, these methods (Barhen et al. 1980; Barhen et al. 1982) were re-discovered by workers in other fields, e.g., earth and atmospheric sciences (Barhen et al. 1982), mechanics of materials (Bui 1994), environmental sciences (Faragó et al. 2014), etc. We are given a functional result and asked to design the system to make that happen. For large-scale complex systems, it is practically impossible to run all possible cases in the “forward” direction (even with multivariate optimization algorithms) or to solve the inverse problem. Adjoint methods, which stem from Lagrange’s method of “integration by parts” (~1755), and were set on a rigorous mathematical foundation by Hilbert and Banach, were used (already in the 1940s) for solving efficiently linear problems in nuclear and reactor physics, and (a decade later) optimal control, by avoiding the need to solve repeatedly forward or inverse problems with altered model parameter values. However, these early adjoint methods were applicable solely to linear problems, since nonlinear operators do not admit “adjoints”, as is universally known. Cacuci and co-workers (Cacuci et al. 1980a; Cacuci et al. 1980b) initiated the application of adjoint methods for computing sensitivities of simple responses in simple nonlinear problems. In a remarkable breakthrough, Cacuci (Cacuci 1981a; Cacuci 1981b; Cacuci 1988) developed in 1981 a mathematically-rigorous “adjoint sensitivity analysis” theory applicable to completely general nonlinear systems. Since the late 1980s, adjoint methods enjoyed a remarkably fast and wide-spread field of applications, from interpretation of seismic reflection data (Yedlin & Pawlink 2011), to airfoil design (e.g., the Boeing 747 wing, (Kress et al. 1991)), to numerical error control (Kress et al. 1991).
In recent years, Cacuci and co-workers (Cacuci 2003; Cacuci et al. 2005; Cacuci and Mihaela Ionescu-Bujor 2010; Cacuci et al. 2014; Cacuci 2014a) have embarked on an effort to formulate a new conceptual framework that unifies the currently disparate fields of “inverse problems”, data assimilation, model calibration and validation, by developing a unified framework based on physics-driven mathematical procedures founded on the maximum entropy principle, dispensing with the need for “minimizing user-defined cost functionals” (which characterizes virtually all of the methods currently in use). This fairly self-explanatory framework is depicted in Figure 5, and aims at developing validated predictive computational methods that can be used in the design of multi-phase HeteroFoaM materials. Such methods will be capable of designing not only the constituent materials and their interactions, but also the morphology of the shape, size, surfaces and interfaces that define the heterogeneity and the resulting functional response of the material system. Last but not least, this framework is envisaged to provide the foundation for developing game-changing high-order (to at least fourth-order, including skewness- and kurtosis-like moments of the predicted distributions for design parameters and responses of interest) predictive direct and inverse modelling capabilities, empowered by a new high-order adjoint sensitivity analysis procedure (HO-ASAP) for computing exactly and extremely efficiently (“smoking fast”) response sensitivities of arbitrary order to any and all parameters in large-scale coupled multi-physics models. The high efficiency of the second-order adjoint sensitivity analysis procedure (SO-ASAP) has been illustrated (Cacuci 2014b) via an application to a paradigm particle diffusion problem; a series of papers documenting the HO-ASAP are currently in preparation.