A first-order adjoint and a second-order hybrid method for an energy output least-squares elastography inverse problem of identifying tumor location

Cahill, Nathan D; Jadamba, Baasansuren; Khan, Akhtar A; Sama, Miguel; Winkler, Brian C

doi:10.1186/1687-2770-2013-263

Research
Open access
Published: 02 December 2013

A first-order adjoint and a second-order hybrid method for an energy output least-squares elastography inverse problem of identifying tumor location

Nathan D Cahill¹,
Baasansuren Jadamba¹,
Akhtar A Khan¹,
Miguel Sama² &
…
Brian C Winkler¹

Boundary Value Problems volume 2013, Article number: 263 (2013) Cite this article

1645 Accesses
5 Citations
Metrics details

Abstract

In this paper we investigate the elastography inverse problem of identifying cancerous tumors within the human body. From a mathematical standpoint, the elastography inverse problem consists of identifying the variable Lamé parameter μ in a system of linear elasticity where the underlying object exhibits nearly incompressible behavior. This problem is subsequently posed as an optimization problem using an energy output least-squares (EOLS) functional, but the nonlinearity that arises makes the computation of the EOLS functional’s derivatives challenging. We employ an adjoint method for the computation of the gradient, something shown to be an efficient method in recent studies, and also give a parallelizable hybrid method for the computation of the EOLS functional’s second derivative. Detailed discrete formulas and nontrivial computational examples are provided to show the feasibility of both the adjoint and hybrid approaches. Furthermore, all results are given in the framework of a general saddle point problem allowing easy adaptation to numerous other inverse problems.

MSC:35R30, 65N30.

1 Introduction

Consider the following system of partial differential equations describing the response of an isotropic elastic object to certain body forces and traction applied to its boundary:

- \nabla \cdot σ = f in Ω,

(1a)

σ = 2 μ ϵ (u) + λ div u I,

(1b)

u = g on Γ_{1},

(1c)

σ n = h on Γ_{2} .

(1d)

Here the domain Ω is a subset of $R^{2}$ or $R^{3}$ and $\partial Ω = Γ_{1} \cup Γ_{2}$ is its boundary. In (1a)-(1d), the vector-valued function $u = u (x)$ represents the displacement of the elastic object, f is the applied body force, n is the unit outward normal, and

ϵ (u) = \frac{1}{2} (\nabla u + \nabla u^{T})

is the linearized strain tensor. The resulting stress tensor σ in the stress-strain law (1b) is obtained under the assumption that the elastic object is isotropic and the displacement is small enough so that a linear relationship holds. The Lamé parameters μ and λ quantify the elastic properties of the material. (In the following, for simplicity we set $g = 0$ .)

In this work our objective is to investigate the elastography (also known as elasticity imaging) inverse problem of locating cancerous tumors within the human body. This inverse problem consists of identifying the variable parameter μ in (1a)-(1d) from a measurement of the displacement field u. Conversely, the direct problem for (1a)-(1d) is to find the displacement u when function h, the variable coefficients μ and λ, and the body force f are all known. The underlying idea is that differences in molecular makeup as well as microscopic and macroscopic structure result in significant differences in the stiffness of living soft tissue (see [1]). Moreover, changes in tissue stiffness generally correlate with changes in pathological state, with many cancers appearing as hard nodules within the surrounding softer tissue. In a clinical setting, measurements of displacement in human tissue can be obtained using ultrasound and this can then serve as data in the context of the elastography inverse problem. By solving this inverse problem and recovering μ, tumor locations can be identified using the marked differences in elastic properties between the healthy and unhealthy tissue. Additionally, we note that in the elastography inverse problem the human body is treated as a nearly incompressible object where the parameter λ is significantly large and hence only the parameter μ is sought.

Although numerous authors have contributed to using the elasticity properties of soft tissue as a tool to differentiate between normal and cancerous tissue, Raghavan and Yagle [2] were among the first authors to realize that this study can be best done in an inverse problem framework using measured strains and the equations of equilibrium to recover elasticity (cf. (1a)-(1d)). Since then, many studies have been devoted to investigating various aspects of the elastography inverse problem and the interested reader is referred to [3–8] and the cited reference therein. Additionally, a detailed account of the recent developments in elastography inverse problem can be found in the survey article by Doyley [1]. See also [9–21] and the cited references therein for more details.

One of the main technical challenges in the study of this inverse problem stems from the fact that the human body is treated as a nearly incompressible object. That is, the elasticity modulus λ is significantly large (and particularly $λ ≫ μ$ ), rendering classical finite element methods ineffective due to the so-called locking effect. In the literature, several approaches have been proposed to overcome the locking effect, and in this work we employ the mixed finite elements strategy.

In the following, we provide the necessary details for the transformation of system (1a)-(1d) into a saddle point problem to which the mixed finite element approach can be applied.

We begin by recalling that the dot product of two tensors A and B can be denoted by $A \cdot B$ . That is, for $2 \times 2$ tensors A and B, we have

A \cdot B = A_{11} B_{11} + A_{12} B_{12} + A_{21} B_{21} + A_{22} B_{22} .

Given a sufficiently smooth domain $Ω \subset R^{2}$ , the $L_{2}$ -norm of a tensor-valued function $A = A (x)$ is given by

{∥ A ∥}_{L^{2}}^{2} = {∥ A ∥}_{L^{2} (Ω)}^{2} = \int_{Ω} A \cdot A = \int_{Ω} (A_{11}^{2} + A_{12}^{2} + A_{21}^{2} + A_{22}^{2}) .

On the other hand, for a vector-valued function $u (x) = {(u_{1} (x), u_{2} (x))}^{T}$ , the $L_{2}$ -norm is given by

{∥ u ∥}_{L_{2}}^{2} = {∥ u ∥}_{L_{2} (Ω)}^{2} = \int_{Ω} (u_{1}^{2} + u_{2}^{2}),

whereas the $H^{1}$ -norm by

{∥ u ∥}_{H^{1}}^{2} = {∥ u ∥}_{H^{1} (Ω)}^{2} = {∥ u ∥}_{L_{2}}^{2} + {∥ \nabla u ∥}_{L_{2}}^{2} .

In the following discussion, for the sake of simplicity, in (1a)-(1d) we set $g = 0$ . For this case, the space of test functions, denoted by $\hat{V}$ , is given by

\hat{V} = {\bar{v} \in H^{1} (Ω) \times H^{1} (Ω) : \bar{v} = 0 on Γ_{1}} .

By using Green’s identity and boundary conditions (1c) and (1d), we obtain the following weak form of elasticity system (1a)-(1d): Find $\bar{u} \in \hat{V}$ such that

\int_{Ω} 2 μ ϵ (\bar{u}) \cdot ϵ (\bar{v}) + \int_{Ω} λ (div \bar{u}) (div \bar{v}) = \int_{Ω} f \bar{v} + \int_{Γ_{2}} \bar{v} h for every \bar{v} \in \hat{V} .

(2)

The mixed finite elements strategy, which, in the present context, consists of introducing a pressure term $p \in Q = L^{2} (Ω)$ , is as follows:

div \bar{u} = \frac{p}{λ} .

(3)

As $λ \to \infty$ , (3) yields the incompressibility limit

div \bar{u} = 0 .

The weak formulation of (3) reads

\int_{Ω} (div \bar{u}) q - \int_{Ω} \frac{1}{λ} p q = 0 for every q \in Q .

(4)

By using relation (3), the weak form (2) can be expressed as follows: Find $\bar{u} \in \hat{V}$ such that

\int_{Ω} 2 μ ϵ (\bar{u}) \cdot ϵ (\bar{v}) + \int_{Ω} p (div \bar{v}) = \int_{Ω} f \bar{v} + \int_{Γ_{2}} \bar{v} h for every \bar{v} \in \hat{V},

(5)

where the pressure p is also an unknown.

Therefore, the problem of finding $\bar{u} \in \hat{V}$ , satisfying (2), has now been converted into the saddle point problem of finding $(\bar{u}, p) \in \hat{V} \times Q$ such that

\int_{Ω} 2 μ ϵ (\bar{u}) \cdot ϵ (\bar{v}) + \int_{Ω} p (div \bar{v}) = \int_{Ω} f \bar{v} + \int_{Γ_{2}} \bar{v} h for every \bar{v} \in \hat{V},

(6a)

\int_{Ω} (div \bar{u}) q - \int_{Ω} \frac{1}{λ} p q = 0 for every q \in Q,

(6b)

where $Q = L^{2} (Ω)$ and $\hat{V} = {\bar{v} \in H^{1} (Ω) \times H^{1} (Ω) : \bar{v} = 0 on Γ_{1}}$ .

For the saddle point formulation, the Babuška-Brezzi condition provides guidance in the choice of finite element spaces necessary for a stable numerical approximation (see [22]).

The primary objective of this work is to develop an efficient computational framework for the elastography inverse problem. For this we employ an adjoint approach for the derivative computation of a recently proposed energy output least-squares (EOLS) functional [23]. We recall that Oberai et al. [24] used the adjoint approach to compute efficiently the gradient of the output least-squares functional. Inspired by Tortorelli and Michaleris [25], we also devise a hybrid method for an efficient computation of the second-order derivative of the EOLS functional. In this direction, we would also like to draw attention to an interesting paper by Cioacaa, Alexea, and Sandua [26] where a second-order adjoint method is studied. All the results and formulas given are for a general saddle point problem and hence can easily be adapted to a wide range of inverse problems for variational problems (see [27]). In the derivation of the adjoint formulas, we do not include the regularization functional while considering the EOLS functional. However, we use a smooth regularizer for the identification of a smooth parameter and a BV regularizer for the identification of discontinuous coefficients.

2 Optimization approach for inverse problems in saddle point problems

Let $\hat{V}$ and Q be real Hilbert spaces, let B be a real Banach space, and let A be a nonempty, closed, and convex subset of B. Here B is the coefficient/parameter space and A is the set of all admissible coefficients. Let $a : B \times \hat{V} \times \hat{V} \to R$ be a trilinear map which we assume to be symmetric with respect to the second and third arguments. That is, for every $ℓ \in B$ and for all $\bar{u}, \bar{v} \in \hat{V}$ , we have $a (ℓ, \bar{u}, \bar{v}) = a (ℓ, \bar{v}, \bar{u})$ . Let $b : \hat{V} \times Q \to R$ be a bilinear form, let $c : Q \times Q \to R$ be a symmetric bilinear form, and let $m : \hat{V} \to R$ be a linear and continuous map. We assume that there are positive constants $κ_{1}$ , $κ_{2}$ , $ς_{1}$ , $ς_{2}$ , and $κ_{0}$ such that the following inequalities hold:

a (ℓ, \bar{v}, \bar{v}) \geq κ_{1} {∥ \bar{v} ∥}^{2} for every \bar{v} \in \hat{V}, for every ℓ \in A,

(7a)

a (ℓ, \bar{u}, \bar{v}) \leq κ_{2} ∥ ℓ ∥ ∥ \bar{u} ∥ ∥ \bar{v} ∥ for every \bar{u}, \bar{v} \in \hat{V}, for every ℓ \in A,

(7b)

c (q, q) \geq ς_{1} {∥ q ∥}^{2} for every q \in Q,

(7c)

| c (p, q) | \leq ς_{2} ∥ p ∥ ∥ q ∥ for every p, q \in Q,

(7d)

| b (\bar{v}, q) | \leq κ_{0} ∥ \bar{v} ∥ ∥ q ∥ for every \bar{v} \in \hat{V}, for every q \in Q .

(7e)

Remark 2.1 We remark that for the subsequent development of our approach, it suffices to assume that A is a closed and convex set of admissible parameters. Most commonly, it is chosen as the set of box-constraints. In some works, the space in which A resides is required to be compactly embedded in the solution space (see [28–31]). In our discrete examples, we have used linear elements to approximate the imposed box-constraints.

We consider the following saddle point problem: Given $ℓ \in A$ , find $(\bar{u}, p) \in \hat{V} \times Q$ such that

a (ℓ, \bar{u}, \bar{v}) + b (\bar{v}, p) = m (\bar{v}) for every \bar{v} \in \hat{V},

(8a)

b (\bar{u}, q) - c (p, q) = 0 for every q \in Q .

(8b)

Given all the data, the direct problem in this setting is to find $(\bar{u}, p)$ . However, our focus is on the inverse problem of finding a parameter $ℓ \in A$ that makes (8a)-(8b) true for a measurement $(\bar{z}, \hat{z})$ of $(\bar{u}, p)$ .

Evidently, saddle point problem (6a)-(6b) connected to the elastography inverse problem of identifying a variable parameter μ in the system of incompressible linear elasticity can be deduced by setting:

a (μ, \bar{u}, \bar{v}) = \int_{Ω} 2 μ ϵ (\bar{u}) \cdot ϵ (\bar{v}),

(9a)

b (\bar{u}, q) = \int_{Ω} (div \bar{u}) q,

(9b)

c (p, q) = \int_{Ω} \frac{1}{λ} p q,

(9c)

m (\bar{v}) = \int_{Ω} f \bar{v} + \int_{Γ_{2}} \bar{v} h .

(9d)

A common approach to solve inverse problems of parameter identification in PDEs is to minimize the output least-squares functional, which, in the present context, can be defined by

J_{OLS} (ℓ) = \frac{1}{2} {∥ u (ℓ) - z ∥}_{V}^{2} = \frac{1}{2} {∥ \bar{u} (ℓ) - \bar{z} ∥}_{\hat{V}}^{2} + \frac{1}{2} {∥ p (ℓ) - \hat{z} ∥}_{Q}^{2},

(10)

where $V = \hat{V} \times Q$ , $z = (\bar{z}, \hat{z}) \in V$ is the measured data, and $u (ℓ) = (\bar{u} (ℓ), p (ℓ)) \in V$ is the solution of (8a)-(8b) corresponding to ℓ.

The output least-squares solution to the inverse problem of identifying ℓ is the one that solves the following optimization problem: Find $\bar{ℓ} \in A$ such that

J_{OLS} (\bar{ℓ}) \leq J_{OLS} (ℓ) for every ℓ \in A .

Recently, in [23], the following objective functional was proposed to solve the inverse problem of identifying the variable parameter $ℓ \in A$ in saddle point problem (8a)-(8b):

J (ℓ) = \frac{1}{2} a (ℓ, \bar{u} (ℓ) - \bar{z}, \bar{u} (ℓ) - \bar{z}) + \frac{1}{2} c (p (ℓ) - \hat{z}, p (ℓ) - \hat{z}),

(11)

where $z = (\bar{z}, \hat{z})$ is the measured data and $u (ℓ) = (\bar{u} (ℓ), p (ℓ))$ is the solution of (8a)-(8b) corresponding to ℓ.

Clearly, to solve an optimization problem with the above objective functional, we need to compute its derivative which, in turn, requires us to compute the derivative of the solution map. It is well known that one of the most challenging aspects in the study of inverse problems is in finding an efficient computation of the derivative of the solution map. We will now develop an adjoint method for the computation of the first derivative of the EOLS functional and then a new hybrid method for the computation of the functional’s second derivative.

For every $ℓ \in A$ , the map $ℓ \to S (ℓ) = (\bar{u} (ℓ), p (ℓ))$ is well defined and single-valued. The following result for the differentiability of S, which was announced in [23] without a proof, will be needed.

Theorem 2.1 For each ℓ in the interior of A, $u = u (ℓ) = (\bar{u} (ℓ), p (ℓ))$ is infinitely differentiable at ℓ.

1.
Given u, the first derivative $δ u = (δ \bar{u}, δ p) = (D \bar{u} (ℓ) δ ℓ, D p (ℓ) δ ℓ)$ is the unique solution of the saddle point problem:
$a (ℓ, δ \bar{u}, \bar{v}) + b (\bar{v}, δ p) = - a (δ ℓ, \bar{u}, \bar{v}) for every \bar{v} \in \hat{V},$
(12a)
$b (δ \bar{u}, q) - c (δ p, q) = 0 for every q \in Q .$
(12b)
2.
The second-order derivative
$δ^{2} u = (δ^{2} \bar{u}, δ^{2} p) = (D^{2} \bar{u} (ℓ) (δ ℓ_{1}, δ ℓ_{2}), D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2}))$

is the unique solution of the saddle point problem
$\begin{matrix} a (ℓ, δ^{2} \bar{u}, \bar{v}) + b (\bar{v}, δ^{2} p) = - a (δ ℓ_{2}, D \bar{u} (ℓ) δ ℓ_{1}, \bar{v}) \\ - a (δ ℓ_{1}, D \bar{u} (ℓ) δ ℓ_{2}, \bar{v}) for every \bar{v} \in \hat{V}, \end{matrix}$
(13a)

$b (δ^{2} \bar{u}, q) - c (δ^{2} p, q) = 0 for every q \in Q .$
(13b)

Proof We define a map $G : A \times \hat{V} \to {\hat{V}}^{*} \times Q^{*}$ by $G (ℓ, (\bar{u}, p)) = (a (ℓ, \bar{u}) + b (p) - m, b (\bar{u}) - c (p))$ , where ${\hat{V}}^{*}$ and $Q^{*}$ are the duals of $\hat{V}$ and Q, and $a (ℓ, \bar{u})$ , $b (p)$ , and $c (p)$ are the associated dual elements given by the Riesz theorem. Then saddle point problem (8a)-(8b) is equivalent to the following implicit equation:

G (ℓ, u) = (0_{{\hat{V}}^{*}}, 0_{Q^{*}}) .

(14)

The differentiability of $u = u (ℓ)$ follows from the implicit function theorem. In fact, the map G is infinitely differentiable and the partial derivative with respect to variable $u = (\bar{u}, p)$ is given by

D_{u} G (ℓ, (\bar{u}, p)) (δ \bar{u}, δ p) \equiv (a (ℓ, δ \bar{u}) + b (δ p), b (δ \bar{u}) - c (δ p)) for every δ \bar{u} \in \hat{V} .

By [[22], Proposition 4], the map $D_{u} G (ℓ, (\bar{u}, p)) : A \times \hat{V} \to {\hat{V}}^{*} \times Q^{*}$ is an isomorphism. Therefore, using the implicit function theorem, the map $u = u (ℓ)$ is infinitely differentiable at any ℓ in the interior of A.

We now compute the first and second derivatives of the coefficient-to-solution map. By using equation (8a), for any $\hat{ℓ} \in A$ and for any sufficiently small $t \in R_{+}$ , we have

\begin{matrix} a (ℓ + t \hat{ℓ}, \bar{u} (ℓ + t \hat{ℓ}), \bar{v}) + b (\bar{v}, p (ℓ + t \hat{ℓ})) = m (\bar{v}), \\ a (ℓ, \bar{u} (ℓ), \bar{v}) + b (\bar{v}, p (ℓ)) = m (\bar{v}), \end{matrix}

and by manipulating the terms in these two equations, we obtain

a (ℓ + t \hat{ℓ}, \frac{\bar{u} (ℓ + t \hat{ℓ}) - \bar{u} (ℓ)}{t}, \bar{v}) + a (\hat{ℓ}, \bar{u} (ℓ), \bar{v}) + b (\bar{v}, \frac{p (ℓ + t \hat{ℓ}) - p (ℓ)}{t}) = 0,

which, by passing the above equation to the limit when $t \to 0^{+}$ , yields (12a)

a (ℓ, D \bar{u} (ℓ) (\hat{ℓ}), \bar{v}) + b (\bar{v}, D p (ℓ) (\hat{ℓ})) = - a (\hat{ℓ}, \bar{u} (ℓ), \bar{v}) .

Analogously, using equation (8b), for any $\hat{ℓ} \in A$ and for every sufficiently small $t \in R_{+}$ , we have

\begin{matrix} b (u (ℓ + t \hat{ℓ}), q) - c (p (ℓ + t \hat{ℓ}), q) = 0, \\ b (u (ℓ), q) - c (p (ℓ), q) = 0, \end{matrix}

and by manipulating the above two equations, we obtain

b (\frac{u (ℓ + t \hat{ℓ}) - u (ℓ)}{t}, q) - c (\frac{p (ℓ + t \hat{ℓ}) - p (ℓ)}{t}, q) = 0,

which, by passing the above equation to limit $t \to 0^{+}$ , gives

b (D \bar{u} (ℓ) (\hat{ℓ}), q) - c (D p (ℓ) (\hat{ℓ}), q) = 0,

which is (12b). Consequently, (12a) and (12b) characterize the first derivative.

The same arguments can be used to compute the form of the second derivative. From (12a), for any ${\hat{ℓ}}_{1}, {\hat{ℓ}}_{2} \in A$ and for any sufficiently small $t \in R_{+}$ , we have

\begin{matrix} a (ℓ + t {\hat{ℓ}}_{1}, D \bar{u} (ℓ + t {\hat{ℓ}}_{1}) ({\hat{ℓ}}_{2}), \bar{v}) + b (\bar{v}, D p (ℓ + t {\hat{ℓ}}_{1}) ({\hat{ℓ}}_{2})) = - a ({\hat{ℓ}}_{2}, \bar{u} (ℓ + t {\hat{ℓ}}_{1}), \bar{v}), \\ a (ℓ, D \bar{u} (ℓ) ({\hat{ℓ}}_{2}), \bar{v}) + b (\bar{v}, D p (ℓ) ({\hat{ℓ}}_{2})) = - a ({\hat{ℓ}}_{2}, \bar{u} (ℓ), \bar{v}), \end{matrix}

and by rearranging the above set of equations, we obtain

\begin{matrix} a ({\hat{ℓ}}_{1}, D \bar{u} (ℓ + t {\hat{ℓ}}_{1}) ({\hat{ℓ}}_{2}), \bar{v}) + a (ℓ, \frac{D \bar{u} (ℓ + t {\hat{ℓ}}_{1}) ({\hat{ℓ}}_{2}) - D \bar{u} (ℓ) ({\hat{ℓ}}_{2})}{t}, \bar{v}) \\ + b (\bar{v}, \frac{D p (ℓ + t {\hat{ℓ}}_{1}) ({\hat{ℓ}}_{2}) - D p (ℓ) ({\hat{ℓ}}_{2})}{t}) \\ = - a ({\hat{ℓ}}_{2}, \frac{\bar{u} (ℓ + t {\hat{ℓ}}_{1}) - \bar{u} (ℓ)}{t}, \bar{v}) . \end{matrix}

Since the solution map $u = u (ℓ)$ is twice Fréchet differentiable, by passing to the limit $t \to 0^{+}$ , we get

\begin{matrix} a ({\hat{ℓ}}_{1}, D \bar{u} (ℓ) ({\hat{ℓ}}_{2}), \bar{v}) + a (ℓ, D^{2} \bar{u} ({\hat{ℓ}}_{1}, {\hat{ℓ}}_{2}), \bar{v}) + b (\bar{v}, D^{2} p ({\hat{ℓ}}_{1}, {\hat{ℓ}}_{2})) \\ = - a ({\hat{ℓ}}_{2}, D \bar{u} (ℓ) ({\hat{ℓ}}_{1}), \bar{v}), \end{matrix}

(15)

which, after a rearrangement of terms, yields (13a)

a (ℓ, D^{2} \bar{u} ({\hat{ℓ}}_{1}, {\hat{ℓ}}_{2}), \bar{v}) + b (\bar{v}, D^{2} p ({\hat{ℓ}}_{1}, {\hat{ℓ}}_{2})) = - a ({\hat{ℓ}}_{2}, D \bar{u} (ℓ) ({\hat{ℓ}}_{1}), \bar{v}) - a ({\hat{ℓ}}_{1}, D \bar{u} (ℓ) ({\hat{ℓ}}_{2}), \bar{v}) .

From (12b), for any ${\hat{ℓ}}_{1}, {\hat{ℓ}}_{2} \in A$ and any sufficiently small $t \in R_{+}$ , we have

\begin{matrix} b (D \bar{u} (ℓ + {\hat{ℓ}}_{1}) ({\hat{ℓ}}_{2}), q) - c (D p (ℓ + {\hat{ℓ}}_{1}) ({\hat{ℓ}}_{2}), q) = 0, \\ b (D \bar{u} (ℓ) ({\hat{ℓ}}_{2}), q) - c (D p (ℓ) ({\hat{ℓ}}_{2}), q) = 0, \end{matrix}

and by rearranging the above two equations, we get

b (\frac{D \bar{u} (ℓ + t {\hat{ℓ}}_{1}) ({\hat{ℓ}}_{2}) - D \bar{u} (ℓ) ({\hat{ℓ}}_{2})}{t}, q) - c (\frac{D p (ℓ + {\hat{ℓ}}_{1}) ({\hat{ℓ}}_{2}) - D p (ℓ) ({\hat{ℓ}}_{2})}{t}, q) = 0 .

By passing to limit $t \to 0^{+}$ , we finally deduce

b (D^{2} \bar{u} ({\hat{ℓ}}_{1}, {\hat{ℓ}}_{2}), q) - c (D^{2} p ({\hat{ℓ}}_{1}, {\hat{ℓ}}_{2}), q) = 0,

which in conjunction with (13b) forms the corresponding saddle point whose unique solution characterizes the second derivative $D^{2} u ({\hat{ℓ}}_{1}, {\hat{ℓ}}_{2}) = (D^{2} \bar{u} ({\hat{ℓ}}_{1}, {\hat{ℓ}}_{2}), D^{2} p ({\hat{ℓ}}_{1}, {\hat{ℓ}}_{2}))$ . □

3 An adjoint and a hybrid method for the energy output least squares

The developed adjoint method for the EOLS functional,

J (ℓ) = \frac{1}{2} a (ℓ, \bar{u} - \bar{z}, \bar{u} - \bar{z}) + \frac{1}{2} c (p - \hat{z}, p - \hat{z}),

is based on the key observation that the underlying saddle point problem can equivalently be posed as a variational equation of finding $u = (\bar{u}, p) \in V = \hat{V} \times Q$ such that

T (ℓ, u, v) = m (\bar{v}) for every v = (\bar{v}, q) \in V,

(16)

where

T (ℓ, u, v) = a (ℓ, \bar{u}, \bar{v}) + b (\bar{v}, p) + b (\bar{u}, q) - c (p, q) .

(17)

By a direct computation, we have

\begin{aligned} D T (ℓ, u, v) (ℓ) (δ ℓ) = & a (δ ℓ, \bar{u}, \bar{v}) + a (ℓ, δ \bar{u}, \bar{v}) \\ + b (\bar{v}, δ p) + b (δ \bar{u}, q) - c (δ p, q) \\ = & a (δ ℓ, \bar{u}, \bar{v}) + t (ℓ, δ u, v) . \end{aligned}

(18)

We define

J (ℓ, v) = J (ℓ) + T (ℓ, u, v) - m (\bar{v}) for every v \in V,

and, by using (16), notice that

J (a, v) = J (a) for every v \in V .

Therefore, for any ‘test function’ $v = (\bar{v}, q) \in V$ , we have

D J (ℓ) (δ ℓ) = D_{ℓ} J (ℓ, v) (δ ℓ) for every δ ℓ \in A,

(19)

where $D_{ℓ}$ stands for the partial derivative with respect to ℓ.

The key idea behind the adjoint method is to choose a particular v to avoid the computation of δu. By a direct computation and taking into account (18), we obtain

\begin{array}{rcl} D_{ℓ} J (ℓ, v) (δ ℓ) & = & \frac{1}{2} a (δ ℓ, \bar{u} - \bar{z}, \bar{u} - \bar{z}) + a (ℓ, δ \bar{u}, \bar{u} - \bar{z}) + c (δ p, p - \hat{z}) \\ + a (δ ℓ, \bar{u}, \bar{v}) + T (ℓ, δ u, v) . \end{array}

(20)

Now, let $w = w (ℓ)$ be the unique solution of the saddle point problem

a (ℓ, \bar{w}, \bar{v}) + b (\bar{v}, p_{w}) = - a (ℓ, \bar{u} - \bar{z}, \bar{v}) - b (\bar{v}, p - \hat{z}) for every \bar{v} \in \hat{V},

(21a)

b (\bar{w}, q) - c (p_{w}, q) = 0 for every q \in Q,

(21b)

which exists, by standard arguments, since the above problem is just (8a)-(8b) with $m (\cdot) = - a (ℓ, \bar{u}, \cdot) - b (\cdot, p)$ .

By setting $v = w$ in (20), we obtain

\begin{array}{rcl} D_{ℓ} J (ℓ, w) (δ ℓ) & = & \frac{1}{2} a (δ ℓ, \bar{u} - \bar{z}, \bar{u} - \bar{z}) + a (ℓ, δ \bar{u}, \bar{u} - \bar{z}) + c (δ p, p - \hat{z}) \\ + a (δ ℓ, \bar{u}, \bar{w}) + T (ℓ, δ u, w) \\ = & \frac{1}{2} a (δ ℓ, \bar{u} - \bar{z}, \bar{u} - \bar{z}) + a (ℓ, δ \bar{u}, \bar{u} - \bar{z}) + c (δ p, p - \hat{z}) \\ + a (δ ℓ, \bar{u}, \bar{w}) - a (ℓ, \bar{u} - \bar{z}, δ \bar{u}) - b (δ \bar{u}, p - \hat{z}) \\ = & \frac{1}{2} a (δ ℓ, \bar{u} - \bar{z}, \bar{u} - \bar{z}) + a (δ ℓ, \bar{u}, \bar{w}) + a (ℓ, δ \bar{u}, \bar{u}) + c (δ p, p) \\ - b (δ \bar{u}, p - \hat{z}) + a (δ ℓ, \bar{u}, \bar{w}) - a (ℓ, δ \bar{u}, \bar{u} - \bar{z}) \\ = & \frac{1}{2} a (δ ℓ, \bar{u} - \bar{z}, \bar{u} - \bar{z}) + a (δ ℓ, \bar{u}, \bar{w}) \\ + c (δ p, p - \hat{z}) - b (δ \bar{u}, p - \hat{z}), \end{array}

where we have used the symmetry of the trilinear form T, $a (ℓ, \cdot, \cdot)$ , and (21a)-(21b). Since $c (δ p, p - \hat{z}) = b (δ \bar{u}, p - \hat{z})$ , we obtain

D_{ℓ} J (ℓ, w) (δ ℓ) = \frac{1}{2} a (δ ℓ, \bar{u} - \bar{z}, \bar{u} - \bar{z}) + a (δ ℓ, \bar{u}, \bar{w}) .

Therefore, using (19), we have

D J (ℓ) (δ ℓ) = \frac{1}{2} a (δ ℓ, \bar{u} - \bar{z}, \bar{u} - \bar{z}) + a (δ ℓ, \bar{u}, \bar{w}) .

(22)

Summarizing, we have the following scheme to compute the derivative $D J (ℓ)$ given $a \in A$ :

1.
Compute u by (16).
2.
Compute w by (21a)-(21b).
3.
Compute $D J (ℓ)$ by (22).

Let us now develop the hybrid method for the computation of the second-order derivative. In the hybrid method proposed below, the derivative δu is computed directly while the computation of the second derivative $δ^{2} u$ is avoided by using an adjoint method. We will follow the same general scheme that was used above, but here we will use derivative formula (12a)-(12b).

Let $δ ℓ_{2} \in A$ be a fixed direction. Then, for any $v = (\bar{v}, q) \in V$ , we define

\begin{array}{rcl} H (ℓ, v) & = & D J (ℓ) (δ ℓ_{2}) + a (ℓ, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{v}) + b (\bar{v}, D p (ℓ) (δ ℓ_{2})) \\ + b (D \bar{u} (ℓ) (δ ℓ_{2}), q) - c (D p (ℓ) (δ ℓ_{2}), q) + a (δ ℓ_{2}, \bar{u}, \bar{v}) \\ = & \frac{1}{2} a (δ ℓ_{2}, \bar{u} - \bar{z}, \bar{u} - \bar{z}) + a (ℓ, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{u} - \bar{z}) \\ + c (D p (ℓ) (δ ℓ_{2}), p - \hat{z}) + a (ℓ, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{v}) + b (\bar{v}, D p (ℓ) (δ ℓ_{2})) \\ + b (D \bar{u} (ℓ) (δ ℓ_{2}), q) - c (D p (ℓ) (δ ℓ_{2}), q) + a (δ ℓ_{2}, \bar{u}, \bar{v}) . \end{array}

By the construction of H, for every $v \in V$ , we have

\frac{\partial H}{\partial ℓ} (ℓ, v) (δ ℓ_{1}) = D^{2} J (ℓ) (δ ℓ_{1}, δ ℓ_{2}) for every δ ℓ_{1} \in A .

(23)

By a simple calculation, we have

\begin{array}{rcl} \frac{\partial H}{\partial ℓ} (ℓ, v) (δ ℓ_{1}) & = & a (δ ℓ_{2}, D \bar{u} (ℓ) (δ ℓ_{1}), \bar{u} - \bar{z}) + a (δ ℓ_{1}, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{u} - \bar{z}) \\ + a (ℓ, D^{2} \bar{u} (ℓ) (δ ℓ_{1}, δ ℓ_{2}), \bar{u} - \bar{z}) \\ + a (ℓ, D \bar{u} (ℓ) (δ ℓ_{2}), D \bar{u} (ℓ) (δ ℓ_{1})) \\ + c (D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2}), p - \hat{z}) + c (D p (ℓ) (δ ℓ_{2}), D p (ℓ) (δ ℓ_{1})) \\ + a (δ ℓ_{1}, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{v}) + a (ℓ, D^{2} \bar{u} (ℓ) (δ ℓ_{1}, δ ℓ_{2}), \bar{v}) \\ + b (\bar{v}, D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2})) + b (D^{2} \bar{u} (ℓ) (δ ℓ_{1}, δ ℓ_{2}), q) \\ - c (D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2}), q) + a (δ ℓ_{2}, D \bar{u} (ℓ) (δ ℓ_{1}), \bar{v}) . \end{array}

(24)

Let $w (ℓ) = (\bar{w} (ℓ), p_{w} (ℓ))$ be the unique solution of the saddle point problem (cf. (12a)-(12b)):

a (ℓ, \bar{w}, \bar{v}) + b (\bar{v}, p_{w}) = a (ℓ, \bar{v}, \bar{z} - \bar{u}) + b (\bar{v}, \hat{z} - p) for every \bar{v} \in \hat{V},

(25a)

b (\bar{w}, q) - c (p_{w}, q) = 0 for every q \in Q .

(25b)

By setting $v = w$ in (24), we have

\begin{array}{rcl} \frac{\partial H}{\partial ℓ} (ℓ, w) (δ ℓ_{1}) & = & a (δ ℓ_{2}, D \bar{u} (ℓ) (δ ℓ_{1}), \bar{u} - \bar{z}) + a (δ ℓ_{1}, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{u} - \bar{z}) \\ + a (ℓ, D^{2} \bar{u} (ℓ) (δ ℓ_{1}, δ ℓ_{2}), \bar{u} - \bar{z}) + a (ℓ, D \bar{u} (ℓ) (δ ℓ_{2}), D \bar{u} (ℓ) (δ ℓ_{1})) \\ + c (D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2}), p - \hat{z}) + c (D p (ℓ) (δ ℓ_{2}), D p (ℓ) (δ ℓ_{1})) \\ + a (δ ℓ_{1}, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{w}) + a (ℓ, D^{2} \bar{u} (ℓ) (δ ℓ_{1}, δ ℓ_{2}), \bar{w}) \\ + b (\bar{w}, D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2})) + b (D^{2} \bar{u} (ℓ) (δ ℓ_{1}, δ ℓ_{2}), q) \\ - c (D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2}), q) + a (δ ℓ_{2}, D \bar{u} (ℓ) (δ ℓ_{1}), \bar{w}) \\ = & a (δ ℓ_{2}, D \bar{u} (ℓ) (δ ℓ_{1}), \bar{u} - \bar{z}) + a (δ ℓ_{1}, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{u} - \bar{z}) \\ + a (ℓ, D \bar{u} (ℓ) (δ ℓ_{2}), D \bar{u} (ℓ) (δ ℓ_{1})) + c (D p (ℓ) (δ ℓ_{2}), D p (ℓ) (δ ℓ_{1})) \\ + a (ℓ, D^{2} \bar{u} (ℓ) (δ ℓ_{1}, δ ℓ_{2}), \bar{u} - \bar{z}) + c (D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2}), p - \hat{z}) \\ + a (δ ℓ_{1}, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{w}) + a (ℓ, D^{2} \bar{u} (ℓ) (δ ℓ_{1}, δ ℓ_{2}), \bar{w}) \\ + b (\bar{w}, D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2})) + b (D^{2} \bar{u} (ℓ) (δ ℓ_{1}, δ ℓ_{2}), q) \\ - c (D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2}), q) + a (δ ℓ_{2}, D \bar{u} (ℓ) (δ ℓ_{1}), \bar{w}) \\ = & a (δ ℓ_{2}, D \bar{u} (ℓ) (δ ℓ_{1}), \bar{u} - \bar{z}) + a (δ ℓ_{1}, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{u} - \bar{z}) \\ + a (ℓ, D \bar{u} (ℓ) (δ ℓ_{2}), D \bar{u} (ℓ) (δ ℓ_{1})) + c (D p (ℓ) (δ ℓ_{2}), D p (ℓ) (δ ℓ_{1})) \\ + a (δ ℓ_{1}, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{w}) + a (δ ℓ_{2}, D \bar{u} (ℓ) (δ ℓ_{1}), \bar{w}) \\ + c (D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2}), p - \hat{z}) + b (D^{2} \bar{u} (ℓ) (δ ℓ_{1}, δ ℓ_{2}), \hat{z} - p) . \end{array}

Recall that by derivative formula (13a)-(13b), we have

c (D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2}), p - \hat{z}) = b (D^{2} p (ℓ) (δ ℓ_{1}, δ ℓ_{2}), p - \hat{z}),

which implies

\begin{array}{rcl} \frac{\partial H}{\partial ℓ} (ℓ, w) (δ ℓ_{1}) & = & a (δ ℓ_{2}, D \bar{u} (ℓ) (δ ℓ_{1}), \bar{u} - \bar{z}) + a (δ ℓ_{1}, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{u} - \bar{z}) \\ + a (ℓ, D \bar{u} (ℓ) (δ ℓ_{2}), D \bar{u} (ℓ) (δ ℓ_{1})) + c (D p (ℓ) (δ ℓ_{2}), D p (ℓ) (δ ℓ_{1})) \\ + a (δ ℓ_{1}, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{w}) + a (δ ℓ_{2}, D \bar{u} (ℓ) (δ ℓ_{1}), \bar{w}) . \end{array}

Consequently, from (23), we get

\begin{array}{rcl} D^{2} J (ℓ) (δ ℓ_{1}, δ ℓ_{2}) & = & a (δ ℓ_{2}, D \bar{u} (ℓ) (δ ℓ_{1}), \bar{u} - \bar{z}) + a (δ ℓ_{1}, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{u} - \bar{z}) \\ + a (ℓ, D \bar{u} (ℓ) (δ ℓ_{2}), D \bar{u} (ℓ) (δ ℓ_{1})) + c (D p (ℓ) (δ ℓ_{2}), D p (ℓ) (δ ℓ_{1})) \\ + a (δ ℓ_{1}, D \bar{u} (ℓ) (δ ℓ_{2}), \bar{w}) + a (δ ℓ_{2}, D \bar{u} (ℓ) (δ ℓ_{1}), \bar{w}) \end{array}

and, in particular,

D^{2} J (ℓ) (δ ℓ, δ ℓ) = 2 a (δ ℓ, δ \bar{u}, \bar{u} - \bar{z}) + a (ℓ, δ \bar{u}, δ \bar{u}) + c (δ p, δ p) + 2 a (δ ℓ, δ \bar{u}, \bar{w}) .

(26)

Summarizing, we propose the following scheme to compute the derivative $D^{2} J (ℓ) (δ ℓ, δ ℓ)$ given $ℓ \in A$ , $δ ℓ \in A$ .

1.
Compute $u (ℓ) = (\bar{u} (ℓ), p)$ by (16).
2.
Compute $δ u = (δ \bar{u}, δ p)$ by (12a)-(12b).
3.
Compute $w (ℓ) = (\bar{w} (ℓ), q (ℓ))$ by (25a)-(25b).
4.
Compute $D^{2} J (ℓ) (δ ℓ, δ ℓ)$ by (26).

4 Discretization formulas for the adjoint and the hybrid method

In this section, we collect discrete formulae for saddle point problem (8a)-(8b) and the associated inverse problem. We begin, therefore, with a triangulation $T_{h}$ on Ω, $L_{h}$ is the space of all piecewise continuous polynomials of degree $d_{ℓ}$ relative to $T_{h}$ , $U_{h}$ is the space of all piecewise continuous polynomials of degree $d_{u}$ relative to $T_{h}$ , and $Q_{h}$ is the space of all piecewise continuous polynomials of degree $d_{q}$ relative to $T_{h}$ .

In order to represent the discrete saddle point problem in a computable form, we proceed as follows. We represent bases for $L_{h}$ , $U_{h}$ , and $Q_{h}$ by ${φ_{1}, φ_{2}, \dots, φ_{m}}$ , ${ψ_{1}, ψ_{2}, \dots, ψ_{n}}$ , and ${χ_{1}, χ_{2}, \dots, χ_{k}}$ , respectively. The space $L_{h}$ is then isomorphic to $R^{m}$ and for any $ℓ \in L_{h}$ , we define $L \in R^{m}$ by $L_{i} = ℓ (x_{i})$ for $i = 1, 2, \dots, m$ , where the nodal basis ${φ_{1}, φ_{2}, \dots, φ_{m}}$ corresponds to the nodes ${x_{1}, x_{2}, \dots, x_{m}}$ . Conversely, each $L \in R^{m}$ corresponds to $ℓ \in L_{h}$ defined by $ℓ = \sum_{i = 1}^{m} L_{i} φ_{i}$ . Similarly, $u \in U_{h}$ will correspond to $U \in R^{n}$ , where ${\bar{U}}_{i} = u (y_{i})$ , $i = 1, 2, \dots, n$ , and $u = \sum_{i = 1}^{n} {\bar{U}}_{i} ψ_{i}$ , where $y_{1}, y_{2}, \dots, y_{n}$ are the nodes of the mesh defining $U_{h}$ . Finally, $q \in Q_{h}$ will correspond to $Q \in R^{k}$ , where $Q_{i} = q (z_{i})$ , $i = 1, 2, \dots, k$ , and $q = \sum_{i = 1}^{k} Q_{i} χ_{i}$ , where $z_{1}, z_{2}, \dots, z_{k}$ are the nodes of the mesh defining $Q_{h}$ . (The spaces $A_{h}$ , $U_{h}$ , and $Q_{h}$ are defined relative to the same elements, but the nodes will be different if $d_{ℓ} \neq d_{u} \neq d_{q}$ .)

Recall that the discrete saddle point problem seeks, for each $ℓ_{h}$ , the unique $({\bar{u}}_{h}, p_{h}) \in V_{h} \times Q_{h}$ with

a (ℓ_{h}, {\bar{u}}_{h}, \bar{v}) + b (\bar{v}, p_{h}) = m (\bar{v}) for every \bar{v} \in U_{h},

(27a)

b ({\bar{u}}_{h}, q) - c (p_{h}, q) = 0 for every q \in Q_{h} .

(27b)

We define $S : R^{m} \to R^{n + k}$ to be the finite element solution operator that assigns to each coefficient $ℓ_{h} \in A_{h}$ the unique approximate solution $u_{h} = ({\bar{u}}_{h}, p_{h}) \in U_{h} \times Q_{h}$ . Then $S (L) = U$ , where U is defined by

K (L) U = F,

(28)

and where the stiffness matrix $K (L) \in R^{(n + k) \times (n + k)}$ and the load vector $F \in R^{n + k}$ are given by

K (L) = [\begin{array}{cc} \hat{K} (L) & B^{T} \\ B & - C \end{array}]

with

\begin{matrix} \hat{K} {(L)}_{i, j} = a (ℓ, ψ_{j}, ψ_{i}), i, j = 1, 2, \dots, n, \\ B_{i, j} = b (ψ_{j}, χ_{i}), i = 1, 2, \dots, k, n = 1, 2, \dots, n, \\ C_{i, j} = c (χ_{j}, χ_{i}), i, j = 1, 2, \dots, k, \\ F_{i} = m (ψ_{i}), i = 1, 2, \dots, n, \\ F_{j} = 0, j = n + 1, n + 2, \dots, n + k . \end{matrix}

For future reference, it will be useful to note that

\hat{K} {(L)}_{i j} = T_{i j k} L_{k},

where the summation convention is used and T is the tensor defined by

T_{i j k} = a (φ_{k}, ψ_{i}, ψ_{j}) for every i, j = 1, \dots, n, k = 1, \dots, m .

Let us now compute the discrete analogue of energy least-squares objective functional. By using the above notations, the discrete form of

J (L) = \frac{1}{2} a (ℓ, \bar{u} - \bar{z}, \bar{u} - \bar{z}) + \frac{1}{2} c (p - \hat{z}, p - \hat{z})

is given by

J (L) = \frac{1}{2} {(\bar{U} (L) - \bar{Z})}^{T} \hat{K} (L) (\bar{U} (L) - \bar{Z}) + \frac{1}{2} {(P (L) - \hat{Z})}^{T} C (P (L) - \hat{Z}) .

In order to get an operative expression for the gradient, we need to consider the so-called adjoint stiffness matrix $A$ defined by the following condition:

\hat{K} (L) \bar{V} = A (\bar{V}) L for every L \in R^{m}, for every \bar{V} \in R^{n} .

(29)

4.1 Computation of the gradient by using the adjoint method

Using the above notation, we have the following discrete adjoint method for the computation of gradient of $J (\cdot)$ .

1.
We compute $U = [\begin{array}{c} \bar{U} (L) \\ P (L) \end{array}]$ by solving the linear system
$[\begin{array}{cc} \hat{K} (L) & B^{T} \\ B & - C \end{array}] [\begin{array}{c} \bar{U} (L) \\ P (L) \end{array}] = [\begin{array}{c} F \\ 0 \end{array}] .$
(30)
2.
We compute $W = [\begin{array}{c} \bar{W} (L) \\ P_{W} (L) \end{array}]$ by solving the linear system
$[\begin{array}{cc} \hat{K} (L) & B^{T} \\ B & - C \end{array}] [\begin{array}{c} \bar{W} (L) \\ P_{W} (L) \end{array}] = [\begin{array}{c} - \hat{K} (L) (\bar{U} - \bar{Z}) - B^{T} (P - \hat{P}) \\ 0 \end{array}] .$
(31)
3.
The gradient $\nabla J (L)$ can be calculated by using the adjoint stiffness matrix. From (22), we have
$D J (ℓ) (δ ℓ) = \frac{1}{2} a (δ ℓ, \bar{u} - \bar{z}, \bar{u} - \bar{z}) + a (δ ℓ, \bar{u}, \bar{w}),$
(32)

a direct discretization gives the following:
$\begin{aligned} \nabla J (L) (δ L) & = \frac{1}{2} {(\bar{U} - \bar{Z})}^{T} \hat{K} (δ L) (\bar{U} - \bar{Z}) + {\bar{U}}^{T} \hat{K} (δ L) \bar{W} \\ = \frac{1}{2} {\bar{U}}^{T} A (\bar{U} - \bar{Z}) δ L + {\bar{U}}^{T} A (\bar{W} (L)) δ L, \end{aligned}$

and therefore the gradient $\nabla J (L)$ is given by
$\nabla J (L) = \frac{1}{2} {(\bar{U} - \bar{Z})}^{T} A (\bar{U} - \bar{Z}) + {\bar{U}}^{T} A (\bar{W} (L)) .$
(33)

4.2 Computation of the Hessian by using a hybrid method

Recall that we have established the following:

D_{ℓ ℓ}^{2} J (ℓ) (δ ℓ, δ ℓ) = 2 a (δ ℓ, δ \bar{u}, \bar{u} - \bar{z}) + a (ℓ, δ \bar{u}, δ \bar{u}) + c (δ p, δ p) + 2 a (δ ℓ, δ \bar{u}, \bar{w}) .

(34)

By the standard discretization scheme, we have

1.
$a (δ L, δ \bar{U}, \bar{U} - \bar{Z}) = δ L^{T} \nabla {\bar{U}}^{T} \hat{K} (δ L) (\bar{U} - \bar{Z}) = δ L^{T} \nabla {\bar{U}}^{T} A (\bar{U} - \bar{Z}) δ L$ ,
2.
$a (L, δ \bar{U}, δ \bar{U}) = δ L^{T} \nabla {\bar{U}}^{T} \hat{K} (L) \nabla \bar{U} δ L = δ L^{T} \nabla {\bar{U}}^{T} \hat{K} (L) \nabla \bar{U} δ L$ ,
3.
$c (δ P, δ P) = δ L^{T} \nabla P^{T} C \nabla P δ L$ ,
4.
$a (δ ℓ, δ \bar{u}, \bar{w}) = δ L^{T} \nabla {\bar{U}}^{T} \hat{K} (δ L) \bar{W} = δ L^{T} \nabla {\bar{U}}^{T} A (\bar{W}) δ L$ .

Consequently, we have the following explicit formula for the Hessian:

\nabla^{2} J (L) = 2^{T} \nabla {\bar{U}}^{T} A (\bar{U} - \bar{Z}) + \nabla {\bar{U}}^{T} \hat{K} (L) \nabla \bar{U} + \nabla P^{T} C \nabla P + 2 \nabla {\bar{U}}^{T} A (\bar{W}) .

(35)

Summarizing, we have the following scheme for the computation of the second derivative of the EOLS:

1.
Compute $U = (\bar{U}, P)$ by solving linear system (30).
2.
Compute $W = (\bar{W}, P)$ by solving linear system (31).
3.
Compute $\nabla U = (\nabla \bar{U}, \nabla P)$ by solving m linear systems.
4.
Compute $\nabla^{2} J (L)$ by using formula (35).

We note that to compute the Hessian using the hybrid method requires the solution of $m + 2$ linear systems.

5 Numerical experiments

We consider here two representative examples of elastography inverse problems for the recovery of a variable μ on a two-dimensional isotropic domain $Ω = (0, 1) \times (0, 1)$ with boundary $\partial Ω = Γ_{1} \cup Γ_{2}$ . In the first example, a smooth coefficient is recovered using both the adjoint and hybrid gradient calculation methods. For the second example, we examine the recovery of a discontinuous coefficient using the adjoint method.

All examples are solved on a $75 \times 75$ quadrangular mesh with 5,476 quadrangles and 16,576 total degrees of freedom. Example 1 uses a smooth Tikhonov-type regularization method, whereas the discontinuities in Example 2 necessitate the use of a BV-regularization scheme (see [23] for a more thorough discussion of regularization).

5.1 Example 1

In this example we consider the recovery of a smooth coefficient in which the left and right domain boundaries ( $Γ_{1}$ ) are fixed with static condition $g (x, y)$ and the top and bottom boundaries have Neumann condition $h (x, y)$ . The functions defining the coefficient, load, and boundary conditions are as follows:

\begin{matrix} μ (x, y) = 2.5 + \frac{1}{4} sin (2 π x), f (x, y) = [\begin{array}{c} 2.3 + \frac{1}{10} x \\ 2.3 + \frac{1}{10} y \end{array}], \\ g (x, y) = \frac{1}{100} [\begin{array}{c} x \\ y^{2} \end{array}] on Γ_{1}, h (x, y) = \frac{1}{2} [\begin{array}{c} 1 + 2 x^{2} \\ 1 + 2 y^{2} \end{array}] on Γ_{2} . \end{matrix}

For this example, the underlying optimization problem was solved using both a first-order Newton-CG-Trust Region algorithm as well as a second-order quasi-Newton method, using the adjoint and hybrid gradient calculations outlined in the preceding sections, respectively. Comparatively, the hybrid method converges faster to the solution in only 9 algorithm iterations compared to 13 iterations for the adjoint method when both are started from the same initial point and under the same stopping criteria ( $\nabla J < 10^{- 10}$ ). This can be seen qualitatively in Figures 1 and 2 through the comparison of the computed μ at selected intermediary algorithm steps (subfigures (a) and (b)).

5.2 Example 2

For the discontinuous example, the top of the region is taken as $Γ_{1}$ and fixed with (constant) Dirichlet condition $g (x, y)$ . The remaining edges of the region are taken as $Γ_{2}$ with Neumann condition $h (x, y)$ . The functions defining the coefficient, load, and boundary conditions are as follows:

\begin{matrix} μ (x, y) = {\begin{matrix} 0.3 & for (x, y) \in R_{1}, \\ 0.5 & for (x, y) \in R_{2}, \\ 0.2 & otherwise, \end{matrix} f (x, y) = [\begin{array}{c} 2.0 + \frac{x}{10} \\ 2.0 + \frac{y}{10} \end{array}], \\ g (x, y) = [\begin{array}{c} 0 \\ 0 \end{array}] on Γ_{1}, h (x, y) = [\begin{array}{c} 1 \\ 1 \end{array}] on Γ_{2}, \end{matrix}

where $R_{1} = {(x, y) : 0.6 \leq x \leq 0.8, 0.2 \leq y \leq 0.6}$ and $R_{2} = {(x, y) : 0.2 \leq x \leq 0.4, 0.2 \leq y \leq 0.4}$ .

6 Concluding remarks

In this work we have presented a detailed application of the adjoint method for efficiently computing the gradient of the energy output least-squares functional as well as a hybrid method for calculating the functional’s second derivative. We have also provided two numerical examples of elastography inverse problems to demonstrate the overall feasibility of implementation and establish the relative effectiveness of these methods when coupled with the appropriate first-order and second-order optimization algorithms. See Figure 3.

One issue not addressed in depth was the comparative performance of these methods, measured both against existing schemes and against one other. In short, we note that the hybrid method requires the solution of $m + 2$ linear systems with m scaling along with the size of the mesh. However, the m systems remain entirely independent, allowing for the parallelization of parts of the computation and thus granting significant performance gains and potential advantages over other strategies. In a future work, we look to extend our study here into just such a thorough analysis and carefully consider the performance of the adjoint and hybrid derivative computation methods.

References

Doyley MM: Model-based elastography: a survey of approaches to the inverse elasticity problem. Phys. Med. Biol. 2012., 57: Article ID R35 10.1088/0031-9155/57/3/R35
Google Scholar
Raghavan KR, Yagle AE: Forward and inverse problems in elasticity imaging of soft tissues. IEEE Trans. Nucl. Sci. 1994, 41: 1639-1648.
Article Google Scholar
Aguilo MA, Aquino W, Brigham JC, Fatemi M: An inverse problem approach for elasticity imaging through vibroacoustics. IEEE Trans. Med. Imaging 2010, 29: 1012-1021.
Article Google Scholar
Ammari H, Garapon P, Jouve F: Separation of scales in elasticity imaging: a numerical study. J. Comput. Math. 2010, 28: 354-370.
Article MATH MathSciNet Google Scholar
Arnold A, Reichling S, Bruhns O, Mosler J: Efficient computation of the elastography inverse problem by combining variational mesh adaption and clustering technique. Phys. Med. Biol. 2010, 55: 2035-2056.
Article Google Scholar
Beretta E, Bonnetier E, Francini E, Mazzucato A: Small volume asymptotics for anisotropic elastic inclusions. Inverse Probl. Imaging 2012, 6: 1-23.
Article MATH MathSciNet Google Scholar
Ji L, McLaughlin J: Recovery of Lamé parameter μ in biological tissues. Inverse Probl. 2004, 20: 1-24.
Article MATH MathSciNet Google Scholar
Kallel F, Bertrand M: Tissue elasticity reconstruction using linear perturbation method. IEEE Trans. Med. Imaging 1996, 15: 299-313.
Article Google Scholar
Barbone PE, Bamber JC: Quantitative elasticity imaging: what can and cannot be inferred from strain images. Phys. Med. Biol. 2002, 47: 2147-2164.
Article Google Scholar
Barbone PE, Gokhale NH: Elastic modulus imaging: on the uniqueness and nonuniqueness of the elastography inverse problem in two dimensions. Inverse Probl. 2004, 20: 283-296.
Article MATH MathSciNet Google Scholar
Braess D: Finite Elements: Theory, Fast Solvers, and Applications in Solid Mechanics. 3rd edition. Cambridge University Press, Cambridge; 2007.
Book MATH Google Scholar
Chan TF, Tai XC: Identification of discontinuous coefficients in elliptic problems using total variation regularization. SIAM J. Sci. Comput. 2003, 25: 881-904.
Article MATH MathSciNet Google Scholar
Gockenbach MS, Khan AA: Identification of Lamé parameters in linear elasticity: a fixed point approach. J. Ind. Manag. Optim. 2005, 1: 487-497.
Article MATH MathSciNet Google Scholar
Gockenbach MS, Jadamba B, Khan AA: Numerical estimation of discontinuous coefficients by the method of equation error. Int. J. Math. Comput. Sci. 2006, 1: 343-359.
MATH MathSciNet Google Scholar
Gockenbach MS, Jadamba B, Khan AA: Equation error approach for elliptic inverse problems with an application to the identification of Lamé parameters. Inverse Probl. Sci. Eng. 2008, 16: 349-367.
Article MATH MathSciNet Google Scholar
Harrigan T, Konofagou EE: Estimation of material elastic moduli in elastography: a local method, and an investigation of Poisson ratio sensitivity. J. Biomech. 2004, 37: 1215-1221.
Article Google Scholar
Jadamba B, Khan AA, Raciti F: On the inverse problem of identifying Lamé coefficients in linear elasticity. Comput. Math. Appl. 2008, 56: 431-443.
Article MATH MathSciNet Google Scholar
Jadamba B, Khan AA, Sama M: Inverse problems of parameter identification in partial differential equations. In Mathematics in Science and Technology. World Scientific, Hackensack; 2011:228-258.
Chapter Google Scholar
Konofagou E, Harrigan T, Ophir J, Krouskop T: Poroelastography: estimation and imaging of the poroelastic properties of tissues. IEEE Proceedings of the Symposium in Ultrasonics, Ferroelectrics and Frequency Control 1999, 1627-1630, Lake Tahoe, NV
Google Scholar
McLaughlin J, Yoon JR: Unique identifiability of elastic parameters from time-dependent interior displacement measurement. Inverse Probl. 2004, 20: 25-45.
Article MATH MathSciNet Google Scholar
Mehrabian H, Campbell G, Samani A: A constrained reconstruction technique of hyperelasticity parameters for breast cancer assessment. Phys. Med. Biol. 2012, 53: 7489-7508.
Google Scholar
Brezzi F, Fortin M: Mixed and Hybrid Finite Element Methods. Springer, New York; 1991.
Book MATH Google Scholar
Doyley, MM, Jadamba, B, Khan, AA, Sama, M, Winkler, B: A new energy inversion for parameter identification in saddle point problems with an application to the elasticity imaging inverse problem of predicting tumor location (2013, submitted)
Oberai AA, Gokhale NH, Feijóo GR: Solution of inverse problems in elasticity imaging using the adjoint method. Inverse Probl. 2003, 19: 297-313.
Article MATH MathSciNet Google Scholar
Tortorelli DA, Michaleris P: Design sensitivity analysis: overview and review. Inverse Probl. Eng. 1994, 1: 71-105.
Article Google Scholar
Cioacaa A, Alexea M, Sandua A: Second-order adjoints for solving PDE-constrained optimization problems. Optim. Methods Softw. 2012, 27: 625-653.
Article MathSciNet Google Scholar
Goeleven D, Motreanu D 2. In Variational and Hemivariational Inequalities - Theory, Methods and Applications. Springer, Berlin; 2003.
Chapter Google Scholar
Bush, N, Jadamba, B, Khan, AA, Raciti, F: Identification of a parameter in fourth-order partial differential equations by an equation error approach (2014, to appear)
MATH Google Scholar
Crossen, E, Gockenbach, MS, Jadamba, B, Khan, AA, Winkler, B: An equation error approach for the elasticity imaging inverse problem for predicting tumor location. Comput. Math. Appl. (2013, to appear)
Google Scholar
Gockenbach MS, Khan AA: An abstract framework for elliptic inverse problems. Part 1: an output least-squares approach. Math. Mech. Solids 2007, 12: 259-276.
Article MATH MathSciNet Google Scholar
Gockenbach MS, Khan AA: An abstract framework for elliptic inverse problems. Part 2: an augmented Lagrangian approach. Math. Mech. Solids 2009, 14: 517-539.
Article MATH MathSciNet Google Scholar

Download references

Acknowledgements

The work of AA Khan is supported by RIT’s COS D-RIG Acceleration Research Funding Program 2012-2013 and a grant from the Simons Foundation (#210443 to Akhtar Khan). The work of M Sama is partially supported by Ministerio de Ciencia (Spain), project (MTM2012-30942).

Author information

Authors and Affiliations

Center for Applied and Computational Mathematics, School of Mathematical Sciences, Rochester Institute of Technology, 85 Lomb Memorial Drive, Rochester, NY, 14623, USA
Nathan D Cahill, Baasansuren Jadamba, Akhtar A Khan & Brian C Winkler
Departamento de Matemática Aplicada, Universidad Nacional de Educación a Distancia, Calle Juan del Rosal, 12, Madrid, 28040, Spain
Miguel Sama

Authors

Nathan D Cahill
View author publications
You can also search for this author in PubMed Google Scholar
Baasansuren Jadamba
View author publications
You can also search for this author in PubMed Google Scholar
Akhtar A Khan
View author publications
You can also search for this author in PubMed Google Scholar
Miguel Sama
View author publications
You can also search for this author in PubMed Google Scholar
Brian C Winkler
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Akhtar A Khan.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

This research was carried out during Prof. Miguel Sama’s visit at RIT and all the work was done at that time in a collaborative manner. All authors read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Cahill, N.D., Jadamba, B., Khan, A.A. et al. A first-order adjoint and a second-order hybrid method for an energy output least-squares elastography inverse problem of identifying tumor location. Bound Value Probl 2013, 263 (2013). https://doi.org/10.1186/1687-2770-2013-263

Download citation

Received: 14 August 2013
Accepted: 05 November 2013
Published: 02 December 2013
DOI: https://doi.org/10.1186/1687-2770-2013-263

A first-order adjoint and a second-order hybrid method for an energy output least-squares elastography inverse problem of identifying tumor location

Abstract

1 Introduction

2 Optimization approach for inverse problems in saddle point problems

3 An adjoint and a hybrid method for the energy output least squares

4 Discretization formulas for the adjoint and the hybrid method

4.1 Computation of the gradient by using the adjoint method

4.2 Computation of the Hessian by using a hybrid method

5 Numerical experiments

5.1 Example 1

5.2 Example 2

6 Concluding remarks

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Authors’ original submitted files for images

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Rights and permissions

About this article

Cite this article

Share this article

Keywords