Convex optimization-based blind deconvolution for images taken with coherent illumination

Reinier Doelman; Michel Verhaegen

doi:10.1364/JOSAA.36.000678

1. INTRODUCTION

In application areas such as coherent diffraction imaging (CDI) [1], long-range horizontal imaging [2], imaging of layered metamaterials [3], or ptychography [4], the image formation process can be described using the expressions for imaging with coherent illumination [5]

y = {| g_{o} ⋆ h |}^{2},

where

y

denotes the (noiseless) measurement (recorded discretized image), and

{| \cdot |}^{2}

denotes the element-wise squared absolute value of the complex-valued argument, in this case the complex field in the imaging plane.

g_{o}

is the object-plane complex amplitude, and

h

denotes the amplitude impulse response.

⋆

is the (discrete) convolution operator. The amplitude impulse response is sometimes called the coherent point spread function (coherent PSF). In ptychography, for example, the quantities of interest are the Fourier transforms of

g_{o}

or

h

.

If either $g_{o}$ or $h$ is known, and the other is to be estimated based on the measurements $y$ , then this problem is called a phase retrieval problem. If both quantities are to be estimated based on $y$ , then the problem is called a blind deconvolution problem. The method proposed in this paper can be seen as an extension of a method we proposed in [6] for phase retrieval, where that algorithm is compared to other standard phase retrieval methods.

In this paper we consider the blind deconvolution problem for images taken with coherent illumination, that is,

find g_{o}, h subject to y = {| g_{o} ⋆ h |}^{2}, g_{o} \in K_{g_{o}}, h \in K_{h},

where

K

denotes a set of (convex) constraints on the variables that encode the available prior information.

This blind deconvolution problem is different from what is typically encountered in the literature, the blind deconvolution problem for images taken with incoherent illumination [5]

find f, s subject to y = f ⋆ s, f \in K_{f}, s \in K_{s},

where

f = {| g_{o} |}^{2}

is the (real and positive valued) intensity of the object in the object plane, and

s = {| h |}^{2}

is the intensity impulse response, more often called the point spread function. For the incoherent illumination case, there are several categories of blind deconvolution methods in the literature. Classic iterative projection methods [7–11] use alternating projections of the estimates and their Fourier transforms on their respective constraints in the constraint sets. A second group is that of (non-convex), gradient-based optimization approaches [12], including Bayesian estimation approaches [13–15]. A downside of gradient-based approaches is that the initial guess is often crucial for performance. Recently, a third group of algorithms is being developed based on convex optimization of a “lifted” problem [16–19]. The “lifting” of the problem hinders the use of phase diversity [20], a powerful type of prior information, which can be described as a linear constraint on

h

for different images with different phase diversities.

For the coherent illumination blind deconvolution problem, we can make the same classifications.

In the first category, there is [21,22], where the extended Ptychographical Iterative Engine (ePIE) is proposed, an iterative transform algorithm for ptychography. Other iterative Fourier transform-based techniques are [23–29].

In the second category, there are the methods proposed in [30,31] for the estimation of wavefront errors in CDI and in ptychography [32–34]. Reference [35] compares the performance of several gradient descent schemes showing superior robustness to noise for amplitude-based metrics. Refinement of a guessed object and wavefront aberration in a maximum likelihood context can be found in [36]. Related to this, gradient-descent schemes are also popular in ptychography for compensating for positioning errors [37].

A convex optimization-based approach has, to the best of our knowledge, not been applied to the coherent blind deconvolution problem in the literature. For example, in ptychography, [38] only solves the deconvolution problem, not the blind deconvolution problem, using convex optimization-based heuristic methods.

In this paper we propose a blind deconvolution method for the coherent illumination case based on a rank-constrained reformulation of Eq. (2). The reformulation is such that the use of multiple images and phase diversity is easily incorporated into the reformulation and subsequent optimization problem. To attempt to find a solution with rank constraints satisfied, we propose to use the nuclear norm as a convex heuristic for the rank constraint. An iterative extension of the subsequent convex optimization problem is proposed to possibly improve the convex heuristic approximation. This iterative extension has been shown in the validation studies to improve the results. To anticipate the problem that the convex optimization problem results in an unsatisfactory solution, we propose an iterative scheme of convex optimization problems that produces, in our experience, iteratively improved results.

The organization of this paper is as follows. In Section 2 we formulate the blind deconvolution problem as a problem to estimate a complex-valued object and the affinely parameterized pupil function of the optical system with unknown phase aberration. Section 3 explains how to reformulate the blind deconvolution into a rank-constraint problem with constraints on matrices affinely parameterized in the object and amplitude impulse response. Section 3.D describes the convex heuristic for the problem, and Section 3.F describes how to incorporate several types of prior information. In Section 4 we demonstrate the algorithm on an illustrative numerical example and compare our method to a gradient descent scheme.

A. Notation

The operation $x = vect (X)$ stacks the columns from the left to right of matrix $X$ on top of each other to obtain the vector $x$ . $\otimes$ denotes the Kronecker product. $I_{n}$ denotes an $n \times n$ identity matrix. $X = d (x)$ is the diagonal matrix with the values of the vector $x$ on its diagonal. The Hermitian transpose of $X$ is denoted by $X^{H}$ . The nuclear norm is denoted as $‖ X ‖_{*}$ , and the Frobenius norm as $‖ X ‖_{F}$ .

2. PROBLEM DESCRIPTION

The generalized pupil function (GPF) characterizing an optical system [5] is the complex-valued function

P (ρ, θ) = A (ρ, θ) \exp (j ϕ (ρ, θ)),

where

(ρ, θ)

are the radius and angle of the polar coordinates, respectively.

| ρ | \leq 1

and

θ \in [0,2 π)

.

A (ρ, θ)

is the amplitude apodization function, and

ϕ (ρ, θ) \in R

is the phase aberration function of the optical system.

To obtain more measurements of the same object $g_{o}$ with different PSFs, a phase diversity $ϕ_{d}$ may be introduced into the system by means of, for example, a deformable mirror. The GPF then becomes

P_{d} (ρ, θ) = A (ρ, θ) \exp (j ϕ (ρ, θ)) \exp (j ϕ_{d} (ρ, θ)) .

In this paper we consider the problem in modal form: we assume that the GPF can be well-approximated with a weighted sum of basis functions. We use real-valued radial basis functions and complex coefficients to approximate the GPF [39]. Switching from polar coordinates

(ρ, θ)

to Cartesian coordinates

(x, y)

, the radial basis functions and approximate GPF are given by

G_{i} = χ (x, y) \exp (- λ_{i} ({(x - x_{i})}^{2} + {(y - y_{i})}^{2})), P (x, y) \approx \tilde{P} (x, y, v) = \sum_{i = 1}^{s} v_{i} G_{i} (x, y),

with

(x_{i}, y_{i})

being the centers of the radial basis functions and

v_{i} \in C

.

χ (x, y)

is the aperture support function,

λ_{i}

is the spread of the radial basis function, and

v \in C^{s}

is the complex-valued vector of coefficients

v_{i}

. Including an introduced diversity

ϕ_{d} (x, y)

, the approximate pupil function reads

{\tilde{P}}_{d} (x, y, v) = \sum_{i = 1}^{s} v_{i} G_{i} (x, y) \exp (j ϕ_{d} (x, y)) .

The amplitude impulse response, also called the coherent point spread function (coherent PSF),

h_{d} (u, v)

is the (two-dimensional) inverse Fourier transform of the GPF

h_{d} (u, v) = \sum_{i = 1}^{s} v_{i} F^{- 1} {G_{i} (x, y) \exp (j ϕ_{d} (x, y))} = \sum_{i = 1}^{s} v_{i} B_{d, i} (u, v) .

Here the coordinates

(u, v)

are the Cartesian coordinates in the image plane of the optical system.

A complex amplitude in the object plane $g_{o}$ , imaged through this optical system, in the case of coherent illumination, gives the complex amplitude $g_{i}$ in the image plane [5] as follows:

g_{i} (u, v) = g_{o} (u, v) ⋆ h_{d} (u, v) .

In the noise-free case, the intensity of the complex field

g_{i}

is recorded to produce the measurements

y

as follows:

y (u, v) = {| g_{i} (u, v) |}^{2} .

We now drop the notation for the dependency on coordinates and assume the signals

y, g_{i}, g_{o}

, and

h_{d}

are sampled on square grids of sizes

r \times t, r \times t, m \times n

, and

p \times q

, respectively, such that we obtain matrices of the corresponding size. Throughout this paper we assume that the edges of

g_{o}

and

h_{d}

are zero padded, which for the discrete two-dimensional convolution results in the relation

r = m + p - 1, t = n + q - 1

.

With a slight abuse of notation, the blind deconvolution problem in Eq. (2) has now turned into the problem to identify $g_{o}$ and $v$ from measurements $y$ as follows:

find g_{o}, v, h, g_{i} subject to y = {| g_{i} |}^{2}, g_{i} = g_{o} ⋆ h, h = Bv, g_{o} \in K_{g_{o}}, h \in K_{h} .

Here the constraints

y = {| g_{i} |}^{2}

and

g_{i} = g_{o} ⋆ h

are non-convex constraints. The prior information in

K_{g_{o}}

and

K_{h}

is assumed to be a convex constraint on the decision variables.

3. BLIND DECONVOLUTION AS A RANK-CONSTRAINED FEASIBILITY PROBLEM

The aim of this section is to rewrite Eq. (11) into a feasibility problem with rank constraints: one rank constraint to replace $y = {| g_{i} |}^{2}$ , and one rank constraint to replace $g_{i} = g_{o} ⋆ h$ .

In the following two subsections, we use the following lemma:

Lemma 1 [40] Define the matrix

M (C, A, B, Q, X, Y, W_{1}, W_{2}) = (\begin{matrix} W_{1} & 0 \\ 0 & I \end{matrix}) (\begin{matrix} C + A Q Y + X Q B + X Q Y & (A + X) Q \\ Q (B + Y) & Q \end{matrix}) (\begin{matrix} W_{2} & 0 \\ 0 & I \end{matrix}),

where

I

is the identity matrix.

For any $X$ of the same size as $A$ , for any $Y$ of the same size as $B$ , for any invertible matrices $W_{1}, W_{2}$ of a size corresponding to the sizes of matrix $C$ , and for nonzero $Q$ , it holds that the equality

rank (M (C, A, B, Q, X, Y, W_{1}, W_{2})) = rank (Q)

is equivalent to the equality

C = A Q B .

Note that variables

A

and

B

appear in a product in Eq. (13), but they do not appear in a product in the matrix

M

in Eq. (12).

A. Convolution Constraint $g_{i} = g_{o} ⋆ h$

The two-dimensional (discrete) convolution of $g_{o}$ and $Bv$ gives the matrix $g_{i}$ . The elements of the matrix $g_{i}$ are given by the summation of products of individual elements of $g_{o}$ with individual elements of $Bv$ . Lemma 2 states how this can be cast into a bilinear matrix equality as follows:

Lemma 2 The constraint $g_{i} = g_{o} ⋆ h$ is equivalent to the bilinear equality

vect (g_{i}) = (vect {(g_{o})}^{T} \otimes I_{r t}) V vect (Bv)

for a matrix of zeros and ones

V \in B^{m n r t \times p q}

.

Proof. See Appendix A.

In Eq. (14), the general bilinear form $C = A Q B$ of Lemma 1 shows through, with

C = vect (g_{i}), A = (vect {(g_{o})}^{T} \otimes I_{r t}), Q = V, B = vect (Bv) .

We can therefore replace the constraint

g_{i} = g_{o} ⋆ h

with the rank constraint

rank (M (vect (g_{i}), vect {(g_{o})}^{T} \otimes I_{r t}, vect (Bv), V, X, Y, W_{1}, W_{2})) = rank (V) .

The matrices

X, Y, W_{1}

, and

W_{2}

are here further specified to

X = - vect {(\hat{g_{o}})}^{T} \otimes I_{r t}, Y = - vect (B \hat{v}), W_{1} = I, W_{2} = I,

where

\hat{g_{o}}

and

\hat{v}

are—for the moment—some guesses for

g_{o}

and

v

, respectively. The expression for the matrix-valued function

M

we now abbreviate for notational convenience and call this specific abbreviation

M_{c}

;

M_{c} (g_{i}, g_{o}, v, V, \hat{g_{o}}, \hat{v}) = M (vect (g_{i}), vect {(g_{o})}^{T} \otimes I_{r t}, vect (Bv), V, X, Y, W_{1}, W_{2}),

where

M_{c} \in C^{(r t + m n r t) \times (1 + p q)}

.

B. Measurement Constraint $y = {| g_{i} |}^{2}$

The treatment of the measurement constraints is similar to that in [6]. The constraint $y = {| g_{i} |}^{2}$ uses the element-wise operator $| \cdot |$ . To obtain the relation between $y$ and $g_{i}$ in matrix format, we place the values on a matrix diagonal as follows:

y = {| g_{i} |}^{2} \Leftrightarrow d (vect (y)) = d {(vect (g_{i}))}^{H} d (vect (g_{i})) .

The related rank constraint is

rank (M (d (vect (y)), d {(vect (g_{i}))}^{H}, d (vect (g_{i})), I_{r t}, X, Y, W_{1}, W_{2})) = r t .

We further specify here that

X = - d {(vect (\hat{g_{i}}))}^{H}, Y = - d (vect (\hat{g_{i}})), W_{1} = I, W_{2} = I,

where

\hat{g_{i}}

is a guess for

g_{i}

. Furthermore, we abbreviate the arguments of

M

as

M_{m} (y, g_{i}, \hat{g_{i}}) = M (d (vect (y)), d {(vect (g_{i}))}^{H}, d (vect (g_{i})), I_{r t}, X, Y, W_{1}, W_{2}),

where

M_{m} \in C^{2 r t \times 2 r t}

.

C. Rank-Constrained Blind Deconvolution Problem

Using Eqs. (16) and (20), the blind deconvolution problem in Eq. (11) can be expressed as

find g_{o}, v, g_{i},

The caveat is that rank-constrained problems are in general NP-hard, that is (informally), in general there do not exist algorithms that can compute a guaranteed feasible solution within a time that is bounded by a polynomial in the number of variables. However, we can attempt to compute a solution

{g_{o}^{*}, g_{i}^{*}, v^{*}}

and check whether the matrices

M_{m} (y, g_{i}^{*}, \hat{g_{i}})

and

M_{c} (g_{i}^{*}, g_{o}^{*}, v^{*}, V, \hat{g_{o}}, \hat{v})

have the correct rank.

D. Convex Heuristic for Blind Deconvolution

Even though the reformulated problem with its rank constraints is still non-convex, we propose to use a convex heuristic, the nuclear norm [41], to attempt to minimize the ranks of the matrices involved. The nuclear norm of a matrix is defined as the sum of the singular values of a matrix

‖ X ‖_{*} = \sum_{i} σ_{i} (X),

where

σ_{i} (X)

is the

i

th largest singular value of

X

. We can therefore use the nuclear norm as a convex heuristic for the blind deconvolution problem to attempt to find a solution, but success is not guaranteed. The convex optimization approach for Eq. (23) is

\min_{g_{o}, v, g_{i}} μ ‖ M_{m} (y, g_{i}, \hat{g_{i}}) ‖_{*} + ‖ M_{c} (g_{i}, g_{o}, v, V, \hat{g_{o}}, \hat{v}) ‖_{*}, g_{o} \in K_{g_{o}}, h = Bv \in K_{h},

where the parameter

μ > 0

is a tuning parameter that weighs the nuclear norm of matrix

M_{m}

with the nuclear norm of matrix

M_{c}

.

The optimization problem in Eq. (25) is parameterized in Eqs. (17) and (21) by $\hat{g_{o}}, \hat{v}$ , and $\hat{g_{i}}$ . The interpretation is that, given some guess ${\hat{g_{o}}, \hat{v}, \hat{g_{i}}}$ , Eq. (25) produces a new estimate ${{\hat{g_{o}}}^{+}, {\hat{v}}^{+}, {\hat{g_{i}}}^{+}}$ . Motivated by [6,40], Eq. (25) can be used in an iterative update scheme; see Algorithm 1.

Algorithm 1. Convex optimization-based blind deconvolution for images taken with coherent illumination

View Table

Such an iterative scheme gives rise to three questions. First, do the estimates converge to a fixed point? Second, are the resulting estimates correct solutions to the blind deconvolution problem? Third, if they converge, how fast do they converge? Unfortunately, all three questions are very difficult to answer, and we cannot provide a theoretical proof of convergence. We do notice, however, that correct solutions of the blind deconvolution problem are fixed points of Algorithm 1. For solutions ${g_{o}^{*}, v^{*}, g_{i}^{*}}$ of the blind deconvolution problem, we verify that by substitution

μ ‖ M_{m} (y, g_{i}^{*}, g_{i}^{*}) ‖_{*} + ‖ M_{c} (g_{i}^{*}, g_{o}^{*}, v^{*}, V, g_{o}^{*}, v^{*}) ‖_{*} = μ {‖ (\begin{matrix} 0 & 0 \\ 0 & I_{r t} \end{matrix}) ‖}_{*} + {‖ (\begin{matrix} 0 & 0 \\ 0 & V \end{matrix}) ‖}_{*} = μ r t + ‖ V ‖_{*},

which does not depend on any of the variables. So if

{\hat{g_{o}}, \hat{v}, \hat{g_{i}}} = {g_{o}^{*}, v^{*}, g_{i}^{*}}

in Eq. (25), the optimal parameters for Eq. (25) are

{g_{o}^{*}, v^{*}, g_{i}^{*}}

.

The convergence speed properties and success rate of Algorithm 1 depend on the initialization ${{\hat{g_{o}}}^{0}, {\hat{v}}^{0}, {\hat{g_{i}}}^{0}}$ and tuning of $μ$ and the matrices $W_{1}, W_{2}$ in Eqs. (17) and (21). To show the difference that tuning of $W_{1}$ and $W_{2}$ in Eqs. (17) and (21) can make, we solve a small, one-dimensional blind deconvolution problem with three different sets of tuning parameters (see https://bitbucket.org/rdoelman/blinddeconvolution). We set $W_{1} = W_{2} = m_{1} I$ in Eq. (21), $W_{1} = c_{1} I$ , $W_{2} = c_{2} I$ in Eq. (17), and $μ = 1$ . The three sets of parameters $(m_{1}, c_{1}, c_{2})$ are (1, 1, 1), (2, 1, 4), and (0.6, 1, 0.6). The different convergence speeds can be seen in Fig. 1. It can be seen that the effect of tuning on the convergence speed can be very large. Unfortunately, we cannot provide general tuning rules that optimize convergence speed.

Fig. 1. Convergence of the constraint violations $| | y - {| \hat{g_{i}} |}^{2} {| |}_{F}^{2}$ (measurement fit) and $| | \hat{g_{i}} - \hat{g_{o}} ⋆ B \hat{v} {| |}_{F}^{2}$ (convolution fit) through updates in Algorithm 1 for three different sets of tuning parameters.

Download Full Size | PDF

E. Computational Complexity of Eq. (25)

The computational complexity of the nuclear norm minimization in Eq. (25) can be estimated as follows. If we assume that minimizing the nuclear norm of a matrix with $n$ variables is of approximately $O (n^{6})$ when using a standard semidefinite programming (SDP) solver [42], then solving Eq. (25) is of complexity $O ({(r t + m n + p q)}^{6})$ , which is very unfavorable for practical applications. An alternating direction method of multipliers (ADMM, [43,44]) solution for Eq. (25) consists of the singular value decomposition of the matrices $M_{c} \in C^{(r t + m n r t) \times (1 + p q)}$ and $M_{m} \in C^{2 r t \times 2 r t}$ that are of $O (r t (1 + m n) {(1 + p q)}^{2} + {(1 + p q)}^{3})$ and $O ({(r t)}^{3})$ , respectively.

Exploiting parallelization opportunities similar to [6], this can be reduced to $r t$ singular value decompositions (SVDs) with complexity $O (\max {(m n, p q)}^{3})$ and $r t$ SVDs of matrices of size 2 with complexity $O (1)$ , which can be computed in parallel.

F. Including Prior Information and Regularization

The optimization in Eq. (25) is a convex optimization problem in the decision parameters $g_{i}, g_{o}$ , and $v$ . This makes the addition of prior information and regularization very simple, if these can be expressed as convex constraints or convex penalty functions. The convex optimization-based blind deconvolution (for incoherent illumination) techniques such as [17] are based on directly estimating ${| g_{i} |}^{2}$ and ${| h |}^{2}$ , making it difficult to apply constraints on $g_{i}$ and $h$ .

We here list some examples of prior information that can be included.

1. The imaged object has a known support (known non-zero-valued pixels). This can be expressed as the constraint $Q vect (g_{o}) = 0$ for a selection matrix $Q$ .
2. The imaged object is sparse, in the sense that many pixels of $g_{o}$ have value 0. This can be expressed through the addition of a penalty term $τ \sum_{i} | g_{o i} |$ with regularization parameter $τ$ and $i$ denoting the $i$ th pixel.
3. The extension to the use of multiple images, taken with different phase diversities, can be done by adding additional terms to the objective function corresponding to the different images and with addition of the constraints $h_{n} = B_{n} v \in K_{h_{n}}$ for the $n$ th image.
4. In ptychography, overlapping parts of an object positioned in the pupil plane are imaged with the same “probe” or amplitude transfer function. If we write the Fourier transform as a linear mapping with a matrix $F$ , $vect (F {x}) = F vect (x)$ , then a shift in the position of the illuminated object can be represented by the constraint $F_{1} vect (g_{o 1}) = F_{2} vect (g_{o 2})$ , where $F_{1}$ and $F_{2}$ are those parts of the Fourier transform matrices that correspond to the overlapping part of the object. This constraint addresses the problem that a phase aberration of the probe can be attributed to the phase of the object and the other way around.

4. NUMERICAL EXPERIMENTS

We implemented an ADMM algorithm in MATLAB to compute the updates in Algorithm 1. Although this allows for parallel computations of $r t d$ SVDs with complexity $O (\max {(m n, p q)}^{3})$ , where $d$ is the number of images taken, and $r t d$ SVDs with complexity $O (1)$ , we computed these in series. Due to the computational complexity, we tested the algorithm for two cases with small dimensions. Furthermore, the ADMM algorithm that iteratively finds the optimal solution to Eq. (25) is terminated after only 10 iterations.

For comparison, we implemented a gradient descent method comparable to [30,31,36], but adapted to our formulation with decision variables defined in the focal plane. An accelerated gradient descent scheme, ADAM [45], is used to speed up the procedure and automatically determine step size. The step size $η$ is tuned once up front to ensure convergence. The settings are: $β_{1} = 0.8, β_{2} = 0.999, ε = 1 \cdot 10^{- 8}, η = 2 \cdot 10^{- 4}$ .

The experiment models an unknown aberration consisting of eight basis functions, as in Eq. (6), that approximate a small defocus $ϕ = 0.2 Z_{2}^{0}$ , where $Z_{2}^{0}$ is the defocus Zernike polynomial. We take three images with phase diversities that are defoci with coefficients $\approx - 2,0$ and 2. Due to the computational complexity, the aperture is undersampled when the amplitude impulse response is computed, and the resulting matrix is cut to a size of $5 \times 5$ . The object $g_{o}$ is a complex-valued matrix of dimensions $8 \times 8$ , and the resulting measurements $y$ are of size $12 \times 12$ ; see Figs. 2 and 3. The value of $μ$ in Eq. (25) is tuned to 0.55.

Fig. 2. Top: the three $12 \times 12$ (noiseless) measured intensities $y = {| g_{o} ⋆ h |}^{2}$ . Bottom: the three $5 \times 5$ intensity impulse response functions (point spread functions) $s = {| h |}^{2}$ corresponding to the three different diversities that generate the different $h$ .

Download Full Size | PDF

Fig. 3. Left: the amplitude of the object. Right: the phase in radians of the object.

Download Full Size | PDF

Both Algorithm 1 and the gradient descent method are tested on a noiseless case and the same case where measurement noise has been added with a signal-to-noise ratio (SNR) of 20 dB. Both algorithms in both cases are initialized with the same initial guess, where the pixels of the initial object estimate are randomly drawn from a Gaussian distribution and the initial guesses for the coefficients are those that best approximate zero aberration. The computation time for the proposed method, implemented without taking advantage of parallel computation of SVDs, is approximately 10 h for the 15,000 iterations as shown here. The computation time consists of roughly 5 h for computation of SVDs, 40 min for solving least-squares problems, and the rest is overhead. The gradient descent method is much faster, with approximately 18 min for 1,00,000 iterations. The resulting norms of the residual between measurements and convolution of the estimated object and amplitude impulse response are plotted in Fig. 4. As can be seen in this figure, the gradient descent method gets stuck in the noiseless case, whereas the proposed method converges to a feasible solution.

Fig. 4. Measurement fit generated by the two algorithms. Black: Algorithm 1. Red: gradient descent. Solid lines show the case with noisy measurements (SNR: 20 dB); dashed lines show the noiseless case.

Download Full Size | PDF

The estimated values $\hat{v}$ and $\hat{g_{o}}$ have an ambiguity, since for a complex scalar $c$ , $c B \hat{v} ⋆ \hat{g_{o}} = B \hat{v} ⋆ c \hat{g_{o}}$ . We can remove the ambiguity from $\hat{v}$ , for example, when reporting the estimation error by computing

\min_{c \in C} ‖ c \hat{v} - v ‖_{2} .

After removal of the ambiguity of the estimated values of

g_{o}

and

v

, we plot in Fig. 5 the norms of the residuals between the actual complex amplitude of the object

g_{o}

and coefficients

v

and their estimated values. As can be seen in this figure, the proposed method converges in the noiseless case not just to a feasible solution but to the correct solution, whereas the gradient descent method stops progressing towards the solution. In the noisy case, Algorithm 1 computes the solution with the best estimated measurements (see Fig. 4) and with the best estimated object (Fig. 5, top). The coefficients

v

have an estimate that is further from the real coefficients than the estimate of the gradient descent method, but given the measurement fit and fit of

g_{o}

, the effect of this error is small. The estimates resulting from Algorithm 1 and from the gradient descent method of the object

g_{o}

are displayed in Fig. 6. From Fig. 6 it becomes clear that even though in the noisy case the proposed method does not converge to the exact solution, it converges to a solution that resembles the original object quite well. Inspecting Fig. 5 shows that the gradient descent method provides estimates of

g_{o}

that are far from it. The resulting estimates of

| h |

are shown in Fig. 7.

Fig. 5. Frobenius norm of the residual between the true variables $g_{o}$ , the complex-valued object, and $v$ , the radial basis function coefficients, and the (ambiguity removed) estimated variables $\hat{g_{o}}$ and $\hat{v}$ . Top figure: residuals for $g_{o}$ . Bottom figure: residuals for $v$ . Black: Algorithm 1. Red: gradient descent. Solid lines show the case with noisy measurements (SNR: 20 dB); dashed lines show the noiseless case.

Download Full Size | PDF

Fig. 6. Estimates and residuals of $g_{o}$ using Algorithm 1 and gradient descent with identical initialization. Top left: the estimated amplitude of $g_{o}$ and the residual for the noiseless case. Top right: the estimated angle of $g_{o}$ and the residual for the noiseless case. Bottom left: the estimated amplitude of $g_{o}$ and the residual for the noisy case. Bottom right: the estimated angle of $g_{o}$ and the residual for the noisy case. Since estimates of very small complex values can have a radically different complex angle, the angle is only plotted for the nonzero pixels in the original object of Fig. 3.

Download Full Size | PDF

Fig. 7. Estimates of $| h |$ . The maximum absolute value of $\hat{h}$ is scaled to 1.

Download Full Size | PDF

5. CONCLUSION AND FUTURE RESEARCH

We derived a convex heuristic for the blind deconvolution problem for images taken with coherent illumination that is also able to incorporate the concept of phase diversity. We suggested an update scheme and demonstrated on a numerically illustrative example that it is capable of retrieving the object and PSF from a random initialization, thereby overcoming local minima. At the moment, the method is computationally burdensome, but we expect computational improvements similar to [6] by fully exploiting parallelization opportunities and the structure in the optimization problems. Apart from the nuclear norm heuristic, there are also other methods that attempt to find low rank results, like the difference of convex programming (e.g., [46]) or application of the truncated nuclear norm (e.g., [47]), but we leave the evaluation of their performance for future research. Several questions still remain open concerning optimal tuning rules for the different parameters in the optimization, the performance of other non-convex low-rank-inducing norms, bounds on convergence speed, and the computational speed-up by exploiting parallelization opportunities.

APPENDIX A: PROOF OF LEMMA 2

The vector $z = vect (vect (g_{o}) vect {(Bv)}^{T})$ lists all possible products between elements of $g_{o}$ and elements of $Bv$ . Since the elements of $g_{i}$ , the result of a discrete convolution, are sums of specific elements of $z$ , it is possible to construct a matrix of zeros and ones $L \in B^{r t \times m n p q}$ , such that

vect (g_{i}) = L vect (vect (g_{o}) vect {(Bv)}^{T}) .

Application of the identity [48]

vect (A X B) = (B^{T} \otimes A) vect (X)

allows us to rewrite Eq. (A1) as

L (I_{p q} \otimes vect {(g_{o})}^{T}) vect (Bv) .

Let

l_{i}

be the

i

th row of

L

. Then, applying Eq. (A2) on the

i

th row of Eq. (A3) gives

l_{i} (I_{p q} \otimes vect {(g_{o})}^{T}) = vect {(g_{o})}^{T} L_{i},

where

l_{i}^{T} = vect (L_{i})

and

L_{i} \in B^{m n \times p q}

. Combining the expressions for all rows, the result is

vect (g_{i}) = (I_{r t} \otimes vect {(g_{o})}^{T}) (\begin{matrix} L_{1} \\ ⋮ \\ L_{r t} \end{matrix}) vect (Bv) .

Using a row reordering of the matrix with blocks

L_{i}

([48], Eq. 2.14) gives us

V

, the expression in Eq. (14).

Funding

FP7 Ideas: European Research Council (IDEAS-ERC) (FP7/2007-2013) (339681).

REFERENCES

1. J. Miao, P. Charalambous, J. Kirz, and D. Sayre, “Extending the methodology of X-ray crystallography to allow imaging of micrometre-sized non-crystalline specimens,” Nature 400, 342–344 (1999). [CrossRef]

2. A. MacDonald, “Blind deconvolution of anisoplanatic images collected by a partially coherent imaging system,” Ph.D. thesis (Air Force Institute of Technology, Wright-Patterson Air Force Base Ohio, 2004).

3. D. Pastor, T. Stefaniuk, P. Wróbel, C. J. Zapata-Rodríguez, and R. Kotyński, “Determination of the point spread function of layered metamaterials assisted with the blind deconvolution algorithm,” Opt. Quantum Electron. 47, 17–26 (2015). [CrossRef]

4. J. M. Rodenburg and H. M. Faulkner, “A phase retrieval algorithm for shifting illumination,” Appl. Phys. Lett. 85, 4795–4797 (2004). [CrossRef]

5. J. Goodman, Introduction to Fourier Optics (McGraw-Hill, 2008).

6. R. Doelman, N. H. Thao, and M. Verhaegen, “Solving large-scale general phase retrieval problems via a sequence of convex relaxations,” J. Opt. Soc. Am. A 35, 1410–1419 (2018). [CrossRef]

7. G. Ayers and J. C. Dainty, “Iterative blind deconvolution method and its applications,” Opt. Lett. 13, 547–549 (1988). [CrossRef]

8. M. Tofighi, O. Yorulmaz, K. Köse, D. C. Yıldırım, R. Çetin-Atalay, and A. E. Cetin, “Phase and TV based convex sets for blind deconvolution of microscopic images,” IEEE J. Sel. Top. Signal Process. 10, 81–91 (2016). [CrossRef]

9. D. Wilding, O. Soloviev, P. Pozzi, G. Vdovin, and M. Verhaegen, “Blind multi-frame deconvolution by tangential iterative projections (tip),” Opt. Express 25, 32305–32322 (2017). [CrossRef]

10. D. Fish, A. Brinicombe, E. Pike, and J. Walker, “Blind deconvolution by means of the Richardson-Lucy algorithm,” J. Opt. Soc. Am. A 12, 58–65 (1995). [CrossRef]

11. P. Sarder and A. Nehorai, “Deconvolution methods for 3-D fluorescence microscopy images,” IEEE Signal Process. Mag. 23(3), 32–45 (2006). [CrossRef]

12. G. Liu, S. Chang, and Y. Ma, “Blind image deblurring using spectral properties of convolution operators,” IEEE Trans. Image Process. 23, 5047–5056 (2014). [CrossRef]

13. T. J. Schulz, “Multiframe blind deconvolution of astronomical images,” J. Opt. Soc. Am. A 10, 1064–1073 (1993). [CrossRef]

14. T. J. Holmes, “Blind deconvolution of quantum-limited incoherent imagery: maximum-likelihood approach,” J. Opt. Soc. Am. A 9, 1052–1061 (1992). [CrossRef]

15. R. Mourya, L. Denis, J.-M. Becker, and E. Thiébaut, “A blind deblurring and image decomposition approach for astronomical image restoration,” in 23rd European Signal Processing Conference (EUSIPCO) (IEEE, 2015), pp. 1636–1640.

16. D. Stöger, P. Jung, and F. Krahmer, “Blind deconvolution and compressed sensing,” in 4th International Workshop on Compressed Sensing Theory and Its Applications to Radar, Sonar and Remote Sensing (CoSeRa) (IEEE, 2016), pp. 24–27.

17. A. Ahmed, A. Cosse, and L. Demanet, “A convex approach to blind deconvolution with diverse inputs,” in 6th International Workshop on Computational Advances in Multi-Sensor Adaptive Processing (CAMSAP) (IEEE, 2015), pp. 5–8.

18. S. Ling and T. Strohmer, “Regularized gradient descent: a non-convex recipe for fast joint blind deconvolution and demixing,” Information and Inference: A Journal of the IMA 8, 1–49 (2019) [CrossRef] .

19. P. Jung, F. Krahmer, and D. Stöger, “Blind demixing and deconvolution at near-optimal rate,” IEEE Trans. Information Theory 64, 704–727 (2018) [CrossRef] .

20. R. A. Gonsalves, “Phase retrieval and diversity in adaptive optics,” Opt. Eng. 21, 215829 (1982). [CrossRef]

21. A. M. Maiden and J. M. Rodenburg, “An improved ptychographical phase retrieval algorithm for diffractive imaging,” Ultramicroscopy 109, 1256–1262 (2009). [CrossRef]

22. A. M. Maiden, M. J. Humphry, F. Zhang, and J. M. Rodenburg, “Superresolution imaging via ptychography,” J. Opt. Soc. Am. A 28, 604–612 (2011). [CrossRef]

23. R. Horstmeyer, X. Ou, J. Chung, G. Zheng, and C. Yang, “Overlapped Fourier coding for optical aberration removal,” Opt. Express 22, 24062–24080 (2014). [CrossRef]

24. F. Jian and L. Peng, “A general phase retrieval algorithm based on a ptychographical iterative engine for coherent diffractive imaging,” Chin. Phys. B 22, 014204 (2013). [CrossRef]

25. M. Foreman, C. Giusca, P. Török, and R. Leach, “Phase-retrieved pupil function and coherent transfer function in confocal microscopy,” J. Microsc. 251, 99–107 (2013). [CrossRef]

26. X. Ou, G. Zheng, and C. Yang, “Embedded pupil function recovery for Fourier ptychographic microscopy,” Opt. Express 22, 4960–4972 (2014). [CrossRef]

27. P. Thibault, M. Dierolf, O. Bunk, A. Menzel, and F. Pfeiffer, “Probe retrieval in ptychographic coherent diffractive imaging,” Ultramicroscopy 109, 338–343 (2009). [CrossRef]

28. R. Hesse, D. R. Luke, S. Sabach, and M. K. Tam, “Proximal heterogeneous block implicit-explicit method and application to blind ptychographic diffraction imaging,” SIAM J. Imaging Sci. 8, 426–457(2015). [CrossRef]

29. H. Chang, P. Enfedaque, and S. Marchesini, “Blind ptychographic phase retrieval via convergent alternating direction method of multipliers,” SIAM J. Imaging Sci. 12, pp. 153–185 (2019).

30. G. R. Brady, M. Guizar-Sicairos, and J. R. Fienup, “Optical wavefront measurement using phase retrieval with transverse translation diversity,” Opt. Express 17, 624–639 (2009). [CrossRef]

31. M. Guizar-Sicairos and J. R. Fienup, “Measurement of coherent x-ray focused beams by phase retrieval with transverse translation diversity,” Opt. Express 17, 2670–2685 (2009). [CrossRef]

32. Y. S. Nashed, T. Peterka, J. Deng, and C. Jacobsen, “Distributed automatic differentiation for ptychography,” Procedia Comput. Sci. 108, 404–414 (2017). [CrossRef]

33. M. Odstrčil, A. Menzel, and M. Guizar-Sicairos, “Iterative least-squares solver for generalized maximum-likelihood ptychography,” Opt. Express 26, 3108–3123 (2018). [CrossRef]

34. Y. Zhang, W. Jiang, and Q. Dai, “Nonlinear optimization approach for Fourier ptychographic microscopy,” Opt. Express 23, 33822–33835 (2015). [CrossRef]

35. L.-H. Yeh, J. Dong, J. Zhong, L. Tian, M. Chen, G. Tang, M. Soltanolkotabi, and L. Waller, “Experimental robustness of Fourier ptychography phase retrieval algorithms,” Opt. Express 23, 33214–33240 (2015). [CrossRef]

36. P. Thibault and M. Guizar-Sicairos, “Maximum-likelihood refinement for coherent diffractive imaging,” New J. Phys. 14, 063004(2012). [CrossRef]

37. A. Tripathi, I. McNulty, and O. G. Shpyrko, “Ptychographic overlap constraint errors and the limits of their numerical recovery using conjugate gradient descent methods,” Opt. Express 22, 1452–1466 (2014). [CrossRef]

38. R. Horstmeyer, R. Y. Chen, X. Ou, B. Ames, J. A. Tropp, and C. Yang, “Solving ptychography with a convex relaxation,” New J. Phys. 17, 053044 (2015). [CrossRef]

39. P. Piscaer, A. Gupta, O. Soloviev, and M. Verhaegen, “Modal-based phase retrieval using Gaussian radial basis functions,” J. Opt. Soc. Am. A 35, 1233–1242 (2018). [CrossRef]

40. R. Doelman and M. Verhaegen, “Sequential convex relaxation for convex optimization with bilinear matrix equalities,” in European Control Conference (ECC) (IEEE, 2016), pp. 1946–1951.

41. B. Recht, M. Fazel, and P. A. Parrilo, “Guaranteed minimum-rank solutions of linear matrix equations via nuclear norm minimization,” SIAM Rev. 52, 471–501 (2010). [CrossRef]

42. Z. Liu and L. Vandenberghe, “Interior-point method for nuclear norm approximation with application to system identification,” SIAM J. Matrix Anal. Appl. 31, 1235–1256 (2009). [CrossRef]

43. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Found. Trends Mach. Learn. 3, 1–122 (2011). [CrossRef]

44. J.-F. Cai, E. J. Candès, and Z. Shen, “A singular value thresholding algorithm for matrix completion,” SIAM J. Optim. 20, 1956–1982 (2010). [CrossRef]

45. D. P. Kingma and J. Ba, “Adam: a method for stochastic optimization,” arXiv:1412.6980 (2014).

46. H. Tuy, “DC optimization: theory, methods and algorithms,” in Handbook of Global Optimization (Springer, 1995), pp. 149–216.

47. Y. Hu, D. Zhang, J. Ye, X. Li, and X. He, “Fast and accurate matrix completion via truncated nuclear norm regularization,” IEEE Trans. Pattern Anal. Mach. Intell. 35, 2117–2130 (2013). [CrossRef]

48. D. A. Turkington, Generalized Vectorization, Cross-Products, and Matrix Calculus (Cambridge University, 2013).

Convex optimization-based blind deconvolution for images taken with coherent illumination

Abstract

1. INTRODUCTION

A. Notation

2. PROBLEM DESCRIPTION

3. BLIND DECONVOLUTION AS A RANK-CONSTRAINED FEASIBILITY PROBLEM

A. Convolution Constraint $g_{i} = g_{o} ⋆ h$

B. Measurement Constraint $y = {| g_{i} |}^{2}$

C. Rank-Constrained Blind Deconvolution Problem

D. Convex Heuristic for Blind Deconvolution

E. Computational Complexity of Eq. (25)

F. Including Prior Information and Regularization

4. NUMERICAL EXPERIMENTS

5. CONCLUSION AND FUTURE RESEARCH

APPENDIX A: PROOF OF LEMMA 2

Funding

REFERENCES

Cited By

Figures (7)

Tables (1)

Equations (36)

Journal of the Optical Society of America A

Abstract

1. INTRODUCTION

A. Notation

2. PROBLEM DESCRIPTION

3. BLIND DECONVOLUTION AS A RANK-CONSTRAINED FEASIBILITY PROBLEM

A. Convolution Constraint gi=go⋆h

B. Measurement Constraint y=|gi|2

C. Rank-Constrained Blind Deconvolution Problem

D. Convex Heuristic for Blind Deconvolution

E. Computational Complexity of Eq. (25)

F. Including Prior Information and Regularization

4. NUMERICAL EXPERIMENTS

5. CONCLUSION AND FUTURE RESEARCH

APPENDIX A: PROOF OF LEMMA 2

Funding

REFERENCES

Cited By

Figures (7)

Tables (1)

Equations (36)

Journal of the Optical Society of America A

A. Convolution Constraint $g_{i} = g_{o} ⋆ h$

B. Measurement Constraint $y = {| g_{i} |}^{2}$