Comparison of linear and nonlinear deconvolution algorithms for co-optimization of depth-of-field enhancing binary phase masks

Olivier Lévêque; Caroline Kulcsár; François Goudail

doi:10.1364/OSAC.415925

1. Introduction

Nowadays, nearly all imaging systems include digital processing algorithms to improve image quality. It is thus natural to jointly optimize the optical system and the processing algorithms. Optical-digital joint design is part of the field of computational imaging [1], and the mathematical and conceptual foundations of this approach for a complete imaging system were laid by Stork and Robinson [2]. This “co-optimization” approach can be formulated in a rigorous way by defining the optimization criterion as the mean square difference between an ideally sharp image and the image delivered by the system after digital deconvolution [3–5]. This approach has been used to successfully co-optimize different types of imaging systems operating in the visible and infrared ranges [6–11]. In particular, it has been applied to design phase masks aimed at extending the depth-of-field (DoF) of camera lenses [12–16], which have been implemented with success in several real-world imagers [11,17].

Most of these works have used the Wiener filter as the image deconvolution algorithm since in this case, the optimization criterion is closed-form [4,15,18], which considerably accelerates the optimization process. Of course, it is possible to use nonlinear algorithms to restore images from systems optimized with a Wiener filter-based criterion [19]. This is often what is done in practice since it is well known that nonlinear deconvolution algorithms yield better image quality than the Wiener filter. It is also possible to take into account a nonlinear deconvolution algorithm directly in the optimization criterion. This was done for example in a recent work reporting joint optimization of binary DoF enhancing masks with a deconvolution algorithm based on neural network [20]. However, this comes at the price of more complex and time consuming optimization process. The question is whether this price is worth paying. In other words, it is important to determine whether better imaging performance can be obtained with an optical system optimized using a nonlinear deconvolution algorithm instead of a linear one. The purpose of this article is to address this issue, which has, to the best of our knowledge, never been addressed in detail.

In addressing this issue, a major difficulty is that there exists many different types of nonlinear deconvolution algorithms and it is impossible to be exhaustive. In consequence, we have chosen to focus on a single type of nonlinear deconvolution algorithm and to concentrate on the interpretation of the results. We chose TV-based deconvolution since it is a well known, well documented, and an efficient approach widely used for edge preservation purpose [21]. Moreover, we restrict our attention to a precise co-optimization problem, namely, the design of binary annular phase masks for DoF extension. This type of masks is easier to manufacture than masks with continuous phase variation, such as the cubic mask, and have been shown to have similar DoF enhancing ability [22]. Within this framework, we demonstrate that the optimal masks obtained with criteria based on the Wiener filter and on the TV-based algorithm are identical. We propose a conjecture to interpret and generalize this result. This finding is important since it supports the frequent co-design practice consisting of optimizing a system with a simple criterion and deconvolving images with a more complex algorithm.

2. Depth-of-field extension with phase masks

There are several reasons for an imaging system to be defocused in practice. It may be because the objects in the observed scene are located at different distances, so that the system cannot be focused on all of them [11]. Another frequent reason is variation of temperature, which induces a variation of the focal length of the lens so that the image plane is no longer on the sensor plane [23]. To get a sharp image in these cases, the DoF of the imaging system has to be improved. It is well known that placing a binary annular phase mask in the aperture stop of the lens makes it possible to increase its DoF [14,15]. Such masks are composed of concentric rings [15,24] [see Fig. 1(a)]. Each ring is an annular region that induces phase modulation alternatively equal to 0 or $\pi$ radian at a nominal wavelength $\lambda$. Under monochromatic illumination, simple binary etching in a dielectric substrate is sufficient to guarantee a $\pi$-phase step at $\lambda$, but under polychromatic illumination, this condition is not necessarily met. However, it has been shown that phase masks optimized under monochromatic illumination assumption are quite robust to their use under wide spectrum illumination [25]. Moreover, some approaches have been proposed to improve binary phase masks in the presence of wide spectrum and color imaging. It is for example possible to superpose two dielectric substrates to force the ring phase shift to be close to $\pi$ at three chosen RGB wavelengths [26]. An $N$-ring mask is defined by the set of radii of $N-1$ inner rings, that is, by the parameter vector $\boldsymbol {\rho }=(\rho _1, \ldots , \rho _{N-1})$.

Fig. 1. (a) Example of annular binary phase mask; each ring is an annular region of phase modulation of alternatively 0 or π at a nominal wavelength $\lambda$. (b) Original grayscale image $\boldsymbol {x}_0$ used in our simulations, denoted “Image A” in the following. (c) Two zoom-ins of image A with, respectively, high spatial frequency details (A1), and smooth spatial variations (A2).

Download Full Size | PDF

Such masks are placed in the aperture stop of the imaging lens in order to extend its DoF. Let us assume that one observes an object at a certain distance $z_{\textrm {o}}$ of an imaging system with effective focal length $f$, with a sensor located at a fixed imaging distance $z_{\textrm {i}}$. The amplitude of the defocusing is classically defined by the following defocus parameter, expressed in units of wavelength:

(1)$$\psi = \frac{(z_{\textrm{i}} \times \operatorname{\textrm{NA}})^{2}}{2} \left(\frac{1}{z_{\textrm{i}}} + \frac{1}{z_{\textrm{o}}} - \frac{1}{f}\right)$$

where $\operatorname {\textrm {NA}}$ is the image numerical aperture of the system. The value of $\psi$ is equal to 0 if the lens is perfectly focused, and the lens is considered to produce sharp images over the range $\psi \in [-\lambda /4,\lambda /4]$. Our goal is to get an imaging system that provides good image quality over a larger range $[-\psi _{\textrm {max}},\psi _{\textrm {max}}]$ of defocus values. Note that when using binary phase masks with $\pi$-phase modulation, the DoF range is always symmetrical since the PSF remains identical for $\psi$ and $-\psi$ [16]. It is thus sufficient to optimize the masks over the range $[0,\psi _{\textrm {max}}]$.

An image observed by a system equipped with a binary phase mask with parameter set $\boldsymbol {\rho }$ in the presence of a defocus parameter $\psi$ can be modeled as

(2)$$\boldsymbol{y} = \boldsymbol{h}_{\boldsymbol{\rho},\psi}*\boldsymbol{x}_0+\boldsymbol{n}$$

where the symbol $*$ denotes convolution, $\boldsymbol {y}\in \mathbb {R}^{N}$ the observed $N$-pixel image, $\boldsymbol {x}_0\in \mathbb {R}^{N}$ the truth-unknown image, $\boldsymbol {n}\in \mathbb {R}^{N}$ a white stationary zero mean noise, and $\boldsymbol {h}_{\boldsymbol {\rho },\psi }\in \mathbb {R}^{N\times N}$ the point spread function (PSF) of the optical system, that depends on $\boldsymbol {\rho }$ and $\psi$. The estimate $\boldsymbol {\hat {x}}$ of $\boldsymbol {x}_0$ is estimated from $\boldsymbol {y}$ by using a deconvolution algorithm.

Many different criteria have been proposed to optimize the parameters $\boldsymbol {\rho }$ of a DoF enhancing binary annular phase mask. Many of them are based on the properties of the modulation transfer function (MTF), that is, the Fourier transform of $\boldsymbol {h}_{\boldsymbol {\rho },\psi }$. For example, in Ref. [14], the criterion consists in maximizing the effective cutoff frequency, defined as the largest frequency for which the value of the MTF is larger than a certain threshold. A Wiener filter is used for restoring the image. In the present article, we used an alternative approach that consists in taking directly the quality of the restored image as the optimization criterion [7,15]. This quality is defined as the mean square error (MSE) between the estimated image $\boldsymbol {\hat {x}}$ and the truth-unknown image $\boldsymbol {x}_0$:

(3)$$\textrm{MSE} (\psi,\boldsymbol{\rho}) = \frac{1}{N}\left\lVert{\boldsymbol{\hat{x}}(\psi,\boldsymbol{\rho})-\boldsymbol{x}_0}\right\rVert^{2}\,\,.$$

We note that this criterion is widely used in papers about image restoration, including the most recent approaches based on machine learning. Since it takes into account the quality of the restored image, this criterion performs a compromise between resolution and contrast of the imaging system. The result of this compromise is studied in detail in Ref. [27], through the concept of “effective MTF” after deconvolution.

The MSE criterion defined in Eq. (3) depends on the mask parameters $\boldsymbol {\rho }$ and on the value of the defocus $\psi$. We define the global image quality as the maximum value of the MSE over the defocus range:

(4)$$\textrm{MMSE} (\boldsymbol{\rho}) = \max_{\psi} \textrm{MSE} (\psi,\boldsymbol{\rho})\,\,.$$

In practice, maximization over $\psi$ in Eq. (4) is done over a discrete number $N_\psi$ of defocus values $\psi _i$ uniformly distributed in the interval $[0,\psi _{\textrm {max}}]$. Finally, the optimal mask parameters $\boldsymbol {\rho }_{\textrm {opt}}$ are found by minimizing the MMSE:

(5)$$\boldsymbol{\rho}_{\textrm{opt}} = \arg\min_{\boldsymbol{\rho}}\textrm{MMSE} (\boldsymbol{\rho})\,\,.$$

Let us first assume that the deconvolution algorithm is linear, so that the estimated image is obtained as

(6)$$\boldsymbol{\hat{x}}=\boldsymbol{d}*\boldsymbol{y}\,$$

where $\boldsymbol {d}$ is the impulse response of the deconvolution filter. In the present article, we will use the filter that minimizes the average of the MSE over the defocus range, that is, $\sum _{i=1}^{N_\psi } \textrm {MSE}(\psi _i,\boldsymbol {\rho })$. It is called the averaged Wiener filter and is expressed, in the Fourier domain, as Ref. [15]:

(7)$${\tilde{d}}(\nu)=\frac{S_{xx}(\nu) \displaystyle\frac{1}{N_\psi}\displaystyle\sum_{i=1}^{N_\psi}\tilde{h}^{*}_{\boldsymbol{\rho},\psi_i}(\nu) }{S_{xx}(\nu)\displaystyle\frac{1}{N_\psi}\displaystyle\sum_{i=1}^{N_\psi}\left|\tilde{h}_{\boldsymbol{\rho},\psi_i}(\nu)\right|^{2} +S_{nn}(\nu)}$$

where the superscript $\sim$ denotes the Fourier transform, the superscript $*$ the complex conjugate, $S_{xx}$ the power spectral density (PSD) of the true image $\boldsymbol {x}_0$, and $S_{nn}$ the PSD of the noise $\boldsymbol {n}$. In this article, we approximate $S_{xx}$ by the modulus square of the Fourier transform of the true image $|\tilde {x}_0|^{2}$ and $S_{nn}$ will be a constant value determining the signal to noise ratio (SNR). The advantage of using such a linear deconvolution filter is that the expression of the MSE in Eq. (3) has a closed-form expression [4,15,28]:

(8)$$\textrm{MSE} (\psi,\boldsymbol{\rho}) = \int \left| {\tilde{d}}(\nu) \tilde{h}_{\boldsymbol{\rho},\psi_i}(\nu) - 1 \right|^{2} S_{xx}(\nu) \, d\nu + \int \left| {\tilde{d}}(\nu) \right|^{2} S_{nn}(\nu) \, d\nu\,\,.$$

This closed-form expression of the image quality criterion makes it possible to solve the optimization problem defined in Eq. (5) very efficiently. For example, this method has been used for mask optimization in Refs [11,15,16].

However, it is well known that nonlinear algorithms usually yield better deconvolution performance than the Wiener filter. In consequence, considering that the estimated image $\boldsymbol {\hat {x}}$ in the MSE criterion of Eq. (3) is obtained with a nonlinear algorithm, it could lead to better co-optimized imaging systems. In order to verify this hypothesis, we consider a nonlinear deconvolution algorithm based on a quadratic data fitting term and a TV-based regularization term:

(9)$$\boldsymbol{\hat{x}} = \arg\min_{\boldsymbol{x}} \left\{\frac{1}{2}\left\lVert{\boldsymbol{y}- \left(\frac{1}{N_\psi}\sum_{i=1}^{N_\psi}h_{\boldsymbol{\rho},\psi_i}\right) * \boldsymbol{x}}\right\rVert^{2}+\mu\sum_{i=1}^{N}\left\lVert{D_i\boldsymbol{x}}\right\rVert\right\}$$

where $D_i$ is a linear operator that approximates the spatial gradient of an image $\boldsymbol {x}$ at a position indexed by $i$. This regularization method was introduced by Rudin, Osher and Fatemi in Ref. [29] to preserve image edges better than the quadratic regularization implicitly used in the Wiener filter. Many methods exist in the literature to solve Eq. (9). We use the first-order primal-dual algorithm presented in Ref. [30] and available in the GlobalBioIm Matlab library [31]. Since this algorithm is iterative, one cannot find a closed-form expression of the MSE criterion as it was the case when using the Wiener filter in Eq. (8). This makes the optimization process much longer. Moreover, this algorithm requires tuning the regularization parameter $\mu$. We choose to estimate this parameter as:

(10)$$\hat{\mu}=\arg\min_\mu \textrm{MSE} (\psi,\boldsymbol{\rho},\mu)$$

where we have explicitly denoted the dependence on $\mu$ of the MSE in Eq. (3). Finally, the optimal mask is obtained as:

(11)$$\boldsymbol{\rho}_{\textrm{opt}} = \arg\min_{\boldsymbol{\rho}}\left\{\max_\psi \textrm{MSE} (\psi,\boldsymbol{\rho},\hat{\mu})\right\}.$$

Note that since $\hat {\mu }$ depends on the unknown values of the defocus $\psi$ and on the mask parameters $\rho$, the estimation of the regularization parameter in Eq. (10) cannot be implemented in practice. The results obtained with this method will thus be an upper bound on the performance that can be actually achieved with the deconvolution algorithm defined in Eq. (9).

3. Comparison of the masks optimized with linear and nonlinear deconvolution algorithms

In this section we optimize masks with the criteria based on the averaged Wiener filter and on the TV algorithm. Then, we compare the parameters and the performance of the optimal masks obtained with these two criteria.

For this purpose, we consider three different values of targeted DoF range, namely $\psi _{\textrm {max}}=\{1\lambda ,1.5\lambda ,2\lambda \}$. It was shown in Ref. [16] that these values of the DoF range can be reached with masks having only 2 to 4 rings. Moreover, it is clear that the result of mask optimization depends on the considered truth-unknown image $\boldsymbol {x}_0$. We have performed mask optimizations with several examples of image $\boldsymbol {x}_0$ with different characteristics, and have obtained in all cases the same optimal masks for the two criteria. In this section, the optimization results will thus be first presented and analyzed in the case where $\boldsymbol {x}_0$ is the Butterfly image displayed in Fig. 1(b), and will then be generalized to other examples of images. In all cases, the input SNR is set to $34$ dB, which corresponds to a noise standard deviation $\sigma =2.4~.10^{-3}$ on the Butterfly image (the pixel values of $\boldsymbol {x}_0$ are assumed to range from 0 to 1).

In order to find the optimal parameters $\boldsymbol {\rho }_{\textrm {opt}}$ of the phase masks, we perform an exhaustive exploration of the parameter space. This deterministic approach avoids any potential problem of getting stuck in local minima, that could be encountered with an iterative optimization algorithm. Our exploration step size for each radius $\rho _n$ is $\Delta \rho _n = 5~.10^{-2}$. We refine this step to $\Delta \rho _n = 5~.10^{-3}$ around the position of the global minimum to estimate its position with better precision. Moreover, in the case of TV-based criterion, the best regularization parameter $\hat {\mu }$, as defined in Eq. (10), is estimated for each value of $\boldsymbol {\rho }$ and $\psi$ with a local optimization algorithm based on the Nelder-Mead simplex algorithm [32]. Since this exhaustive optimization process is time consuming, we used parallel computing to accelerate computations.

Let us first consider optimization of a 2-ring mask defined by the radius $\rho _1$ of its inner ring. Figure 2(a) represents the value of the MMSE criterion defined in Eq. (4) as a function of $\rho _1$ when the Wiener filter is used. The three curves correspond to the three considered values of the DoF range: $\psi _{\textrm {max}}=\{1\lambda ,1.5\lambda ,2\lambda \}$. Figure 2(b) represents the results obtained with the TV-based deconvolution algorithm. By comparing these two graphs, it is observed that for all the considered values of $\psi _{\textrm {max}}$, the curves obtained with the two criteria have similar shapes and reach their minima for the same values of $\rho _1$: the optimal masks obtained with the criteria based on the Wiener filter and on the TV deconvolution algorithm are thus identical.

Fig. 2. Value of MMSE$(\boldsymbol {\rho })$ as a function of ring radius for a 2-ring mask and 3 different values of $\psi _{\textrm {max}}$ obtained with Wiener algorithm (a) and TV algorithm (b). Value of MMSE$(\boldsymbol {\rho })$ as a function of the two radii for a 3-ring mask and $\psi _{\textrm {max}}=2 \lambda$, obtained with Wiener algorithm (c) and TV algorithm (d).

Download Full Size | PDF

Let us now consider a 3-ring mask, which is defined by two parameters. Figures 2(c) and 2(d) represent the variations of the Wiener and TV-based criteria as a function of these two parameters for $\psi _{\textrm {max}}=2 \lambda$. It is seen that in this case again, the shapes of the maps are very similar for the two criteria. The best value of the criterion is obtained for the same parameter set $(\rho _1, \rho _2)=(0.55,0.75)$. Thus, the optimal masks are identical. We have also performed the optimization of a 4-ring mask, which is defined by 3 parameters. Here again, we found that the optimal masks for both criteria are identical.

These results have been obtained with the Butterfly image [see Fig. 1(b)] as the truth-unknown image $\boldsymbol {x}_0$. In order to confirm this result, we performed the same optimizations on many other images of different types. We observed that in all cases, one obtains the same optimal masks for the two criteria. For example, we present in Fig. 3 and Fig. 4 the optimization results obtained with two other original images $\boldsymbol {x}_0$ with different characteristics. It is seen that the optimal 2-ring and 3-ring masks obtain with both criteria are identical.

Fig. 3. (a) Original grayscale image $\boldsymbol {x}_0$ used in this simulation. Value of MMSE$(\boldsymbol {\rho })$ as a function of ring radius for a 2-ring mask and 3 different values of $\psi _{\textrm {max}}$ obtained with Wiener algorithm (b) and TV algorithm (c). Value of MMSE$(\boldsymbol {\rho })$ as a function of the two radii for a 3-ring mask and $\psi _{\textrm {max}}=2 \lambda$, obtained with Wiener algorithm (d) and TV algorithm (e).

Download Full Size | PDF

Fig. 4. Results of the same type of simulations as in Fig. 3, but with the ground-truth image $\boldsymbol {x}_0$ equal to Fig. 4(a).

Download Full Size | PDF

In order to summarize these results, we have represented in Fig. 5(a) the profiles of the optimal masks (which are identical for both criteria) for different numbers of rings and different values $\psi _{\textrm {max}}$ of the targeted DoF range. In addition, Fig. 5(b) plots the values of the image quality criterion MMSE$(\boldsymbol {\rho }_{\textrm {opt}})$ of Eq. (4) obtained with these optimal masks. Each curve corresponds to one of the two considered deconvolution algorithms and one of the 3 defocus ranges. It is observed on these curves that the TV deconvolution algorithm always yields smaller MMSE values than the Wiener filter. This corroborates a well known result: TV-based deconvolution algorithms work better than the Wiener filter on natural images. As a conclusion, we have shown that the optimization criteria based on Wiener filter and on TV deconvolution yield the same optimal DoF enhancing masks, even if when using this optimal mask, the final image quality depends on the deconvolution algorithm. To generalize this conclusion, we have performed optimization of a 2-ring phase mask for the targeted DoF range $\psi _{\textrm {max}}=1\lambda$ and two additional noise levels. Note that according to Fig. 5(b), 2-ring masks are sufficient to reach minimal MMSE when $\psi _{\textrm {max}}=1\lambda$. The two selected noise levels, 24 dB and 9 dB, respectively correspond to noise standard deviations $\sigma =8.0~.10^{-3}$ and $\sigma =4.3~.10^{-2}$ on the Butterfly image. The results are shown in Table 1, where we have represented, for the three considered noise levels and three different images, the optimal mask parameter $\rho _1^{\textrm {opt}}$ and the corresponding MMSE. It is clearly seen that optimization criteria based on Wiener filter and TV-based algorithm yield quasi-identical optimal masks whatever the noise level.

Fig. 5. (a) Profiles of the optimal 2-ring, 3-ring and 4-ring masks for three different values of $\psi _{\textrm {max}}$. They are identical for Wiener and TV algorithms. (b) Evolution of the maximum mean square error (MMSE) as a function of the number of rings for different values of $\psi _{\textrm {max}}$.

Download Full Size | PDF

Table 1. 2-ring phase mask optimization results for the targeted DoF range $\psi _{\textrm {max}}=1\lambda$, three different images and three different noise levels: 34 dB (low noise level), 24 dB (medium noise level), and 9 dB (high noise level).

View Table

This result is important in practice since optimization of the Wiener filter-based criteria is much faster, thanks to the closed-form expression of the MSE in Eq. (8). Note that in practice, it is preferable to optimize a single phase mask over a large set of images relevant to the application at hand. When using the Wiener filter, this is easy since it mainly boils down to using a prior power spectral density $S_{xx}$ for the whole set of images in the expression of the Wiener filter and of the MSE [see Eqs. (7) and (8)]. In contrast, when using the TV regularized algorithm, this is more time consuming since one needs to deconvolve each image in order to compute an averaged MSE. Since this has to be done at each iteration of the mask optimization algorithm, the optimization time is proportional to the size of the image set.

4. Interpretation and discussion

How is it possible to explain this equivalence of the two criteria for optimizing DoF enhancing masks? A first interpretation can be found with the help of Fig. 6, where we have represented the variation of the PSF profile with the defocus $\psi$ (note that since this PSF has circular symmetry, a cross-section along any line passing through its center is sufficient to represent its properties). The upper graph of Fig. 6 represents the profile of the PSFs obtained without mask, for $\psi \in [0,1\lambda ]$, and the lower graph represents the profile obtained with a 3-ring mask optimal for $\psi _{\textrm {max}} = 1 \lambda$. As expected, it is observed that the PSF obtained with the phase mask is much more invariant to defocus. This invariance is an essential property to ensure good image restoration. It is equally important for Wiener and TV deconvolution, since the expressions of both algorithms depend on the PSF (or the MTF) averaged over the values of $\psi$ within the targeted DoF range, namely, ${1}/{N_\psi }\sum _{i=1}^{N_\psi }h_{\boldsymbol {\rho },\psi _i}$[see Eqs. (7) and (9)]. For deconvolution to be efficient, this averaged PSF has to be representative of all the PSFs within the defocus range, and thus, these PSFs have to be similar. Indeed, it has been shown in Ref. [15] that DoF enhancing mask optimization is a tradeoff between maximizing similarity of the PSFs at the different defocus values $\psi$ and minimizing the width of these PSFs (or maximizing the level of the corresponding MTF) to limit noise enhancement by deconvolution.

Fig. 6. Evolution of the PSF profile as a function of defocus: the upper graph corresponds to a standard optical system without mask and the second one to a co-designed optical system using a 3-ring binary phase mask optimized for the defocus range $[0,1\lambda ]$.

Download Full Size | PDF

To give more insight on the behavior of the deconvolution algorithms as a function of the spatial frequency content of the image, we performed a quantitative comparison between the results obtained with the Wiener and the TV deconvolution algorithm on two regions of the Butterfly image having very different graylevel variations. These regions are represented in Fig. 1(c): the zoom-in A1 contains many details, such as highly contrasted edges, whereas the zoom-in A2 has smooth spatial variations. We assume that the lens is equipped with a 3-ring binary phase mask optimized for $\psi _{\textrm {max}}=1\lambda$ and that the image is acquired under a defocus value of $\psi =0.75\lambda$. The bar chart of Fig. 7(a) displays the root mean square error, defined as ${\rm{RMSE}} = [\textrm {MSE}(\psi ,\boldsymbol {\rho }_{\textrm {opt}})]^{1/2}$ [see Eq. (3)] obtained with the Wiener and the TV deconvolution algorithms on the whole image A and on the two selected zoom-ins A1 and A2. It is seen that the RMSE obtained on the whole image A and on the “smooth” zoom-in A2 are smaller when using the TV algorithm. On the other hand, on the zoom-in A1, the RMSE is roughly equal for both algorithms. It is even slightly smaller with the Wiener filter. To illustrate the generality of this conclusion, we have also evaluated the image quality with two other metrics [33]: the L1 norm [see Fig. 7(b)], and the differential structural similarity index (DSSIM) [see Fig. 7(c)], defined as $\operatorname {\textrm {DSSIM}}=(1-\operatorname {\textrm {SSIM}})/2$, where SSIM is defined in Ref. [34]. It is seen that these two graphs lead to equivalent conclusions as the RMSE: on the whole image A and on the “smooth” zoom-in A2, the image quality is better when using the TV algorithm, whereas on the zoom-in A1, the image quality is roughly equal for both algorithms.

Fig. 7. (a) RMSE, (b) L1 norm, (c) DSSIM computed on the whole image A and on the two selected zoom-ins A1 and A2 [see Fig. 1] with Wiener and TV algorithms. Simulation parameters: SNR $=34$ dB, $\psi =0.75\lambda$, $\psi _{\textrm {max}}=1\lambda$, $\mu =8.5625.10^{-5}$ and $\boldsymbol {\rho }=(0.75,0.9)$ (3-ring optimized binary phase mask)

Download Full Size | PDF

These results are visually illustrated on examples of deconvolved images in Fig. 8. The first column of Fig. 8 represents the original sharp image of the whole scene A and of the zoom-ins A1 and A2. The second column represents the blurred images obtained for a defocus $\psi =0.75\lambda$, when using the optimal 3-ring mask for $\psi _{\textrm {max}}=1 \lambda$ defocus range. The third column represents the images deconvolved with the Wiener filter, and the fourth column with the TV algorithm. These results show that the zoom-in A1 is restored with similar accuracy by the TV and the Wiener algorithms. The edges seem a little “sharper” in the TV image, due to the well-known “cartoon” effect of this type of algorithms. However, they are not more accurate: as shown in Fig. 7, the values of the RMSE yielded by the two algorithms are similar, the RMSE of the Wiener algorithm being even a little smaller. On the other hand, TV restoration yields much better results on the zoom-in A2, that mostly contains low spatial frequencies. This algorithm thus reaches a better compromise between recovery of sharp edges (where it equals the Wiener filter) and restoration of smooth regions, where it clearly outperforms the Wiener filter. Since this type of smooth regions represent most of the image area, the TV algorithm yields a smaller global RMSE on the whole image.

Fig. 8. First row, from left to right: original whole image A, convolved and noisy image, image deconvolved with Wiener filter, image deconvolved with TV algorithm. Second row: same images for zoom-in A1 [see Fig. 1]. Third row: same images for zoom-in A2. The RMSE is displayed on the deconvolved images. Simulation parameters: ${\rm{SNR}} =34$ dB, $\psi =0.75\lambda$, $\psi _{\textrm {max}}=1\lambda$, $\mu =8.5625.10^{-5}$ and $\boldsymbol {\rho }=(0.75,0.9)$ (3-ring optimized binary phase mask).

Download Full Size | PDF

Based on these observations, we can propose the following conjecture. Whatever the deconvolution method, the same characteristics of the optical system are required to recover the details of the image. In the case of DoF extension, this characteristic is the invariance of the PSF with defocus. As a consequence, the optimal mask parameters, which govern the characteristics of the PSF, are common to both algorithms. On the other hand, since TV regularization is more adapted to real-world images than the quadratic regularization used implicitly in the Wiener filter, it yields much better restoration of smooth regions.

5. Conclusion

Considering the problem of optimization of DoF enhancing binary annular phase masks, we have shown that co-design procedures based on linear and nonlinear deconvolution methods yield identical optimal masks. This has been demonstrated on masks with different numbers of rings, different targeted defocus ranges, and for different types of observed images. We have proposed a conjecture to explain this fact: the same PSF characteristics are required to recover the details of the image whatever the deconvolution method. Of course, even with the same masks, the TV-based deconvolution algorithm yields better image reconstruction performance, in particular in the smooth regions of the image, since TV regularization is more adapted to real-world images. This result is important since it supports the frequent co-design practice consisting of optimizing a system with a simple closed-form criterion and deconvolving with a nonlinear algorithm to get better image quality. This work has many perspectives. The main one is to verify the generalization of this conjecture to other nonlinear deconvolution algorithms, other phase mask structures and other co-design problems.

Disclosures

The authors declare no conflicts of interest.

References

1. J. N. Mait, G. W. Euliss, and R. A. Athale, “Computational imaging,” Adv. Opt. Photonics 10(2), 409–483 (2018). [CrossRef]

2. D. G. Stork and M. D. Robinson, “Theoretical foundations for joint digital-optical analysis of electro-optical imaging systems,” Appl. Opt. 47(10), B64–B75 (2008). [CrossRef]

3. T. Mirani, M. P. Christensen, S. C. Douglas, D. Rajan, and S. L. Wood, “Optimal co-design of computational imaging system,” in Proceedings. (ICASSP ’05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005., vol. 2, ii/597–ii/600, (2005).

4. M. D. Robinson and D. G. Stork, “Joint Design of Lens Systems and Digital Image Processing,” in International Optical Design (Optical Society of America, 2006), p. WB4.

5. T. Mirani, D. Rajan, M. P. Christensen, S. C. Douglas, and S. L. Wood, “Computational imaging systems: joint design and end-to-end optimality,” Appl. Opt. 47(10), B86–B103 (2008). [CrossRef]

6. A. R. Harvey, T. Vettenburg, M. Demenikov, B. Lucotte, G. Muyo, A. Wood, N. Bustin, A. Singh, and E. Findlay, “Digital image processing as an integral component of optical design,” in Novel Optical Systems Design and Optimization XI, vol. 7061R. J. Koshel, G. G. Gregory, J. D. M. Jr., and D. H. Krevor, eds., International Society for Optics and Photonics (SPIE, 2008), pp. 32–42.

7. D. G. Stork and M. D. Robinson, “Theoretical foundations for joint digital-optical analysis of electro-optical imaging systems,” Appl. Opt. 47(10), B64–B75 (2008). [CrossRef]

8. A. Ashok and M. A. Neifeld, “Point spread function engineering for iris recognition system design,” Appl. Opt. 49(10), B26–B39 (2010). [CrossRef]

9. P. Trouve, F. Champagnat, G. Le Besnerais, G. Druart, and J. Idier, “Design of a Chromatic 3D Camera with an End-to-End Performance Model Approach,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops (2013).

10. P. Zammit, A. R. Harvey, and G. Carles, “Extended depth-of-field imaging and ranging in a snapshot,” Optica 1(4), 209–216 (2014). [CrossRef]

11. M.-A. Burcklen, F. Diaz, F. Leprêtre, J. Rollin, A. Delboulbé, M.-S. L. Lee, B. Loiseaux, A. Koudoli, S. Denel, P. Millet, F. Duhem, F. Lemonnier, H. Sauer, and F. Goudail, “Experimental demonstration of extended depth-of-field f/1.2 visible High Definition camera with jointly optimized phase mask and real-time digital processing,” Journal of the European Optical Society : Rapid publications 10, 15046 (2015). [CrossRef]

12. E. R. Dowski and W. T. Cathey, “Extended depth of field through wave-front coding,” Appl. Opt. 34(11), 1859–1866 (1995). [CrossRef]

13. W. T. Cathey and E. R. Dowski, “New paradigm for imaging systems,” Appl. Opt. 41(29), 6080–6092 (2002). [CrossRef]

14. E. Ben-Eliezer, N. Konforti, B. Milgrom, and E. Marom, “An optimal binary amplitude-phase mask for hybrid imaging systems that exhibit high resolution and extended depth of field,” Opt. Express 16(25), 20540–20561 (2008). [CrossRef]

15. F. Diaz, F. Goudail, B. Loiseaux, and J.-P. Huignard, “Increase in depth of field taking into account deconvolution by optimization of pupil mask,” Opt. Lett. 34(19), 2970–2972 (2009). [CrossRef]

16. R. Falcón, F. Goudail, C. Kulcsár, and H. Sauer, “Performance limits of binary annular phase masks codesigned for depth-of-field extension,” Opt. Eng. 56(6), 065104 (2017). [CrossRef]

17. F. Diaz, M.-S. L. Lee, X. Rejeaunier, G. Lehoucq, F. Goudail, B. Loiseaux, S. Bansropun, J. Rollin, E. Debes, and P. Mils, “Real-time increase in depth of field of an uncooled thermal camera using several phase-mask technologies,” Opt. Lett. 36(3), 418–420 (2011). [CrossRef]

18. V. Sitzmann, S. Diamond, Y. Peng, X. Dun, S. Boyd, W. Heidrich, F. Heide, and G. Wetzstein, “End-to-End Optimization of Optics and Image Processing for Achromatic Extended Depth of Field and Super-Resolution Imaging,” ACM Trans. Graph. 37(4), 1–13 (2018). [CrossRef]

19. J. Portilla and S. Barbero, “Hybrid digital-optical imaging design for reducing surface asphericity cost while keeping high performance,” in Computational Optics II, vol. 10694D. G. Smith, F. Wyrowski, and A. Erdmann, eds., International Society for Optics and Photonics (SPIE, 2018), pp. 28–37.

20. S. Elmalem, R. Giryes, and E. Marom, “Learned phase coded aperture for the benefit of depth of field extension,” Opt. Express 26(12), 15316–15331 (2018). [CrossRef]

21. D. M. Titterington, “Common Structure of Smoothing Techniques in Statistics,” International Statistical Review 53(2), 141–170 (1985). [CrossRef]

22. R. Falcon Maimone, “Co-conception des systemes optiques avec masques de phase pour l’augmentation de la profondeur du champ : evaluation du performance et contribution de la super-résolution,” Theses, Université Paris Saclay (COmUE) (2017).

23. J. W. Perry, “Thermal effects upon the performance of lens systems,” Proc. Phys. Soc. 55(4), 257–285 (1943). [CrossRef]

24. R. Falcón, F. Goudail, and C. Kulcsár, “How many rings for binary phase masks co-optimized for depth of field extension?” in Imaging and Applied Optics 2016 (Optical Society of America, 2016), p. CTh1D.5.

25. A. Fontbonne, H. Sauer, and F. Goudail, “Theoretical and experimental analysis of co-designed binary phase masks for enhancing the depth of field of panchromatic cameras,” (2021). Working paper or preprint.

26. S. Elmalem, N. Konforti, and E. Marom, “Polychromatic imaging with extended depth of field using phase masks exhibiting constant phase over broad wavelength band,” Appl. Opt. 52(36), 8634–8643 (2013). [CrossRef]

27. A. Fontbonne, H. Sauer, C. Kulcsár, A.-L. Coutrot, and F. Goudail, “Experimental validation of hybrid optical–digital imaging system for extended depth-of-field based on co-optimized binary phase masks,” Opt. Eng. 58(11), 1–12 (2019). [CrossRef]

28. T. Vettenburg, N. Bustin, and A. R. Harvey, “Fidelity optimization for aberration-tolerant hybrid imaging systems,” Opt. Express 18(9), 9220–9228 (2010). [CrossRef]

29. L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Phys. D 60(1-4), 259–268 (1992). [CrossRef]

30. A. Chambolle and T. Pock, “A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging,” Journal of Mathematical Imaging and Vision 40(1), 120–145 (2011). [CrossRef]

31. M. Unser, E. Soubies, F. Soulez, M. McCann, and L. Donati, “GlobalBioIm: A Unifying Computational Framework for Solving Inverse Problems,” in Imaging and Applied Optics 2017 (3D, AIO, COSI, IS, MATH, pcAOP), (Optical Society of America, 2017), p. CTu1B.1.

32. J. C. Lagarias, J. A. Reeds, M. H. Wright, and P. E. Wright, “Convergence Properties of the Nelder–Mead Simplex Method in Low Dimensions,” SIAM Journal on Optimization 9(1), 112–147 (1998). [CrossRef]

33. H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss Functions for Image Restoration With Neural Networks,” IEEE Transactions on Computational Imaging 3(1), 47–57 (2017). [CrossRef]

34. Z. Wang, E. P. Simoncelli, and A. C. Bovik, “Multiscale structural similarity for image quality assessment,” in The Thrity-Seventh Asilomar Conference on Signals, Systems Computers, 2003, vol. 2, 1398–1402, (2003).

SNR	Image	Wiener filter		TV-based algorithm
SNR	Image	$ρ_{1}^{opt}$	MMSE( $ρ_{1}^{opt}$ )	$ρ_{1}^{opt}$	MMSE( $ρ_{1}^{opt}$ )
34 dB	Butterfly	0.830	$1.12 {.10}^{- 3}$	0.830	$8.34 {.10}^{- 4}$
	Elephants	0.850	$4.58 {.10}^{- 3}$	0.865	$5.04 {.10}^{- 3}$
	Zebras	0.840	$2.61 {.10}^{- 3}$	0.865	$3.14 {.10}^{- 3}$
24 dB	Butterfly	0.825	$1.38 {.10}^{- 3}$	0.840	$1.59 {.10}^{- 3}$
	Elephants	0.855	$4.35 {.10}^{- 3}$	0.880	$4.87 {.10}^{- 3}$
	Zebras	0.830	$3.00 {.10}^{- 3}$	0.855	$3.45 {.10}^{- 3}$
9 dB	Butterfly	0.825	$2.58 {.10}^{- 3}$	0.835	$2.39 {.10}^{- 3}$
	Elephants	0.825	$7.76 {.10}^{- 3}$	0.830	$7.71 {.10}^{- 3}$
	Zebras	0.825	$4.82 {.10}^{- 3}$	0.835	$4.67 {.10}^{- 3}$

Comparison of linear and nonlinear deconvolution algorithms for co-optimization of depth-of-field enhancing binary phase masks

Abstract

1. Introduction

2. Depth-of-field extension with phase masks

3. Comparison of the masks optimized with linear and nonlinear deconvolution algorithms

4. Interpretation and discussion

5. Conclusion

Disclosures

References

Cited By

Figures (8)

Tables (1)

Equations (11)

OSA Continuum