De-noising imaging through diffusers with autocorrelation

Qianqian Cheng; Enlai Guo; Jie Gu; Lianfa Bai; Jing Han; Dongliang Zheng; Dongliang Zheng

doi:10.1364/AO.425099

1. INTRODUCTION

Optical imaging through diffusers is a long-standing and challenging problem in many fields, such as imaging in biological tissue, and cloudy and hazy conditions [1–3]. As a pervasive issue, it brings out tremendous methods based on optical phase conjugation [4–7], transmission matrix [8–10], point spread function (PSF) [11–13], speckle correlation technologies (SCTs) [14–19], and so on. Especially, SCT based on the optical memory effect has advantages of noninvasiveness and single-shot imaging in measurements but relies on the phase retrieval (PR) algorithm to retrieve the structure of the original target. Existing PR algorithms cannot escape the shadow of the ill-posed inverse problem or may not converge to the global optimal solution. To reduce the influence of noise, traditional SCT algorithms rely heavily on high-resolution and high-sensitivity cameras for collecting high-quality speckle patterns [16,20,21]. Thus, a major limitation of these methods is their poor robustness to noise.

Recent studies in computational imaging based on deep learning have been developed, which offer the ability to solve the ill-posed inverse problem [22–35]. A machine-learning-based approach was presented by Horisaki et al. to calculate the inverse scattering process [23]. IDiffNet solved the problem of imaging through diffusers and first used the negative Pearson correlation coefficient (NPCC) as a loss function, but it has a limitation in de-noising [30]. A deep neural network model developed by Li et al. based on their previous work achieves generalization under different scattering conditions [35]. Yang et al. used a U-net architecture to reconstruct images with glass and a multi-mode fiber [32]. Lyu et al. retrieved the target from a very small fraction of its scattered pattern [33]. To solve the dynamic scattering imaging problem, Sun et al. proposed a method of classifying first in reconstruction, which can restore images hidden in the unknown scattering medium by the closest scattering condition [34]. Each of the above works has its own advantages, and generally takes the end-to-end model that directly maps the relationship between speckle patterns and targets. Existing deep learning works of scattering imaging are mostly focused on the capacity of recovering or rely on high-SNR speckle patterns, but little attention is paid to the de-noising or reconstruction of the speckle pattern with low SNR. Also, only data-driven computation is prone to limited applicability, especially in a complex situation such as high-noise or diverse diffusers.

Inspired by traditional SCT, the autocorrelation reconstruction convolutional neural network (ACR-CNN) using two items of physical a priori knowledge to guide the learning process has natural advantages in robustness rather than only data-driven computation. The first item of physical a priori knowledge is that the algorithm is divided into two stages: de-noising and reconstructing. The second item is to de-noise the autocorrelation rather than the speckle pattern since autocorrelations have structural characteristics while speckle patterns have statistical characteristics. Therefore, the de-noising stage of ACR-CNN is designed to enhance autocorrelations and output high-SNR autocorrelations, which serve as the input of the reconstructing stage. Combining the physical constraint with data-driven computation, our two-stage method offers a solution to the issue of imaging through various diffusers with noise interference. As proofs of concept, the robustness of ACR-CNN is experimentally demonstrated by recovering targets from low-SNR speckle patterns generated in simulation and captured in the actual optical system.

2. METHOD

A. Experimental Setup

The experimental setups of the simulation and actual optical system for imaging through diffusers with noise interferences are shown in Fig. 1.

Fig. 1. Experimental setup of the simulation and actual optical system. (a) Experimental setup of the simulation. $u$ denotes the distance between targets and diffusers; $v$ denotes the distance between diffusers and the camera. (b) Setup of the actual optical system. DMD, digital micromirror device; blue arrows denote ambient light coming from the LED. (c) Actual imaging system.

Download Full Size | PDF

Simulation. As the experimental setup of the simulation is presented in Fig. 1(a), targets are illuminated by a simulative light source at a wavelength of 600 nm. The double Gaussian random matrix (DGRM) indicates modulation when the incident light is scattered by the diffuser in the optical system [36]. In other words, the variance $\sigma _M^2$ changes of the DGRM can simulate scattering media having different statistical characteristics. The diffusers are placed at distances of $u$ from targets, and the camera is placed at distances of $v$ from the diffusers ($u = 200\,\,\rm cm$, $v = 100\,\,\rm cm$). Different variances of Gaussian noise, $\sigma _G^2$, are set to generate different levels of noise during simulation. Therefore, speckle patterns imaging through various diffusers are simulated with different levels of Gaussian noise.

System. The actual optical system is set as shown in Figs. 1(b) and 1(c); the light source of the system is a LED with a central wavelength of 625 nm (Thorlabs, M625L4). Three kinds of different grit-ground-glass (Thorlabs, DG100X-120, DG100X-220, and DG100X-600) are used as the diffusers and placed between targets and the camera (Balser acA1, 920-155um). The target is generated by a digital micromirror device (DMD, resolution $1024 \times 768$ pixels, mirror element size 13.68 µm/pixel).

The illumination conditions changing the ambient light that comes from the LED can lead to photon noise. Even without the interference of ambient light, low exposure can cause the detector noise called noise 0. A filter with a central wavelength of 632.8 nm (Thorlabs, FL632.8-1) is placed in front of the camera, as shown in Figs. 1(b) and 1(c). Therefore, the light signal received by the camera is mainly disturbed in light intensity by the ambient light. To introduce different levels of photon noise, interference conditions of the ambient light are changed at three components (LED, pupil 1, and pupil 2), as shown in Fig. 1(b). The ambient light of noise 1 interferes with pupil 2, wherein ballistic light and scattered light are mixed with ambient light. In the illumination condition of noise 2 where pupil 1 is exposed to ambient light, the light path through pupil 1 results in the interference of optical intensity. The interference of ambient light in three components (LED, pupil 1, and pupil 2) affects the optical intensity and random walk for photon transmission, which produces noise 3.

B. Processing Theory

In the classical mathematical model, reconstructing images refers to the problems of recovering the target, $x \in \mathbb R^{m}$, from the speckle pattern, $x \in \mathbb R^{n}$, of the form

(1) $$y = F(x),$$

where $F$ represents the forward operator. Here, $m$ and $n$ are the numbers of elements in the target and the speckle pattern, respectively. The inverse process is

(2)$$x = {F^{- 1}}(y),$$

where ${F^{- 1}}$ represents the inverse function.

Due to the diverse statistical characteristics of different diffusers, the directly modeling ${F^{- 1}}$ method usually adopted by the existing algorithm is an arduous task. Autocorrelation of the speckle pattern, ${R_S}$, is essentially identical to the target’s autocorrelation in the aberration-free diffraction-limited optical system within a certain field of view (FOV). Thus, the autocorrelation can be a bridge to connect $x$ and $y$ in our two-stage method.

Low-SNR autocorrelation, ${R_S}$, is constructed as

(3)$${R_S} = (y \star y) + c,$$

where $\star$ indicates computing the autocorrelation, and $c$ is measurement noise. The de-noising stage called ACR-1 (first stage of ACR) is designed to enhance autocorrelation by eliminating the influence of noise. Therefore, high-SNR autocorrelation, $R^\prime $, is obtained from the low-SNR ones, ${R_S}$, as

(4)$$R^\prime = {\rm argmin}\{(x \star x) - {R_S}\} ,$$

where $R^\prime $ represents enhanced autocorrelation. $\rm argmin(\cdot)$ is the function for the value of $R^\prime $ when $\{(x \star x) - {R_S}\}$ is minimized. Thus, $R^\prime $ is approximately equal to the autocorrelation of $x$. The reconstructing stage called ACR-2 (second stage of ACR) is designed to reconstruct the target, $x$, from $R^\prime $ enhanced by ACR-1 as

(5)$$x^\prime = {\rm argmin}\{x - G(R^\prime)\} ,$$

where $x^\prime $ represents the reconstructed object, and $G(\cdot)$ is the operation to reconstruct $x^\prime $ from $R^\prime $.

The raw speckle pattern recorded by the camera is a random, unorganized image that contains a lot of redundant information. As proofs of concept, three small areas (marked by A1, A2, and A3) of $256 \times 256$ pixels are selected out of the raw simulation image of $1920 \times 1200$ pixels, and the structural distribution similarities of their autocorrelations are compared. Therefore, three lines (L1, L2, and L3) are selected to draw the gray level of pixels as shown in Fig. 2. The gray levels of lines L1, L2, and L3 show that autocorrelations of three small areas have similar distributions with noise perturbing. In fact, using the lines at different angles does not make any difference in the result. Thus, a small area has redundant information with random noise, and totally satisfies the desired information for calculating autocorrelations.

Fig. 2. Simulative raw speckle pattern, small speckle, and their autocorrelations. (a) Gray level of lines L1 obtained from the enlarged autocorrelation of A1, A2, and A3; (b) gray level of lines L2; (c) gray level of lines L3.

Download Full Size | PDF

Here, similar to the simulative image, the autocorrelations of four areas of $256 \times 256$ pixels out of the raw system image of $1920 \times 1200$ pixels are calculated and shown in Fig. 3. Importantly, the maximum of an 8-bit image recorded by the 8-bit Balser camera is 255, but the gray levels of scattering images are less than 30 caused by low exposure. As a result, the fluctuation in the intensity of the pattern is tough to directly observe with human eyes. A piecewise linear extension is employed to satisfy the desired human visual sense. Figure 3(b) compares the grayscale histogram of the image before transformation and after. The autocorrelations of four areas with noise perturbing are averaged and shown in Fig. 3(c). The result shows that the averaged image has higher quality, which confirms the average process of autocorrelations also is the same process of noise signals. Thus, the average of multiple areas of autocorrelations can reduce noise within a certain extent.

Fig. 3. Extension speckle pattern and small speckle autocorrelations. (a) Extension speckle pattern; (b) grayscale histogram of extension and original; (c) enlarged autocorrelations.

Download Full Size | PDF

The algorithm is divided into two stages: de-noising and reconstructing, and the autocorrelation with high SNR is used as the intermediate optimization component. Due to the redundant properties of the large speckle pattern, a small area already contains global information and totally satisfies the demand for calculating autocorrelations. Importantly, multiple small areas are more conducive to reduce random noise than a single speckle. The averaged image has a higher SNR, implying that convolution operations of the network can be considered as an effective computation of de-noising. Based on these above theories, a two-stage CNN, ACR-CNN, is constructed in Section 2.C to improve the robustness to noise using two terms of physical knowledge in imaging through diffusers.

C. Model for ACR-CNN

The most important characterization of deep learning is that models automatically learn features through data driving. The existing network generally adopts end-to-end mapping driven purely by large datasets. The major limitation of this mapping is their higher demand for the depth and width of the network for more complex issues such as multiple noises and various media. More parameters of the network lead to more possible directions of learning mapping, in which a suitable model is hard to derive. The deeper and wider network requires huge data or is prone to overfitting. To get out of this predicament, our two-stage method uses high-SNR autocorrelation as a bridge to connect the two-stage CNN, ACR-CNN. Unlike the end-to-end network, the input to the two-stage method is low-SNR autocorrelation rather than the speckle pattern. In addition, high-SNR autocorrelation is the key physical constraint to improve the robustness to noise and recover targets.

Schematically illustrated in Fig. 4, ACR-CNN is constructed as two-stage and connected by the autocorrelation based on U-net and ResNet. The original U-net is proposed for biomedical image segmentation by classifying each pixel, and this architecture is widely used in deep learning [37]. The residual model of ResNet refers to each layer of input, and residual functions formed during the learning process are more beneficial to optimize the mapping than unreferenced mapping [38].

Fig. 4. Network architecture of ACR-CNN. (H, W, 1), height of the image, width of the image, and number of channels; (Conv $+$ Relu $+$ Maxpool), convolution, Relu activation function, and Max pooling layer; (Resblock $+$ Relu), residual block and Relu activation function; Upconvblock, up-convolution block; Sub-pixel conv, sub-pixel convolution; Argmax, arguments of the maxima function.

Download Full Size | PDF

As mentioned in Section 2.B, the de-noising stage of ACR-CNN, ACR-1, is used to eliminate the influence of noise by enhancing the autocorrelation. The low-SNR autocorrelations of speckle patterns are set as input, and the autocorrelations of targets are set as ground truth for ACR-1. Thus, the de-nosing stage, ACR-1, is trained to model the autocorrelation from low SNR to high SNR. Features of the autocorrelation are extracted through the encoder of ACR-1 and extended by the decoder. As shown in Fig. 4, the encoder consists of one convolution layer, one max-pooling layer, and four residual blocks, and the decoder extends feature mappings through eight up-convolution layers. We use the L1 loss (sum of mean absolute error) and the structural similarity index (SSIM) as loss function ${\rm ACR} - {1_{{\rm loss}}}$, which is given by

(6)$${\rm ACR} - {1_{{\rm loss}}} = L1 + (1 -\rm SSIM).$$

$L1$ loss is used to measure degrees of the deviation between the output and the ground truth at pixel level. Since SSIM measures the similarity of two images from the brightness, contrast, and structure, the overall structure can be constrained. Therefore, a high-SNR autocorrelation without noise influences is output by ACR-1.

As mentioned in Section 2.B, the reconstructing stage of ACR-CNN, ACR-2, is used to reconstruct the target from the output of ACR-1. Accordingly, the enhanced autocorrelation is sent to the encoder and the decoder of ACR-2 for mapping features. As shown in Fig. 4, the encoder of ACR-2 is similar to ACR-1 in structure. But depending on the reconstruction task, ACR-2’s decoder uses six sub-pixel convolutions to extend features unlike ACR-1’s. Importantly, to preserve both global high-level features and local detail information, five skip connections are added at different resolutions between the encoder and decoder. Since the goal of ACR-2 is a binary image, we treat this training process as a classification task. Correspondingly, the cross-entropy loss is used as a loss function.

The purpose of two-stage ACR-CNN constructed with high-SNR autocorrelations as the intermediate optimization parameter is to solve the problem of imaging through various diffusers with different levels of noise. ACR-CNN learns the features of autocorrelation to connect the target and speckle rather than characterizing directly the speckle pattern. Correspondingly, high-SNR autocorrelation is enhanced by a low-SNR one in the de-noising stage. This stage allows ACR-1 to eliminate the unknown noise interference in low-SNR speckle pattern imaging through unknown diffusers. Enhanced autocorrelation is the input signal to restore structure distribution of the original target in the reconstructing stage. Therefore, ACR can improve generalization by reconstructing objects from enhanced autocorrelations more than speckle patterns.

3. ANALYSIS

A. Noise Datasets

Here, we present the noise dataset of experimental data, which is used to verify the noise robustness of the two-stage method based on MNIST and Chars74k databases [39,40].

Simulation. As the simulated experimental demonstration, speckle patterns are simulated through various diffusers with different levels of Gaussian noise. In these experiments, the setup of the simulated optical system is shown in Fig. 1(a).

As a first experimental demonstration, we simulate speckle pattern imaging through various diffusers. Even with the same diffuser, the modulations of the input optical signal are different in the same statistical characteristic. Therefore, to simulate different modulations, each DGRM is randomly generated for each object under the same variance $\sigma _M^2$. As mentioned in Section 2.A, the different variances $\sigma _M^2$ of DGRM are set to simulate the various diffusers having different statistical characteristics. As the speckle pattern’s autocorrelation is approximately equal to the target’s, Fig. 5 shows that autocorrelations of speckle patterns of $1920 \times 1200$ pixels simulated with different $\sigma _M^2$ (100, 110, 150, 200, 250, and 260) have distributions similar to the autocorrelations of targets.

Fig. 5. Raw simulated speckle patterns, targets, and their autocorrelations in six diffusers when $\sigma _M^2$ is 100, 110, 150, 200, 250, and 260.

Download Full Size | PDF

As a second experimental demonstration, we simulate low-SNR speckle patterns with different levels of Gaussian noise imaging through various diffusers. The small speckle pattern of $256 \times 256$ pixels at the central region is selected from raw patterns of $1920 \times 1200$ pixels to calculate the autocorrelation. Different levels of Gaussian noise are introduced to speckle patterns with the variance $\sigma _G^2$ from 0.01 to 0.30 as shown in Fig. 6. With the larger $\sigma _G^2$, more information is lost until until it is almost buried in noise. For training, 13,204 digits and English letters are simulated with four kinds of diffusers (when $\sigma _M^2$ is 100, 150, 200, and 250, respectively denoted by $D\_100$, $D\_150$, $D\_200$, and $D\_250$) under four kinds of Gaussian noise (when $\sigma _G^2$ is 0.01, 0.03, 0.05, and 0.07). Two other kinds diffusers ($D\_110$ and $D\_260$) and eight levels of Gaussian noise (0.04, 0.08, 0.1, 0.12, 0.14, 0.2, 0.25, and 0.3) are used to test the robustness of unknown diffusers and unknown noise. Each test group has 406 digits and English letters.

Fig. 6. Low-SNR simulated speckle patterns with various Gaussian noise and their autocorrelations when $\sigma_G^2$ is 0.01, 0.03, 0.05, 0.07, 0.15, 0.20, 0.25, and 0.30.

Download Full Size | PDF

System. As the experimental demonstration, speckle patterns are collected in the actual optical system through three grit-ground-glass diffusers (Thorlabs, DG100X-120, DG100X-220, and DG100X-600), with noise interferences of the photon noise. In these experiments, the targets are illuminated by a LED light source and diffused through the grit-ground-glass [Supplement 1, Figs. 1(b) and 1(c)]. The collected speckle patterns are disturbed by the detector noise called noise 0, which is caused by low exposure.

As a third experimental demonstration, we collect speckle pattern imaging through three grit-ground-glass diffusers (Thorlabs, DG100X-120, DG100X-220, and DG100X-600). Totally, 4200 images (600 images per piece) collected through seven pieces of the 220 grit-ground-glass diffuser, $G\_220$ (Thorlabs, DG100X-220), are used for training. Additionally, 1200 images (600 images per piece) are imaged through two other grit-ground-glass diffusers, $G\_120$ and $G\_600$ (Thorlabs, DG100X-120 and DG100X-600), and are used to test.

As a final experimental demonstration, we collect speckle pattern imaging through the 220 grit-ground-glass diffuser, $G\_220$ (Thorlabs, DG100X-220), with three levels of photon noise. The disturbance conditions of the ambient light are changed to make three levels of photon noise called noise 1, 2, and 3 during measurements, as described in Section A. Figure 7 shows that the autocorrelations of small areas are selected at the central region from the extension speckle patterns with noise 0, 1, 2, and 3. The results indicate that the increase in the ambient light’s disturbance degree leads to a decrease in the autocorrelation’s peak SNR (PSNR). We totally use 2700 digits and English letters (900 images per noise) for training, and 300 images (100 images per noise) for testing with three-photon noise (noise 1, 2, and 3).

Fig. 7. Extension speckle patterns of the system and ground truth with their autocorrelations. (a) Extension speckle patterns of the system with four cases of noise; (b) autocorrelations of small speckles; (c) enlarged autocorrelations; (d) ground truth.

Download Full Size | PDF

B. Comparison with BM3D and Non-Local Means De-Noisers

Following the popularity of the image de-noising algorithm, we apply two de-noisier filters, which are block-matching and 3D (BM3D) and non-local means (NLM) to decrease the Gaussian noise simulated by different variances and photon noise collected in the actual optical system [41,42]. There are some usual transformations in optical imaging systems, such as the zoom and flip; beyond that, the special transformation, rotation, is caused by DMD in the actual optical system. ACR-1 has the ability to model these transformation processes, but BM3D or NLM does not. To make fair comparisons among de-noise results of the three de-noisier filters, corresponding preprocessing is applied in speckle patterns and targets. Note that this transformation preprocessing (zoom, flip, and rotation) is applied only in this section. The images in other sections are all modeled from original data without any preprocessing.

As the experimental comparison, we show the de-noising result of BM3D, NLM, and ACR-1 in the actual optical system where autocorrelations are buried in three levels of photon noise. NLM and BM3D are acted on the autocorrelations of speckles; the results are shown in Fig. 8 denoted as NLM-A and BM3D-A. We also present the autocorrelation results when NLM and BM3D are acted on speckle patterns, denoted as NLM-S and BM3D-S. Compared with NLM-S and BM3D-S, the autocorrelations of NLM-A and BM3D-A have better robustness to three-photon noise. Therefore, NLM-A and BM3D-A are used in the next experimental comparison. As the simulated experimental comparison, the robustness of three de-noisier filters is tested through 14 Gaussian noises when $\sigma _G^2$ is changed from 0.01 to 0.30. For quantitative analysis, the PSNR of simulation and the system are respectively shown in Figs. 9(a) and 9(b).

Fig. 8. Comparison of BM3D, NLM, and ACR-1. Speckle-Auto, autocorrelation of speckles; NLM-S, NLM acted on speckle patterns; NLM-A, NLM acted on Speckle-Auto; BM3D-S, BM3D acted on speckle patterns; BM3D-A, BM3D acted on Speckle-Auto.

Download Full Size | PDF

Fig. 9. PSNR of BM3D, NLM, and ACR-1 in simulation and system. (a) PSNR of three de-noisier filters under various Gaussian noise in simulation; the bottom label denotes the variance of Gaussian noise $\sigma _G^2$ for simulation. (b) PSNR of three de-noisier filters under three-photon noise in the system; the bottom label denotes cases of noise for the system. Input denotes Speckle-Auto; NLM denotes NLM-A; BM3D denotes BM3D-A.

Download Full Size | PDF

In general, comparing NLM-S, BM3D-S with NLM-A, BM3D-A, it is more effective to de-noise the autocorrelation. Also, BM3D and NLM are unable to filter out the noise distributed in the background items of the autocorrelation images, which also leads to a PSNR value improvement of less than 1 dB for both. In addition, ACR-1 outstandingly recovers the autocorrelation information and improves the PSNR by about 20 dB in system data. Therefore, ACR-1 can be applied to different levels of noise compared with NLM and BM3D algorithms.

C. ACR-1 Model Test

Figure 10 presents the results of the ACR-1 model test in the imaging experiment, where the autocorrelation information is buried in noise. For the quantitative analysis of enhancements, the PSNR and SSIM are shown in Fig. 11.

Fig. 10. Test result of simulation and system for ACR-1.

Download Full Size | PDF

Fig. 11. SSIM and PSNR for ACR-1. (a) PSNR of two types of experiments with various diffusers and noise; (b) SSIM of two types of experiments with various media and noise. The bottom label denotes the variance of Gaussian noise $\sigma _G^2$ for simulation; the top label denotes cases of noise for the system; $D\_110$ and $D\_260$ denote two simulative diffusers when $\sigma _M^2$ is 110 and 260; Sys means system data.

Download Full Size | PDF

Simulation. As the simulated experimental demonstration, we test the robustness through eight Gaussian noises with two diffusers ($D\_110$ and $D\_260$). Since the autocorrelation of the target can be approximately obtained from the speckle pattern [43,44], our two-stage method has a natural advantage in the generalization of diffusers having different statistical characteristics. In the de-noising stage, the SSIM of output is almost up to 0.98 from 0.15 under eight levels of Gaussian noise (0.04, 0.08, 0.1, 0.12, 0.14, 0.2, 0.25, and 0.3). The PSNR of the output nearly remains constant at 35 dB when $\sigma _G^2$ is $0.04 \sim 0.14$. Even though there is a downward trend when $\sigma _G^2$ is 0.01, PSNR still remains nearly above 30 dB. Information of the autocorrelations is almost buried in the untrained Gaussian noise when $\sigma _G^2$ is $0.20 \sim 0.30$ as shown in Fig. 6, but ACR-1 consistently enhances high-SNR autocorrelation of target imaging through unknown diffusers.

System. As the experimental demonstration, we test the robustness of actual optical system imaging through the 220 grit-ground-glass diffuser, $G\_220$, with three levels of photon noise. As shown in Fig. 11, the PSNR has increased from 20 to 38 dB, and the SSIM is almost up to 1.0 from 0.2. The result is similar to the simulated experiment and verifies that ACR-1 observably improves the robustness of optical system imaging through complex environments.

The robustness of ACR-CNN to unknown degrees of Gaussian noise is tested in simulation imaging through unknown diffusers. Furthermore, the generalizability of ACR-CNN in the optical system has also been successfully verified with detector and photon noise, which are respectively caused by low-exposure and the interference of ambient light. The results show that enhanced autocorrelations have high SSIM and PSNR. Therefore, ACR-1 can fit the model of unknown diffusers and eliminate noise. In the reconstructing stage, ACR-2, the target will be reconstructed from the enhanced autocorrelation.

D. ACR-2 Model Test

In the ACR-2 model test, we compare the reconstruction effect of two models as shown in Fig. 12. The end-to-end [speckle reconstruction (SR)] directly recovers targets from speckle patterns without physical constraint, and the two-stage model is constrained by the autocorrelation, as physical knowledge recovers targets from the speckle ACR (SAR) and the enhanced ACR (AR).

Fig. 12. Test results of the ACR-2. SR, speckle reconstruction; SAR, speckle autocorrelation reconstruction; AR, autocorrelation reconstruction; GT, ground truth. $G\_120$, $G\_220$, and $G\_600$ respectively represent three grit-ground-glass diffusers: Thorlabs, DG100X-120, DG100X-220, and DG100X-600.

Download Full Size | PDF

Simulation. As the simulated experimental demonstration, we reconstruct targets from low-SNR speckle patterns, low-SNR autocorrelations of speckle patterns, and high-SNR autocorrelations enhanced by ACR-1. Under the condition of unknown diffusers and noise, AR can consistently make high-quality reconstructions, while SR entirely fails to recover, as shown in Fig. 12.

System. As the experimental demonstration, we respectively reconstruct targets from speckle patterns, low-SNR autocorrelations of speckle patterns, and high-SNR autocorrelations in the actual optical system through three grit-ground-glass diffusers, ($G\_220$, $G\_120$, and $G\_600$), with three levels of photon noise. For the speckle patterns diffused by the trained grit-ground-glass ($G\_220$) with trained noise 0, 1, 2, and 3, the AR still makes high-quality reconstructions, but SR and SAR poorly recover only larger targets or cannot be generalized to other small targets. For the speckle patterns diffused by the untrained grit-ground-glass ($G\_120$ and $G\_600$) with detector noise 0, the targets can be recovered by AR, while SR and SAR fail to reconstruct in each grit-ground-glass. The results shown in Fig. 12 demonstrate that AR has a high generalization ability for the complex imaging system, especially for the system imaging through variable diffusers.

For unknown noise and diffusers in the simulation data, SR cannot recover the target and SAR poorly recovers, while AR can successfully reconstruct the target from the enhanced autocorrelation. For unknown diffusers, SR is still unable to recover the target from system data, while AR can successfully reconstruct the target. For three-photon noises, SR can recover only part of the original target information from the system data, while AR can successfully reconstruct the target with rich details. In the simulation and system data, the average PSNR of AR is increased by 4 dB more the SR. Figure 12 demonstrates that our two-stage model has more capability than the end-to-end, and high-SNR autocorrelation can significantly improve the reconstruction target for the problem of imaging through diffusers with noise.

E. Comparison with Erf4-Net

In this section, we compare the reconstruction of three models based on two networks in the actual optical system through one grit-ground-glass diffuser ($G\_220$) with three levels of photon noise. The one network is our ACR-2 shown in Fig. 4. The other network, called Erf4-Net, is transformed from ERFNet, and its construction is shown in Table 1 [45]. As shown in Table 1, we employ four skip connections and increase the number of each feature map at ERFNet by four times to ensure Erf4-Net has a model size similar to the sum of ACR-1 and ACR-2. The corresponding training parameters are shown in Table 2. Thus, in a similar model size, the reconstruction results of the three methods (SR, SAR, and AR) on the two networks (Erf4-Net and ACR-CNN) are compared, as shown in Fig. 13.

Table 1. Layer Display of Erf4-Net^a

View Table | View all tables in this article

Table 2. Training Parameters of Erf4-Net and ACR-CNN

View Table | View all tables in this article

Fig. 13. Comparison results between ACR-2 and Erf4-Net. Erf4-SR-Input, input of Erf4-Net for SR; Erf4-SR-Out, output of Erf4-Net for SR; Erf4-SAR-Input, input of Erf4-Net for SAR; Erf4-SAR-Out, output of Erf4-Net for SAR; AR-ACR2-Input, input of ACR-2 for AR; AR-ACR2-Out, output of ACR-2 for AR; GT, ground truth.

Download Full Size | PDF

Compared with AR based on the same network, ACR-2, SR and SAR cannot completely reconstruct the target hidden in unknown diffusers and unknown noise, and SAR can recover part of the target structure in the case of a known diffuser and known noise. The comparison results between ACR-2 and Erf4-Net show that increasing the network width can improve the reconstruction quality of SR to a certain extent. The reconstruction quality of SAR is slightly lower than that of AR, which indicates that high-SNR autocorrelation is very important physical prior knowledge. On the other hand, the training duration of Erf4-Net is much longer than ACR-CNN, as shown in Table 2. Providing high-SNR autocorrelation rather than increasing the network width is more helpful to reconstruct targets in the de-noising task of imaging through diffusers.

In general, the de-noising stage significantly improves the PSNR and SSIM even under the condition of high-level noise with unknown diffusers. The reconstructing stage consistently recovers higher-quality images from enhanced autocorrelation rather than the speckle pattern. The unconstrained end-to-end network with a size similar to the two-stage network is difficult to converge and requires a longer time for modeling de-noising and recovery. In addition, the increase in network width may lead to overfitting. Combining physical prior knowledge with the network is helpful to optimize learning and improve the convergence speed. Taking advantage of using high-SNR autocorrelations as the physical constraint, the two-stage method overcomes the existing challenge to the poor generalization in various diffusers with noise.

4. DISCUSSION

According to the results of reconstructions in Section 3, we have the following.

(i) Using the physical knowledge constraint learning process can improve the robustness to noise caused by an imperfect imaging environment. The first item of physical constraint is dividing the imaging task into two stages, de-noising and reconstructing. Also, the two-stage method with autocorrelation as the intermediate optimization target has a natural advantage in generalizing various diffusers. The second item is building the de-noising model based on autocorrelation rather than speckle patterns. In particular, the enhancement of autocorrelations would be necessary for scenarios involving low-SNR conditions. The unconstrained end-to-end network with a size similar to the two-stage network has indeed anti-noise ability but also increases the training duration and the risk of overfitting.
(ii) Compared with NLM and BM3D, the de-noising stage, ACR-1, improves the generalizations and contributes to high-SNR autocorrelation due to a physical constraint and data driving. Therefore, based on U-net and ResNet, the corresponding ACR-1 network is built to learning features and outputs autocorrelations as input sent to the reconstructing stage.
(iii) The reconstructing stage can consistently recover targets hidden in the unknown diffuser with unknown noise while the end-to-end model entirely fails. Similar to the structure of ACR-1, the corresponding ACR-2 network is built based on U-net and ResNet, and has five skip layers to fuse the features of the high-level information and low-level characters. When using the Erf4-Net having a size similar to ACR-CNN, the reconstruction quality of the end-to-end model is improved in three kinds of photon noise, but is still lower than the two-stage network.
(iv) In the de-noising stage, since the autocorrelation image has a large background and the effective information is distributed only in a small central area, the calculated PSNR is still higher than 20 dB, although with a high level of noise. However, in the reconstructing stage, because of the characteristics of the different sizes of the target, the effective information region distributed in the image is varying. Therefore, there exists the phenomenon that the reconstructed image is poor visually but its PSNR can reach 17 dB, and the PSNR of the visually high-quality image is only 13 dB.
(v) In this proof of concept, we use simulation and actual system experiments, and the results show that our two-stage method has higher generalizations than the end-to-end model even in low-SNR conditions, such as the detector and photon noise in the actual optical system.

5. CONCLUSION

For the condition that the speckle pattern has been seriously submerged in noise, introducing high-SNR autocorrelation as the intermediate optimization goal can optimize the network learning process, improve learning efficiency, and improve the robustness to noise. On the other hand, it can be seen from the experimental results that high-quality autocorrelation information is helpful to improve the reconstruction effect. Especially for the speckle autocorrelation seriously disturbed by noise, the commonly used de-noising algorithms are difficult to apply, as shown by the de-noising results of BM3D and NLM. Thus, improving the SNR of speckle autocorrelation is very necessary and important physical prior knowledge in the de-noising task of imaging through diffusers.

Combining the physical constraints with data- driving, we have proposed a two-stage method to significantly improve robustness. The purpose of the de-noising stage is to enhance the autocorrelation from low SNR, and it significantly increases the average PSNR and SSIM up to 35 dB and 1, respectively. Importantly, the two-stage method achieves higher-quality reconstructions based on physical knowledge than the end-to-end method, and the autocorrelation with high SNR is helpful to improve the efficiency of reconstruction. In practice, our work has a low requirement for detector quality even in complex noise scenes. This condition is the basis for extending imaging through diffusers having different statistical characteristics to many fields such as imaging in low illumination.

Funding

National Natural Science Foundation of China (62031018, 61971227, 61727802), Jiangsu Provincial Key Research and Development Program (BE2018126); Fundamental Research Funds for the Central Universities (30920031101).

Disclosures

The authors declare no conflicts of interest.

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

REFERENCES

1. J. W. Goodman, W. H. Huntley, D. W. Jackson, and M. Lehmann, “Wavefront-reconstruction imaging through random media,” Appl. Phys. Lett. 8, 311–313 (1966). [CrossRef]

2. V. Ntziachristos, “Going deeper than microscopy: the optical imaging frontier in biology,” Nat. Methods 7, 603–614 (2010). [CrossRef]

3. A. Ishimaru, Wave Propagation and Scattering in Random Media (IEEE, 1997).

4. M. S. Feld, C. Yang, D. Psaltis, and Z. Yaqoob, “Optical phase conjugation for turbidity suppression in biological samples,” Nat. Photonics 2, 110–115 (2008). [CrossRef]

5. K. Si, R. Fiolka, and M. Cui, “Fluorescence imaging beyond the ballistic regime by ultrasound-pulse-guided digital phase conjugation,” Nat. Photonics 6, 657–661 (2012). [CrossRef]

6. T. R. Hillman, T. Yamauchi, W. Choi, R. R. Dasari, M. S. Feld, Y. K. Park, and Z. Yaqoob, “Digital optical phase conjugation for delivering two-dimensional images through turbid media,” Sci. Rep. 3, 1909 (2013). [CrossRef]

7. L. Yan, M. Cheng, Y. Shen, J. Shi, and L. V. Wang, “Focusing light inside dynamic scattering media with millisecond digital optical phase conjugation,” Optica 4, 280–288 (2017). [CrossRef]

8. M. Kim, W. Choi, Y. Choi, C. Yoon, and W. Choi, “Transmission matrix of a scattering medium and its applications in biophotonics,” Opt. Express 23, 12648–12668 (2015). [CrossRef]

9. A. Drémeau, A. Liutkus, D. Martina, O. Katz, C. Schülke, F. Krzakala, S. Gigan, and L. Daudet, “Reference-less measurement of the transmission matrix of a highly scattering material using a DMD and phase retrieval techniques,” Opt. Express 23, 11898–11911 (2015). [CrossRef]

10. H. B. De Aguiar, S. Brasselet, and S. Gigan, “Enhanced nonlinear imaging through scattering media using transmission-matrix-based wave-front shaping,” Phys. Rev. A 94, 043830 (2016). [CrossRef]

11. X. Xu, X. Xie, H. He, H. Zhuang, J. Zhou, T. Abhilash, and A. P. Mosk, “Imaging objects through scattering layers and around corners by retrieval of the scattered point spread function,” Opt. Express 25, 32829–32840 (2017). [CrossRef]

12. L. Long, L. Quan, S. Shuai, H. Z. Lin, W. T. Liu, and P. X. Chen, “Imaging through scattering layers exceeding memory effect range with spatial-correlation-achieved point-spread-function,” Opt. Lett. 43, 1670–1673 (2018). [CrossRef]

13. H. He, X. Xie, Y. Liu, H. Liang, and J. Zhou, “Exploiting the point spread function for optical imaging through a scattering medium based on deconvolution method,” J. Innov. Opt. Health Sci. 12, 1930005 (2019). [CrossRef]

14. O. Katz, E. Small, and Y. Silberberg, “Looking around corners and through thin turbid layers in real time with scattered incoherent light,” Nat. Photonics 6, 549–553 (2012). [CrossRef]

15. J. Bertolotti, E. G. Van Putten, C. Blum, A. Lagendijk, W. L. Vos, and A. P. Mosk, “Non-invasive imaging through opaque scattering layers,” Nature 491, 232–234 (2012). [CrossRef]

16. O. Katz, P. Heidmann, M. Fink, and S. Gigan, “Non-invasive single-shot imaging through scattering layers and around corners via speckle correlations,” Nat. Photonics 8, 784–790 (2014). [CrossRef]

17. A. Porat, E. R. Andresen, H. Rigneault, D. Oron, S. Gigan, and O. Katz, “Widefield lensless imaging through a fiber bundle via speckle correlations,” Opt. Express 24, 16835–16855 (2016). [CrossRef]

18. N. Stasio, C. Moser, and D. Psaltis, “Calibration-free imaging through a multicore fiber using speckle scanning microscopy,” Opt. Lett. 41, 3078–3081 (2016). [CrossRef]

19. G. Osnabrugge, R. Horstmeyer, I. N. Papadopoulos, B. Judkewitz, and I. M. Vellekoop, “The generalized optical memory effect,” Optica 4, 886–892 (2017). [CrossRef]

20. C. Guo, J. Liu, W. Li, T. Wu, L. Zhu, J. Wang, G. Wang, and X. Shao, “Imaging through scattering layers exceeding memory effect range by exploiting prior information,” Opt. Commun. 434, 203–208 (2018). [CrossRef]

21. H. Chen, Y. Gao, X. Liu, and Z. Zhou, “Imaging through scattering media using speckle pattern classification based support vector regression,” Opt. Express 26, 26663–26678 (2018). [CrossRef]

22. T. Ando, R. Horisaki, and J. Tanida, “Speckle-learning-based object recognition through scattering media,” Opt. Express 23, 33902–33910 (2015). [CrossRef]

23. R. Horisaki, R. Takagi, and J. Tanida, “Learning-based imaging through scattering media,” Opt. Express 24, 13738–13743 (2016). [CrossRef]

24. S. Ayan, L. Justin, S. Li, and B. George, “Lensless computational imaging through deep learning,” Optica 4, 1117–1125 (2017). [CrossRef]

25. K. H. Jin, M. T. Mccann, E. Froustey, and M. Unser, “Deep convolutional neural network for inverse problems in imaging,” IEEE Trans. Image Process. 26, 4509–4522 (2017). [CrossRef]

26. G. Satat, M. Tancik, O. Gupta, B. Heshmat, and R. Raskar, “Object classification through scattering media with deep learning on time resolved measurement,” Opt. Express 25, 17466–17479 (2017). [CrossRef]

27. C. Piergiorgio, B. Alessandro, B. Daniel, H. Matthias, C. F. Higham, H. Robert, M. S. Roderick, and F. Daniele, “Neural network identification of people hidden from view with a single-pixel, single-photon detector,” Sci. Rep. 8, 11945 (2018). [CrossRef]

28. A. Turpin, I. Vishniakou, and J. D. Seelig, “Light scattering control in transmission and reflection with neural networks,” Opt. Express 26, 30911–30929 (2018). [CrossRef]

29. S. Jiang, J. Liao, Z. Bian, K. Guo, Y. Zhang, and G. Zheng, “Transform- and multi-domain deep learning for single-frame rapid autofocusing in whole slide imaging,” Biomed. Opt. Express 9, 1601–1612 (2018). [CrossRef]

30. S. Li, M. Deng, J. Lee, A. Sinha, and G. Barbastathis, “Imaging through glass diffusers using densely connected convolutional networks,” Optica 5, 803–813 (2018). [CrossRef]

31. Y. Li, Y. Xue, and L. Tian, “Deep speckle correlation: a deep learning approach toward scalable imaging through scattering media,” Optica 5, 1181–1190 (2018). [CrossRef]

32. M. Yang, Z.-H. Liu, Z.-D. Cheng, J.-S. Xu, C.-F. Li, and G.-C. Guo, “Deep hybrid scattering image learning,” J. Phys. D 52, 115105 (2018). [CrossRef]

33. M. Lyu, H. Wang, G. Li, S. Zheng, and G. Situ, “Learning-based lensless imaging through optically thick scattering media,” Adv. Photon. 1, 036001 (2019). [CrossRef]

34. Y. Sun, J. Shi, L. Sun, J. Fan, and G. Zeng, “Image reconstruction through dynamic scattering media based on deep learning,” Opt. Express 27, 16032–16046 (2019). [CrossRef]

35. Y. Li, S. Cheng, Y. Xue, and L. Tian, “Displacement-agnostic coherent imaging through scatter with an interpretable deep neural network,” Opt. Express 29, 2244–2257 (2021). [CrossRef]

36. J. Goodman, Speckle Phenomena in Optics: Theory and Applications (Roberts and Company, 2007).

37. O. Ronneberger, P. Fischer, and T. Brox, “U-net: convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer International Publishing, 2015), pp. 234–241.

38. K. He, X. Zhang, S. Ren, and S. Jian, “Deep residual learning for image recognition,” in IEEE Conference on Computer Vision and Pattern Recognition (2016), pp. 770–778.

39. L. Deng, “The MNIST database of handwritten digit images for machine learning research [best of the web],” IEEE Signal Process. Mag. 29(6), 141–142 (2012). [CrossRef]

40. T. E. De Campos, B. R. Babu, and M. Varma, “Character recognition in natural images,” in VISAPP 2009—Proceedings of the 4th International Conference on Computer Vision Theory and Applications (2009), Vol. 2, pp. 273–280.

41. K. Dabov, A. Foi, V. Katkovnik, and K. Egiazarian, “Image restoration by sparse 3D transform-domain collaborative filtering,” Proc. SPIE 6812, 62–73 (2008). [CrossRef]

42. A. Buades, B. Coll, and J.-M. Morel, “A non-local algorithm for image denoising,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05) (2005), Vol. 2, pp. 60–65.

43. G. R. Ayers, M. J. Northcott, and J. C. Dainty, “Knox-Thompson and triple-correlation imaging through atmospheric turbulence,” J. Opt. Soc. Am. A 5, 963–985 (1988). [CrossRef]

44. L. Zhu, Y. Wu, J. Liu, T. Wu, L. Liu, and X. Shao, “Color imaging through scattering media based on phase retrieval with triple correlation,” Opt. Lasers Eng. 124, 105796 (2020). [CrossRef]

45. E. Romera, J. M. Álvarez, L. M. Bergasa, and R. Arroyo, “ERFNet: efficient residual factorized convnet for real-time semantic segmentation,” IEEE Trans. Intell. Transp. Syst. 19, 263–272 (2018). [CrossRef]

Layer	Type	Out-H, W, F	Skip
0	Input	256,256,1	–
1	Conv $+$ Maxpool $+$ BN	128,128,64	1
2	Conv $+$ Maxpool $+$ BN	64,64,256	2
3–7	$5 \times$ Non-bt-1D	64,64,256	3
8	Conv $+$ Maxpool $+$ BN	32,32,512	4
9	Non-bt-1D(dialated 2)	32,32,512	–
10	Non-bt-1D(dialated 4)	32,32,512	–
11	Non-bt-1D(dialated 8)	32,32,512	–
12	Non-bt-1D(dialated 16)	32,32,512	–
13	Non-bt-1D(dialated 2)	32,32,512	–
14	Non-bt-1D(dialated 4)	32,32,512	–
15	Non-bt-1D(dialated 8)	32,32,512	–
16	Non-bt-1D(dialated 16)	32,32,512	4
17	ConvTran $+$ BN $+$ Relu	64,64,256	3
18–19	$2 \times$ Non-bt-1D	64,64,256	2
20	ConvTran $+$ BN $+$ Relu	128,128,64	1
21–22	$2 \times$ Non-bt-1D	128,128,64	–
23	ConvTran $+$ Argmax	256,256,1	–

	ACR-1	ACR-2	Erf4-Net
Number of images	2700	2700	2700
Batch size	32	64	12
Time/iteration	11s	15s	170s
Model size	204.8M	243.2M	460.4M
GPU	GeForce RTX 2080 Ti

Layer	Type	Out-H, W, F	Skip
0	Input	256,256,1	–
1	Conv $+$ Maxpool $+$ BN	128,128,64	1
2	Conv $+$ Maxpool $+$ BN	64,64,256	2
3–7	$5 \times$ Non-bt-1D	64,64,256	3
8	Conv $+$ Maxpool $+$ BN	32,32,512	4
9	Non-bt-1D(dialated 2)	32,32,512	–
10	Non-bt-1D(dialated 4)	32,32,512	–
11	Non-bt-1D(dialated 8)	32,32,512	–
12	Non-bt-1D(dialated 16)	32,32,512	–
13	Non-bt-1D(dialated 2)	32,32,512	–
14	Non-bt-1D(dialated 4)	32,32,512	–
15	Non-bt-1D(dialated 8)	32,32,512	–
16	Non-bt-1D(dialated 16)	32,32,512	4
17	ConvTran $+$ BN $+$ Relu	64,64,256	3
18–19	$2 \times$ Non-bt-1D	64,64,256	2
20	ConvTran $+$ BN $+$ Relu	128,128,64	1
21–22	$2 \times$ Non-bt-1D	128,128,64	–
23	ConvTran $+$ Argmax	256,256,1	–

	ACR-1	ACR-2	Erf4-Net
Number of images	2700	2700	2700
Batch size	32	64	12
Time/iteration	11s	15s	170s
Model size	204.8M	243.2M	460.4M
GPU	GeForce RTX 2080 Ti

De-noising imaging through diffusers with autocorrelation

Abstract

1. INTRODUCTION

2. METHOD

A. Experimental Setup

B. Processing Theory

C. Model for ACR-CNN

3. ANALYSIS

A. Noise Datasets

B. Comparison with BM3D and Non-Local Means De-Noisers

C. ACR-1 Model Test

D. ACR-2 Model Test

E. Comparison with Erf4-Net

4. DISCUSSION

5. CONCLUSION

Funding

Disclosures

Data Availability

REFERENCES

Data Availability

Cited By

Figures (13)

Tables (2)

Equations (6)

Applied Optics