Abstract
Temporal compressive coherent diffraction imaging is a lensless imaging technique with the capability to capture fast-moving small objects. However, the accuracy of imaging reconstruction is often hindered by the loss of frequency domain information, a critical factor limiting the quality of the reconstructed images. To improve the quality of these reconstructed images, a method dual-domain mean-reverting diffusion model-enhanced temporal compressive coherent diffraction imaging (DMDTC) has been introduced. DMDTC leverages the mean-reverting diffusion model to acquire prior information in both frequency and spatial domain through sample learning. The frequency domain mean-reverting diffusion model is employed to recover missing information, while hybrid input-output algorithm is carried out to reconstruct the spatial domain image. The spatial domain mean-reverting diffusion model is utilized for denoising and image restoration. DMDTC has demonstrated a significant enhancement in the quality of the reconstructed images. The results indicate that the structural similarity and peak signal-to-noise ratio of images reconstructed by DMDTC surpass those obtained through conventional methods. DMDTC enables high temporal frame rates and high spatial resolution in coherent diffraction imaging.
© 2024 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement
1. Introduction
Coherent diffraction imaging (CDI) [1] is a lensless imaging technique with the potential to achieve higher spatial resolution than direct imaging by generating high-resolution images containing both intensity and phase information from far-field coherent diffraction patterns. It leverages the coherence of light for imaging, yielding high-resolution and high-sensitivity imaging results. This technology finds widespread application in fields such as materials science, biology, and nanotechnology [2–5]. However, the inverse problem of CDI is ill-conditioned, and the quality of traditional phase recovery algorithms [6] is constrained by support constraints and high detection noise [7] resulting from single frame imaging. Several improvement methods have been proposed to mitigate excessive noise and low resolution associated with single frame imaging, including the single-shot phase imaging with a randomized light algorithm [8] and coherent modulation imaging [9], which represent key challenges in CDI. However, the temporal resolution of CDI is limited by the frame rate of the camera, which makes it unable to capture fast-moving small targets. Enhancing the imaging frame rate of CDI is a key issue that needs to be addressed.
With the intention of enhancing data collection efficiency, Duval et al. proposed compressed sensing (CS) [10]. It enables the capture of multiple time frames within a single frame, resulting in the acquisition of multiple images in a single compressed frame, thereby reducing data processing time and cost. This approach significantly enhances the quality of microscope imaging, lensless imaging, video snapshots, and holographic imaging [11–13]. Building upon the temporal CS theory, an algorithm known as temporal compressive coherent diffraction imaging (TC-CDI) was introduced by Chen et. al. [14]. The TC-CDI system integrates CS technology in the temporal domain to capture multi-frame images at a rapid pace and with high spatial resolution. These images are modulated by the Digital Micromirror Device (DMD) and compressed into a single frame in the temporal domain. TC-CDI system enables the reconstruction of up to 8 frames from a single snapshot measurement, thereby enhancing the ability of camera to detect and track fast-moving small targets with heightened sensitivity. While TC-CDI can achieve multi-frame dynamic target recovery, its spatial image recovery accuracy is limited, necessitating the incorporation of prior information from both the frequency domain and spatial domain to enhance the reconstruction quality.
In recent years, deep learning has experienced rapid advancement and has been successfully employed in CS, demonstrating its significant potential in the field of optical imaging and introducing innovative concepts to optical imaging research [15–17]. Recently, a diffusion model with powerful generative capabilities based on stochastic differential equations (SDE) was proposed [18], enabling prior information extraction and high-resolution sample generation. This development has notably enhanced the quality of image editing, medical imaging, and lensless imaging [19–22]. However, the traditional diffusion model necessitates multi-step iteration to achieve high-quality sample generation, resulting in substantial computational overhead and prolonged processing time cost. To expedite the conventional diffusion model, Luo et al. introduced image restoration with mean-reverting stochastic differential equations (IR-SDE) [23]. It only diffuse the image to a lower-quality state without introducing complete noise, effectively creating an intermediate state containing a low-quality (LQ) image and Gaussian noise.
To improve the image quality of CDI under high temporal frame rates with compression ratios, an algorithm termed as dual-domain mean-reverting diffusion model-enhanced temporal compressive coherent diffraction imaging (DMDTC) has been introduced. DMDTC acquires prior information of the spatial domain and frequency domain through sample learning. During the reconstruction process, it utilizes frequency prior information to recover the missing data in the frequency domain, thereby improving the accuracy of the subsequent phase retrieval. Additionally, DMDTC incorporates learned spatial prior information to further denoise and restore the image in the spatial domain, demonstrating significant potential for high-speed frame acquisition and high-quality image reconstruction.
The remainder of this research is organized as follows. Section 2 will provide an overview of the foundational principles of coherent diffraction and the fundamentals of IR-SDE. Section 3 will introduce the implementation of DMDTC as well as explain the training and reconstruction procedure in detail. Section 4 will demonstrate the reconstruction performance of this approach by conducting simulative and experimental experiments. Section 5 will discuss the results between different compressive ratio and conduct the ablation experiment. Section 6 will conclude this work and outlines the proposed approach.
2. Preliminary
2.1 Temporal compressive coherent diffraction imaging
The fundamental principle of TC-CDI is illustrated in Fig. 1. After the dynamic sample $O(x^{\prime},t)$ at time t is illuminated by the monochromatic surface light source ${U_0}(x^{\prime})$, the target in the field of view (FOV) is transformed into the frequency domain by the Fourier lens, where $x^{\prime}$ is the object plane coordinate. Subsequently, the light waves are modulated by digital micromirror device (DMD), whose modulation function can be expressed as $M(x,t)$, to achieve compressed sampling of multi-frame time domain information. The intensity information $I(x)$ is captured by the camera into a single snapshot, where x is the frequency domain coordinate.
According to the principle of Fraunhofer diffraction, $I(x)$ can be modeled as:
where $\varDelta t$ represents the exposure time of the camera. The frequency domain light field in the DMD plane can be represented as:2.2 Mean-reverting SDE
Brownian motion is a stochastic process that introduces continuous, random noise into a system over time. The diffusion model simulates Brownian motion, gradually adds noise to Gaussian noise in the image, and then learns the process of inverse solution. This characteristic of random noise makes diffusion model a valuable tool for describing various natural phenomena, such as particle movement, fluctuations in financial markets. The transformation of images from high quality to Gaussian noise can be viewed as a continuous addition of noise. This process can be characterized as a stochastic process, where the high-resolution image is considered a deterministic signal and the noise is treated as a stochastic process. To better replicate this process, it is modeled as a solution to an SDE:
where $dw$ is Gaussian noise, and the drift equation $f(x,t)$ describes the interaction between the image signal and the noise, which determines the direction and intensity of the influence of the noise on the image signal. The dispersion equation $g(t)$ describes the diffusion speed of noise and determines the degree of impact of noise on the image signal.However, the traditional diffusion model necessitates multi-step iteration to achieve high-quality sample generation, resulting in substantial computational overhead and prolonged processing time cost. Luo et al. [23] proposed image restoration with mean-reverting stochastic differential equations (IR-SDE), which provides a new idea for the accelerated diffusion model.
IR-SDE adds a parameter $\mu $ to the original model, indicating that the diffusion process is from the high-quality image ${x_0}$ to the low-quality target $\mu $, then the Eq. (6) can be converted as:
After a certain number of steps, the entire SDE will flow to $\mu $ and stable Gaussian noise $\varepsilon $. Let ${{\sigma _t^2} / {{\theta _t}}} = 2{\lambda ^2}$, then given any starting state $x(s)$ at time $s < t$, Eq. (7) can be solved as:
An overview is presented in Fig. 2, and the diffusion process can be inverted from $\mu $ to high-quality images by using numerical techniques:
where the only uncertain component is the score ${\nabla _x}\log {p_t}(x)$ of the temporal data distribution, which can be obtained during the training process. As long as the network obtains the noise level ${\varepsilon _t}$ at each time node from ${x_0}$ to $\mu $, it can determine the score function of the data distribution at a specific moment. To achieve this goal, the instantaneous noise is estimated by exploiting the conditional time correlation network ${\widetilde \varepsilon _\theta }({x_i},\mu ,t)$.However, employing neural networks directly to learn instantaneous noise is a challenging task that can potentially result in unstable training. To solve this problem, the maximum likelihood estimation method is used to calculate the optimal path for image restoration. This method not only facilitates the model in estimating the optimal direction for image restoration but also produces an output that closely resembles the real image. Training process can be expressed as follows:
Subsequently, the noise network ${\widetilde \varepsilon _\theta }({x_i},\mu ,t)$ is optimized so that the mean-reverting SDE is aligned with the optimal trajectory as follows:
3. Method
To achieve high-quality reconstruction of TC-CDI with a high compression ratio, this paper proposed a method termed as dual-domain mean-reverting diffusion model-enhanced temporal compressive coherent diffraction imaging (DMDTC), the main procedure of which is presented in Fig. 3.
With the aim of obtaining the prior information of frequency domain and spatial domain image distribution, two score networks were set for training. In the frequency domain scoring network, the fully sampled spectrogram $U_0^{(F)}$ and the sparse spectrogram ${\mu ^{(F)}}$ sampled by random masks are input into the scoring network in pairs. In the neural network, $U_0^{(F)}$ can be gradually introduced Gaussian noise making it approach the ${\mu ^{(F)}}$. This process can be expressed as:
In the spatial domain scoring network, the fully sampled spatial domain image $U_0^{(S)}$ and the sparse spatial image ${\mu ^{(S)}}$ under-sampled by random masks and artifact blocks are put into the scoring network in pairs. Then Eq. (7) can be remodeled as:
By solving Eq. (14) and Eq. (15), the instantaneous noise distribution of the process can be estimated as:
In the reconstruction stage, the time-domain compressed spectrogram is restored to $U_r^{(F)}$ using the method described in Section 2.1, where $r = 1,2,3, \cdots ,R$ is the index of frequency domain frames. These reconstructed frequency-domain frames $U_r^{(F)}$, which are then employed masks by Hadamard product, are then individually put into the trained frequency domain scoring network. Using the score distribution $S_\theta ^{(F)}$ obtained during the training phase, the SDE is iteratively inverted. The process can be remodeled as:
Since the measured spectral graph only has intensity information, the phase information of the image is obtained and the spatial image is recovered using a hybrid input-output algorithm:
In the iteration, use Eq. (19) to maintain fidelity, where ${F_i}$ and ${f_i}$ respectively represent the signal in the frequency domain and the spatial domain image at the $i - th$ iteration, F and ${F^{ - 1}}$ represents the Fourier transform and inverse Fourier transform.
To further improve the reconstruction quality, spatial prior information is incorporated for constraints. The reconstructed spatial domain image $U_r^{(S)}$ is used as the mean input for the IR-SDE network. The process can be remodeled as:
After iteratively applying reverse SDE, the high-quality spatial images $\hat{U}_r^{(S)}$ can be generated by the network. Algorithm 1 states the pseudocode of DMDTC.
4. Experiments
4.1 Data specification and parameter selection
When training DMDTC, the MINIST dataset is utilized, which contains a diverse range of handwritten numerical styles to construct the training data. A total of 9,000 binarized images from the MINIST dataset are randomly selected without repetition. The 28 × 28 image within the MNIST dataset underwent a process of upscaling to dimensions of 40 × 40. Then they are embedded in center of a black background sized 512 × 512 for spatial domain training. This dataset is utilized as the GT input for the spatial domain training network, following which the GT is under-sampled using random artifact blocks and random masks to serve as the LQ input for the spatial domain training network. For training the frequency domain network, the dataset is converted to the frequency domain by Fourier transform. The frequency domain dataset is modelled to simulate the detector. This dataset is utilized as the GT input for the frequency domain training network, following which the GT is under-sampled using random masks to serve as the LQ input for the frequency domain training network. The number of discretization steps for the Mean-reversing SDE T are set to 100, noise scale ${\sigma _t}$ is set to 25, the number of frames R are set to 20 and the DNN-HIO steps L are set to 60.
The mask values in frequency and spatial domain training used for sampling are {0, 0.7}. Random artifact blocks in spatial domain training with edge sizes ranging from 0 to 3 and gray values uniformly distributed from 0 to 200 are utilized for under-sampled images. Two networks are trained using the Adam optimizer with a learning rate set to 0.0001. During the image reconstruction phase, 100 iterations are performed for frequency and spatial domain training. This research is implemented on an NVIDIA GeForce RTX 3090 Ti graphics processor with 24GB memory, and all experiments were conducted in PyTorch.
4.2 Simulative validation
Simulative validation is conducted to evaluate the effectiveness of the proposed DMDTC in coherent diffraction imaging reconstruction.
Simulative experiments are conducted using an additional 20 images selected from the MINIST dataset apart from training dataset as imaging targets. The Fourier transform is subsequently applied to these 20 images, followed by a Hadamard product with masks to under-sample the images. The frequency domain data is modelled to simulate the detector. The 20 images are then compressed into a single frame measurement for reconstruction. To demonstrate the superiority of the DMDTC, additional TC-CDI experiment is also conducted for comparative purposes. The ground truth (GT) is compared with the reconstructed images obtained through TC-CDI and DMDTC methods, which is illustrated in Fig. 4. To facilitate a clearer observation of the experimental results, the images sized 512 × 512 are cropped to 128 × 128, with the region of interest serving as the center.
The proposed DMDTC reconstruction closely resembled the GT, displaying sharp edges and minimal noise. In contrast, the reconstructed results of TC-CDI exhibit low brightness, blurred outlines, and noticeable artifacts. As depicted in Fig. 4(d), the DMDTC plots exhibit higher similarity to the GT plots and less noise.
To quantitatively assess the experiments in this study, peak signal-to-noise ratio (PSNR) and structural similarity index measurement (SSIM) were employed as evaluation metrics.
The PSNR is a metric utilized to gauge the relationship between signal and noise. The calculation formula for PSNR is as follows:
The SSIM is an index that measures the structural similarity of two images.
Quantitative analysis results of simulative validation are presented in Table 1, with the values achieved by averaging the 20 reconstructed images. In comparison, the average PSNR of DMDTC is 3.12 dB higher than that of TC-CDI, and the SSIM value is as high as 0.9870, which demonstrated the effectiveness of the proposed DMDTC in coherent diffraction imaging reconstruction.
4.3 Generalization validation
To validate the efficacy of the proposed DMDTC across diverse datasets and its capability in simulating the process of actual camera sampling, two sets of generalization experiments are conducted.
The first set of experiments involved more intricate images sized 28 × 28, sourced from the quick drawing dataset. These images were then directly superimposed onto a black background sized 512 × 512. Subsequently, the images underwent a Hadamard product operation with masks for under-sampling, followed by transformation into frequency domain representations using Fourier transformation. After this transformation, the 20 frequency domain images were compressed into a single measurement for reconstruction. The reconstruction results are visually depicted in Fig. 5. To improve the visibility and interpretability of the experimental results, the original images sized 512 × 512 were cropped to 128 × 128, with the central region of interest serving as the focal point for analysis.
In the case of the more intricate pattern with a size of 28 × 28, the proposed DMDTC method demonstrates superior reconstruction performance. The reconstruction outcomes of DMDTC show enhanced object contours, increased level of details, more uniform pixel distribution, and reduced noise and artifacts. Conversely, the reconstruction results of the traditional TC-CDI method exhibit unclear contours, higher levels of noise and artifacts, and uneven pixel distribution.
In terms of quantitative analysis, the PSNR and SSIM of the reconstruction results obtained by the proposed DMDTC surpass those obtained using the conventional TC-CDI method. The highest PSNR and SSIM values achieved are 32.89 and 0.9947, respectively. Moreover, the SSIM value of the reconstruction results exceed 0.99, indicating a notable enhancement in image quality and fidelity compared to the traditional TC-CDI approach.
The second experiment involved reconstructing a dynamic target scan to simulate real camera functionality. A binary representation of the character “Nangchang” was created using the Times New Roman font for reconstruction, as illustrated in Fig. 6.
The red box sized 40 × 50 in Fig. 6 symbolizing the FOV $P(x^{\prime})$, each frame moved 11 pixels to the right on the target object. In total, 20 frames were generated. Subsequently, Fourier transform is applied to the 20 images, and the under-sampled data is obtained through the Hadamard product with masks. These 20 under-sampled frequency domain images were then combined into a single snapshot for reconstruction. The reconstruction results and normalized residual obtained through TC-CDI and DMDTC are depicted in Fig. 7.
As depicted in Fig. 7(b1) and (c1), the artifacts present in the reconstructed images generated by DMDTC are minimal, and the pixel intensity closely approximates that of the ground truth (GT). Conversely, the artifacts in the reconstructed images produced by TC-CDI are conspicuous, with residuals distributed extensively throughout the reconstructed results, leading to an uneven pixel distribution in the reconstructed images. This observation signifies a substantial enhancement in the quality of image reconstruction achieved by DMDTC. The comparison of residuals reveals a stark contrast between TC-CDI and GT, as depicted in the first row of Fig. 7(b2). Residuals are evident across nearly the entire target, with some areas exhibiting residuals reaching a value of 1. Conversely, as illustrated in Fig. 7(c2), the residual comparison between DMDTC and GT approaches almost 0, signifying a substantial enhancement in the reconstruction quality achieved by DMDTC.
Quantitative analysis results of the generalization validation are presented in Table 2, with the values achieved by averaging the 20 reconstructed images. The proposed DMDTC has a positive effect on the reconstruction of dynamic frame objects in different dataset, as evidenced by an increase in the average values of PSNR and SSIM providing the great generalization of the proposed DMDTC.
4.4 Experimental validation
Experimental validation is conducted to validate the performance of the proposed DMDTC in practical scenarios. This experiment utilizing real experimental data provided by Chen et al. [14]. A laser with a central wavelength of 780 nm and a spectral linewidth of 50kHz is utilized as the light source. The measurement results encoded through DMD (1024 × 768 pixels, pixel pitch of 13.68 µm) are captured by a camera (1024 × 1280 pixels, pixel pitch of 4.8 µm). The imaging target is the “westlake” character etched on the steel sheet, and the measurement data is a frequency domain image compressed by 20 frames through sparse sampling. The reconstruction results are illustrated in Fig. 8. To facilitate a more precise assessment of the reconstruction quality, the images sized 512 × 512 are cropped to 128 × 128, with the region of interest serving as the center.
As depicted in the red box within Fig. 8, it is evident that the TC-CDI reconstruction image suffers from significant loss of detail, leading to fractures in the contiguous regions and an uneven distribution of pixel intensities throughout the image. Notably, in column 1, the letter “w” experiences substantial information loss under TC-CDI reconstruction, resulting in an uneven distribution of pixel values in the reconstructed image. In contrast, images reconstructed using DMDTC demonstrate heightened sharpness, a more uniform distribution of pixel intensities, and a reduced presence of artifacts. Specifically, in the fourth column, the reconstructed image under TC-CDI exhibits conspicuous fractures. The proposed DMDTC leverages prior information from both the frequency domain and spatial domain to rectify these deficiencies and enhance the quality of reconstruction.
5. Discussion
5.1 Effect of compression ratio on reconstruction quality
To discuss the impact of the number of compressed frames R on the reconstruction results, this work examined the effectiveness and robustness of image reconstruction using DMDTC under different compressive ratios C = 1/R when C = 1/10, 1/12, 1/14, 1/16, 1/18, 1/20. In this experiment, 20 pictures are randomly selected from the MNIST dataset. The 20 images are transformed to frequency domain images, then these images are under-sampled by random masks. When the compression ratio is 1/R, the first R pictures of the 20 images are compressed in order and input into the model for reconstruction. The results for the different compressive ratios are presented in Fig. 9. To improve the visibility and interpretability of the experimental results, the original images sized 512 × 512 were cropped to 128 × 128, with the central region of interest serving as the focal point for analysis.
For the same image, it is evident that the image reconstructed by TC-CDI exhibits good reconstruction quality at the compression ratio of C = 1/10. However, when the compression ratio decreased to C = 1/20, the reconstructed image by TC-CDI is markedly distorted in target shape. This disparity in quality can be ascribed to the heightened loss of pertinent data resulting from excessive amount of information is compressed into a single snapshot, consequently leading to a deterioration in the quality of the reconstructed image. Nonetheless, across varying compression ratios, the reconstructed image of the proposed DMDTC demonstrates minimal deviation from the GT, signifying bolstering its stability and robustness of the proposed DMDTC.
By conducting a quantitative analysis of PSNR and SSIM as depicted in Fig. 9, it becomes apparent that the SSIM value of the reconstructed image using the proposed DMDTC approach closely approximates 0.99, while the PSNR has been enhanced and consistently maintained at a high level, which prove the effectiveness of DMDTC under different compressive ratios.
5.2 Ablation experiment
Ablation experiments are conducted to discuss the effects of frequency domain prior information, spatial prior information and dual-domain prior information on the reconstruction results. The “Nanchang” target used in Section 4.3 is employed for reconstruction, and two additional sets of experiments were conducted.
The reconstruction results are depicted in Fig. 10. The first column showcases the reconstructed outputs obtained through TC-CDI. The second column demonstrates the reconstruction outcomes of information filling on frequency domain images achieved through the integration of TC-TCDI with a spatial domain mean-reverting diffusion model. The third column presents the reconstruction results of denoising and image restoration accomplished through the fusion of TC-TCDI with a frequency domain mean-reverting diffusion model. Lastly, the fourth column exhibits the reconstruction results utilizing DMDTC.
The results above indicate that utilizing the mean-reverting diffusion model to assimilate prior information in either the frequency domain or the spatial domain significantly enhances the image reconstruction quality. Solely employing the spatial domain mean-reverting diffusion model facilitates the removal of certain artifacts, but it fails to rectify losses in the frequency domain, exemplified by the redundant connection in the letter “n” in Fig. 10(c) and the absent connection in the letter “g” in Fig. 10(c). Exclusively employing the frequency domain mean-reverting diffusion model effectively mitigates image loss within the frequency domain such as improved shape of the reconstructed images, but it does not entirely alleviate artifact and intensity issues. As illustrated in Fig. 10(d), the pixel intensity of the letters “n” and “g” markedly deviates from that of the GT, and the artifact in the letter “g” is conspicuous. As depicted in Fig. 10(e), the combined utilization of the dual-domain mean-reverting diffusion model exhibits minimal artifacts and pixel intensity that closely aligns with that of the GT.
In the quantitative analysis of PSNR and SSIM presented in Fig. 10, it is evident that the exclusive utilization of either the frequency domain or spatial mean-reverting diffusion model may yield only marginal improvements in the quality of reconstructed images, and in certain instances, may even result in a deterioration. Conversely, the adoption of the proposed DMDTC markedly elevates both the PSNR and SSIM of the reconstructed images. Notably, the dual-domain mean-reverting diffusion model exhibits superior PSNR and SSIM values compared to the single-domain diffusion model.
6. Conclusion
This study proposes a temporal compressive coherent diffraction imaging combined with dual-domain mean-reverting diffusion model, which termed as DMDTC. Through the utilization of extracted prior information concerning the data distribution in both the frequency and spatial domains of the images, the proposed DMDTC significantly enhances the quality of image reconstruction in comparison to traditional TC-CDI. The results indicate that the reconstructed images using DMDTC exhibit fewer artifacts, and the shapes and pixel intensity are more similar to the GT. In simulative validation, both PSNR and SSIM metrics show significant improvement, the highest PSNR and SSIM is 34.41 dB, 0.9964. The proposed DMDTC also has excellent performance in generalization validation, which proves that it can achieve effective recovery of targets in different dataset. The actual measurement data reconstruction results indicate that the reconstructed images using DMDTC exhibit superior shape and uniform pixel distribution. Compared to the single-domain mean-reverting modeling method, the dual-domain mean-reverting diffusion model exhibits good modeling performance and accurately captures data distribution, which is anticipated to be widely utilized in optical and medical imaging fields. Expanding the proposed DMDTC to other areas such as static machined non-biological samples can be achieved by training the model on datasets specific to those fields. This method allows for the customization of image reconstruction solutions tailored to the characteristics of the data in those particular domains. In future work, investigating the imaging of colored objects using temporal compressive coherent diffraction imaging would be a highly significant topic. Additionally, due to the high computational costs, utilizing diffusion models for image reconstruction still presents challenges. It would be worthwhile to consider incorporating acceleration techniques, such as two-step distillation, in future research to reduce the sampling steps of diffusion models during the image reconstruction process.
Funding
National Natural Science Foundation of China (62105138, 62122033); Nanchang University 2023 College Students Innovation and Entrepreneurship Training (2023CX203).
Disclosures
The authors declare no conflicts of interest.
Data availability
Data underlying the results presented in this paper are available in Ref. [24].
References
1. J. Miao, P. Charalambous, J. Kirz, et al., “Extending the methodology of X-ray crystallography to allow imaging of micrometre-sized non-crystalline specimens,” Nature 400(6742), 342–344 (1999). [CrossRef]
2. M. A. Pfeifer, G. J. Williams, I. A. Vartanyants, et al., “Three-dimensional mapping of a deformation field inside a nanocrystal,” Nature 442(7098), 63–66 (2006). [CrossRef]
3. D. Shapiro, P. Thibault, T. Beetz, et al., “Biological imaging by soft x-ray diffraction microscopy,” Proc. Natl. Acad. Sci. U.S.A. 102(43), 15343–15346 (2005). [CrossRef]
4. G. Popescu, T. Ikeda, R. R. Dasari, et al., “Diffraction phase microscopy for quantifying cell structure and dynamics,” Opt. Lett. 31(6), 775–777 (2006). [CrossRef]
5. P. Marquet, B. Rappaz, P. J. Magistretti, et al., “Digital holographic microscopy: a noninvasive contrast imaging technique allowing quantitative visualization of living cells with subwavelength axial accuracy,” Opt. Lett. 30(5), 468–470 (2005). [CrossRef]
6. Ç Işıl, F. S. Oktem, and A. Koç, “Deep iterative reconstruction for phase retrieval,” Appl. Opt. 58(20), 5422–5431 (2019). [CrossRef]
7. X. Huang, J. Nelson, J. Steinbrener, et al., “Incorrect support and missing center tolerances of phasing algorithms,” Opt. Express 18(25), 26441–26449 (2010). [CrossRef]
8. R. Horisaki, R. Egami, and J. Tanida, “Single-shot phase imaging with randomized light (SPIRaL),” Opt. Express 24(4), 3765–3773 (2016). [CrossRef]
9. F. Zhang, B. Chen, G. R. Morrison, et al., “Phase retrieval by coherent modulation imaging,” Nat. Commun. 7(1), 13367 (2016). [CrossRef]
10. D. L. Donoho, “Compressed sensing,” IEEE Trans. Inf. Theory 52(4), 1289–1306 (2006). [CrossRef]
11. Y. Dou, M. Cao, X. Wang, et al., “Coded aperture temporal compressive digital holographic microscopy,” Opt. Lett. 48(20), 5427–5430 (2023). [CrossRef]
12. L. Wang, M. Cao, Y. Zhong, et al., “Spatial-temporal transformer for video snapshot compressive imaging,” IEEE Trans. Pattern Anal. Mach. Intell. 45(7), 1–18 (2022). [CrossRef]
13. Y. He, Y. Yao, D. Qi, et al., “Temporal compressive super-resolution microscopy at a frame rate of 1200 frames per second and spatial resolution of 100 nm,” Adv. Photon. 5(02), 026003 (2023). [CrossRef]
14. Z. Chen, S. Zheng, Z. Tong, et al., “Physics-driven deep learning enables temporal compressive coherent diffraction imaging,” Optica 9(6), 677–680 (2022). [CrossRef]
15. G. Montavon, W. Samek, and K. R. Müller, “Methods for interpreting and understanding deep neural networks,” Digit. Signal Process. 73, 1–15 (2018). [CrossRef]
16. G. Barbastathis, A. Ozcan, and G. Situ, “On the use of deep learning for computational imaging,” Optica 6(8), 921–943 (2019). [CrossRef]
17. X. Yuan, D. J. Brady, and A. K. Katsaggelos, “Snapshot compressive imaging: Theory, algorithms, and applications,” IEEE Signal Process. Mag. 38(2), 65–88 (2021). [CrossRef]
18. Y. Song and S. Ermon, “Improved techniques for training score-based generative models,” Advances in neural information processing systems 33, 12438–12448 (2020).
19. H. Peng, C. Jiang, J. Cheng, et al., “One-shot Generative Prior in Hankel-k-space for Parallel Imaging Reconstruction,” IEEE Trans. Med. Imaging 42(11), 3420–3435 (2023). [CrossRef]
20. M. Daniels, T. Maunu, and P. Hand, “Score-based generative neural networks for large-scale optimal transport,” Advances in neural information processing systems 34, 12955–12965 (2021).
21. W. Wan, H. Ma, Z. Mei, et al., “Multi-phase FZA lensless imaging via diffusion model,” Opt. Express 31(12), 20595–20615 (2023). [CrossRef]
22. X. Song, G. Wang, W. Zhong, et al., “Sparse-view reconstruction for photoacoustic tomography combining diffusion model with model-based iteration,” Photoacoustics 33, 100558 (2023). [CrossRef]
23. Z. Luo, F. K. Gustafsson, Z. Zhao, et al., “Image restoration with mean-reverting stochastic differential equations,” arXiv, arXiv:2301.11699 (2023). [CrossRef]
24. H. Li, “Dual-domain Mean-reverting Diffusion Model-enhanced Temporal Compressive Coherent Diffraction Imaging,” GitHub (2024) [accessed 9 Apr. 2024], https://github.com/yqx7150/DMDTC.