Full-color optically-sectioned imaging by wide-field microscopy via deep-learning

Chen Bai; Chen Bai; Jia Qian; Jia Qian; Jia Qian; Shipei Dang; Shipei Dang; Tong Peng; Junwei Min; Ming Lei; Dan Dan; Dan Dan; Baoli Yao; Baoli Yao

doi:10.1364/BOE.389852

1. Introduction

Wide-field (WF) illumination and detection is a basic optical imaging task for microscopic samples, which exposes the entire specimen of interest to a light source to deliver a resulting image that may either be observed directly or captured via a camera [1,2]. WF microscopy (WFM) is usually preferred by biologists since it is a low-cost mode of imaging biological specimens with a fast imaging speed and causes minimal photo-damage to living specimens in comparison with the confocal microscopy [3].

However, due to a restriction in the optical imaging mode, WFM collects the light emitted or scattered by objects in the focal plane as well as the light from all illuminated layers of the sample above and below the focal plane. As a result, the acquired images suffer from degraded contrast between the foreground and background voxels due to the out-of-focus light swamping the in-focus information, resulting in a low signal-to-noise ratio (SNR) and poor axial resolution [4]. Although deconvolution techniques [5] are frequently utilized to reverse the out-of-focus blur and recover the images with improved contrast and resolution, the processed result has some limitations and is dependent on detailed parameter inputs [3].

In order to acquire a clear focal-plane image from a thick sample by suppressing the background information or activating only in-focus signals, laser scanning confocal microscopy (LSCM) has gained popularity in the scientific and industrial communities [6]. Point illumination and a spatial pinhole are used to filter out-of-focus light in specimens, which greatly increases the optical axial resolution and contrast. However, LSCM can often be time-consuming for 3D imaging due to its point-by-point scanning mode and introduces the risk of photo-damage to biological samples due to its high laser intensity on a focused spot [7].

An alternative method to extract in-focus information and remove the out-of-focus background is structured illumination microscopy (SIM), which illuminates the sample using structured light with a sinusoidal intensity distribution in one direction, rather than traditional uniform light [8]. This method offers benefits such as fast imaging speed, low photo-damage and good optical compatibility. Many applications have used SIM for time-lapse 3D imaging of living tissues and cellular structures [9].

Currently, SIM for super-resolution (SIM-SR) [8] and for optical sectioning (SIM-OS) [10] are the most popular working modes. Specifically, SIM-SR has been proposed to improve the spatial resolution of microscopy and SIM-OS has been developed to eliminate the out-of-focus background that is encountered in WFM. Both methods have the same optical configuration but use different data processing algorithms. In this paper, we only focus on SIM-OS mode. The basic principle of SIM-OS is to extract the in-focus information and reject the out-of-focus background portion of the image by using structured illumination, usually adopting a three-step phase-shifting method [7]. Since SIM-OS offers the comparable 3D slicing ability as confocal microscopy but with much faster imaging speeds, this method has become one of the most popular 3D optical microscopy methods [7].

Although many of the SIM techniques are promising, most of these methods ignore a key feature for sample identification, i.e., color, which is an important feature in some aspects of biological research [11,12]. Moreover, the variability of color patterns often causes confusion because individuals from the same species can exhibit different colors. These two reasons illustrate that the color (or spectrum) is of great interest for biologists. Therefore, we have previously proposed a full-color SIM (FC-SIM) approach [9,13,14], which is based on the normal SIM-OS method. A digital micro-mirror device (DMD)-based FC-SIM was built, which uses the fringe projection of low-coherence LED-illumination. The full-color 3D imaging was implemented using a color complementary metal oxide semiconductor (CMOS) camera and white light illumination. This method has been shown to offer several performance advantages, including a high resolution, large scale potential, 3D images, natural colors and quantitative analysis capabilities [9]. However, similar to other SIM-OS approaches, the FC-SIM method has a major drawback as it requires high data acquisition when the objects being imaged are thick and large-scale, due to limited in-focus information in each optical section. Compared with FC-WFM, the in-focus depth in FC-SIM is much shorter than the depth-of-field (DOF) of the corresponding WF images.

Recently, many powerful deep-learning methods have been developed which offer promising methods for removing the out-of-focus background information with satisfactory resolution, especially for WF images with lower noise levels and higher imaging depths [15,16]. For instance, convolutional neural networks (CNNs) have been trained to perform autofocusing and phase recovery simultaneously, while improving the imaging depth and imaging reconstruction time [17]. CNN has also been used to estimate the focus distance by formulating the auto-focusing as a classification problem [18]. In parallel to these recent results, a fully connected Fourier neural network incorporating focal stacks with different focal planes and radially-averaged spectral knowledge has also achieved single-shot autofocus [19]. These studies all demonstrate that neural networks can be employed to estimate depth information using deep-learning.

In this work, we propose a full-color optically-sectioned imaging method with a wide-field microscope via deep-learning, named the FC-WFM-Deep method. The final reconstruction result, which we name the OS in DOF, can be directly acquired from the WF data, thereby extensively reducing the data acquisition load for the 3D reconstruction with full color patterns while maintaining a satisfactory imaging quality, which should be very practical in imaging implication. The effectiveness of the proposed method is verified by the testing dataset and experiments on 3D imaging of insects.

The rest of the paper is organized as follows. Section 2 describes the proposed deep-learning-based FC-WFM method, including the background suppression with limited in-focus detection, and the acquisition of the OS in DOF via deep-learning. Section 3 verifies the effectiveness of the proposed method through experimentation with different insects and provides the evaluation results and the corresponding discussion. Section 4 concludes the paper.

2. Methods

2.1 Criterion of the OS in DOF

As shown in Fig. 1(a), the original FC-WFM and FC-SIM can be implemented using a collimated and high-power white light LED (SILIS-3C, Thorlabs Inc.) as the illumination source. The LED light enters the total internal reflected prism (TIR prism) before reflecting onto the DMD (V7000, ViALUX GmbH, Germany). The modulated light then passes through the optical projection system, which includes an achromatic collimating lens, a beam-splitter and an objective lens (20x objective, 0.45 numerical aperture, Nikon Inc., Japan), and projects a sinusoidal fringe pattern onto the specimens [14]. The specimens are mounted on an x-y-z motorized translation stage (Attocube GmbH, Germany). A color CMOS camera (80 fps, 2048×2048 pixels, IDS GmbH, Germany) that is equipped with a tube lens is used to capture the 2D wide-field images. Additionally, timing synchronization to control the DMD, image collection, and stage movement are performed using customized software developed based on C++ code. For each axial plane in z-scanning, three fringe-illuminated raw images with an adjacent phase-shift are captured. The volume data values for the different axial layers are obtained by axial movement of the specimen at different z positions to acquire the 3D light intensity distribution of the specimen. Once imaging of an entire sample with a series of field-of-views has been completed, the large-scale image can also be reconstructed using the image stitching technique.

Fig. 1. (a) Light path diagram for the FC-WFM and FC-SIM microscope. (b) Intensity distribution with respect to the defocused range, which measures the averaged OS strength in the FC-SIM and the WFM intensity.

Download Full Size | PDF

As only four pixels per period of binary pattern are used in the fringe projection by DMD, commonly-used three-step phase-shifting at intervals of 2π/3 cannot be performed. Therefore, alternate three-step phase-shifting at intervals of π/2 is adopted according to the same principle [7]. Consequently, for each slice, three sub-images are acquired, which are denoted as S₀, S_π/2, and S_π, in order.

After acquisition, three raw images with individual phase-shifts are recorded by the color CMOS camera. In contrast with most electronic display products that form images by combining RGB (Red, Green and Blue) light with varying intensities, the data is processed using three HSV (Hue, Saturation, and Value) channels which are spaced separately and recombined into a 3D image with full natural color [13]. HSV is based on the principle of color recognition in human vision in terms of hue, lightness and chroma [20] and imaged data is transformed from RGB color space into HSV color space to obtain the three H, S and V components [9]. For each component, due to the defocusing feature of the high spatial frequency illumination near the focal plane, the merged imaging data can be simply decomposed into two parts, i.e., the in-focus component S_in and the out-of-focus component S_out. Therefore, the required in-focus information of S_{in, p} can be extracted from the captured image triples according to the following process [7]:

(1)$${{\mathbf S}_{in,p}} = \frac{{\sqrt {{{({2{{\mathbf S}_{{\pi \mathord{\left/ {\vphantom {\pi {2,p}}} \right.} {2,p}}}} - {{\mathbf S}_{0,p}} - {{\mathbf S}_{\pi ,p}}} )}^2} + {{({{{\mathbf S}_{0,p}} - {{\mathbf S}_{\pi ,p}}} )}^2}} }}{2},$$

where p = H, S, and V represent the three different components. The sectioned images for the three HSV channels are then recombined and transformed back into the RGB space in order to display the images on devices.

Although the FC-SIM has a relatively faster imaging speed than the LSCM, the 3D reconstruction requires a huge number of OS images due to the limited in-focused information extracted by the calculation, which is also demonstrated as the OS capacity [7]. Specifically, the intensity distribution I(z,v) permitted in the field of a microscope (objective), including the focal plane as the starting surface along the defocusing direction of both sides, can be expressed as [10]:

(2)$${\mathbf I}({z,v} )= 2g(v )\left|{\frac{{{J_1}[{4uv({1 - v} )} ]}}{{4uv({1 - v} )}}} \right|,$$

where u=8z(π/λ)sin²(α/2), α is the half aperture angle of the objective lens, z is the axial range outside the focal plane and λ is the wavelength. J₁(•) represents the first-order Bessel function and g(v) = 1–1.38v+0.0304v²+0.344v³. The OS capacity in SIM-OS can then be estimated by the full width at half maximum (FWHM) of the I(z,v) curve, defined as the OS strength [7] for a fixed spatial frequency v. In addition, it also has been proven that at a given NA, the optimal spatial frequency v of the illumination pattern should be half of the cut-off frequency, which can be used to obtain the best OS effect. For the system described above, the specific I(z,v) value was calculated and is shown in Fig. 1(b), where the theoretical average OS strength is approximately 1.83 µm in our system (shown in black arrows in Fig. 1(b)).

The FC-WFM image can also be simultaneously obtained with:

(3)$${{\mathbf S}_{wf,p}} = \frac{1}{2}({{{\mathbf S}_{0,p}} + {{\mathbf S}_{\pi ,p}}} ).$$

This shows that the FC-SIM image and the FC-WFM image of the sample can be obtained using the same raw data, which is very convenient for the theoretical analysis and comparison. As mentioned previously, the WF image includes both in-focus and defocus information. In order to provide a quantitative demonstration, the intensity curve of the FC-WFM can be estimated from the experimental data [21]. Ten specific points in a vertical slice were firstly selected, and then their intensities collected in different z-scanning positions were averaged and finally fitted to a Gaussian curve, as shown in Fig. 1(b). Additionally, the DOF of FC-WFM can be determined by [22]:

(4)$${d_{\textrm{DOF}}} = \frac{{\lambda n}}{{\textrm{N}{\textrm{A}^2}}} + \frac{n}{{\textrm{M} \cdot \textrm{NA}}}e,$$

where n is the refractive index of the medium, NA is the objective numerical aperture, e is the pixel pitch of the camera and M is the lateral magnification of the objective. Therefore, the DOF is shown to be approximately 3.45 µm for FC-WFM, which matches the FWHM values shown by the blue dashed curve (approximately 3.59 µm with red arrows in Fig. 1(b)). In other words, the FC-SIM can significantly remove the out-of-focus background, resulting in an in-focus depth that is theoretically reduced from 3.45 µm to 1.86 µm. Therefore, FC-WFM-Deep, a new WFM method based on a deep-learning network, is proposed which makes full use of the most effective intensity contained in a significant depth range of around 3.5 µm, in order to extract the OS in DOF and reduce the data acquisition of imaging requirements.

In order to provide a further explanation of the concepts for the proposed method, a schematic representation of the data acquisition is shown in Fig. 2, where the optical imaging mode of FC-WFM, FC-SIM and the proposed FC-WFM-Deep are presented. The common approach used to image a whole specimen is to take multiple images corresponding to different object planes along the z-scanning direction. For the FC-WFM, outside of the relatively large in-focus depth shown on its intensity curve in Fig. 2(a), each imaging slice in the z-stack is a complex combination of in-focus and out-of-focus components. In contrast, FC-SIM can select a focused area in order to reconstruct an image projection that is sharp everywhere, although the in-focus depth becomes much shorter than for FC-WFM, which means a larger number of slices in the z-stack would be required to reconstruct the 3D result. As shown in Fig. 2(c), after the network has been trained in the FC-WFM-Deep method, it can be used to reconstruct the OS in DOF directly from the WFM slice data, which contains the information from a series of focal stacks of FC-SIM images. This reconstructed slice will have a comparable rejection effect on out-of-focus components and a data-analysis capability similar to FC-SIM imaging, whereas the in-focus depth is equivalent to that of FC-WFM, which implies the data acquisition requirements would be greatly reduced both by the absence of phase-shifting and the restored in-focus depth.

Fig. 2. Schematics of data acquisition with corresponding in-focus depth and scanning strategies for different optical imaging modes: (a) FC-WFM imaging, (b) FC-SIM imaging and (c) the proposed FC-WFM-Deep imaging, respectively. It should be noted that the FC-WFM images and the FC-SIM images of the sample were obtained from the same raw data to conveniently provide the theoretical analysis and equal comparison.

Download Full Size | PDF

2.2 Deep-learning-based FC-WFM

Although our FC-WFM-Deep method can predict the OS in DOF from a single WF image, focal stacks with different depths should be collected for training and validation. These focal stacks, processed by the phase-shifting algorithm [7], were collected over a distance of 3.50 µm distributed symmetrically around the true focal plane. The spacing between each focal plane was 0.50 µm, which was selected to be twice as high as the OS strength to achieve extremely precise 3D reconstruction in our previous studies [7,13]. In fact, in order to extend the maximum number of visible details, several methods have been investigated and their efficiency has been proven to reconstruct composite imaging information [23]. Of these methods, the most popular approach used is multi-focus of colorful microscopy images based on multi-scale decompositions [24]. The basic concept of this method is to perform a multi-scale transformation on each source image, and then integrate all decomposition coefficients selected using a particular criterion to produce a composite representation. The composite OS is then finally reconstructed by performing an inverse multi-scale transform. A discrete Cosine transform (DCT)-based fusion technique for composite information of imaging with multi-focus has been evaluated previously [24] and demonstrated great performance in terms of quality and complexity reduction and is used here. Specifically, the following integration rule is used, which consists of selecting the slice with the largest variance in DCT coefficients at each point:

(5)$${\mathbf D}({\tilde{m},\tilde{n}} )= {{\mathbf O}_j}({\tilde{m},\tilde{n};\arg {{\max }_z}|{\sigma_j^2({\tilde{m},\tilde{n};z} )} |} ),$$

where j=1, 2 …, K represents the j-th slice in the stack, O demonstrates the DCT coefficient distribution in a block with center pixels $({\tilde{m},\tilde{n}} )$, and σ² is the corresponding variance. Consequently, the block with the highest activity level is selected as the appropriate block for the fused image. Finally, for consistency verification, the DCT presentation of the fused image is produced using blocks with larger variances from slices at different focal planes z. Additionally, to avoid saturation and false colors, a vector-to-scalar conversion of the multichannel data was also performed [24].

The operating principle of our method is shown in Fig. 3. Briefly, a CNN is designed to formulate an OS in DOF extension model and this model is trained using a test pair consisting of a WF image and a corresponding fused reference, which both have size 2048×2048 pixels. Since using the whole representation as an input to the CNN model will massively increase both the size of the training data and the computational cost, the training data is split into center-to-center sub-pairs. In fact, many other studies have recently demonstrated that one single image containing enough ensembles can successfully train a CNN model [17]. As information at the central point will be affected by the surrounding areas, the side length of the sub-pair is determined based on the point spread function (PSF) [17]. Taking the imaging of a tiger beetle as an example, the intensity values of images were firstly normalized to the range of [0, 1] for preprocessing. Then, the size of the sub-window can be chosen as 5∼15 times the PSF distribution area based on the imaging approach with spatially rotating PSF [25] and the training and reference images can be divided into pairs with dimensions of 180×180 pixels and a step length of 60 pixels. For more precise results, data from five images with different field-of-views (FOVs) of the tiger beetle were jointly used after practical trials. This allows acquisition of several thousand sub-image pairs, which can provide enough training data for the neural network. All the data we have were partitioned into 80% for training, 10% for validation, and 10% for testing, as depicted in Table 1. Note that all the testing images were used for the evaluation of our deep learning-based method and they were not included in the training dataset.

Fig. 3. Schematics of the FC-WFM-Deep imaging system. (a) Training the network using a WF image and a corresponding composite OS reference image, and (b) reconstructing the high-quality output image consisting of the OS in DOF using the trained network. The 3D imaging result can then be acquired with a reduced data acquisition requirement.

Download Full Size | PDF

Table 1. Dataset in FC-WFM-Deep imaging

View Table | View all tables in this article

Once this has been completed, the goal of FC-WFM-Deep is to learn a model f that predicts values Ŷ=f(X), where Ŷ denotes an estimate of the target-fused representation Y and X represents the WF data. Rather than directly outputting the predicted images, residual networks can be easily trained to improve their accuracy, as has been achieved in many previous applications [26]. Therefore, the residual network can converge much faster, while showing superior performance to the standard network where the input and output images are highly correlated [27]. Thus, the residual image, R = Y-X, is trained and predicted in FC-WFM-Deep. For FC-WFM-Deep with depth L, there are three types of layers [28]: the convolutional (Conv) layer, the rectified linear unit (ReLU) layer and the batch normalization (BN) layer, which are all shown in Fig. 3(a). Note that convolution operations are accompanied by zero padding so that the output size is the same as the input size. Finally, given T training sub-image pairs, the averaged mean squared error between the required residual images and the estimated images can be adopted using a loss function to learn the trainable parameters $\Theta $ in FC-WFM-Deep with:

(6)$$\ell (\Theta )= \frac{1}{{2T}}\sum\limits_{i = 1}^T {{{||{f({{{\mathbf X}_i};\Theta } )- {{\mathbf R}_i}} ||}^2}} .$$

For implementation, adaptive moment estimation (ADAM) was employed as an optimizer [17,28]. The momentum and weight decay parameters were set to 0.9 and 0.0001, respectively. In order to train the network more quickly, the learning rate was initially set to 0.1 and then decreased by a factor of 10 every 30 epochs and stopped after 90 epochs. The training took roughly 8 hours on a GPU Titan V. It has been seen previously that as the depth of the network increases, the performance improves rapidly [27]. Therefore, after many trials, layers of 20 with batches of 128 were chosen as the best model to minimize the loss function with an acceptable data size.

In summary, our FC-WFM-Deep method has two main features: (1) a residual learning formula is adopted to learn f(X) and (2) BN and multi-scale learning rates are incorporated to improve training speeds while providing accurate predictive performance [28]. Finally, as shown in Fig. 3(b), to obtain the OS in DOF, the WF image is input into a well-trained model, which subsequently predicts a corresponding high-quality image with a large in-focus composite range.

3. Results and discussions

3.1 Evaluation with the test dataset

A testing dataset containing images of a tiger beetle’s eye, which is a bowl-like object, was produced for performance evaluation of all of the imaging methods being tested. To ensure fair comparison, the spacing for each focal plane was set to 0.5 µm for both FC-WFM and FC-SIM in the dataset, as before in previous section. A typical set of results for this dataset is shown in Fig. 4. Specifically, Fig. 4(a) and (b) show the imaging results of a tiger beetle’s eye, captured at different imaging depths and illuminated by a white light LED. Compared with the FC-WFM, the OS-based FC-SIM can efficiently suppress the background, however the remaining in-focus component only contains a relatively small area compared to the WF data. In contrast, the imaging result of FC-WFM-Deep provides more information within the OS in DOF and a satisfactory imaging quality that is comparable to that of FC-SIM. Furthermore, in terms of computational time, FC-WFM-Deep achieves a very appealing computational efficiency, e.g. it can process an image of 2048×2048 pixels in about 0.11 s.

Fig. 4. Imaging of a compound eye at different depths of (a) 14 µm and (b) 28 µm acquired with FC-WFM method, OS with FC-SIM method and the OS in DOF with proposed FC-WFM-Deep, from left to right, respectively.

Download Full Size | PDF

Moreover, to intuitively and quantitatively evaluate the acquisition of depth information in the FC-WFM-Deep method, the 3D reconstruction for the compounded eye and the corresponding re-slicing images in the x-z direction are shown in the Fig. 5. Specifically, two slices in the x-z direction were randomly selected from the 3D imaging data, which show the details without obvious differences. Furthermore, two profiles across the x-z slices were also randomly selected and compared in Fig. 5(d), (e), (i) and (j). Even though the intensities in the imaging results of FC-SIM are not absolutely identical with that of FC-WFM-Deep method, the slight distinction is acceptable in practice. All these results indicate that the depth information acquired by the FC-WFM-Deep is enough for the 3D reconstruction without details losing.

Fig. 5. Comparison of the 3D reconstruction form (a) FC-SIM and (f) FC-WFM-Deep methods, where two x-z slices are randomly re-selected from the corresponding red (b), (g) and yellow (c), (h) boxes in 3D imaging, respectively. In addition, to show the details in the x-z planes better, two profiles along the white and blue dash lines are respectively shown in (d), (e), (i) and (j) as well.

Download Full Size | PDF

To demonstrate the data-analysis capability of FC-WFM-Deep, precise measurement data (e.g., length, curvature) for ultrastructural aspects (e.g., punctae, scales, setae) of a typical group in the test dataset was calculated. For instance, the maximum intensity projection (MIP) color images of a compound eye are shown and compared for different methods in Fig. 6. Due to the deterioration of the background, FC-WFM visually shows relatively poor imaging quality. For easy and fair comparison, the reconstruction MIP of FC-WFM used the same number of slices as the FC-SIM results, which are shown in Fig. 6(d). It is obvious that this method can achieve more details of the compounded eye than FC-WFM, which is mainly because of the in-focus component extraction. Finally, in terms of FC-WFM-Deep with OS in DOF, the MIP result is practically identical to that of the FC-SIM. In order to quantitatively confirm this, the reconstruction results were evaluated using the peak-signal-to-noise-ratio (PSNR) [29] and structural similarity (SSIM) [30] indexes.

Fig. 6. The MIP imaging results of the compound eye using the three methods: (a) original FC-WFM, (d) FC-SIM, and (g) the proposed FC-WFM-Deep. (b), (e), (h) Corresponding 3D height map in the region of interest randomly selected within the red dashed box and (c), (f), (i) the profile along the white dashed lines are also compared. It should be noted that the reconstruction MIP of FC-WFM uses the same number of slices as the FC-SIM results, i.e., 161 slices, to ensure the comparison is convenient and fair.

Download Full Size | PDF

Specifically, the PSNR and SSIM comparing the proposed FC-WFM-Deep method and the original SIM of the compound eye imaging were found to reach 37.25 dB and 0.95, respectively (The larger the value of both PSNR and SSIM, the smaller the distortion [29]); the results are summarized in Table 2. However, the original SIM method involved 161 slices with 3 step phase-shifting, i.e., 483 processed images were used, compared to only 23 WF images to reconstruct the MIP image with FC-WFM-Deep, i.e., 23 calculated images were used. This is a reduction in data size of a factor of 21. This demonstrates that our proposed FC-WFM-Deep method can achieve an imaging quality comparable to FC-SIM directly using the WF data but with lower data acquisition requirements.

Table 2. Quantitative Comparison between the FC-SIM and FC-WFM-Deep Methods

View Table | View all tables in this article

Additionally, as small distortion in the depth information of FC-WFM-Deep that has been witnessed and validated in Fig. 5, regions of interest (ROIs) were randomly selected as shown by the red dashed box in Fig. 6(a), (d), and (g) to quantify the spatial resolution, where the height maps were estimated using the measured intensity data in three dimensions with a height map extraction algorithm [9], as shown in Fig. 6(b), (e) and (h), respectively. It is obvious that since the out-of-focus information severely influences the reconstruction result, the structure of the ommatidia can only be roughly distinguished in the 3D height map of WFM. In contrast, the details become more distinct in the FC-SIM and FC-WFM-Deep maps, which has also been validated by our previous study [9]. The results correspond to the profiles along the white dashed line for the different results, while the WFM profile can merely confirm the number of ommatidia in Fig. 6(c). In contrast, although the profile from the FC-WFM-Deep reconstruction shown in Fig. 6(i) is slightly smoother than the original FC-SIM shown in Fig. 6(f) due to the Gaussian filtering effects in the extraction algorithm, it still indicates that the radius of curvature for a single ommatidia is about 13 µm, while the FC-WFM-Deep result is approximately 1.75 mm for the whole compounded eye, which is also exactly the same as the result for the original FC-SIM. Visualization 1 presents the “3D color images” of the compound eye after 3D reconstruction.

In order to provide a better detailed demonstration of the reconstruction differences between FC-SIM and FC-WFM-Deep, Fig. 7(a) and (b) show the estimated errors of the profile along the white dashed line in Fig. 6 and the entire 3D height map, respectively. It can be seen that the approximations processed by the proposed method have minimal errors with the FC-SIM results and the average divergences of both the selected line and the total plane approach zero. For further quantitative assessment of these estimation errors, the mean squared error (MSE) [31] was chosen as the error measure.

Fig. 7. The estimated errors between the FC-SIM and the FC-WFM-Deep of (a) the profiles, i.e., the difference along the white dashed line in Fig. 6, and (b) the entire 3D height maps approximated with different imaging results.

Download Full Size | PDF

The MSEs for the compound eye imaging are calculated and summarized in Table 2. It can be seen that the average MSE of the profile is approximately 2.1×10⁻³ while the entire height map is approximately 1.9×10⁻³, which illustrates that the data results from the proposed FC-WFM-Deep method can optimally approximate the analytical results with high resolution

3.2 Experiments with a real imaging example

To verify the applicability of our method to different specimens, we imaged two different samples, i.e. a shining leaf chafer (Mimela sp.) and a leaf beetle (Clitenella fulminans, Faldermann, 1835), respectively, and performed reconstructions using the different imaging methods. Considering the different imaging features in individual specimens, the networks were trained separately for each type of sample. The spacing for each focal plane in FC-SIM was also 0.5 µm. For the detailed comparison, we only focused on a small region near the center of an elytron. Figure 8 shows the MIP color images using different imaging methods, which shows that the FC-WFM results are unsatisfactory. The results evidently show that the imaging quality can be greatly enhanced by FC-SIM. However, the FC-WFM-Deep result clearly shows that the elytra in dorsal view are composed of a number of microstructures (punctate) with slight differences in size, shape and color. The presence of such microstructures exerts an indirect influence on their color, resulting in the overall color that we can observe visually. These imaging results illustrate that the FC-WFM-Deep method can exhibit its color imaging capability with less data, i.e., 21 times smaller than the original FC-SIM, without loss of detail.

Fig. 8. Results of MIP images of two types of beetles, a shining leaf chafer in the top row and a leaf beetle in the bottom row, with (a), (d) FC-WFM, (b), (e) FC-SIM and (c), (f) the proposed FC-WFM-Deep, respectively. FC-WFM and FC-SIM used the same number of slices, i.e., 160 slices, to reconstruct their MIP results.

Download Full Size | PDF

Additionally, corresponding 3D height maps in the randomly-selected region of interest (shown by the white dashed box in Fig. 8) are also estimated and shown in Fig. 9. As mentioned previously, since the 3D height map for the WFM reconstruction suffers from distortion resulting in a great number of out-of-focus components, only the height maps of FC-SIM and FC-WFM-Deep are compared here so that the conclusion will be more meaningful and easier to interpret. Consequently, the 3D height map estimated from the FC-WFM-Deep result is very similar to that from the FC-SIM, for the imaging of both types of beetles.

Fig. 9. Corresponding 3D height maps for the randomly selected ROI with the white dashed box in Fig. 8 for (a), (d) FC-SIM and (b), (e) FC-WFM-Deep. The estimated errors between the FC-SIM and the FC-WFM-Deep of the entire 3D height maps approximated with different imaging results, respectively of (c) the leaf chafer in the top row and (f) the leaf beetle in the bottom row.

Download Full Size | PDF

In terms of the image quality assessment between FC-SIM and FC-WFM-Deep, the results show that the FC-WFM-Deep achieves a PSNR of approximately 40.66 dB with the leaf chafer and approximately 35.80 dB with the leaf beetle. In addition, the SSIMs between the FC-WFM-Deep and the FC-SIM imaging reached to 0.96 and 0.92 respectively for the leaf chafer and leaf beetle, which can be seen in the Table 2 as well. Although the FC-WFM-Deep calculation has only one thirtieth of the number of slices as the original SIM, the bulge and cavity details shown in the height maps validate the effectiveness of the proposed method. Visualization 2 presents the “3D color images” of the leaf chafer after 3D reconstruction.

Similarly, the estimation errors between FC-SIM and our proposed FC-WFM-Deep method are also shown in Fig. 9, which are slightly more intuitive. Except for a few singularities, there are minimal differences between the height maps approximated using the results of these two methods. To quantify this conclusion, the MSEs are also calculated and compared in Table 1. The average MSE for reconstruction of the height map for these two imaging experiments is only approximately 1.9×10⁻³, thereby our proposed method has practical research value.

All these experimental results demonstrate that FC-WFM-Deep is not a purely academic WFM imaging algorithm but it can handle large-scale imaging data with natural color. A CNN model with an appropriate training dataset plays a key role in many imaging applications. However, the FC-WFM-Deep does have a major limitation. It is currently restricted to the same specimen types and a specific FC-SIM system. In other words, when the imaging condition is changed, e.g. the specimen or the underlying system is adjusted, the FC-WFM-Deep model requires retraining for the changes in imaging parameters. Further obtaining a general network for various conditions is possible in theory and extremely valuable. Nevertheless, training such model needs a huge number of samples with different characteristics and significantly increases the training complexity and cost. In the future research, the FC-WFM-Deep will be extended to handle complex targets and adapted to flexible systems.

4. Conclusion

In summary, we have presented a residual-learning framework for high quality FC-WFM-based imaging reconstruction. The proposed FC-WFM-Deep architecture fully exploits the unique high resolution and full-color capability of FC-SIM and can be trained using a single FC-WF frame and then generalized with a FC-WFM experiment. The results show that high-quality images with OS in DOF and full-color can be directly acquired from the WF image. The images have a comparable imaging quality to FC-SIM imaging in terms of 3D reconstruction and data-analysis capability with spatial resolution and dimensions. However, the advantage to our method is that the data requirement in reconstruction of each 3D color image can be 21-fold less than the FC-SIM for specific specimens in a certain system. Overall, this technique significantly improves the imaging throughput of an imaging system by extracting OS in DOF and reducing data acquisition without loss of detail. The technique has the potential for wide applications in WFM to gather large-scale spatial and temporal information in a storage- and computation-efficient manner.

Funding

National Natural Science Foundation of China (61905277, 91750106, 61705256, 81427802); China Postdoctoral Science Foundation (2019M663849); Key Research and Development Program of Shaanxi Province (2020GY-008).

Disclosures

The authors declare no conflicts of interest.

References

1. P. J. Shaw, “Comparison of widefield/deconvolution and confocal microscopy for three-dimensional imaging,” In Handbook of Biological Confocal Microscopy, 453–467. Springer (2006).

2. W. A. Carrington, K. E. Fogarty, L. Lifschitz, and F. S. Fay, “Three-dimensional Imaging on Confocal and Wide-field Microscopes,” In Handbook of Biological Confocal Microscopy, 151–161. Springer (1990).

3. S. Boorboor, Jadhav, M. Ananth, D. Talmage, Role, and A. Kaufman, “Visualization of neuronal structures in wide-field microscopy brain images,” IEEE Trans. Visual. Comput. Graphics 25(1), 1018–1028 (2019). [CrossRef]

4. M. Wilson. “Introduction to widefield microscopy,” https://www.leica-microsystems.com/sciencelab/introd-uction-to-widefield-microscopy/ (2017).

5. P. Sarder and A. Nehorai, “Deconvolution methods for 3-D fluorescence microscopy images,” IEEE Signal Process. Mag. 23(3), 32–45 (2006). [CrossRef]

6. T. Wilson, R. Juškaitis, and J. B. Tan, “Differential imaging in confocal microscopy,” J. Microsc. 175(1), 1–9 (1994). [CrossRef]

7. D. Dan, B. Yao, and M. Lei, “Structured illumination microscopy for super-resolution and optical sectioning,” Chin. Sci. Bull. 59(12), 1291–1307 (2014). [CrossRef]

8. M. G. L. Gustafsson, “Surpassing the lateral resolution limit by a factor of two using structured illumination microscopy,” J. Microsc. 198(2), 82–87 (2000). [CrossRef]

9. J. Qian, S. Dang, Z. Wang, X. Zhou, D. Dan, B. Yao, Y. Tong, H. Yang, Y. Lu, Y. Chen, X. Yang, M. Bai, and M. Lei, “Large-scale 3D imaging of insects with natural color,” Opt. Express 27(4), 4845–4857 (2019). [CrossRef]

10. M. A. A. Neil, T. Wilson, and R. Juškaitis, “A light efficient optically sectioning microscope,” J. Microsc. 189(2), 114–117 (1998). [CrossRef]

11. F. Liu, B. Q. Dong, X. H. Liu, Y. M. Zheng, and J. Zi, “Structural color change in longhorn beetles Tmesisternus isabellae,” Opt. Express 17(18), 16183–16191 (2009). [CrossRef]

12. E. Shevtsova, C. Hansson, D. H. Janzen, and J. Kjærandsen, “Stable structural color patterns displayed on transparent insect wings,” Proc. Natl. Acad. Sci. U. S. A. 108(2), 668–673 (2011). [CrossRef]

13. J. Qian, M. Lei, D. Dan, B. Yao, X. Zhou, Y. Yang, S. Yan, J. Min, and X. Yu, “Full-color structured illumination optical sectioning microscopy,” Sci. Rep. 5(1), 14531 (2015). [CrossRef]

14. D. Dan, M. Lei, B. Yao, W. Wang, M. Winterhalder, A. Zumbusch, Y. Qi, L. Xia, S. Yan, Y. Yang, P. Gao, T. Ye, and W. Zhao, “DMD-based LED-illumination super-resolution and optical sectioning microscopy,” Sci. Rep. 3(1), 1116 (2013). [CrossRef]

15. Y. Wu, Y. Rivenson, H. Wang, Y. Luo, E. Ben-David, L. A. Bentolila, C. Pritz, and A. Ozcan, “Three-dimensional virtual refocusing of fluorescence microscopy images using deep learning,” Nat. Methods 16(12), 1323–1331 (2019). [CrossRef]

16. X. Zhang, Y. Chen, K. Ning, C. Zhou, Y. Han, H. Gong, and J. Yuan, “Deep learning optical-sectioning method,” Opt. Express 26(23), 30762–30772 (2018). [CrossRef]

17. Y. Wu, Y. Rivenson, Y. Zhang, Z. Wei, H. Günaydin, X. Lin, and A. Ozcan, “Extended depth-of-field in holographic image reconstruction using deep learning based auto-focusing and phase-recovery,” Optica 5(6), 704–710 (2018). [CrossRef]

18. Z. Ren, Z. Ren, and E. Y. M. Lam, “Autofocusing in digital holography using deep learning,” Three-dimensional and Multidimensional Microscopy: Image Acquisition and Processing XXV (SPIE, 2018).

19. H. Pinkard, Z. Phillips, A. Babakhani, D. A. Fletcher, and L. Waller, “Deep learning for single-shot autofocus microscopy,” Optica 6(6), 794–797 (2019). [CrossRef]

20. H. D. Cheng, X. Jiang, Y. Sun, and J. Wang, “Color image segmentation: advances and prospects,” Pattern Recogn. 34(12), 2259–2281 (2001). [CrossRef]

21. Z. Xie, Y. Tang, X. Liu, J. Liu, Y. He, and S. Hu, “Profilometry with enhanced accuracy using differential structured illumination microscopy,” IEEE Photonics Technol. Lett. 31(13), 1017–1020 (2019). [CrossRef]

22. K. R. Spring and M. W. Davidson, “Depth of field and depth of focus,” MicroscopyU of Nikon, https://www.microscopyu.com/microscopy-basics/depth-of-field-and-depth-of-focus.

23. B. Forster, D. V. D. Ville, J. Berent, D. Sage, and M. Unser, “Complex wavelets for extended depth-of-field: a new method for the fusion of multichannel microscopy images,” Microsc. Res. Tech. 65(1-2), 33–42 (2004). [CrossRef]

24. M. B. A. Haghighat, A. Aghagolzadeh, and H. Seyedarabi, “Multi-focus image fusion for visual sensor networks in DCT domain,” Comput. Electr. Eng. 37(5), 789–797 (2011). [CrossRef]

25. Z. Wang, Y. Cai, Y. Liang, X. Zhou, S. Yan, D. Dan, P. R. Bianco, M. Lei, and B. Yao, “Single shot, three-dimensional fluorescence microscopy with a spatially rotating point spread function,” Opt. Express 8(12), 5493–5506 (2017). [CrossRef]

26. K. He, X. Zhang, S. Ren, J. Sun, and M. Research, “Deep residual learning for image recognition,” Conf. Comput. Vis. Pattern Recognit. (CVPR), (IEEE, 2016).

27. J. Kim, J. Lee, and K. Lee, “Accurate image super-resolution using very deep convolutional networks,” Conf. Comput. Vis. Pattern Recognit. (CVPR), (IEEE, 2016).

28. K. Zhang, W. Zuo, Y. Chen, D. Meng, and L. Zhang, “Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising,” IEEE Trans. on Image Process. 26(7), 3142–3155 (2017). [CrossRef]

29. U. Sara, M. Akter, and M. S. Uddin, “Image quality assessment through FSIM, SSIM, MSE and PSNR-a comparative study,” J. Comput. Commun. 07(03), 8–18 (2019). [CrossRef]

30. Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Trans. on Image Process. 13(4), 600–612 (2004). [CrossRef]

31. O. V. Michailovich and D. Adam, “A novel approach to the 2-D blind deconvolution problem in medical ultrasound,” IEEE Trans. Med. Imaging 24(1), 86–104 (2005). [CrossRef]

Name	Description
Visualization 1	Visualization 1 presents the “3D color images” after 3D reconstruction of the compound eye.
Visualization 2	Visualization 2 presents the “3D color images” after 3D reconstruction of the shining leaf chafer.

	Number of images	Proportion	Imaging subjects
Training set	2800	80%	Sub-pairs from five images of hair, rough surface, leg, etc. of a tiger beetle
Validation set	350	10%
Testing set	350	10%	Compounded eye and rough surface of a tiger beetle with different FOVs

		Compound eye	Leaf chafer	Leaf beetle
SSIM for MIP image		0.95	0.96	0.92
PSNR for MIP image (dB)		37.25	40.66	35.80
MSE for spatial analysis	Profile	2.1×10⁻³	×	×
MSE for spatial analysis	Height map	1.9×10⁻³	1.8×10⁻³	2.0×10⁻³

	Number of images	Proportion	Imaging subjects
Training set	2800	80%	Sub-pairs from five images of hair, rough surface, leg, etc. of a tiger beetle
Validation set	350	10%
Testing set	350	10%	Compounded eye and rough surface of a tiger beetle with different FOVs

		Compound eye	Leaf chafer	Leaf beetle
SSIM for MIP image		0.95	0.96	0.92
PSNR for MIP image (dB)		37.25	40.66	35.80
MSE for spatial analysis	Profile	2.1×10⁻³	×	×
MSE for spatial analysis	Height map	1.9×10⁻³	1.8×10⁻³	2.0×10⁻³

Full-color optically-sectioned imaging by wide-field microscopy via deep-learning

Abstract

1. Introduction

2. Methods

2.1 Criterion of the OS in DOF

2.2 Deep-learning-based FC-WFM

3. Results and discussions

3.1 Evaluation with the test dataset

3.2 Experiments with a real imaging example

4. Conclusion

Funding

Disclosures

References

Supplementary Material (2)

Cited By

Figures (9)

Tables (2)

Equations (6)

Biomedical Optics Express