Real-time optical 3D reconstruction based on Monte Carlo integration and recurrent CNNs denoising with the 3D light field display

Yuanhang Li; Xinzhu Sang; Shujun Xing; Yanxin Guan; Shenwu Yang; Duo Chen; Le Yang; Binbin Yan

doi:10.1364/OE.27.022198

1. Introduction

Three-dimensional (3D) image with depth cues attracts more attentions of both industry and academia. Among them, 3D light field display with high resolution, wide viewing angle, full parallax, and proper space occlusion is the most promising way to evoke the audience at a great sense of true presence and immersion [1–8]. Integral imaging (InIm) is a kind of popular 3D light field display proposed by Lippman in 1908 [2,3,9], which can provide full-parallax and continuous-view 3D image on 3D light field display without convergence-accommodation conflict and any auxiliary devices.

To generate interactive 3D image, numerous elemental image array (EIA) rendering methods were presented, e.g., multiple viewpoints rendering (MVR), lens based rendering (LBR), and backward ray-tracing (BRT) [10–12]. LBR and BRT are the common methods for the 3D light field display. LBR realizes real-time generation of InIm with rasterization, whereas its rendering efficiency is very low with dense lens arrangements. The BRT method we proposed earlier can generate EIA at real-time, but global illumination is missed and 3D image quality is not high.

Realism is the primary goal in computer graphics [13]. Global illumination could be achieved with Monte Carlo (MC) method [14,15]. However, the indirect illumination calculation in MC rendering leads to noisy images at low sampling rate [16–19]. In real-time path tracing, limited sampled rate is used. The difference between the sampled value and the real value is expressed as the 3D image noise [20]. High sampling rate can produce reliable noise-free results, but lots of iteration overhead is required. To resolve that contradiction, Kalantari et al. (2015) developed neural networks in cross-bilateral filter for MC sequences denoising [19]. Chaitanya et al. (2017) introduced recurrent autoencoder filter to implement interactive MC image reconstruction [21]. In their work, a series of noisy images at low sampling rate and corresponding full sampled target images are used to train the denoising filter. Then, the trained denoising filter is used to convert the noise-input into noise-free graph.

However, previous studies on high quality graph generation are focused on two-dimension. They failed to conduct in-depth research on realistic 3D image reconstruction. To generate high quality 3D image on interactive basis, directional path tracing (DPT) is proposed in here, which is based on MC method and recurrent convolutional neural networks (CNNs) [13,19]. In DPT, path-traced MC integration guarantees the quality of 3D image, and recurrent CNNs filter accelerates the convergence process of the integral. These processing modules ensure fast generation of the verisimilar 3D image. Instead of further improving EIA quality through a large number of path sampling in traditional MC method, EIA is generated at acceptable quality with far fewer sampling by adopting recurrent CNNs filter. It can be regarded as a post-processing for encoded EIA to realize noise-free 3D image. Based on GPU parallel processing mechanism, interactive InIm at very low sampling budgets is realized. It consistently outperforms the high quality for the real-time 3D light field display.

With the proposed DPT method, interactive optical 3D reconstruction with global illumination at low sampling rate is achieved. Reconstruction efficiency is only related to the screen resolution, and real-time frame rate is more than 30 fps at 4K resolution. The structural similarity metric (SSIM) and Peak Signal to Noise Ratio (PSNR) of the 10 th frame and full sampled target image is over 0.9 and 25 dB, respectively. In comparison to MVR and LBR, DPT is independent of the number of viewpoints and optical structure. Compared with BRT method, DPT is with the advantage in enhancing the realism of 3D image. To the best of our knowledge, our method is an optimized solution for the real-time lens-based 3D light field display.

2. Principle

2.1. Monte Carlo integration

Monte Carlo integration is a method to estimate the values of integrals by using random path sampling. Unlike rasterization-based method, path tracing is an unbiased way to generate photo-realistic graph by using a more physics-like approach to simulate light field distribution [22]. Veach et al. (1997) indicated that in MC integration, each pixel color $u (x)$ could be estimated by computing a transport measure over each individual path in Eq. (1) [23,24]. Under Eq. (1) path integral formulation, $u (x)$ is given by the integral of all possible ray paths $P$ . To discretize the integral computation, pixel $x$ ’s value is estimated by all sampling paths $P_{x}^{1}$ ,..., $P_{x}^{n_{x}}$ , then pixel color $u (x)$ is derived by an appropriate of MC sampling procedure. If $c_{x}^{j}$ denotes the color transported by random path $c_{x}^{j} (c_{x}^{j} = f (P_{x}^{1}))$ , MC approximation of $\tilde{u} (x)$ can be computed with Eq. (2). As shown in Figs. 2 (b) and (c), with sampling rate increasing, estimated value of $\tilde{u} (x)$ is getting closer and closer to the physical realistic value of $u (x)$ .

u (x) = \int_{Ω_{x}} f (P) d μ (P)

\tilde{u} (x) = \frac{1}{n_{x}} \sum_{j = 1}^{n_{x}} c_{x}^{j}

f_{r} (p, ω_{o}, ω_{i}) = \frac{d L_{o} (p, ω_{o})}{L_{i} (p, ω_{i}) \cos θ_{i} d ω_{i}}

As shown in Fig. 1, each screen pixel is individually assigned a thread to perform recursive ray path sampling. If the sampled ray intersects with scene primitives under the maximum recursion depth, the BRDFs sampling in Eq. (3) is used to perform path integral [14,17]. As illustrated in Figs. 2(a) and 2(b), with sampling rate increasing, rendering quality of the Cornell box illumination gets closer to fully sampled target result.

Fig. 1 Schematic diagram of ray path tracing.

Download Full Size | PDF

Fig. 2 MC path-traced Cornell box at different sampling rate ( $S S I M (a, c) = 0.534713$ , $S S I M (b, c) = 0.957433,$ $P S N R (a, c) = 15.4247 d B,$ $P S N R (b, c) = 38.2147 d B) .$ (a) Sparsely sampled results at 20 sample/pixel (spp). (b) Fully sampled target image. (c) Reconstructed sparsely sampled results with recurrent CNNs denoising filter. (d) Partial rendering results.

Download Full Size | PDF

2.2. Optical structure design for the demonstrated DPT method

3D light field display optical structure is depicted in Figs. 3(a) and 3(b), which is used for the demonstration of proposed DPT method. The optical layers are mainly composed of lens array, and its function is to redistribute pixels emitting lights into relevant spatial field. Loading the EIA on liquid crystal panel, the sampled light field is reconstructed in image plane after being modulated by the optical layer. The optical layer lens arrangement rule follows compact honeycomb type. Such a structure can make full use of screen pixels to present the optimal viewing experience at visual zone [23,25]. As shown in Fig. 3(c), each lens of the optical layer covers a regular hexagonal elemental image (EI) to bear all viewpoints’ partial scene information. The existence of the optical layer allows only portion of pixels to be visible when the viewers observe in viewing zone.

Fig. 3 Overview of 3D light field display. (a) The schematic of honeycomb arrangement 3D light field display. (b) The construction of optical layers. (c) Single hexagonal EI.

Download Full Size | PDF

2.3. Encoding algorithm of directional path tracing

DPT is implemented on the honeycomb arrangement based 3D light field display. As shown in Fig. 4(e), a set of discrete virtual cameras are arranged in virtual camera plane to construct the correspondence of viewpoints and EI pixels. Virtual cameras in viewpoints are used to capture different perspective information. The mapping rule of viewpoints and pixels is based on the principle of imaging display in lens pinhole model. As shown in Fig. 4, EI of current pixel is firstly determined based on its screen coordinates, and then the viewpoint of current pixel is obtained according to the relative positional relationship between the current pixel and the EI. Finally, the pixel value is calculated by launching directional ray from viewpoint to current pixel for path integral.

Fig. 4 The principle of DPT encoding algorithm. (a) Screen pixel segmentation. (b) The segmentation unit to which the current pixel belongs. (c) Diagram of the regular hexagon EI division method. (d) The correspondence between EI’s pixels and viewpoints. (e) DPT emits directional rays from pixel’s virtual camera for pixel shading integral.

Download Full Size | PDF

As shown in Figs. 4(a) and 4(b), for an arbitrary screen pixel $p (x, y),$ we use Eqs. (4) and (5) to divide the screen into staggered rectangles and determine the rectangular region of $r e c (x, y)$ where $p (x, y)$ is located. In $r e c (x, y),$ its left upper half yellow triangle zone belongs to the cellular in the upper left corner, and the green triangle area of the lower left part belongs to the lower left cellular. In view of this, pixel $p (x, y)'s$ EI cellular center $c e n t e r (x, y)$ can be calculated with Eqs. (6)–(8). Here, $r$ is the cellular radius and $d (x, y)$ is the pixel pitch. Equation (8) is used to determine the location of $p (x, y)$ in the segmentation area of Fig. 4(b). Then, $p (x, y)'s$ EI local coordinate index in current hexagon $e l e m e n t (i, j)$ can be acquired from Eq. (9). At last, Eq. (10) is used to derive $p (x, y)$ corresponding viewpoint index $v i e w_{n u m} (x, y) .$ $d i s (x, y)$ is the viewpoint interval and ${\vec{P}}_{c a m} (0, 0, z_{0})$ is the central view plane position.

p i t c h (x, y) = (1.5 \times r, \sqrt{3} \times r)

c e l l (i, j) = (p - (0, \frac{f l o o r (p_{x} / p i t c h_{x}) % 2}{2.0})) % p i t c h - (0, \frac{p i t c h_{y}}{2.0})

p o s (x, y) = f l o o r ((p - (0, \frac{f l o o r (p_{x} / p i t c h_{x}) % 2}{2.0})) / p i t c h) \times p i t c h + (r, \frac{\sqrt{3} \times r}{2.0})

s h i f t (x, y) = {\begin{cases} (0, 0), i f c e l l_{i} \geq p i t c h_{x} / 3.0 & | c e l l_{j} | < (\sqrt{3} \times c e l l_{x}) \\ (- p i t c h_{x}, p i t c h_{y}), i f c e l l_{i} < p i t c h_{x} / 3.0 & c e l l_{j} \geq 0 \\ (- p i t c h_{x}, - p i t c h_{y}), i f c e l l_{i} < p i t c h_{x} / 3.0 & c e l l_{j} < 0 \end{cases}

c e n t e r (x, y) = p o s (x, y) + s h i f t

Once pixel encoding is completed, the corresponding relation of between the pixel $p (x, y)$ and the viewpoint $v i e w_{n u m} (x, y)$ can be achieved. For 3D scene rendering, ray $\vec{R_{0}}$ is defined by its origin $\vec{O_{0}}$ and direction $\vec{d i r},$ which can be derived using Eqs. (11) and (12) according to the characteristics of the 3D light field display. Here, ${\vec{P}}_{w}$ is the world coordinate of the pixel $p (x, y)$ .

e l e m e n t (i, j) = p - c e n t e r

v i e w_{n u m} (x, y) = - \frac{e l e m e n t}{d} \times d i s

\vec{O_{0}} (x, y, z) = (v i e w_{n u m}, 0) + {\vec{P}}_{c a m}

\vec{d i r} (x, y, z) = \vec{O_{o}} - {\vec{P}}_{w} / ‖ \vec{O_{o}} - \vec{P_{w}} ‖

Since current pixel’s initial ray $\vec{R_{0}}'s$ origin $\vec{O_{0}}$ and direction $\vec{d i r}$ have been acquired, $\vec{R_{0}}$ is launched to carry out path tracing. It simulates the process of corresponding pixel luminescence and imaging in 3D light field display. After completing path tracing, the MC method in Part 2.1 is applied to integrate tracing results into output buffer. Since the path of the ray is calculated in strict accordance with the process of 3D imaging, the EIA generated by DPT can fully sample and record the light field of the original scene. As shown in Fig. 5, with the progress of MC integration, the rendering results are optimized and a realistic 3D image is presented.

Fig. 5 DPT eliminates the EIA noise with continuously MC sampling. (a) The model of Buddha, (b) EIA with different sampling rate. (c) Reconstructed 3D image at different sampling rate. (see Visualization 1 and Visualization 2).

Download Full Size | PDF

It should be noted that our method is not only suitable for honeycomb-arranged lens array, but also suit other types of arrangement rules (e.g., rectangular arrangements and other irregular arrangements). We only need to assign pixels to corresponding viewpoints according to the structure of the EI covered by the lens and then perform directional path tracing, and the rendered EIA can be output directly.

2.4. 3D image reconstruction with recurrent CNNs filter

To accelerate the convergence speed of MC integration, recurrent CNNs is used for image filtering. As shown in Fig. 5(b), EIA is consisted of EIs that structure result in the discontinuity between adjacent EIs. While ensuring continuity between EIA sequences, as high as possible image resolution should be adopted for a batter viewing quality and interactive experience. These requirements determine the denoisng filter having the following characteristics: (1) preserving the edges while removing the noise, (2) working well for super-resolution, (3) stabilization within frame sequences. As shown in Fig. 6, the denoising filter of recurrent CNNs filter has recurrent convolutional blocks in encoder convolutional layer and hierarchical skip connections between encoder and decoder stages, which provides a sufficient temporal consistency within frames. The U-net structure guarantees the noise-free efficient [26]. Besides, full convolutional network structure facilitates the processing of super-resolution targets and removes spikes while preserving edges [27]. Therefore, the recurrent CNNs filter is elected as the denoising filter to reconstruct the 3D image.

Fig. 6 Neural network architecture of recurrent CNNs filter [21].

Download Full Size | PDF

{\hat{θ}}_{i} = g (x_{i})

L = W_{s} \times L_{s} + W_{g} \times L_{g} + W_{t} \times L_{t}

As shown in Fig. 6, the architecture of the recurrent CNNs can be divided into encoder and decoder two parts. In encoder part, the input image is down-sampled constantly, important high-dimensional features are preserved and the noise information is gradually lost. While in decoder part, the filtered information in the corresponding layer is reintroduced through the skip connection of the high-speed channel. The network layer dependencies $g$ in Eq. (13) uses features $x_{i} = {x_{1}, x_{2}, \dots, x_{N}}$ (e.g. albedo, normal and material) to form the approximate filter parameters ${\hat{θ}}_{i} .$ The training data sets can be acquired by using masses of pre-rendered EIA sequences and secondary features into the recurrent CNNs denoising filter. When the average training loss of Eq. (14) is fully converged, the neural network is constructed. Then, the denoising filter can be used to denoise the DPT’s output. Secondary features are recorded at the first frame and denoising process is begun in the fourth frame. Rendered EIA sequences and secondary features $x_{i}$ are used as input of the denoising filter for noise elimination. The filter automatically constructs the filter parameters ${\hat{θ}}_{i}$ from the input and training data sets. Loading the denoised EIA onto the 3D light field display, a noise-free 3D image is reconstructed.

As shown in Fig. 2, the SSIM of reconstructed result at 20 spp and target image at 6400 spp is 0.957433, which indicate the efficiency of the denoising filter. The noisy-free EIA and corresponding interactive 3D image of Dragon are shown in Fig. 7. These results manifest that the filter has powerful effect in accelerating 3D image drawing.

Fig. 7 (a) The Dragon with White Porcelain BRDFs material. (b) The contrast of Noise-free filter introduced before and after EIA rendering results. (c) Denoised 3D image at diverse perspectives. (see Visualization 3).

Download Full Size | PDF

DPT method flowchart is illustrated in Fig. 8. Directional sampling ray’s origin and direction are derived according to the 3D light field display optical layer structure. Then, the directional rays are launched in GPUs to intersect with the scene for MC integration. Next, MC integration results are transferred into denoising filter for post-processing. The filter utilizes the input noise image and scene features to determine different layer minimize metric $g$ . With continuous sampling and denoising, a noise-free 3D image with global illumination is presented.

Fig. 8 Flowchart of DPT method.

Download Full Size | PDF

3. Experimental results

3.1. Experimental configurations and Result

The parameters of used 3D light field display device and DPT method are illustrated in Table. 1. And the rendering platform parameters are listed in Table. 2. Distributed computation is implemented on multiple GPUs by assigning corresponding parallel threads to specific GPU device. In addition, NVIDIA NVLink is also used to drive GPU devices communication and memory sharing to further improve memory access speed [28]. With distributed computation being implemented on multi-GPUs, DPT can quickly render EIA sequences for 3D display.

Table 1. Experimental Configurations

View Table | View all tables in this article

Table 2. Rendering platform parameters

View Table | View all tables in this article

The mesh models of Spaceship, Dragon and Coffee are visualized on the 3D light field display, and the display results are shown in Fig. 9. Here, orangish-plastic material is attached to Dragon model to present different illumination effects. We can see that with the introduction of denoising filter, a noise-free 3D image is presented. It can be verified that the verisimilar display target is further reached with DPT high quality optical 3D reconstruction.

Fig. 9 Reconstruction results with different sampling rate and denoising filter. (see Visualization 4 and Visualization 5).

Download Full Size | PDF

3.2. Performance evaluation

Performance experiment is conducted to evaluate DPT rendering efficiency. DPT drawing one frame cost time at different output image sizes are plotted in Fig. 10(a). At fixed sampling rate of 2 spp, the amount of launched rays are twice the number of pixels in EIA. As the output image size increases, rendering frame rate gradually decreases. The overhead of one frame is less than 30 ms at 4K resolution, indicating DPT is fully capable of achieving interactive rendering for optical 3D reconstruction. In addition, since the rendering complexity is only related to output image size and independent of the viewpoint numbers, there is a very efficient application in 3D light field display.

Fig. 10 Performance and noise elimination experiment. (a) The time required for DPT to finish one frame at different output image sizes. (b) The change graph of SSIM convergence index before and after denoising filtering is introduced in DPT. (c) The change graph of PSNR before and after denoising filtering is introduced.

Download Full Size | PDF

3.3. Image quality evaluation

DPT reconstructed 3D image quality’s is evaluated. The SSIMs and PSNRs at the central viewpoint of different sampling rate 3D image and target 3D image are depicted in Figs. 10(b) and 10(c), respectively. With the introduction of recurrent CNNs denoising filter at the fourth frame, the 3D image quickly converges to target image display result. The average SSIM value at the 10 th frame is higher than 0.9 and PSNR is over 25 dB, indicating the effective acceleration of the denoising filter in optical 3D reconstruction.

SSIMs and PSNRs of the Dragon’s 3D image before and after denoising of the 10 th frame at different viewing angles are shown in Fig. 11. SSIM values of the reconstruction 3D image at the center 0° and ± 20° viewing angles exceed 90% and 82%, respectively. While the values are only 75% and 53% without the denoising filter, respectively. PSNR values at different viewing angles with the denoising filter and without denoising filter are shown in Fig. 11(b). With the introduction the denoising filter, the 3D image denoising effect gain exceeds 10 dB. As a result, the quality of the 3D image seen in different directions has been significantly improved.

Fig. 11 3D image noise elimination evaluations from different viewing angle. (a) The SSIM index of dragon at 10 th frame. (b) PSNR change diagram of dragon.

Download Full Size | PDF

The image quality evaluation shows that the recurrent CNNs filter can achieve high quality 3D image reconstruction at extremely low sampling rate. Besides, as shown in Fig. 9, with the implementation of global illumination, our method can simulate the illumination characteristics in the real world (e.g. ambient, reflection, and specular). Therefore, a high realistic optical 3D image is reconstructed.

4. Conclusion

In summary, a real-time interactive 3D image generation method of DPT is presented. It exclusively focuses on photorealistic reconstruction of 3D image. Models with BRDFs materials can be visualized on the 3D light field display with global illumination at extremely high quality. Two indicators of PSNR and SSIM for the reconstructed 3D image quality evaluation results show that the proposed DPT method can efficiently generate time-coherent 3D images without noise. Different level details and depth information of 3D image are clearly reconstructed. Besides, since the generation efficiency of DPT is only related to output EIA size for designating scene, computation resource is efficiently utilized. The presented DPT method can provide a clear optical 3D reconstruction at real-time.

Funding

National Key Research and Development Program (2017YFB1002900); National Natural Science Foundation of China (NSFC) (61575025); Fund of the State Key Laboratory of Information Phonics and Optical Communications.

References

1. D. Fattal, Z. Peng, T. Tran, S. Vo, M. Fiorentino, J. Brug, and R. G. Beausoleil, “A multi-directional backlight for a wide-angle, glasses-free three-dimensional display,” Nature 495(7441), 348–351 (2013). [CrossRef] [PubMed]

2. G. Lippmann, “La photographie integrale,” C. R. Acad. Sci. 146, 446–451 (1908).

3. R. Yang, X. Huang, and S. Chen, “Efficient rendering of integral images,” in Proceedings of International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH, 2005), pp. 44. [CrossRef]

4. J. Geng, “Three-dimensional display technologies,” Adv. Opt. Photonics 5(4), 456–535 (2013). [CrossRef] [PubMed]

5. W. Zhang, X. Sang, X. Gao, X. Yu, B. Yan, and C. Yu, “Wavefront aberration correction for integral imaging with the pre-filtering function array,” Opt. Express 26(21), 27064–27075 (2018). [CrossRef] [PubMed]

6. H. E. Ives, “Optical properties of a Lippmann lenticulated sheet,” J. Opt. Soc. Am. 21(3), 171–176 (1931). [CrossRef]

7. X. Sang, X. Gao, X. Yu, S. Xing, Y. Li, and Y. Wu, “Interactive floating full-parallax digital three-dimensional light-field display based on wavefront recomposing,” Opt. Express 26(7), 8883–8889 (2018). [CrossRef] [PubMed]

8. N. Balram and I. Tošić, “Light field imaging and display systems,” Inf. Disp. 32(4), 6–13 (2016). [CrossRef]

9. A. Stern and B. Javidi, “Three-dimensional image sensing, visualization, and processing using integral imaging,” Proc. IEEE 94(3), 591–607 (2006). [CrossRef]

10. M. Halle, “Multiple viewpoint rendering,” in Proceedings of the Annual Conference on Computer Graphics (ACM Siggraph, 1998), pp. 243–254. [CrossRef]

11. G. Chen, C. Ma, Z. Fan, X. Cui, and H. Liao, “Real-time lens based rendering algorithm for super-multiview integral photography without image resampling,” IEEE Trans. Vis. Comput. Graph. 24(9), 2600–2609 (2018). [CrossRef] [PubMed]

12. S. Xing, X. Sang, X. Yu, C. Duo, B. Pang, X. Gao, S. Yang, Y. Guan, B. Yan, J. Yuan, and K. Wang, “High-efficient computer-generated integral imaging based on the backward ray-tracing technique and optical reconstruction,” Opt. Express 25(1), 330–338 (2017). [CrossRef] [PubMed]

13. M. Pharr, G. Humphreys, and W. Jakob, Physically based rendering, from theory to implementation, 3rd ed. (Morgan Kaufmann, 2016), Chap. 9.

14. J. T. Kajiya, “The rendering equation.” in Proceedings of 13th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH, 1986), pp. 143–150.

15. K. Peters, Ray tracing from the ground up (A. K. Peters, Ltd., 2007).

16. J. Bikker and J. V. Schijndel, “The brigade renderer: a path tracer for real-time games,” Int. J. Comput. Games Technol. 2013(8), 578269 (2013). [CrossRef]

17. D. Hearn, M. P. Baker, and W. R. Carithers, Computer graphics with open GL, 4th ed. (Pearson Education, 2010).

18. D. Philip, B. Philippe, and B. Kavita, Advanced global illumination, 2nd ed. (A. K. Peters, 2006).

19. N. K. Kalantari, S. Bako, and P. Sen, “A machine learning approach for filtering monte carlo noise,” ACM Trans. Graph. 34(4), 122 (2015). [CrossRef]

20. M. Zwicker, W. Jarosz, J. Lehtinen, B. Moon, R. Ramamoorthi, F. Rousselle, P. Sen, C. Soler, and S.-E. Yoon, “Recent advances in adaptive sampling and reconstruction for monte carlo rendering,” Comput. Graph. Forum 34(2), 667–681 (2015). [CrossRef]

21. C. R. A. Chaitanya, A. S. Kaplanyan, C. Schied, M. Salvi, and T. Aila, “Interactive reconstruction of monte carlo image sequences using a recurrent denoising autoencoder,” ACM Trans. Graph. 36(4), 1–12 (2017). [CrossRef]

22. E. Veach, Robust Monte Carlo methods for light transport simulation (ProQuest LLC, 1998).

23. D. H. Kim, M. U. Erdenebat, K. C. Kwon, J. S. Jeong, J. W. Lee, K. A. Kim, N. Kim, and K. H. Yoo, “Real-time 3D display system based on computer-generated integral imaging technique using enhanced ISPP for hexagonal lens array,” Appl. Opt. 52(34), 8411–8418 (2013). [CrossRef] [PubMed]

24. M. Delbracio, P. Musé, A. Buades, J. Chauvier, N. Phelps, and J. M. More, “Boosting monte carlo rendering by ray histogram fusion,” ACM Trans. Graph. 33(1), 1–15 (2014). [CrossRef]

25. W. Wu, S. Wang, M. Piao, Y. Zhao, and J. Wei, “Performance metric and objective evaluation for displayed 3D images generated by different lenslet arrays,” Opt. Commun. 426, 635–641 (2018). [CrossRef]

26. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in Proceedings of 18th International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI, 2015), pp. 234–241. [CrossRef]

27. H. Yan, W. Wang, and L. Wang, “Bidirectional recurrent convolutional networks for multi-frame super-resolution,” in Proceedings of 29th Annual Conference on Neural Information Processing Systems (NIPS, 2015), pp. 235–243.

28. J. Choquette, O. Giroux, and D. Foley, “Volta: performance and programmability,” IEEE Micro 38(2), 42–52 (2018). [CrossRef]

Name	Description
Visualization 1	This video presents the 3D image obtained by using path tracing at different sampling rate.
Visualization 2	This visualization expresses the high quality 3D image can be generated at very low sampling rates with the recurrent CNNs denoising filter.
Visualization 3	This media presents the 3D image generated by the DPT method at different viewing perspectives.
Visualization 4	This video uses the dragon to demonstrate that DPT can generate 3D image in real-time.
Visualization 5	This visualization shows the 3D images of the spaceship and coffee captured from different viewing angles.

3D Light field display device	Cellular radius (r)	76 pixel
	The distance of lens array to LCD	12 mm
	Resolution	3840 × 2160
	Pixel pitch	0.136 mm
	Viewpoint interval	1.93 mm
DPT	Field of view in horizontal	12.9°
	Viewpoint number	15006
	The distance of camera to lens array	2300 mm
	Sampling rate	2 spp
	Maximum tracing depth	10

Hardware	CPU	Intel (R) Xeon (R) Gold 5120 @ 2.2 GHz
	GPU	NVIDIA Quadro GV100 (Volta, 32 GB)
	Memory	32 GB
	OS	Microsoft Windows 10 (64 bit version)
	Support	CUDA toolkit 9.0
	Ray path tracing engine	NVIDIA OptiX 5.1.1

3D Light field display device	Cellular radius (r)	76 pixel
	The distance of lens array to LCD	12 mm
	Resolution	3840 × 2160
	Pixel pitch	0.136 mm
	Viewpoint interval	1.93 mm
DPT	Field of view in horizontal	12.9°
	Viewpoint number	15006
	The distance of camera to lens array	2300 mm
	Sampling rate	2 spp
	Maximum tracing depth	10

Hardware	CPU	Intel (R) Xeon (R) Gold 5120 @ 2.2 GHz
	GPU	NVIDIA Quadro GV100 (Volta, 32 GB)
	Memory	32 GB
	OS	Microsoft Windows 10 (64 bit version)
	Support	CUDA toolkit 9.0
	Ray path tracing engine	NVIDIA OptiX 5.1.1

Real-time optical 3D reconstruction based on Monte Carlo integration and recurrent CNNs denoising with the 3D light field display

Abstract

1. Introduction

2. Principle

2.1. Monte Carlo integration

2.2. Optical structure design for the demonstrated DPT method

2.3. Encoding algorithm of directional path tracing

2.4. 3D image reconstruction with recurrent CNNs filter

3. Experimental results

3.1. Experimental configurations and Result

3.2. Performance evaluation

3.3. Image quality evaluation

4. Conclusion

Funding

References

Supplementary Material (5)

Cited By

Figures (11)

Tables (2)

Equations (14)

Optics Express