Learning-based compensation of spatially varying aberrations for holographic display [Invited]

Dongheon Yoo; Seung-Woo Nam; Youngjin Jo; Seokil Moon; Chang-Kun Lee; Byoungho Lee

doi:10.1364/JOSAA.444613

1. INTRODUCTION

Holographic displays are considered one of the near-eye display techniques for augmented and virtual reality with the highest potential [1]. A compact form factor, supporting focus cues, and high-resolution imagery are the main advantages of holographic displays in near-eye display implementation. They usually consist of coherent light sources, a spatial light modulator (SLM), and several optical elements for relaying and imaging. The optical elements can be replaced by volume gratings, which enables the compact form factor of holographic near-eye displays [2–4].

The holographic display allows direct complex amplitude modulation, and therefore, per-pixel depth control is possible. For example, suppose that each object point at a different depth and lateral location is a point source emitting a spherical wave. The complex point-spread functions (PSFs) for the object points are then superimposed on the SLM to generate a hologram, and three-dimensional imagery can be optically reconstructed if a computer-generated hologram (CGH) is displayed on the SLM [4]. The advantage of supporting focus cues is that they can alleviate the so-called vergence–accommodation conflict that can cause visual discomfort [5–7]. It occurs when the perceived depth information by the extent of eye rotation (convergence) and focusing states of crystalline lenses (accommodation) does not match. Although image disparities can adjust the convergence, the accommodative states cannot be changed unless three-dimensional display methodologies to support focus cues are applied [8–11].

Furthermore, the holographic display enables high-resolution reconstruction of three-dimensional imagery since it can compensate for image degradation due to diffraction effects. When the pitch of a utilized grating gets closer to several wavelength orders, the effects of diffraction become significant in image quality. However, due to ray optics approximation, considering the diffraction effects is difficult for three-dimensional displays using multiple liquid crystal panels such as light field displays and multifocal displays. Therefore, they suffer from image quality degradation unless additional approaches are applied to reflect the diffraction effects [12]. On the other hand, since the CGH is synthesized considering wave optics, the holographic display can intrinsically consider diffraction effects.

Another significant advantage of the holographic display is that it can compensate for optical aberrations without additional optical elements. Typically, CGH synthesis for compensation consists of three steps. First, optical aberrations are measured through ray tracing [13,14], experimental calibration [4,15], or analytic wavefront computation [16]. Second, a spatially varying kernel on SLM to make a sharp focal spot is found or estimated through the measured aberrations for each point on the reconstruction plane. In estimation, the kernel is normally approximated by several orthographic polynomials such as Zernike and Lagrange polynomials. Last, the aberration-compensating CGH is synthesized through four-dimensional convolution of a target image with the spatially varying kernel.

Although the point-wise integration method is effective for a highly aberrated system, it requires substantial computation loads. For example, according to Kim e al. [13], the point-wise integration method takes about 20 min when the CGH and kernel resolutions are $3840 \times 2160$ and $601 \times 325$, even if the entire process is accelerated using GPU programming. Therefore, several works have suggested replacing the point-by-point integration method with a few convolutions based on fast Fourier transform (FFT) for smoothly varying aberrations.

For instance, in the work of Kim et al. [13], a kernel corresponding to the center point replaces whole spatially varying kernels, and the hologram is generated by a single FFT-based convolution using the center kernel. On the other hand, Maimone et al. [4] use a gaze tracker and dynamically change the convolution kernel depending on the observer’s gaze direction. The hologram in their work is also computed by the single FFT-based convolution.

Several works have reported more complicated approaches to improve approximations of spatially varying aberrations [14–16]. Instead of assuming spatial invariance across the field of view, they divide it into several areas and suppose spatial invariance across each region. The region can be square [14,16] or any user-defined shape [15]. Each area involves an FFT-based convolution, and a final hologram for compensation is created by combining each hologram generated by the convolution.

Here, we deviate from the approach of dividing and convolutions. Instead, we use a deep neural network to create the compensation hologram. In recent years, neural networks have been widely reported to be effective in various vision tasks [17–19]. Neural networks also work well for CGH rendering, and several works utilize the neural network to reduce computation loads on CGH rendering [20,21] or improve the image quality of holographic displays [22,23]. However, none of these works assumes highly aberrated systems. In this work, we suggest using the neural network to improve the image quality of such systems. The procedure of generating aberration-compensating holograms using the proposed method is briefly introduced in Fig. 1.

In the proposed method, target intensity $I$ in the reconstruction plane is first backpropagated to the SLM plane with transfer function ${T_{- d}}$ [24] when the distance between the reconstruction plane and the SLM plane is $d$. A two-dimensional phase delay ${\phi _G}$ parameter is incorporated into the transfer function to consider aberrations. The detailed equation of the transfer function is provided in Section 2. In Fig. 1, ${\cal F}$ and ${{\cal F}^{- 1}}$ denote Fourier transform and inverse Fourier transform, respectively. Thereafter, the resultant complex amplitude $\widetilde {{h_{\textit{sv}}}}$ is processed by a neural network ${{\cal N}_G}({\omega _G})$ with weights ${\omega _G}$ to generate a hologram $\widehat {{h_{\textit{sv}}}}$ that can compensate for spatially varying aberrations.

Fig. 1. Schematic of the proposed generator for hologram synthesis to compensate for spatially varying aberrations.

Download Full Size | PDF

The proposed method is computationally efficient and shows superior performance in terms of aberration compensation compared to the point-wise integration method and the single FFT-based convolution method, respectively. Furthermore, the proposed method does not need to manually define how to partition the entire display area as in Nam et al. [16], Haist et al. [14], and Kaczorowski et al. [15]. The following sections describe the proposed method and introduce the optimization strategy for parameters in detail. We also implement a holographic display with severe optical aberrations. Finally, from the experiments, we evaluate the aberration compensation and computational speed of the proposed method.

2. PRINCIPLE

Our method aims to generate a hologram to compensate for spatially varying aberrations through a combination of an FFT-based convolution and a neural network. In this paper, we call the combination a generator, and the parameters in the generator should be optimized. Before the optimization of generator parameters, three main steps are required. First, optical aberrations of the holographic display should be measured. Second, a training dataset should be built, and the dataset consists of pairs of various images and corresponding aberration-compensating holograms synthesized via the point-wise integration method. Third, a propagator is designed and trained to reconstruct an image close to the target image when the aberration-compensating hologram is put into the propagator. Last, the parameters of the generator are optimized using the propagator and the dataset. We elaborate on each step in the following sections.

A. Optical Aberrations Measurement

The aberrations can be evaluated based on ray tracing or measured by experimental calibration. We denote the aberration evaluation as finding a phase kernel that can make a sharp focal spot at each target point. Supposing that the structure of the holographic display and optical information on used elements are all known, the complex PSF generated by each target point can be acquired through ray tracing simulations [13]. Thousands of rays are emitted from the target point, and the phase of each ray in the SLM plane is calculated based on the optical path length of the ray while passing through the display system. After that, a phase kernel can be computed via interpolation from the scattered phase points.

In this paper, we adopt the experimental approach as in Maimone et al. [4]. They describe a kernel $k$ in the SLM plane for each object point as follows:

(1)$$k(\overrightarrow {{p_s}} ;\overrightarrow {{p_o}}) = {e^{i\left(\frac{{2\pi}}{\lambda}\left\| {\overrightarrow {{p_s}} - \overrightarrow {{p_o}}} \right\| + {\phi _o}\right)}} \circ {e^{i{\phi _z}(\overrightarrow {{p_s}} ;\overrightarrow {{p_o}})}},$$

where $\overrightarrow {{p_s}}$ means a point on the SLM. $\overrightarrow {{p_o}}$ is an object point, and $|| \cdot ||$ denotes a Euclidean distance. ${\phi _o}$ is an initial phase of the object point, and $\lambda$ means wavelength. Besides the kernel related to the spherical wave emanated from the object point, ${\phi _z}$ describes a phase delay caused by aberrations. The phase delay is represented using Zernike polynomials ${Z_j}(\overrightarrow {{p_s}})$ and corresponding coefficients ${c_j}(\overrightarrow {{p_o}})$ as follows:

(2)$${\phi _z}(\overrightarrow {{p_s}} ;\overrightarrow {{p_o}}) = {\sum _j} {c_j}(\overrightarrow {{p_o}}){Z_j}(\overrightarrow {{p_s}}),$$

where $j$ indicates the Wyant index of a Zernike polynomial.

Based on Eq. (2), coefficients ${c_j}$ for each object point are found by manual calibration so that the phase kernel $k$ can make a sharp focal spot at the object point. However, it is not practical to find coefficients for entire object points. Instead, several points are chosen, and coefficients of only these points are measured. Coefficients of the remaining points are numerically calculated via interpolation. The number of points to be measured should be adjusted depending on how rapidly aberrations vary to ensure interpolation accuracy.

B. Generation of Aberration-Compensating Holograms Based on a Point-Wise Integration

After measuring the optical aberrations, holograms ${h_{\textit{sv}}}$ to compensate for them are synthesized for various images based on a point-wise integration. The point-wise integration method [4] is defined as follows:

(3)$${h_{\textit{sv}}}(\overrightarrow {{p_s}}) = {\sum _{\overrightarrow {{p_o}} \in {\Omega _o}}}\sqrt {I(\overrightarrow {{p_o}})} k(\overrightarrow {{p_s}} ;\overrightarrow {{p_o}}),$$

where ${\Omega _o}$ means the set of object points. $I(\overrightarrow {{p_o}})$ is an intensity of the object point $\overrightarrow {{p_o}}$. Pairs of images and corresponding holograms are used as ground-truth datasets during training a propagator, which is explained in the following sections.

C. Parameter Optimization of the Customized Propagator

If a hologram for aberration compensation is generated through the proposed method, reconstruction from the hologram should be evaluated in terms of image quality. For the evaluation, we design a propagator that would generate a target image if the aberration-compensating hologram is applied. In an ideal system without aberrations or with spatially invariant aberrations, the propagation can be numerically modeled via a single FFT-based convolution. Furthermore, the propagation’s inverse function can also be represented by the convolution with a conjugate kernel. However, in a system with spatially varying aberrations, the propagation cannot be merely described by the single FFT-based convolution. Theoretically, it involves matrix multiplication with ${N_{\rm{input}}} \times {N_{\rm{output}}}$ elements when the input and output planes are assumed to be sampled by ${N_{\rm{input}}}$ and ${N_{\rm{output}}}$ pixels, which induces large computation loads. In addition, it is rather impractical to compute the inverse of the propagation matrix due to its enormous size.

On the other hand, our customized propagator mainly consists of two parts with reduced calculations: first, an FFT-based convolution with transfer function parameters, and second, forward computation with a deep neural network. The FFT-based convolution can consider aberrations in a spatially invariant manner. After convolution, the neural network is adopted to mitigate the gap due to the spatial variance. Recently, deep neural networks have been shown to be effective at modeling spatially variant and nonlinear functions [25,26]. Network-based modeling is computationally efficient compared to huge matrix calculation. Furthermore, it can help the reconstruction quality from the hologram to be considered during optimizations of generator parameters due to its differentiability. Figure 2 illustrates the customized propagator.

Fig. 2. Schematic of the proposed propagator designed to generate a target image from the aberration-compensating holograms. Solid and dashed lines show computations during evaluation and parameter optimization, respectively.

Download Full Size | PDF

During an inference by the propagator, a hologram ${h_{\textit{sv}}}$ for compensation is first converted into an intermediate intensity image $\tilde I$ through a spatially invariant convolution as follows:

(4)$$\tilde I = {{\cal F}^{- 1}}({\cal F}({h_{\textit{sv}}}){T_d}({\phi _P})),$$

where ${T_d}$ is a transfer function for the FFT-based convolution. It is defined as follows [22,24]:

(5)$${T_d} = \left\{{\begin{array}{*{20}{c}}{{e^{- i\frac{{2\pi}}{\lambda}d\sqrt {1 - {{(\lambda {f_x})}^2} - {{(\lambda {f_y})}^2}}}}{e^{i{\phi _P}}},}&{{\rm{if}}\;\;\;\sqrt {{f_x}^2 + {f_y}^2} \lt \frac{1}{\lambda}}\\0&{{\rm{otherwise}}}\end{array}} \right.\!\!,$$

where $d$ is the distance between the reconstruction plane and the hologram plane. $(x,y)$ denote the coordinates of the hologram plane, and $({f_x},{f_y})$ are the spatial frequencies. The phase delay ${\phi _P} = \sum\nolimits_j {c_{\!j,P}}{Z_j}$ is also incorporated into the transfer function. In the phase delay, ${Z_j}$ and ${c_{\!j,P}}$ are the $j$th Wyant Zernike polynomial and corresponding coefficient, respectively. Note that we allow the phase delay to be adjusted during parameter optimization.

The intermediate image $\tilde I$ is then converted into the final reconstructed image $\hat I$ via a neural network ${{\cal N}_P}({\omega _P})$, which is designed to mitigate the gap between the spatially invariant convolution and physical propagation. ${\omega _P}$ of the network denotes trainable weights. The weights and phase delay are optimized by solving the problem as follows:

(6)$$\mathop {{\rm{minimize}}}\limits_{{c_{\!P}},{\omega _P}} {{\cal L}_P}(I,\hat I),$$

where ${c_{\!P}}$ denotes the set of coefficients of Zernike polynomials for the phase delay ${\phi _P}$. The operator ${{\cal L}_P}(\cdot)$ to compute losses for the propagator is composed of terms to evaluate Euclidean distances between original images and image gradients as follows:

(7)$${{\cal L}_P}(I,\hat I) = \| {I - \hat I} \|_2^2 + {\alpha _g}({\| {{\nabla _x}(I - \hat I)} \|_1} + {\| {{\nabla _y}(I - \hat I)} \|_1}),$$

where ${\nabla _x}$ and ${\nabla _y}$ are operators to compute image gradients for each axis. ${|| \cdot ||_2}$ and ${|| \cdot ||_1}$ denote L2 and L1 norms, respectively. We set the relative weight ${\alpha _g}$ for image gradient loss as 0.01. During parameter optimization for the propagator, we use the target image $I$ as a ground-truth output and the corresponding aberration-compensation hologram ${h_{\textit{sv}}}$ as an input.

D. Parameter Optimization of the Hologram Generator

Once the propagator $P({\phi _P},{\omega _P})$ is prepared, we design a generator $G({\phi _G},{\omega _G})$ that synthesizes a hologram compensating for spatially varying aberrations from a target image. As shown in Fig. 1, the generator consists of spatially invariant convolution and a deep neural network ${{\cal N}_G}({\omega _G})$. The convolution is similar to that of the propagator, except that the first exponent term of the transfer function is conjugated. Here, the phase delay ${\phi _G} = \sum\nolimits_j {c_{\!j,G}}{Z_j}$, and network weights ${\omega _G}$ are also optimized so that the difference between the reconstructed image $\hat I$ and the target image $I$ can be minimized. During parameter optimization, $\hat I$ is computed from the hologram $\widehat {{h_{\textit{sv}}}}$ via the propagator as follows:

(8)$$\hat I = P(\widehat {{h_{\textit{sv}}}};{\phi _P},{\omega _P}),$$

where $\widehat {{h_{\textit{sv}}}}$ is calculated using the generator as follows:

(9)$$\widehat {{h_{\textit{sv}}}} = G(I;{\phi _G},{\omega _G}).$$

Similar to the propagator, generator parameters are optimized by solving the following problem:

(10)$$\mathop {{\rm{minimize}}}\limits_{{c_G},{\omega _G}} {{\cal L}_G}(I,\hat I).$$

Note that the parameters of the propagator remain fixed during generator optimization.

For the loss function ${{\cal L}_G}$, we additionally consider perceptual losses ${{\cal L}_{\textit{pl}}}$ [27,28] compared to the loss ${{\cal L}_P}$ used for the propagator as follows:

(11)$${{\cal L}_{\textit{pl}}}(I,\hat I) = {\alpha _{\textit{pl}}}\sum _{l = 1}^5\left\| {{L_l}(I) - {L_l}(\hat I)} \right\|_2^2,$$

where ${\alpha _{\textit{pl}}}$ is the relative weight for perceptual losses, and it is set to 0.0025. The operator ${L_l}(\cdot)$ extracts features from the $l$th layer of VGG19 network [29]. Overall, the loss function for the generator consists of the same loss function for the propagator and perceptual losses, and Eq. (10) is treated via a gradient-descent method. Although the conventional loss function computes the distance between target and reconstruction pixel-wise, the perceptual loss parameterizes the distance using the deep neural network designed for object classification. The usage of the perceptual loss helps to compare the images based on various features.

3. IMPLEMENTATION AND RESULTS

We built a holographic display prototype with spatially varying aberrations to check the feasibility of the proposed method. All optical elements used in the prototype are commercially available. The prototype is demonstrated in Fig. 3. It consists of a green laser of 520 nm wavelength (Fisba RGBeam), a phase-only SLM of $3840 \times 2160\;{\rm pixels}$ and 3.6 µm pixel pitch (May IRIS-U78), a 4F filtering system, and an eyepiece. We use the 4F filtering system to block high-order noises and encode a complex hologram into a phase-only hologram via double phase-amplitude coding (DPAC) [4]. The lens of 100 mm focal length (Thorlabs AC508-100-A-ML) is used for the 4F filtering system and the eyepiece. The eyepiece is slightly rotated by several degrees to add spatially varying aberrations to the prototype.

Fig. 3. Photograph of implemented prototype of holographic display with spatially varying aberrations.

Download Full Size | PDF

The aberrations are demonstrated by a captured PSF grid as shown in Fig. 4. The grid image of $1024 \times 1024$ pixels is used, and $7 \times 7$ points are sampled at regular intervals. The $7 \times 7$ points were sufficient to model the entire aberrations through the interpolation of measured Zernike coefficients, as explained in Section 2.A. Here, the reconstruction plane is located 51 mm behind the conjugate plane of the SLM. In this plane, the defocus error for the point marked with a white arrow in Fig. 4 is minimal.

Fig. 4. Captured PSF grid of $7 \times 7$ points before compensating for spatially varying aberrations. White arrow denotes the reference point where defocus error is reduced to the maximum.

Download Full Size | PDF

For each point in the grid, the coefficients for Zernike polynomials are found by experimental calibration. The coefficients are manually adjusted so that the kernel corresponding to each grid point reconstructs a sharp PSF. During experiments, the kernel size is set to $601 \times 601$. The size might be adjusted depending on the propagation distance, diffraction angle, or severity of aberrations. Here, the pixel pitch is set to 7.2 µm. Exemplary kernels are demonstrated in Fig. 5. The figure’s left, middle, and right kernels correspond to the upper left corner, upper right corner, and center points from the $7 \times 7$ grid, respectively. In addition, Zernike polynomials of Wyant index 3 to 7, which are associated with defocus, astigmatism, and coma aberrations, are used to approximate wavefront deformation. Therefore, five coefficients per grid point are obtained through the measurement. Overall, these coefficients are interpolated to generate five coefficient maps each with $1024 \times 1024\;{\rm pixels}$.

Fig. 5. Exemplary kernels for aberration compensation.

Download Full Size | PDF

With the coefficient maps, holograms ${h_{\textit{sv}}}$ for the training dataset are generated via the point-wise integration method. The method requires enormous computation loads, so the pipeline is parallelized using the pyCUDA library [30]. It took about 95.14 s to create each hologram of $1024 \times 1024\;{\rm pixels}$ with pyCUDA, and the computation time was averaged over 100 trials. A PC equipped with NVIDIA Tesla P40 GPU of 24 GB memory was used for that generation. For target images, we exploit the DIV2K dataset of 800 training images [31].

As shown in Fig. 2, the propagator consists of a spatially invariant convolution operation and forward computation of the deep neural network. Phase delay ${\phi _P}$ of transfer function ${T_d}$ is represented using Zernike polynomials from Wyant index 1 to 12, and therefore, 12 coefficients are treated as optimization variables. All other variables are included in the neural network ${{\cal N}_P}({\omega _P})$. Although a single spatially invariant convolution is applied in this paper, multiple convolutions may be added before the network computation depending on the degree of spatial variance of aberrations. In this case, optimization variables increase by times the number of the convolutions, and resultant images are concatenated to be used as an input to the network.

Figure 6 shows the network structure in the propagator, and it is designed based on U-Net [32], which is known to be effective in image-to-image translation. In the figure, the numbers in parentheses for each layer indicate the height, width, and channel. Intermediate image $\tilde I$ goes through several “ConvBn,” “DownConv,” and “UpConv” operations to reconstruct an image $\hat I$. The ConvBn operation comprises two-dimensional convolution, batch normalization, and nonlinear activation by a leaky rectified linear unit (ReLU) function (0.3 negative slope). The kernel size, strides, and zero-padding number of the convolution are denoted by K, S, and P, respectively. For the downsampling executed by the DownConv block, ConvBn operation with kernel size 3, strides 2, and no padding area is used. After the downsampling, four consecutive ConvBn operations are performed. In the UpConv operation, bilinear interpolation is utilized for upsampling. The resized layer is concatenated to the previous layer of the same resolution, and the concatenated layer undergoes four ConvBn operations. Last, the reconstruction image $\hat I$ is generated by a convolution with a kernel of size 1 followed by hyperbolic tangent function.

Fig. 6. Illustration of the neural network used in the proposed propagator: (a) network architecture; (b) structures of ConvBn block, DownConv block, and UpConv block used in the network.

Download Full Size | PDF

We used the Tensorflow library [33] to optimize propagator variables based on the gradient-descent approach. The Adam optimizer [34] was utilized during optimization, and it was performed with epoch 100, a learning rate of 0.0004, and batch size 3. The optimization took about a day. As shown in Fig. 7, the optimized propagator could reconstruct the image $\hat I$, and the details are well preserved compared to $\tilde I$, which is merely generated by spatially invariant convolution.

Fig. 7. Examples of reconstruction image $\hat I$ and intermediate image $\tilde I$ given an aberration-compensating hologram ${h_{\textit{sv}}}$.

Download Full Size | PDF

The generator $G({\phi _G},{\omega _G})$ is implemented similar to the propagator. It also consists of a spatially invariant convolution with adjustable phase delay and a neural network. As in the propagator, several additional convolutions can be allocated before the neural network, and the resultant holograms $\widetilde {{h_{\textit{sv}}}}$ are concatenated before going into the network. We adopt Zernike polynomials from index 1 to 12 given in Wyant notation to model the phase delay ${\phi _G}$. Furthermore, the network ${{\cal N}_G}$ is also designed similar to the network shown in Fig. 6 except that shapes of both input and output tensors are changed to (1024, 1024, 2). Parameter optimization for the generator was conducted with epoch 100, a learning rate of 0.0001, and batch size 2. Parameters of the propagator remained fixed during optimization, and it took about a day to reach convergence. The computation time to render a hologram of $1024 \times 1024\;{\rm pixels}$ via the optimized generator was about 0.13 s, and the value was averaged over 100 trials.

Hologram examples rendered using the generator are provided in Fig. 8. The intermediate hologram $\widetilde {{h_{\textit{sv}}}}$ was generated via the spatially invariant convolution with optimized ${\phi _G}$, and the trained network converted it into the aberration-compensating hologram $\widehat {{h_{\textit{sv}}}}$. With $\widehat {{h_{\textit{sv}}}}$, the optimized propagator reconstructed the image $\hat I$ in the target plane with well-preserved details.

Fig. 8. Examples of intermediate hologram $\widetilde {{h_{\textit{sv}}}}$, final hologram $\widehat {{h_{\textit{sv}}}}$ generated using the proposed generator, and reconstructed image $\hat I$ from $\widehat {{h_{\textit{sv}}}}$ using the proposed propagator given a target image $I$.

Download Full Size | PDF

Besides numerical reconstruction, optical reconstruction must be checked to validate the proposed method. Since our method aims to generate a complex hologram of $1024 \times 1024\;{\rm pixels}$, it should be scaled for 4 K phase-only SLM. Therefore, we first pad the original hologram to generate a 2 K hologram and transform it to a 4 K hologram via nearest-neighbor interpolation. For phase-only conversion, the DPAC method is used [4].

The experimental results of optical reconstruction are demonstrated in Fig. 9. For comparison, we captured an image in the target plane after compensation in a spatially invariant manner. The hologram for the spatially invariant compensation was generated according to Eq. (3). In particular, all coefficients $c(\overrightarrow {{p_o}})$ were equally set to the coefficients measured at the center point. As shown in the figure, details of the image after spatially invariant compensation appear to be damaged, except in the center. On the other hand, the result of the proposed method achieves clearer details.

Fig. 9. Experimental results of optical reconstruction when spatially invariant compensation using the kernel for center pixel and the proposed method is applied.

Download Full Size | PDF

4. DISCUSSION

Although the proposed method is verified to improve image quality and preserve more image details than the case with spatially invariant compensation, the noise in the result still cannot be ignored. One of the main reasons is the error introduced by the phase-only conversion of the complex hologram. Previous works have reported that the DPAC method lowers contrast in the reconstructed image and induces ringing artifacts near edges [22,35]. Recently, several works proposed stochastic gradient-descent-based optimization methods to reduce errors by phase-only conversion and thus improve image quality [22,35,36]. In particular, Peng et al. proposed to use neural networks for real-time rendering of the phase-only hologram, and a network for phase-only conversion is included in the rendering pipeline. Inspired by them, we are planning to incorporate a separate network for conversion within the generator in future work.

5. CONCLUSION

In conclusion, we proposed a hologram generation technique to compensate for spatially varying aberrations. Our technique exploits a deep neural network to replace the point-wise integration method, which is a conventional method for compensation, since it requires huge computation loads. Parameters of our algorithm were optimized regarding the DIV2K dataset to cover various images, and the optimized algorithm could compute a hologram of $1024 \times 1024\;{\rm pixels}$ in about 0.13 s, while the conventional method took about 95.14 s for generation. Experimental results also verified the performance of our technique for aberration compensation. We believe that the proposed method can inspire future works to compensate for spatially varying aberrations while achieving superior image quality and real-time calculations.

Disclosures

The authors declare no conflicts of interest.

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

REFERENCES

1. C. Chang, K. Bang, G. Wetzstein, B. Lee, and L. Gao, “Toward the next-generation VR/AR optics: a review of holographic near-eye displays from a human-centric perspective,” Optica 7, 1563–1578 (2020). [CrossRef]

2. K. Bang, C. Jang, and B. Lee, “Compact noise-filtering volume gratings for holographic displays,” Opt. Lett. 44, 2133–2136 (2019). [CrossRef]

3. C. Jang, K. Bang, G. Li, and B. Lee, “Holographic near-eye display with expanded eye-box,” ACM Trans. Graph. 37, 195 (2018). [CrossRef]

4. A. Maimone, A. Georgiou, and J. S. Kollin, “Holographic near-eye displays for virtual and augmented reality,” ACM Trans. Graph. 36, 85 (2017). [CrossRef]

5. G.-A. Koulieris, B. Bui, M. S. Banks, and G. Drettakis, “Accommodation and comfort in head-mounted displays,” ACM Trans. Graph. 36, 87 (2017). [CrossRef]

6. M. Lambooij, M. Fortuin, I. Heynderickx, and W. IJsselsteijn, “Visual discomfort and visual fatigue of stereoscopic displays: a review,” J. Imaging Sci. Technol. 53, 030201 (2009). [CrossRef]

7. R. T. Held, E. A. Cooper, and M. S. Banks, “Blur and disparity are complementary cues to depth,” Current Biol. 22, 426–431 (2012). [CrossRef]

8. D. Yoo, S. Lee, Y. Jo, J. Cho, S. Choi, and B. Lee, “Volumetric head-mounted display with locally adaptive focal blocks,” IEEE Trans. Vis. Comput. Graph. 28, 1415–1427 (2020). [CrossRef]

9. Y. Jo, S. Lee, D. Yoo, S. Choi, D. Kim, and B. Lee, “Tomographic projector: large scale volumetric display with uniform viewing experiences,” ACM Trans. Graph. 38, 215 (2019). [CrossRef]

10. T. Zhan, Y.-H. Lee, and S.-T. Wu, “High-resolution additive light field near-eye display by switchable Pancharatnam–Berry phase lenses,” Opt. Express 26, 4863–4872 (2018). [CrossRef]

11. Y. Jo, K. Bang, D. Yoo, B. Lee, and B. Lee, “Ultrahigh-definition volumetric light field projection,” Opt. Lett. 46, 4212–4215 (2021). [CrossRef]

12. S. Lee, C. Jang, S. Moon, J. Cho, and B. Lee, “Additive light field displays: realization of augmented reality with holographic optical elements,” ACM Trans. Graph. 35, 60 (2016). [CrossRef]

13. D. Kim, S.-W. Nam, K. Bang, B. Lee, S. Lee, Y. Jeong, J.-M. Seo, and B. Lee, “Vision-correcting holographic display: evaluation of aberration correcting hologram,” Biomed. Opt. Express 12, 5179–5195 (2021). [CrossRef]

14. T. Haist, A. Peter, and W. Osten, “Holographic projection with field-dependent aberration correction,” Opt. Express 23, 5590–5595 (2015). [CrossRef]

15. A. Kaczorowski, G. S. Gordon, and T. D. Wilkinson, “Adaptive, spatially-varying aberration correction for real-time holographic projectors,” Opt. Express 24, 15742–15756 (2016). [CrossRef]

16. S.-W. Nam, S. Moon, B. Lee, D. Kim, S. Lee, C.-K. Lee, and B. Lee, “Aberration-corrected full-color holographic augmented reality near-eye display using a Pancharatnam-Berry phase lens,” Opt. Express 28, 30836–30850 (2020). [CrossRef]

17. D. Yoo, J. Cho, J. Lee, M. Chae, B. Lee, and B. Lee, “FinsNet: end-to-end separation of overlapped fingerprints using deep learning,” IEEE Access 8, 209020 (2020). [CrossRef]

18. Z. Xu, S. Zuo, E. Y. Lam, B. Lee, and N. Chen, “AutoSegNet: an automated neural network for image segmentation,” IEEE Access 8, 92452–92461 (2020). [CrossRef]

19. E. Tseng, A. Mosleh, F. Mannan, K. St-Arnaud, A. Sharma, Y. Peng, A. Braun, D. Nowrouzezahrai, J.-F. Lalonde, and F. Heide, “Differentiable compound optics and processing pipeline optimization for end-to-end camera design,” ACM Trans. Graph. 40, 18 (2021). [CrossRef]

20. J. Lee, J. Jeong, J. Cho, D. Yoo, B. Lee, and B. Lee, “Deep neural network for multi-depth hologram generation and its training strategy,” Opt. Express 28, 27137–27154 (2020). [CrossRef]

21. L. Shi, B. Li, C. Kim, P. Kellnhofer, and W. Matusik, “Towards real-time photorealistic 3d holography with deep neural networks,” Nature 591, 234–239 (2021). [CrossRef]

22. Y. Peng, S. Choi, N. Padmanaban, and G. Wetzstein, “Neural holography with camera-in-the-loop training,” ACM Trans. Graph. 39, 185 (2020). [CrossRef]

23. P. Chakravarthula, E. Tseng, T. Srivastava, H. Fuchs, and F. Heide, “Learned hardware-in-the-loop phase retrieval for holographic near-eye displays,” ACM Trans. Graph. 39, 186 (2020). [CrossRef]

24. K. Matsushima and T. Shimobaba, “Band-limited angular spectrum method for numerical simulation of free-space propagation in far and near fields,” Opt. Express 17, 19662–19673 (2009). [CrossRef]

25. V. Sitzmann, J. Martel, A. Bergman, D. Lindell, and G. Wetzstein, “Implicit neural representations with periodic activation functions,” in Advances in Neural Information Processing Systems (2020), Vol. 33.

26. T. Takikawa, J. Litalien, K. Yin, K. Kreis, C. Loop, D. Nowrouzezahrai, A. Jacobson, M. McGuire, and S. Fidler, “Neural geometric level of detail: real-time rendering with implicit 3D shapes,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2021), pp. 11358–11367.

27. J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual losses for real-time style transfer and super-resolution,” in European Conference on Computer Vision (Springer, 2016), pp. 694–711.

28. R. Zhang, P. Isola, A. A. Efros, E. Shechtman, and O. Wang, “The unreasonable effectiveness of deep features as a perceptual metric,” in IEEE Conference on Computer Vision and Pattern Recognition (IEEE, 2018), pp. 586–595.

29. K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint arXiv:1409.1556 (2014).

30. A. Klöckner, N. Pinto, Y. Lee, B. Catanzaro, P. Ivanov, and A. Fasih, “PyCUDA and PyOpenCL: a scripting-based approach to GPU run-time code generation,” Parallel Comput. 38, 157–174 (2012). [CrossRef]

31. E. Agustsson and R. Timofte, “NTIRE 2017 challenge on single image super-resolution: dataset and study,” in IEEE Conference on Computer Vision and Pattern Recognition Workshops (IEEE, 2017), pp. 126–135.

32. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: convolutional networks for biomedical image segmentation,” in Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention (Springer, 2015), pp. 234–241.

33. M. Abadi, A. Agarwal, P. Barham, et al., “TensorFlow: large-scale machine learning on heterogeneous systems,” 2015, https://tensorflow.org.

34. D. P. Kingma and J. A. Ba, “Adam: a method for stochastic optimization,” arXiv preprint arXiv:1412.6980 (2014).

35. P. Chakravarthula, Y. Peng, J. Kollin, H. Fuchs, and F. Heide, “Wirtinger holography for near-eye displays,” ACM Trans. Graph. 38, 213 (2019). [CrossRef]

36. D. Yoo, Y. Jo, S.-W. Nam, C. Chen, and B. Lee, “Optimization of computer-generated holograms featuring phase randomness control,” Opt. Lett. 46, 4769–4772 (2021). [CrossRef]

Learning-based compensation of spatially varying aberrations for holographic display [Invited]

Abstract

1. INTRODUCTION

2. PRINCIPLE

A. Optical Aberrations Measurement

B. Generation of Aberration-Compensating Holograms Based on a Point-Wise Integration

C. Parameter Optimization of the Customized Propagator

D. Parameter Optimization of the Hologram Generator

3. IMPLEMENTATION AND RESULTS

4. DISCUSSION

5. CONCLUSION

Disclosures

Data Availability

REFERENCES

Data Availability

Cited By

Figures (9)

Equations (11)

Journal of the Optical Society of America A