Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Augmented reality three-dimensional visualization with multifocus sensing

Open Access Open Access

Abstract

In augmented reality displays, digital information can be integrated with real-world scenes. We present an augmented reality-based approach for three-dimensional optical visualization and depth map retrieval of a scene using multifocus sensing. From a sequence of images captured with different focusing distances, all-in-focus image reconstruction can be performed along with different point of view synthesis. By means of an algorithm that compares the all-in-focus image reconstruction with each image of the z-stack, the depth map of the scene can also be retrieved. Once the three-dimensional reconstructed scene for different points of view along with its depth map is obtained, it can be optically displayed in smart glasses allowing the user to visualize the real three-dimensional scene along with synthesized perspectives of it and provide information such as depth maps of the scene, which are not possible with conventional augmented reality devices. To the best of our knowledge, this is the first report on combining multifocus sensing and three-dimensional visualization and depth retrieval for applications to augmented reality.

© 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Augmented Reality (AR) devices allow for the simultaneous visualization of a real-world three-dimensional (3D) scene and optically overlayed digital information by means of see-through architectures [13].

Promising applications of AR can be found in surgery, ophthalmology, cell visualization or industry among others [4,5]. In surgery [6], for example, the increasing use of preoperative data has created the need for the development of visualization technology suitable for overlaying information onto the surgical environment. AR has also proved useful in ophthalmology [7] in particular as low vision aid technique [8] by allowing the superimposing of a pseudocolor wireframe to real-world scenes for patients in advanced stage of retinal disease. Applications of AR can also be found in automatic identification and visualization of cells [9] which might be useful for healthcare workers to rapidly extract relevant information for diagnosis. In industry AR devices have also been applied to address different problems in product assembly [10].

Information provided to the AR user is usually in the form of location or connected to location itself (weather, altitude, local time) while 3D information of the scene visualized by the user should be provided by on site sensing.

Since the numerical aperture of the integrated, single camera of AR devices is usually low, real-world scenes are imaged with great depth of field. As a result, the objects are captured in-focus as in a single point of view pinhole-like projection and depth information is lost. So incorporating an external device with 3D sensing capability represents a practical option in this regard.

The addition of a 3D imaging system to commercially available see-through AR glasses is explored with an active illumination method in [11] where a scanning depth sensor based on structured light projection is presented. In [12] axially distributed sensing followed by scene reconstruction combined with feature extraction was applied for object recognition and then optically displayed in smart glasses for visualization of 3D information superimposed on the real-world scene.

Alternatively, a passive illumination method based on multifocus sensing [1315] relies on the capture of a scene with the system focusing at different distances from a camera with the possibility of retrieving 3D information and novel image synthesis [16] from the acquired stack. In [15] an optical system incorporating an electrically focus-tunable lens is used to sweep a scene in depth while Laplacian of Gaussian filtering is used as a focus measure to compute depth from focus. Alternatively, other focus measures like those related to other high-frequency content of the scene [13] can be applied to retrieve depth information from a multifocus stack.

In the present paper we propose the use of multifocus sensing for depth map retrieval and visualization in a smart glasses AR device. By means of an extra camera with a doublet lens incorporating a fixed focal length lens and an electrically focus-tunable lens, multifocus sensing can be performed. These elements are held together and attached in turn to the smart glasses by means of 3D printed parts. The multifocus stack therefore obtained is then processed off-line and in Fourier domain in order to extract the all-in-focus reconstruction of the scene, depth-map retrieval and point of view synthesis which are then passed on to the AR device for the user to visualize 3D information overlayed to the real world scene.

The proposed method is then able to provide the AR user with information that is not usually available in conventional devices and the multifocus system can be potentially integrated in smart glasses as a compact sensing module with no moving parts or misalignment problems. Besides, by avoiding the use of active illumination, the sensing system is more resistant to optical interference from the environment like object contrast or variations in illumination. The integration of multifocus sensing with AR display might also be extended to multifocus microscopy [17,18] with the possibility of the augmented reality device to display information related to the three-dimensional structure of cell aggregates [19,20].

2. Multifocus sensing

The optical setup for multifocus sensing (Fig. 1(a)) consists of a doublet formed by an Electrically Focus-Tunable Lens (EFTL) (Optotune EL-10-30) directly attached in front of the fixed focal length lens (F 1.6, focal lens $6.0$mm) of a camera (Basler series, CMOS sensor $1920\times 1080$px) focusing at infinity. By means of 3D printed parts (Fig. 1(b)) the pieces of the multifocus module can be held together and attached in turn to the AR device (Google Glass).

 figure: Fig. 1.

Fig. 1. a) Optical setup with AR device (assembly direction in green arrow); (1) EFTL, (2) fixed focus lens, (3) camera. b) 3D printed part to hold fixed focus lens and EFTL together (complete 3D printing files are available as we show in Code 1 [22]). c) Points lying within the Depth of Field (DoF) corresponding to the focal distance of the system (DoF in light blue) are imaged in focus at the camera sensor (see the light rays in light blue), while points outside the DoF (see light rays in orange) are imaged as blur circles.

Download Full Size | PDF

Since the device incorporates an EFTL, focusing distance can be varied with no mechanical moving parts involved in the multifocus sensing process. By doing so, images can be captured without lateral displacement between them and misregistration in the focus stack is avoided. On the other hand, small magnification changes along the stack may be generated because the chief ray for a given point of the scene may not stay exactly the same since the lenses in the doublet may not to be perfectly in contact. This can be corrected using a calibration sequence. Following the usual linear fit between magnification and current through the EFTL, center of mass fitting of the multifocus stack of a calibration object allows us to obtain calibration parameters that can then be passed on to any image sequence for field of view correction along the z-stack.

In each of the images of a certain stack, part of the scene is imaged in-focus while the rest is out-of-focus (see Fig. 1(c)): points at in-focus plane of the EFTL are imaged as points in the camera sensor (i.e. in-focus plane of the EFTL is conjugated to camera sensor plane). On the other hand, points outside the in-focus plane (or more precisely, out of the Depth of Field associated with a certain focusing distance) are imaged as blur circles, contributing to the out-of-focus part of the scene.

Tables Icon

Table 1. Summary of parameters in our experiments and simulations.

The maximum focal length range of the resulting system is approximately $4.2-17$cm. This range can be easily extended by means of an offset negative lens attached to the EFTL. Taking into account the aforementioned finite depth of field for each focusing distance which in turn varies between $1$mm @ $4.2$cm and $5$mm @ $17$cm, we can consider for a certain focal range an optimal focal swept [14,21] resulting in $N$ images without gaps or overlaps between depths of field from consecutive focal planes.

As a proof of principle of our proposal we consider the multifocus sensing of a scene close to the AR device (Fig. 2(a)) consisting of three circular objects at 7.5, 10.5, and 12.5cm and a background at 15cm from the sensing device. Figures 2($b_{1-4}$) show the image capture for the above mentioned distances (the complete stack of $N=15$ images covering the $7.5-15$cm range is available in Visualization 1). Experimental parameters are summarized in Table 1.

 figure: Fig. 2.

Fig. 2. (a) Scene to be visualized. ($b_{1-4}$) Registered images with the system focusing at 7.5, 10.5, 12.5 and 15cm, respectively (see Visualization 1 for the complete registered stack).

Download Full Size | PDF

3. All-in-focus reconstruction and depth map visualization

3.1 All-in-focus reconstruction

The images captured after multifocus sensing can be processed offline through a Fourier domain method. Synthesis of images with novel characteristics [21] as all-in-focus reconstruction is accomplished considering the following model.

Let $i_k$ be the intensity distribution of the $k$-th image of the stack ($k=1,\ldots,N$, for color images in $RGB$ space $i_k=\left (i_k^R,i_k^G,i_k^B\right )$) which corresponds to the system focusing at a distance $z=z_k$ from the EFTL (or equivalently, to the current $j=j_k$ through the EFTL) and can be described with the following equation:

$$i_k(x,y)=f_k(x,y)+\sum_{k'\neq k}h_{kk'}(x,y)\ast f_{k'}(x,y),$$
where $f_k$ is the in-focus region of $i_k$. Out-of-focus contributions in $i_k$ come from the 2D convolution between $f_{k'}$ (in-focus part of the $k'$-th image of the stack: $i_{k'}$) and the 2D intensity PSF $h_{kk'}(x,y)$ associated with the currents $j_k$ and $j_{k'}$:
$$h_{kk'}(x,y)=\frac{1}{\pi r_{kk'}^2}circ\left(\frac{\sqrt{x^2+y^2}}{r_{kk'}}\right).$$
where
$$\frac{r_{kk'}}{p}=R_0\left| j_k - j_{k'} \right|$$
and
$$R_0= \frac{R\alpha}{p},$$
where $R$ is radius of the aperture of the optical system, $\alpha$ is the proportionality constant that relates lateral magnification with the current through EFTL and $p$ is the pixel pitch of the camera sensor. It is possible to obtain the effective parameter $R_0$ directly from $R$, $\alpha$ and $p$. However, it also can be easily estimated following Eq. (3) and measuring in the image obtained for current $j=j_k$ the radius (in pixels) of the blurring circle of a point-like source placed at a distance corresponding to the system focusing under current $j=j_{k'}$ through the EFTL. The latter is the approach followed in the present work. For the stacks considered, $R_0\approx 0.5$mA$^{-1}$.

All-in-focus reconstruction (virtual pinhole camera view) $s(x,y)$ of the scene corresponds to the sum of in-focus contributions from the images of the stack:

$$s(x,y)=\sum_{k=1}^{N}f_k(x,y).$$

We will now consider the Fourier Transform ($\mathcal {F}$) of the set of $N$ coupled equations in Eq. (1) which reads in vector form as:

$$\vec{I}(u,v)=H(u,v)\vec{F}(u,v),$$
where $(u,v)$ are spatial frequencies and $N$-element column vectors $\vec {I}$ , $\vec {F}$ and $N\times N$ symmetric matrix $H$ are given by [23]:
$$\begin{aligned}\vec{I}(u,v) &= \left( \begin{array}{c} I_1(u,v)\\ I_2(u,v)\\ \vdots\\ I_N(u,v) \end{array} \right),\quad \vec{F}(u,v) = \left( \begin{array}{c} F_1(u,v)\\ F_2(u,v)\\ \vdots\\ F_N(u,v) \end{array} \right),\\ H(u,v) &= \left( \begin{array}{cccc} 1 & H_{12}(u,v) & \ldots & H_{1N}(u,v)\\ H_{12}(u,v) & 1 & & \vdots\\ \vdots & & \ddots & H_{N-1 N}(u,v)\\ H_{1N}(u,v) & \ldots & H_{N-1 N}(u,v) & 1 \end{array} \right)\end{aligned}$$
where each element $H_{kk'}$ of the symmetric matrix $H$ is an Optical Transfer Function: $H_{kk'}=\mathcal {F}\{h_{kk'}\}$ while $I_k=\mathcal {F}\{i_{k}\}$ and $F_k=\mathcal {F}\{f_{k}\}$. If $H(u,v)$ is invertible, then the solution to the linear system given by Eq. (6) is $\vec {F}(u,v)=H^{-1}(u,v)\vec {I}(u,v)$. On the other hand, when $H(u,v)$ is not invertible (in particular the D.C. component case as discussed below) then a solution to the system may be found through the Moore-Penrose pseudo-inverse $H^{\dagger}$ [24] which provides the set of vectors that minimize the Euclidean norm $\left \| H(u,v)\vec {F}(u,v)- \vec {I}(u,v)\right \|$ in the least squares sense. Thus, the minimal norm vector estimated by the Moore-Penrose pseudo-inverse is given by
$$\vec{F}^{MP}(u,v)=H^\dagger(u,v)\vec{I}(u,v),$$
and all-in-focus reconstruction can be retrieved in frequency domain as
$$S(u,v)=\sum_{k=1}^{N}F_k^{MP}(u,v),$$
while all-in-focus image is obtained through Fourier inverse transforming to spatial domain the previous result:
$$s(x,y)=\mathcal{F}^{{-}1}\{S(u,v)\}.$$

Notice that the previous result is obtained without segmentation of the focused regions from each image of the stack, which usually introduces some inaccuracies in the reconstruction. All-in-focus reconstruction from the multifocus stack in Fig. 2($b_{1-4}$) is depicted in Fig. 3($a_{1}$). The result is visualized in the AR device as shown in Fig. 3($a_{2}$) (image taken with a Nikkon series camera mounted with a 18-55mm lens focusing at the display of the Google Glass to show how the scene is viewed by the AR device user).

 figure: Fig. 3.

Fig. 3. ($a_{1,2}$) All-in-focus reconstruction and visualization in AR device, respectively. ($b_{1,2}$) Retrieved Depth-map ($n=1$ and $D_0=4$ in Eq. (11), $B=25$ in Eq. (16), depth in cm) and visualization in AR device, respectively.

Download Full Size | PDF

3.2 Depth map retrieval

In order to estimate the depth map of the scene we propose the use of an algorithm [25] which takes advantage of the all-in-focus image reconstruction scheme previously developed and a block-based pixel by pixel comparison between the in-focus region of each image of the stack and the all-in-focus image. From this comparison the algorithm assigns a certain depth to each pixel from the set of focusing distances. The precision is then limited by the corresponding depth of field associated with each focusing distance.

To accurately accomplish the comparison we need to take into account that while the computation of the all-in-focus image is well-conditioned and so is the retrieval of the high frequency components of the in-focus regions from each image, the recovery of the low frequency components of each in-focus region is ill-conditioned. In particular, for the D.C. component (i.e. $(u,v) = (0,0)$ ), $H (0,0)$ is an all ones, singular matrix, and the only condition on the in-focus regions is $S(0,0)=\sum _{k=1}^{N}F_k(0,0)$ , which is a (N-1)-fold degenerate problem.

Then, the proposed method to estimate the depth-map is described in the following. We start by considering a low-pass Butterworth filter [26] of order $n$:

$$G(u,v)=\dfrac{1}{1+\left(\frac{D(u,v)}{D_0}\right)^{2n}},$$
where distance to frequency origin is given by
$$D(u,v)=\sqrt{u^2+v^2},$$
and $D_0$ is radius parameter in frequency space. Then, let $s^h$ be the result of high-pass filtering with $1-G$ the AIF reconstruction:
$$s^{h}=\mathcal{F}^{{-}1}\{(1-G)S\},$$
where $S$ is given by Eq. (9).

As the high frequency content of in-focus regions can be correctly retrieved under the scheme previous developed, we can use the high-pass filtered in-focus regions which we define as:

$$f_k^{h}=\mathcal{F}^{{-}1}\{(1-G)F_k^{MP}\},$$
where $F_k^{MP}$ is the $k$-th component of the minimal norm vector given in Eq. (8):
$$F_k^{MP}(u,v)=\left(H^\dagger\vec{I}\right)_k(u,v).$$

Next we make a block-based pixel by pixel comparison between $s^h$ given by Eq. (13) and the high frequency in-focus regions given by Eq. (8), from which the stack number $k$ (and in turn the focusing distance associated to this number) for each pixel $(i,j)$ can be obtained:

$$k(i,j)=\arg \min _{k^{\prime}}\left\langle\left|f_{k^{\prime}}^{h}(i, j)-s^{h}(i, j)\right|^{2}\right\rangle_{B \times B},$$
where $\left |\cdot \right |^{2}$ is the square of the distance in RGB space between the images, $\left \langle \cdot \right \rangle _{B \times B}$ is a mean of this quantity over a square block of size $B$ and $\arg \min$ represents the selection of index $k$ in the stack for which the enclosed expression achieves its minimum.

Depth map obtained through Eq. (16) for the multifocus stack in Figs. 2($b_{1-4}$) and corresponding all-in-focus in 3($a_1$) is depicted in Fig. 3($b_1$) (in Eq. (11) we use $n=1$ to avoid ringing artifacts in the filtering and $D_0=4$ to assure a comparison between images retaining most of the high frequency content while avoiding the d.c. ; $B=25$ in Eq. (16) in order to smooth the resulting depth-map). The result is visualized in AR device as shown in Fig. 3($b_{2}$), where the user can be provided with 3D information of the scene not available in conventional AR devices.

4. Point of view synthesis

Besides all-in-focus reconstruction and depth-map information, synthesis of novel viewpoints of the scene [21] can also be performed from multifocus stack and provide the AR user with additional 3D information.

We start by considering the synthesis of an arbitrary horizontal viewpoint of the scene which corresponds to simulating the baseline displacement $b_x$ of a virtual pinhole camera in the horizontal direction with respect to the center of the system’s pupil (similarly for $b_y$ displacement in the vertical direction). The horizontal disparity $d_k$ between image of a given point of the in focus component $f_k$ as seen by a virtual pinhole camera in the center position and displaced to the left is given by:

$$d_k =b_x\alpha j_k$$
(aside from a constant factor independent of $k$). Then, in a piecewise approximation of the 3D scene by planes, novel viewpoint $s_{b_x}(x,y)$ is obtained by shifting each focus slice $f_k(x,y)$ in an amount according to the disparity associated with the $j_k$ current through EFTL and the baseline displacement $(b_x)$ of the virtual pinhole camera,
$$s_{b_x}(x,y)=\sum_{k=1}^{N}f_k\left(x-b_x\alpha j_k,y\right).$$

By means of the Fourier transform shift theorem that relates translation in space domain with linear phase shift in frequency domain [27], and by using Eq. (8) for $\vec {F}(u,v)$, we obtain the Fourier transform of Eq. (18):

$$S_{b_x}(u,v)=\sum_{k=1}^{N} e^{{-}j2\pi\alpha j_k\left(b_x u\right)}\left(H^\dagger(u,v)\vec{I}(u,v)\right)_k.$$

If we now consider $b_x$ as a fraction $\beta _x$ ($|\beta _x|\leq 1$) of the radius $R$ (displacements outside of the aperture have no physical meaning since it is not possible to synthesize points of view corresponding to rays not collected by the aperture):

$$b_x=\beta_x R,$$
by means of Eq. (4), Eq. (19) can then be rewritten as
$$S_{\beta_x}(u,v)=\sum_{k=1}^{N} e^{{-}j2\pi j_k R_0 \left(\beta_xp u\right)}\left(H^\dagger(u,v)\vec{I}(u,v)\right)_k.$$

We can generalize the previous Eq. to consider besides horizontal translation, vertical translation as a fraction $\beta _y$ ($|\beta _y|\leq 1$) of the pupil $R$:

$$S_{(\beta_x,\beta_y)}(u,v)=\sum_{k=1}^{N} e^{{-}j2\pi j_k R_0 \left(\beta_x p u+\beta_y p v\right)}\left(H^\dagger(u,v)\vec{I}(u,v)\right)_k.$$

Finally, by Fourier inverse transform of Eq. (22) we obtain a new scene perspective as seen from a virtual pinhole camera, translated a fraction $\beta _x$ to the left and $\beta _y$ upwards of the center of the original circular pupil:

$$s_{(\beta_x,\beta_y)}(x,y)=\mathcal{F}^{{-}1}\{S_{(\beta_x,\beta_y)}(u,v)\}$$
(note that $s_{(0,0)}(x,y)$ recovers the all-in-focus reconstruction in Eq. (10)).

Viewpoint synthesis from the stack in Fig. 4, simulating the perspective of a virtual pinhole to the left ($(\beta _x,\beta _y)=(0.5,0)$), from above ($(\beta _x,\beta _y)=(0,0.5)$), from below ($(\beta _x,\beta _y)=(0,-0.5)$), and to the right ($(\beta _x,\beta _y)=(-0.5,0)$), is shown in Figs. 5 ($a_{1-2,4-5}$), respectively, while a center virtual pinhole camera point of view is depicted in Fig. 5($a_{3}$) (see Visualization 3 for full sequence). Video sequence of novel viewpoints with continuous parallax is presented to AR user as in Visualization 4 (one frame from this visualization is shown in Fig. 5($b$)).

 figure: Fig. 4.

Fig. 4. (a) Scene to be visualized ($b_{1-3}$) Image capture (after registration) with system focusing at 4.5, 9.5 and 16.5cm, respectively. See Visualization 2 for the complete stack of $N=40$ images covering the $4.5-16.5$cm focal range.

Download Full Size | PDF

 figure: Fig. 5.

Fig. 5. ($a_{1-5}$) Point of view synthesis for $(\beta _x,\beta _y)$=(0.5,0),(0,0.5),(0,0),(0,-0.5) and (-0.5,0), respectively (see Visualization 3 for the complete sequence covering $\beta$ in the range $[-0.5,0.5]$ in steps $\Delta \beta =0.05$ for both horizontal and vertical direction). Horizontal (dotted) and vertical (dashed) guidelines passing through a feature object in the central viewpoint ($a_{3}$) (branching point of the front plant) have been added to visualize more clearly the displacement of the object in the shifted perspectives. When the object is viewed by the virtual pinhole displaced to the left/right ($a_{1}$/$a_{5}$) from the central viewpoint, the object is seen displaced to the right/left. Analogously, when the object is viewed by the virtual pinhole displaced above/below ($a_{2}$/$a_{4}$), the object is seen displaced below/above. ($b$) Frame from visualization in AR device (see Visualization 4 for complete video).

Download Full Size | PDF

5. Conclusion

We have shown through a series of proof-of-concept experiments that it is possible to integrate multifocus sensing into an AR device to provide the user with a combination of the real world scene and the overlaid synthesized 3D visualization from different perspectives and the depth-map of the same scene in the display. This way the user can take into account 3D information of the scene that is not available in conventional AR devices.

Miniaturization of the optical device for multifocus sensing is desired and could be achieved by means of a micro-camera and a smaller EFTL, in order to add as less weight as possible to the AR device. Besides, by the use of a negative offset lens, higher depth range can be achieved, making the proposal suitable for bigger scale scenes. Real-time processing is also desired and might be achieved by accelerating the algorithms on graphical processing unit.

The introduction of morphological operations besides pixel by pixel comparison in the depth map retrieval algorithm can be considered in order to improve the depth map visualization.

An interesting line of work to explore in the near future is the noise resistance ability of the point of view synthesis algorithms for example under low light multifocus capture.

As another future line of work, we consider that the proposed scheme of multifocus sensing combined with AR has the potential to be extended to multifocus microscopy with the possibility of the AR device to display 3D visual information of a biological sample. In particular, custom built fluorescence microscopy integrating an EFTL allows to acquire multifocus z-stacks of thick 3D biological specimens. This acquired information can in turn be combined by post-processing algorithms for image reconstruction with novel relevant information like extended depth of field and might be finally visualized by means of AR devices.

Funding

Comisión Sectorial de Investigación Científica; Air Force Office of Scientific Research (FA9550-18-1-0338, FA9550-21-1-0333).

Acknowledgments

J. R. Alonso and A. Fernández acknowledge support by Comisión Sectorial de Investigación Científica (Uruguay). B. Javidi acknowledges support by The Air Force Office of Scientific Research (FA9550-18-1-0338, FA9550-21-1-0333). All the authors acknowledge T. O’Connor for his fruitful comments on the manuscript.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are available in Code 1, Ref. [22].

References

1. H. Hua and B. Javidi, “A 3d integral imaging optical see-through head-mounted display,” Opt. Express 22(11), 13484–13491 (2014). [CrossRef]  

2. X. Shen and B. Javidi, “Large depth of focus dynamic micro integral imaging for optical see-through augmented reality display using a focus-tunable lens,” Appl. Opt. 57(7), B184–B189 (2018). [CrossRef]  

3. Z. He, X. Sui, G. Jin, and L. Cao, “Progress in virtual reality and augmented reality based on holographic display,” Appl. Opt. 58(5), A74–A81 (2019). [CrossRef]  

4. M. Billinghurst, A. Clark, and G. Lee, “A survey of augmented reality,” Foundations and Trends in Human-Computer Interaction 8(2-3), 73–272 (2015). [CrossRef]  

5. D. Van Krevelen and R. Poelman, “A survey of augmented reality technologies, applications and limitations,” Int. J. Virtual Reality 9(2), 1–20 (2010). [CrossRef]  

6. T. Sielhorst, M. Feuerstein, and N. Navab, “Advanced medical displays: A literature review of augmented reality,” J. Display Technol. 4(4), 451–467 (2008). [CrossRef]  

7. G. Aydındoğan, K. Kavaklı, A. Şahin, P. Artal, and H. Ürey, “Applications of augmented reality in ophthalmology,” Biomed. Opt. Express 12(1), 511–538 (2021). [CrossRef]  

8. A. N. Angelopoulos, H. Ameri, D. Mitra, and M. Humayun, “Enhanced depth navigation through augmented reality depth mapping in patients with low vision,” Sci. Rep. 9(1), 11230 (2019). [CrossRef]  

9. T. O’Connor, S. Rawat, A. Markman, and B. Javidi, “Automatic cell identification and visualization using digital holographic microscopy with head mounted augmented reality devices,” Appl. Opt. 57(7), B197–B204 (2018). [CrossRef]  

10. X. Wang, S. K. Ong, and A. Y. Nee, “A comprehensive survey of augmented reality assembly research,” Adv. Manuf. 4(1), 1–22 (2016). [CrossRef]  

11. H. Shahinian, A. Markos, J. Navare, and D. Zaytsev, “Scanning depth sensor for see-through ar glasses,” Proc. SPIE11040, 110400B (2019). [CrossRef]  

12. A. Markman, X. Shen, H. Hua, and B. Javidi, “Augmented reality three-dimensional object visualization and recognition with axially distributed sensing,” Opt. Lett. 41(2), 297–300 (2016). [CrossRef]  

13. M. T. Mahmood and T.-S. Choi, “Focus measure based on the energy of high-frequency components in the s transform,” Opt. Lett. 35(8), 1272–1274 (2010). [CrossRef]  

14. C. Zhou, D. Miau, and S. K. Nayar, “Focal sweep camera for space-time refocusing,” Dept. Comput. Sci., Columbia Univ., Tech. Rep. CUCS- 021-12 (2012).

15. J. N. Martel, L. K. Müller, S. J. Carey, J. Müller, Y. Sandamirskaya, and P. Dudek, “Real-time depth from focus on a programmable focal plane processor,” IEEE Trans. Circuits Syst. I 65(3), 925–934 (2018). [CrossRef]  

16. W. Huang and Z. Jing, “Evaluation of focus measures in multi-focus image fusion,” Pattern recognition letters 28(4), 493–500 (2007). [CrossRef]  

17. S. Abrahamsson, J. Chen, B. Hajj, S. Stallinga, A. Y. Katsov, J. Wisniewski, G. Mizuguchi, P. Soule, F. Mueller, C. D. Darzacq, X. Darzacq, C. Wu, C. I. Bargmann, D. A. Agard, M. Dahan, and M. G. L. Gustafsson, “Fast multicolor 3d imaging using aberration-corrected multifocus microscopy,” Nat. Methods 10(1), 60–63 (2013). [CrossRef]  

18. S. Liu and H. Hua, “Extended depth-of-field microscopic imaging with a variable focus microscope objective,” Opt. Express 19(1), 353–362 (2011). [CrossRef]  

19. J. R. Alonso, A. Silva, and M. Arocena, “Computational multimodal and multifocus 3d microscopy,” Proc. SPIE 11351, 1135110 (2020). [CrossRef]  

20. J. R. Alonso, A. Silva, and M. Arocena, “3d visualization in multifocus fluorescence microscopy,” Proc. SPIE 10997, 109970Q (2019). [CrossRef]  

21. J. R. Alonso, A. Fernández, and J. A. Ferrari, “Reconstruction of perspective shifts and refocusing of a three-dimensional scene from a multi-focus image stack,” Appl. Opt. 55(9), 2380–2386 (2016). [CrossRef]  

22. J. R. Alonso, “3D printing files for multifocus sensing module,” figshare (2022), https://doi.org/10.6084/m9.figshare.16728955.

23. J. R. Alonso, A. Fernández, G. A. Ayubi, and J. A. Ferrari, “All-in-focus image reconstruction under severe defocus,” Opt. Lett. 40(8), 1671–1674 (2015). [CrossRef]  

24. A. Ben-Israel and T. N. Greville, Generalized inverses: theory and applications, vol. 15 (Springer Science & Business Media, 2003).

25. J. R. Alonso and J. A. Ferrari, “From frequency domain multi-focus fusion to focus slicing,” in Frontiers in Optics, (Optical Society of America, 2015), pp. FTh3G–5.

26. R. C. Gonzalez, R. E. Woods, and S. L. Eddins, Digital image processing using MATLAB (Pearson Prentice Hall, 2009).

27. J. W. Goodman, Introduction to Fourier optics (Roberts and Company Publishers, 1996).

Supplementary Material (5)

NameDescription
Code 1       3D printing files for multifocus sensing module
Visualization 1       Multifocus capture of circles scene
Visualization 2       Multifocus capture of dinosaurs scene
Visualization 3       Point of view synthesis from dinosaurs stack
Visualization 4       Point of view synthesis visualization in AR

Data availability

Data underlying the results presented in this paper are available in Code 1, Ref. [22].

22. J. R. Alonso, “3D printing files for multifocus sensing module,” figshare (2022), https://doi.org/10.6084/m9.figshare.16728955.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (5)

Fig. 1.
Fig. 1. a) Optical setup with AR device (assembly direction in green arrow); (1) EFTL, (2) fixed focus lens, (3) camera. b) 3D printed part to hold fixed focus lens and EFTL together (complete 3D printing files are available as we show in Code 1 [22]). c) Points lying within the Depth of Field (DoF) corresponding to the focal distance of the system (DoF in light blue) are imaged in focus at the camera sensor (see the light rays in light blue), while points outside the DoF (see light rays in orange) are imaged as blur circles.
Fig. 2.
Fig. 2. (a) Scene to be visualized. ($b_{1-4}$) Registered images with the system focusing at 7.5, 10.5, 12.5 and 15cm, respectively (see Visualization 1 for the complete registered stack).
Fig. 3.
Fig. 3. ($a_{1,2}$) All-in-focus reconstruction and visualization in AR device, respectively. ($b_{1,2}$) Retrieved Depth-map ($n=1$ and $D_0=4$ in Eq. (11), $B=25$ in Eq. (16), depth in cm) and visualization in AR device, respectively.
Fig. 4.
Fig. 4. (a) Scene to be visualized ($b_{1-3}$) Image capture (after registration) with system focusing at 4.5, 9.5 and 16.5cm, respectively. See Visualization 2 for the complete stack of $N=40$ images covering the $4.5-16.5$cm focal range.
Fig. 5.
Fig. 5. ($a_{1-5}$) Point of view synthesis for $(\beta _x,\beta _y)$=(0.5,0),(0,0.5),(0,0),(0,-0.5) and (-0.5,0), respectively (see Visualization 3 for the complete sequence covering $\beta$ in the range $[-0.5,0.5]$ in steps $\Delta \beta =0.05$ for both horizontal and vertical direction). Horizontal (dotted) and vertical (dashed) guidelines passing through a feature object in the central viewpoint ($a_{3}$) (branching point of the front plant) have been added to visualize more clearly the displacement of the object in the shifted perspectives. When the object is viewed by the virtual pinhole displaced to the left/right ($a_{1}$/$a_{5}$) from the central viewpoint, the object is seen displaced to the right/left. Analogously, when the object is viewed by the virtual pinhole displaced above/below ($a_{2}$/$a_{4}$), the object is seen displaced below/above. ($b$) Frame from visualization in AR device (see Visualization 4 for complete video).

Tables (1)

Tables Icon

Table 1. Summary of parameters in our experiments and simulations.

Equations (23)

Equations on this page are rendered with MathJax. Learn more.

i k ( x , y ) = f k ( x , y ) + k k h k k ( x , y ) f k ( x , y ) ,
h k k ( x , y ) = 1 π r k k 2 c i r c ( x 2 + y 2 r k k ) .
r k k p = R 0 | j k j k |
R 0 = R α p ,
s ( x , y ) = k = 1 N f k ( x , y ) .
I ( u , v ) = H ( u , v ) F ( u , v ) ,
I ( u , v ) = ( I 1 ( u , v ) I 2 ( u , v ) I N ( u , v ) ) , F ( u , v ) = ( F 1 ( u , v ) F 2 ( u , v ) F N ( u , v ) ) , H ( u , v ) = ( 1 H 12 ( u , v ) H 1 N ( u , v ) H 12 ( u , v ) 1 H N 1 N ( u , v ) H 1 N ( u , v ) H N 1 N ( u , v ) 1 )
F M P ( u , v ) = H ( u , v ) I ( u , v ) ,
S ( u , v ) = k = 1 N F k M P ( u , v ) ,
s ( x , y ) = F 1 { S ( u , v ) } .
G ( u , v ) = 1 1 + ( D ( u , v ) D 0 ) 2 n ,
D ( u , v ) = u 2 + v 2 ,
s h = F 1 { ( 1 G ) S } ,
f k h = F 1 { ( 1 G ) F k M P } ,
F k M P ( u , v ) = ( H I ) k ( u , v ) .
k ( i , j ) = arg min k | f k h ( i , j ) s h ( i , j ) | 2 B × B ,
d k = b x α j k
s b x ( x , y ) = k = 1 N f k ( x b x α j k , y ) .
S b x ( u , v ) = k = 1 N e j 2 π α j k ( b x u ) ( H ( u , v ) I ( u , v ) ) k .
b x = β x R ,
S β x ( u , v ) = k = 1 N e j 2 π j k R 0 ( β x p u ) ( H ( u , v ) I ( u , v ) ) k .
S ( β x , β y ) ( u , v ) = k = 1 N e j 2 π j k R 0 ( β x p u + β y p v ) ( H ( u , v ) I ( u , v ) ) k .
s ( β x , β y ) ( x , y ) = F 1 { S ( β x , β y ) ( u , v ) }
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.