Fundamentals of 3D imaging and displays: a tutorial on integral imaging, light-field, and plenoptic systems

Manuel Martínez-Corral; Bahram Javidi

doi:10.1364/AOP.10.000512

1. Introduction

In the past two decades, there has been substantial progress in extending the classical two-dimensional (2D) imaging and displays into their 3D counterparts. This transition must take into account the fact that 3D display technologies should be able to stimulate the physical and psychophysical mechanisms involved in the perception of the 3D nature of the world. In terms of psychophysical cues, we can enumerate the perception of occlusions such that occluding objects are closer and occluded objects are farther away, the conical perspective rule, shadows, the movement parallax, that is close objects appear to move faster that distant objects, etc. Conversely, the physical mechanisms are accommodation, convergence of the visual axes, and the disparity between the retinal images of the same object. When observing distant objects, the accommodation is relaxed, the visual axes are parallel, and there is no disparity. In contrast, for the observation of close objects the eyes apply a stronger accommodation effort, a remarkable convergence of the visual axes is stimulated, and great disparity is achieved. The brain uses these physical clues to perceive and acquire or estimate information about the depth in 3D scenes.

The first approach to the challenge of capturing and displaying 3D images was based on the concept of stereoscopy. Stereoscopic systems are based on the imitation of the binocular human visual system (HVS). Following this idea, a pair of pictures (or movies) of the same 3D scene are taken with a pair of cameras that are set up with some horizontal separation between them. Later the images are shown independently to the eyes of the observer so that the left (or the right) eye only can see the picture captured with the left (or the right) camera. In this way, some binocular disparity is induced, which stimulates the convergence of the visual axes. This process provides the brain with the information that allows it to perceive and estimate the depth content of the scene. In 1838, Wheatstone reported the first stereoscope [1]. Some years later, in 1853, Rollman proposed to make use of color vision and codify the stereoscopic pairs in complementary colors. Then he proposed the use of anaglyphs [2]. This method was widely used throughout the 20th century, but became less popular due to poor color reproduction and cross talk between the left and right images. However, this technique is experiencing a certain rebirth, due to its easy application for the reproduction of 3D videos through the Internet. In order to overcome the color-reproduction problems, the use of polarization to codify the stereoscopic information has been proposed [3,4]. However, the main problem of stereoscopy is that 3D images are not optically displayed or optically generated. Instead, a pair of 2D images is displayed for projection onto the human observer’s retinas. It is the brain that promotes the image fusion for producing the sensation of perspective and depth, that is, 3D perception. In this process, a decoupling occurs between two otherwise coupled physiological processes known as convergence and accommodation [5,6]. This is an unnatural physiological process that may give rise to visual discomfort or fatigue such as headache after prolonged observations of stereo displays. Stereoscopy can be implemented also without the need for using special glasses. In this case, the display systems are called auto-stereoscopic, which may be implemented by means of lenticular sheets [7] or by means of parallax barriers [8].

Multi-view systems are an upgraded realization of stereoscopy [9]. Still based on the use of parallax barriers or in lenticular sheets, these systems provide the user with up to 16 views. Although multi-view systems can provide different stereoscopic views to different observers, they have the drawback of flipping, or double image, when the observer is displaced parallel to the system. Note, however, that whatever its complexity any display system based on the concept of stereoscopy still suffers from the consequences of the convergence-accommodation conflict.

In order to avoid such conflict, the so-called real 3D displays have been proposed. In these systems the 3D images are observed without the aid of any special glasses or auto-stereoscopic device. Examples include the volumetric displays [10,11], which can display volumetric 3D images in true 3D space, and the holographic displays [12]. Conceptually, holography is considered by many as a technique that provides a better 3D experience and does not produce visual fatigue. However, practical implementation of holographic display still faces many technical difficulties, such as the need for refreshable or real-time updatable display materials, and the need for a huge number of very small pixels. Thus, holography may face challenges in the near future for widespread 3D display media.

Lippmann proposed another real 3D display technology in 1908, under the name of integral photography (IP). Specifically, Lippmann [13] proposed the use of a microlens array (MLA) in front of photographic film. With such a device it was possible to capture a collection of 2D elemental images, each with a different perspective of the 3D scene. The original idea of Lippmann was to use these images for the display of 3D scenes. However, that system showed some essential problems. One was the overlapping of elemental images in the case of wide scenes. Another problem was that the IP system using microlens arrays could be used only for capturing scenes that are close to the camera.

To overcome these problems, Coffey [14] proposed the use of a field lens of large diameter to form the image of the scene onto a MLA. This design permitted the implementation of the Lippmann concept with a conventional photographic camera, avoiding the overlapping between images. The images provided by the Coffey’s camera are much smaller than the elemental images and are usually named as microimages. The design made by Coffey was refined many years later by Davies et al. [15].

Due to the lack of flexibility associated with the use of photographic film, the IP technology was hibernating for decades. However, thanks to the advances in the quality and speed of optoelectronic pixelated sensors and displays, and also in the speed and capacity of digital technology and software tools, the interest in integral photography was reborn by the end of the 20th century, when it was renamed as integral imaging (InIm) by some authors [16,17]. In this sense some proposals of capturing and transmitting integral images in real time were remarkable [18,19]. The use of a multi-camera system organized in an array form was also noteworthy [20]. It is important, as well, that in 1991 Adelson and Bergen reported the plenoptic formalism for describing the radiance of any luminous ray in the space as a function of the angle and position [21]. On the basis of this formalism the first plenoptic camera, an update of the camera designed by Coffey, was built [22]. Currently, and to the best of our knowledge, two plenoptic cameras are commercially available [23–25].

In the past decade, the InIm technology has experienced a rapid development and more importantly, has been applied as an efficient solution to many technological problems. Among them, some biomedical applications are remarkable, such as the use of plenoptic technology in otoscopy [26], ophthalmology [27], endoscopy [28,29], and for static [30,31] or dynamic [32–34] deep-tissue inspection. InIm technology has been proposed also for wavefront sensing [35], 3D imaging using long-wave infrared light [36,37], head-mounted display technology [38], or large 3D screens [39,40]. The utility of the plenoptic concept is spreading very fast and reaching even some exotic applications, such as giga pixel photography [41], 3D polariscopy [42], the inspection of road surfaces [43], or the monitoring of the red coral [44]. An application of integral photography that deserves special attention is 3D microscopy. This application, called integral microscopy (IMic) or light-field microscopy, offers to microscopists the possibility of observing the samples, almost in real time, reconstructed in 3D from many different perspectives [45–51]. In this application to microscopy, the development of disposable [52,53] and reconfigurable [54] micro-optics for instruments is remarkable.

The widespread research indicates that integral imaging, or the plenoptic concept, is a technology of substantial interest with many diverse applications including entertainment, medicine, security, defense, and transportation. Thus, a tutorial that describes the fundamental principles and characteristics of integral imaging in a rigorous and comprehensive way is of interest to the community of optical scientists and engineers, physicists, and computer scientists interested in 3D imaging. The tutorial is organized as follows. In Section 2, the basic principles of geometrical optics are explained in term matrix formalism. In Section 3, a brief review of scalar wave optics theory of 2D image formation is presented. Special attention is given to the explanation of spatial resolution and depth of field (DoF) metrics. The concepts presented in the first two sections permit the reader to have a better understanding of the materials in Section 4, which presents the fundamentals of 3D InIm systems, and how they can capture multiple views of 3D scenes, that is, the capture stage of the 3D imaging system. The computational refocusing is explained in this section, and the spatial resolutions and depth of field of the reconstructed 3D scene are discussed. Section 5 is devoted to the explanation of the characteristics of integral-imaging display monitors. In Section 6, we overview the specific application of the Lippmann concept to 3D microscopy.

We have included Table 1 with a list of all the acronyms used in the paper.

Table 1. List of Acronyms Used in the Paper

View Table

2. Fundaments of Geometrical Optics

If we do not take into account the wave nature of light, and consider that light propagates in homogeneous media with constant refractive index, then light propagation can be described using geometrical optics. In this case, light beams act as bundles of rays that have ideally infinitesimal width and propagate following a straight trajectory. The branch of optics dealing with light rays, especially in the study of free-space propagation and lenses, is known as geometrical optics. In this section, we present a brief overview of geometrical optics related to the imaging capacity of optical systems. In Appendix A, we describe in more detail the fundamental equations of geometrical optics and the ABCD matrices that connect different spatial-angular states of the light beams, that is, 2D vectors whose components are the spatial and the angular coordinates of the ray. We recommend that those who are not familiar with geometrical optics read Appendix A before proceeding to Subsection 2.1.

2.1. Telecentric Optical Systems

We start by describing a type of imaging architecture that is very common in many optical imaging applications. We refer to telecentric (or afocal) systems [55], which can be implemented by the coupling of focal elements, so that the effective focal length (EFL) is infinite [56] (see Subsection A.3, in Appendix A, for an exact definition of the EFL). Although many lenses can compose telecentric systems, we will perform our study here with a telecentric system composed of only two lenses (see Fig. 1). Naturally, this election does not imply any loss of generality. According to Appendix A, one can calculate the ABCD matrix [57] between the front focal plane (FFP) of the first lens, $F_{1}$ , and the back focal plane (BFP) of the second one, $F_{2}^{'}$ :

M_{F_{1} F_{2}^{'}} = M_{F_{2} F_{2}^{'}} M_{F_{1} F_{1}^{'}} = (\begin{matrix} 0 & - f_{2} \\ P_{2} & 0 \end{matrix}) (\begin{matrix} 0 & - f_{1} \\ P_{1} & 0 \end{matrix}) = (\begin{matrix} - P_{1} f_{2} & 0 \\ 0 & - P_{2} f_{1} \end{matrix}) = (\begin{matrix} A & B \\ C & D \end{matrix}),

where

f_{i}

is the focal length for the lens

i = 1, 2

, and

P_{i} = f_{i}^{- 1}

is its optical power. We find that

F_{1}

and

F_{2}^{'}

are conjugates (the matrix element

B = 0

) with lateral magnification

M = - f_{2} / f_{1}

and angular magnification

γ = - f_{1} / f_{2}

. It is also confirmed that the EFL is infinite (the matrix element

C = 0

).

Figure 1 Scheme of a telecentric system composed of two converging lenses.

Download Full Size | PDF

It is interesting to find the conjugation relation for the case of an object, $O$ , placed at a distance $z_{0}$ from $F_{1}$ . To this end, we calculate the matrix:

M_{{O O}^{'}} = T_{F_{2}^{'} O^{'}} M_{F_{1} F_{2}^{'}} T_{{O F}_{1}} = (\begin{matrix} - P_{1} f_{2} & z_{0}^{'} P_{2} f_{1} - z_{0} P_{1} f_{2} \\ 0 & - P_{2} f_{1} \end{matrix}) .

From this matrix, and after setting the element

B = 0

, we obtain the conjugation relation for telecentric systems:

z_{0}^{'} = M^{2} z_{0} and M = - \frac{f_{2}}{f_{1}} .

Equation (3) confirms that afocal systems are interesting in the sense that, although they have no optical power ( $C = P = 0$ ), they have the capacity of providing images with a constant lateral magnification, $M$ , which does not depend on the object position. An interesting consequence is that the axial magnification, $α$ , is independent of the object position:

α = \frac{Δ z_{0}^{'}}{Δ z_{0}} = M^{2} .

Telecentric systems are typically associated to two optical instruments with very different applications. One is the Keplerian telescope and the other is the optical microscope (see Fig. 2).

Figure 2 Most popular optical instruments based on telecentricity. (a) Keplerian telescope, and (b) optical microscope.

Download Full Size | PDF

The telescope results from the afocal coupling between a large-focal-length objective lens and a short-focal-length ocular lens. Designed for the observation of very far objects (which produce bundles of incident collimated beams), the telescope provides images with high angular magnification ( $| σ^{'} | ≫ | σ |$ ). Low-magnification Keplerian telescopes are widely used in optical laboratories for expanding collimated beams.

In contrast, the optical microscope is composed of a short-focal-length microscope objective (MO) and a large-focal-length tube lens. It is designed for forming images with high lateral magnification ( $M = - f_{TL} / f_{ob}$ ). Due to the property of having constant lateral and axial magnification, the telecentric microscopes are especially well adapted for providing images of 3D specimens, in which some sections are not in the object plane, but axially displaced.

Let us show an example of the utility of the ABCD formalism for the analysis of new optical instruments. We refer to the recent proposal of inserting an electrically addressed liquid lens in an optical microscope aiming to perform fast axial scanning of 3D specimens [58]. The scheme of this new microscope is shown in Fig. 3. In order to find the plane, $O$ , in the object space that is conjugated with the CCD, we calculate the matrix:

M_{O F_{TL}^{'}} = M_{F_{TL} F_{TL}^{'}} M_{LL} M_{F_{ob} F_{ob}^{'}} T_{0 F_{ob}} = (\begin{matrix} - f_{TL} P_{ob} & f_{ob} - z_{0} f_{LL} P_{ob} \\ 0 & - P_{TL} f_{ob} \end{matrix}) .

From the above equation we find that

O

and

F_{TL}^{'}

are conjugated provided that

B = 0

, that is,

z_{0} = f_{ob}^{2} P_{LL},

where

P_{LL}

stands for the optical power of the liquid lens. This means that one can gradually scan the axial position of the object plane by tuning the voltage of the liquid lens, and therefore its optical power. By setting positive or negative optical powers, it is possible to scan the object in front and behind the focal plane. But the key point is that the axial scan is performed without any modification of the other important parameters of the microscope, such as the lateral magnification (

M = - f_{TL} / f_{ob}

) or the numerical aperture.

Figure 3 Scheme of the fast axial-scanning optical microscope.

Download Full Size | PDF

2.2. Aperture and Field Limitation

In the previous study, we have not taken into account the finite size of the optical elements, such as the lenses, apertures, and sensors. Thus, we have analyzed neither the limitations of light collected by the optical system nor the limits in the size of the object that is imaged. To study these important effects, we must take into account the finite size of the optical elements, and also make some additional definitions.

We start by defining the aperture stop (AS) as the element that determines the angular extension of the beam that focuses at the axial point of the image. In Fig. 4(a) we show an example of aperture limitation. In terms of energy, the AS is the element that limits the amount of light collected by the optical system from the central point of the object. Note that in this example, the aperture stop is placed at the BFP of the first lens, which is also the FFP of the second lens. As result, its conjugates at the object space (the entrance pupil) and at the image space (the exit pupil) are at the infinity. Thus, we can say that the system shown in Fig. 4 is strictly telecentric [56].

Figure 4 (a) Aperture stop in a telecentric system, (b) the limit of the field of uniform illumination, (c) the vignetting region, and (d) the field of limit illumination.

Download Full Size | PDF

There are some well-known optical parameters in photography or in microscopy that are defined in terms of the aperture limitation. These include the $f$ -number, $f_{#}$ , or the numerical aperture, NA, whose definitions here are

f_{#} = \frac{f_{1}}{ϕ_{AS}} and NA = \sin σ .

These parameters can be evaluated also in the image space, as

f_{#}^{'} = \frac{f_{2}}{ϕ_{AS}} = | M | f_{#} and {NA}^{'} = \sin σ^{'} = \frac{NA}{| M |} .

In the paraxial case these parameters are related through

f_{#} = \frac{1}{2 NA} and f_{#}^{'} = \frac{1}{2 {NA}^{'}} .

Once the aperture stop is determined, it is easy to see that not all the points in the object plane are able to produce images with the same illumination. Instead, the illumination gradually decreases as the object point moves away from the optical axis. A second aperture, called the field stop (FS), is responsible for this limitation. The joint action of the AS and the FS divides the object plane in some different fields. First, we have the field of uniform illumination; a circular region, centered at the optical axis, in which all the points produce images with the same illuminations as the axial point [see Fig. 4(b)]. Next, we have an annular field in which the illumination gradually decreases, producing the so-called vignetting effect. A typical ring within this field is the one at which the illumination of the image is reduced by a factor of 1/2 [see Fig. 4(c)]. The outer ring of the vignetting field is called the limit-illumination ring [see Fig. 4(d)]. Any object point beyond this ring does not produce any image. An example of field limitation is shown in Fig. 5.

Figure 5 Example of field limitation. The image is composed of a central circular region of uniform illumination plus an annular field in which the illumination decreases (vignetting region).

Download Full Size | PDF

2.3. Lateral Resolution and Depth of Field

To complete the geometrical study of optical imaging instruments, we analyze two features of great interest and that are closely connected: the lateral resolution and the DoF. In order to simplify our study, we consider the case of a telecentric imaging system. This selection simplifies the equations, but does not limit the generality of our conclusions.

As stated previously, in this geometric optics study we are not considering any diffraction (or wave optics) effects. In addition, we assume that the optical systems are free of aberrations (this is a reasonable assumption, since good-quality commercial optical instruments are free of aberrations within the field of uniform illumination). Under these hypotheses, any single point of an object is imaged onto a single image point, and therefore the resolution is determined by the pixelated structure of the sensor (typically CCD or CMOS). Following Nyquist statements [59], we assume that two object points are resolved (or distinguished) in the image when they are imaged or captured by different sensor pixels, having a single pixel between them. Therefore, the resolution limit, defined as the smallest separation between two points on an object that can still be distinguished by the camera system as separate entities, is given by

ρ_{geo} = 2 \frac{Δ_{p}}{| M |},

where

Δ_{p}

is the pixel size. It is apparent that to obtain high-resolution images, sensors with very small pixels are required. Therefore, it is the current challenge for sensor manufacturers to build sensors with a huge amount of small (even submicrometer) pixels.

The DoF of an imaging system is defined as the distance from the nearest object plane in focus to that of the farthest plane which is also simultaneously in focus. The DoF is usually calculated as the conjugate of the depth of focus, which is illustrated in Fig. 6. According to this scheme,

dof = 2 \frac{f_{2}}{ϕ_{AS}} Δ_{p} = 2 f_{#}^{'} Δ_{p} .

Consequently,

{DoF}_{geo} = \frac{dof}{M^{2}} = 2 f_{#} \frac{Δ_{p}}{| M |} = \frac{Δ_{p}}{| M | NA} .

In Fig. 7 we show two pictures of the same scene. One was obtained with a low

f

-number (small DoF) and the other with a high

f

-number (large DoF).

Figure 6 Depth of focus (dof) is the axial range by which the sensor can be displaced and still record a sharp image of an object point. Sharp means that only one pixel is impressed. In good approximation, the depth of field (DoF) is the conjugate of the dof.

Download Full Size | PDF

Figure 7 Two pictures of the same scene: (a) small DoF and (b) large DoF.

Download Full Size | PDF

3. Wave Theory of Image Formation

In this section, we study the image formation, but taking into account the wave nature of light. We will obtain simple formulae that describe the optical image and will find that they are consistent with the results obtained on the basis of geometrical optics. We recommend that those who are not familiar with the basic concept of wave optics propagation read Appendix B before proceeding to Subsection 3.1.

3.1. Propagation of Waves through Telecentric Optical Systems

Now we present the analysis of image formation in terms of wave optics. We perform our analysis for the particular case of a telecentric system. This selection will allow us to derive a mathematical expression for description of the amplitude distribution in the image plane. This selection does not limit the generality to our study, and its conclusions are applicable to any other optical imaging system. The telecentric imaging scheme is shown in Fig. 8.

Figure 8 In the simplest configuration, a telecentric optical system is composed of two convex lenses, coupled in afocal manner, plus an aperture stop that is usually a circular aperture.

Download Full Size | PDF

The light propagating through this system goes through two cascaded Fourier transformations. Then we consider a diffracting object, of amplitude transmittance $t (x, y)$ , that is placed at the FFP of the first lens and is illuminated by a monochromatic plane wave. After the first Fourier transformation, we obtain the light distribution in the BFP after the circular aperture:

u_{1}^{+} (x, y) = \frac{1}{λ_{0} f_{1}} \tilde{t} (\frac{x}{λ_{0} f_{1}}, \frac{y}{λ_{0} f_{1}}) p (x, y),

where

p (x, y)

is the amplitude transmittance of the aperture and

λ_{0}

is the light wavelength in vacuum. Next, we calculate the amplitude at the BFP of the second lens by performing the second Fourier transform:

u_{2} (x, y) = \frac{1}{λ_{0} f_{2}} {\tilde{u}}_{1}^{+} (\frac{x}{λ_{0} f_{2}}, \frac{y}{λ_{0} f_{2}}) = \frac{1}{λ_{0}^{2} f_{1} f_{2}} [{(λ_{0} f_{1})}^{2} t (- λ_{0} f_{1} \frac{x}{λ_{0} f_{2}}, - λ_{0} f_{1} \frac{y}{λ_{0} f_{2}}) \otimes \tilde{p} (\frac{x}{λ_{0} f_{2}}, \frac{y}{λ_{0} f_{2}})] = \frac{1}{| M |} t (\frac{x}{M}, \frac{y}{M}) \otimes \tilde{p} (\frac{x}{λ_{0} f_{2}}, \frac{y}{λ_{0} f_{2}}),

where, as defined in the previous section,

M = - f_{2} / f_{1}

is the lateral magnification of the telecentric system. This equation provides the key result of imaging systems when analyzed in terms of wave optics. That is, the image of a diffracting object is the result of the convolution of two functions. The first function

t (\cdot)

is a properly scaled replica of the amplitude distribution of the object. This function is the ideal magnified image formed or predicted according to the geometrical optics. The second function

\tilde{p} (\cdot)

is the Fourier transform of the aperture stop, and is usually known as the point spread function (PSF) of the imaging system. In other words, the image provided by a diffraction-limited optical system is the result of the convolution between the ideal geometrical optics predicted magnified image with the PSF of the imaging system. The aperture stop of an imaging system is usually circular, and therefore the PSF takes the form of an Airy disk [59]:

PSF (r) = {(\frac{ϕ}{2})}^{2} Disk (\frac{r}{2 λ_{0} f_{2} / ϕ}),

where

ϕ

is the aperture-stop diameter, and

Disk (x) = J_{1} (x) / x

, with

J_{1}

being the Bessel function of the first kind and order 1. The Airy disk is composed of a central lobe, which contains more that 85% of the signal energy, and infinite outer rings of decreasing energy.

3.2. Spatial Resolution and Depth of Field

The structure of the Airy disk has an essential influence on the capacity of the optical system to provide sharp images of the fine details of the objects. This is illustrated in Fig. 9, where we show the image of two point sources that are very close to each other.

Figure 9 Two Airy disks, corresponding to the images of two point sources that are placed close to each other. According to Rayleigh criterion of resolution, we illustrate three cases: (a) the images are not resolved, (b) the images are barely resolved, and (c) the images are well resolved.

Download Full Size | PDF

According to Rayleigh criterion, two Airy disks are resolved if the distance between their centers is larger than the radius of the first zero ring of the disk [60]. This radius is, for the case of a circular aperture stop [see Eq. (15)],

ρ_{dif}^{'} = \frac{1.22 \cdot λ_{0} \cdot f_{2}}{ϕ} .

Note that it is more convenient to express this distance not in the image plane, but in the object plane. To accomplish this, we need to take into account the lateral magnification of the imaging system. Thus, we obtain the resolution limit of the imaging system, i.e., the shortest distance between two object points that can be distinguished:

ρ_{dif} = \frac{ρ_{dif}^{'}}{| M |} = \frac{1.22 \cdot λ_{0} \cdot f_{1}}{ϕ} = 1.22 \cdot λ_{0} \cdot f_{#} = \frac{0.61 \cdot λ_{0}}{NA},

where

M

is the magnification of the imaging system. An example of the influence of diffraction on the resolution of images is shown in Fig. 10, where we depict three pictures of the same oil painting of Abraham Lincoln obtained with lenses with three different

f

-numbers. In the figure, we can see that the use of a large

f

-number (or, equivalently, small NA) produces a strong decrease in the lateral resolution, which results in the blurring of the fine details of the image.

Figure 10 Three pictures of the oil painting “Gala looking at the Mediterranean Sea,” painted by Salvador Dali. A different effect is obtained when the picture is imaged with: (a) a lens with a low value of the $f$ -number, (b) a lens with a medium value of the $f$ -number; and (c) a lens with a high value of the $f$ -number.

Download Full Size | PDF

When imaging systems are analyzed in terms of wave optics, we can evaluate the DoF of the system. We need to calculate the amplitude distribution in the image plane when the diffracting screen is axially displaced by a distance $z_{0}$ , as defined in Fig. 1. In this case, the amplitude distribution in the focal plane is

u_{0} (x, y) = t (x, y) \otimes \frac{e^{- i k_{0} z_{0}}}{λ_{0} z_{0}} \exp {- i \frac{k_{0}}{2 z_{0}} (x^{2} + y^{2})} .

The convolution with the quadratic phase function is due to the displacement of the object by

z_{0}

. Applying the results from Eq. (14), it is straightforward to obtain

u_{2} (x, y) = \frac{1}{M} t (\frac{x}{M}, \frac{y}{M}) \otimes [\tilde{p} (\frac{x}{λ_{0} f_{2}}, \frac{y}{λ_{0} f_{2}}) \otimes \exp {i \frac{k_{0}}{2 M^{2} z_{0}} (x^{2} + y^{2})}],

where we have omitted some irrelevant constant factors. From this equation, we see that the amplitude distribution of a defocused image is obtained as the convolution between the geometrical optics predicted image (the magnified image) with a new PSF, named here as the defocused PSF. This new PSF appears between the square brackets in Eq. (19) and is the result of propagating (or defocusing) the original PSF. Now, it is possible to define a DoF range, which we denote as the

{DoF}_{dif}

, as the length of the axial interval around the object plane, such that the value at the center of the defocused PSF is larger than one half such value of the original PSF. Although, we do not show the details of the calculations, it is can be shown that the

{DoF}_{dif}

is [60]

{DoF}_{dif} = \frac{λ_{0}}{{NA}^{2}} = 4 f_{#}^{2} λ_{0} .

To complete this section, it is necessary to state that in real imaging systems, the wave optics and the geometrical optics effects are present simultaneously. Therefore, when analyzing the lateral resolution or the depth of field, both effects must be taken into account. This classical result is summarized in the next two formulas of lateral spatial resolution and DoF:

ρ = \max {ρ_{geo}, ρ_{dif}} = \max {2 \frac{Δ_{p}}{| M |}, 1.22 \cdot λ_{0} \cdot f_{#}},

and

DoF = {DoF}_{geo} + {DoF}_{dif} = 2 f_{#} (\frac{Δ_{p}}{| M |} + 2 f_{#} λ_{0}) = 2 \frac{f_{#}^{'}}{M^{2}} (Δ_{p} + 2 f_{#}^{'} λ_{0}) .

Concerning the lateral resolution, the ideal case is when

ρ_{dif} = ρ_{geo}

, that is, the resolution is predicted by geometrical optics. To obtain this condition, it is necessary to have the pixel size

Δ_{p} = 0.61 λ_{0} f_{#}^{'}

. In this case, the best resolution provided by wave optics is achieved, and the effect of the sensor’s finite pixel size is avoided.

4. Three-Dimensional Integral Imaging Analyzed in Terms of Geometrical Optics and Wave Optics

In the previous sections, we have used geometrical optics and wave optics to investigate the capacity of monocular optical systems for capturing images of 2D scenes. In addition, since we presented the case of objects that are out of focus, the previous study could be applied to the imaging of 3D objects, provided that they are considered as a linear superposition of 2D slices. However, monocular images capture the information of a single perspective of the 3D scene; therefore, they do not capture the important 3D information of the scene, particularly in the presence of occlusions. The solution to this problem is the capture of many monocular perspectives of the same 3D scene. Thus, this section is devoted to discussing the principles of multi-perspective imaging and the way multi-perspective images can be used for reconstruction and display of 3D scenes.

4.1. Integral Photography

We start this section by analyzing heuristically a conventional photographic camera when it is used for obtaining pictures of 3D scenes. As shown in the example of Fig. 11, any point of a 3D object emits a cone of rays. Although the cone is composed of a continuous distribution of rays, in the figure we show only a finite number of rays. Each ray carries different angular information of the 3D scene. In principle, any optical instrument that collects such angular information is able to capture the perspective information of the 3D scene. This is the case for the binocular HVS, in which any eye (the left and the right) perceives a different perspective of the 3D scene. Thus, the main difference is that in the HVS two human retinas record the two different perspectives.

Figure 11 Scheme of a conventional photographic camera. Every pixel collects a cone of rays with the same spatial coordinate but with variable angular content.

Download Full Size | PDF

In monocular cameras, the position of the sensor defines the object plane. However, when we have a 3D scene, there is no single object plane. To avoid this uncertainty, it is convenient to define a reference object plane (ROP) in the central region of the 3D scene, which is the conjugate of the sensor. Then, each sensor pixel collects and integrates all the rays with the same spatial information, but different angular contents. The problem is that after the integration process, all the angular information is lost.

To understand this process better, it is convenient to perform the analysis in terms of the plenoptic function, which represents the radiance [61] of the rays that impinge any point in the image plane [21]. Assuming the monochromatic case, the plenoptic function is a 4D function $L (x^{'}, σ^{'})$ , where $x^{'} = (x^{'}, y^{'})$ and $σ^{'} = (θ^{'}, ϕ^{'})$ are, respectively, the spatial and the angular coordinates of the rays arriving at the image plane. For the sake of simplicity, in the forthcoming graphic representations, we will draw an epipolar section of the plenoptic function [62–64], that is, $L (x^{'}, θ^{'})$ . In other words, we will draw only the rays propagating on a meridian plane. Naturally, this simplification does not compromise any generality to the study.

We make a second simplification considering only the rays impinging the center of the camera pixels. In this case, it is apparent that any pixel in the conventional photographic camera captures a plenoptic field that is confined to a segment of constant spatial coordinate, but variable angular coordinate [see Fig. 12(a)]. From such plenoptic function, it is possible to calculate the irradiance distribution on the image taken by the camera by simply performing an angular integration. In mathematical terms, this can be made by calculation of the Abel transform [65] of the plenoptic function, as illustrated in Fig. 12(b); that is,

I (x^{'}) = \iint_{σ^{'}} L (x^{'}, σ^{'}) d σ^{'} .

As can be seen from this figure, due to the angular integration, conventional photography (and by extension any conventional imaging system) loses the multi-perspective information and therefore, most of the 3D information of 3D scenes.

Figure 12 (a) Plenoptic field incident onto the image sensor and (b) the captured image.

Download Full Size | PDF

The first approach to design a system with the capacity of capturing the plenoptic field radiated by 3D objects was due to Lippmann, who proposed a multi-view camera system [13]. Specifically, he proposed to insert a lens array in front of a light sensor (photographic film in his experiment). This concept is illustrated in Fig. 13(a) where, for simplification, we assume a pinhole-like array of lenses, and therefore only consider rays passing through their center. Any lens in the array captures a 2D picture of the 3D scene, but from a different perspective. We shall refer to these individual perspective images as elemental images (EIs). In order to avoid overlapping between different EIs, a set of physical barriers is required. If we analyze this system in terms of the plenoptic function, we find that any elemental image contains the angular information corresponding to rays passing through the vertex of the corresponding lens. Note that we use the term integral image to refer to the collection of EIs of a 3D scene. The information collected by the set of lenses can be grouped in a plenoptic map, as shown in Fig. 13(b). From this diagram, it is apparent that the system proposed by Lippmann has the ability of capturing a sampled version of the plenoptic field at the plane of lenses. The sampling frequency is determined by the gap, $g$ , between the lenses and the sensor, the pitch $p$ of the lens array, and the pixel size $Δ_{p}$ of the image sensor (according to what is explained in Appendix A, the gap is measured from the principal plane $H^{'}$ ). While the sampling period along the spatial direction is given directly by $p$ , the period along the angular direction is given by $p_{θ} = Δ_{p} / g$ .

Figure 13 (a) Integral photography system as proposed by Lippmann and (b) corresponding sampled plenoptic map.

Download Full Size | PDF

In the plenoptic map, we can find the elemental images as the columns of the sampled field. It is also interesting to note that horizontal lines correspond to a set of rays passing through the lenses equidistant and parallel to each other. The pixels of any horizontal line in Fig. 13(b) can be grouped to form a subimage of the 3D scene. The subimages constitute the orthographic views of the 3D scene. Orthographic means that the scale of the view does not depend on the distance from the object to the lens array. In this paper, the subimages will be named, alternatively, as microimages. In addition, we will use hereafter the name plenoptic image to refer to the collection of microimages of a 3D scene.

Following the original proposal by Lippmann, other alternative but equivalent methods have been proposed for capturing the elemental images. The simplest one consists in substituting the MLA by a pinhole array. This proposal has to deal with some important constraints. If we analyze the system in terms of geometrical optics, any point from the object is imaged as a circle (or geometric shadow of a pinhole), whose diameter is proportional to the pinhole diameter. Thus, if one does not wish to decrease the resolution of the elemental images the pinholes should have a diameter smaller than the pixel size. However, small pinholes imply very low light efficiency. An additional problem is that small pinholes could give rise to significant diffraction effect that could distort the recorded plenoptic map. As far as we know, the pinhole array has not been proposed yet as an efficient way for capturing the plenoptic information. However, we still consider that it could be very interesting to explore the limits of such a technique by searching for the optimum configuration in terms of the expected resolution, and acceptable light efficiency. It is worthy to remark that a recent approach based on time multiplexing has been proposed for overcoming some of the problems associated with integral imaging with pinholes [66].

Another interesting approach is based on the idea of using an array of digital cameras [67]. The advantage of this approach is that the elemental images can have very high resolution, and that the array can capture large 3D scenes with high parallax. The main problems are that it is a bulky system with the need for synchronizing a large number of digital cameras. In addition, there is limited flexibility in fixing the pitch, since the minimum pitch value is determined by the size of the digital cameras. Another possibility is using a single digital camera on a moving platform [68]. This method has been named as the synthetic-aperture integral imaging, and allows the capture of an array of multiple EIs in which the pitch and the parallax are fixed at will. Also, it permits more exotic geometries in which the positions of the camera do not follow a regular or rectangular grid [69]. The main disadvantages of this technique are the bulkiness of the system and the large acquisition times, which make it useful only for the capture of static scenes or if the speed of the moving platform is much higher than the scene dynamics.

As an example to illustrate integral imaging and the type of images that are captured by this technique, we prepared a scene composed of five miniature clay jugs and implemented a synthetic-aperture integral imaging capture arrangement, in which a digital camera (Canon 450D) was focused on the central jug, which was placed at a distance of 560 mm from the image sensor. The camera parameters were fixed to $f = 18 mm$ and $f_{#}^{'} = 22$ . The large $f$ -number was chosen in order to have a large DoF and obtain sharp pictures of the entire 3D scene. With the setup, we captured a set of $N_{H} = N_{V} = 11$ elemental images with pitch $P_{H} = P_{V} = 6.0 mm$ . Since the pitch is smaller than the size of the image sensor ( $22.2 mm \times 14.8 mm$ ; $4278 \times 2852 pixels$ ; $Δ_{p} = 5.2 μm$ ), we cropped every elemental image to $2700 \times 2700$ pixels to remove the outer parts of the images and reduce their size. In addition, we resized the elemental images so that any image was composed of $n_{H} = n_{V} = 300$ pixels with an effective pixel size of 46.8 μm. In Fig. 14(a), we show the central $7 \times 7$ elemental images containing up to 49 different perspectives of the scene, that is, the columns of the plenoptic map. In Fig. 14(b), we show the central EI. From the EIs, and by a simple pixel mapping procedure, we calculated the plenoptic image, which is composed of $300 \times 300$ microimages, as shown in Fig. 14(c).

Figure 14 (a) Subset of $7 \times 7$ EIs ( $300 \times 300$ pixels each) of the 3D scenes. A movie obtained after composing the EIs of the central row of the integral image is shown in Visualization 1; (b) central EI; and (c) grid of $300 \times 300$ microimages ( $11 \times 11$ pixels each) of the 3D scene. The zoomed area is scaled by a factor of 5.

Download Full Size | PDF

4.2. Depth Refocusing of 3D Objects

Although integral photography was proposed originally by Lippmann as a technique for displaying 3D scenes, there are other interesting applications. One application is to compose a multi-perspective movie with the elemental images (see Visualization 1). Or, equivalently, to display on a flat 2D monitor different elemental images following the mouse movements (see Visualization 2).

Another application makes use of the ABCD algebra for calculating the plenoptic map at different depths, including the ROP and other planes within the 3D object. This is made by use of the free-space propagation matrix:

(\begin{matrix} x_{z} \\ θ_{z} \end{matrix}) = (\begin{matrix} 1 & z_{R} \\ 0 & 1 \end{matrix}) (\begin{matrix} x \\ θ \end{matrix}),

where

(x_{z}, θ_{z})

are the spatial-angular coordinates at a given depth in the object space,

(x, θ)

are the coordinates at the lens array, and

z_{R}

is the refocusing distance measured from the object plane to the lenses. From this equation it is apparent that the plenoptic map at the object area is the same as the one captured by the lens array, but properly sheared:

L (x_{z}, θ_{z}) = L (x + z_{R} θ, θ)

. Naturally, to obtain the irradiance distribution at the propagated distance

z_{R}

, it is only necessary to perform the Abel transform:

I (x; z_{R}) = \int_{θ} L (x + z_{R} θ, θ) d θ,

as illustrated in Fig. 15.

Figure 15 (a) Illustration of the backpropagation algorithm. Any pixel produces a backpropagated ray passing through the center of the corresponding lens. Clearly, backpropagated rays change their spatial coordinates but keep constant their angular coordinates. (b) Sketch of plenoptic map captured by the lens array. (c) Backpropagated ( $z_{R} > 0$ ) plenoptic map obtained by shearing the original plenoptic map. The shearing preserves the angular coordinate. (d) Illustration of the Abel transform necessary for the calculation of the irradiance distribution of the image at the backpropagated distance.

Download Full Size | PDF

Next, in Fig. 16 we show an example of the application of the refocusing algorithm. In this example the algorithm is applied to the elemental images shown in Fig. 14(a). In the figure we show the refocused irradiance distribution of the image at three different distances. In the movie (Visualization 3), we show the refocusing along the entire 3D scene.

Figure 16 Three images obtained after applying the refocusing algorithm for three different values of $z_{R}$ .

Download Full Size | PDF

The refocusing procedure can be understood more easily if we visualize it as the result of shifting and summing the pixels of the elemental images [70]. This process is illustrated in Fig. 17. When all the elemental images are stacked with no relative shifting between them, the irradiance distribution at the infinity is rendered. In the general case, there is a nonlinear relation between the number of pixels of the relative shifting, $n_{S}$ , and the depth position, $z_{R}$ , of the rendered plane [71]. The relation is

z_{R} = g \frac{N}{n_{S}},

where

g

is the gap distance in Fig. 15(a),

N

is the number of pixels per elemental image, and

0 \leq n_{S} \leq N

. Note that

z_{R}

is measured from the refocused plane to the array.

Figure 17 On the left we show a collection of, for example, $3 \times 3$ elemental images. Any elemental image is designed by its index $(i, j)$ . On the right of this figure, we show the scheme of functioning of the refocusing algorithm for $n_{S} = 0, 1, 2, 3$ .

Download Full Size | PDF

For the implementation of the algorithm we define first the function $I_{i, j} (p, q)$ , which stands for the value of the $(p, q)$ pixel within the $(i, j)$ elemental image. We assume that both the number of elemental images, $N_{H}$ (or $N_{v}$ ), and the number of pixels per elemental image, $N$ , are odd numbers. Then the refocused image corresponding to a given value of $n_{S}$ is calculated as

O_{n_{S}} (p, q) = \sum_{i, j = - (N_{H} - 1) / 2}^{+ (N_{H} - 1) / 2} I_{i, j} (p - i n_{S}, q - j n_{S}) .

Characteristic of this method is that the number of refocused planes is limited by

N

. This can be a strong problem, since it is usual that the number of refocused planes in the region of interest is low. An easy solution to this problem is to resize the elemental images by an integer factor,

m

. The problem is that in such a case the computation time is increased by a factor of

m^{2}

and therefore, overflow errors can occur. A minor problem is that the number of pixels in the refocused image depends on the depth of the plane. Proper cropping of the images solves this problem.

To allow a good density of refocused planes but avoiding overflow problems, a back-projection algorithm was reported, in which the number of pixels and depth position of refocused images can be selected at will [72]. As shown in Fig. 18, in this algorithm the lens array is substituted by an array of pinholes. As a first step, the depth position, the refocused image size, and the number of pixels are fixed. Then, the irradiance value at any pixel of the refocused image is obtained by summing up the values of the real pixels from the elemental images that are impacted by the straight lines traced from the calculated pixel and passing through the pinholes. The main drawback of this algorithm is the lower efficiency in terms of computation time.

Figure 18 Scheme of the back-projection algorithm. The number of calculated pixels in the refocused images is selected at will.

Download Full Size | PDF

Another way of implementing the algorithm shown in Fig. 15, but with higher computational efficiency, is by taking advantage of the Fourier slice theorem [73,74]. This algorithm works as follows (see the illustration in Fig. 19). First, we obtain the 2D Fourier transform of the captured plenoptic function:

\tilde{L} (u_{x}, u_{θ}) = \iint_{R^{2}} L (x, θ) \exp {- i 2 π (x u_{x} + θ u_{θ})} d x d θ .

Note that in this Fourier transformation we are treating

θ

as a Cartesian coordinate. In the second step, we rotate the Fourier axes by an angle

α

, as defined in Fig. 19,

(\begin{matrix} u_{θ}^{'} \\ u_{x}^{'} \end{matrix}) = (\begin{matrix} 1 & - \tan α \\ \tan α & 1 \end{matrix}) (\begin{matrix} u_{θ} \\ u_{x} \end{matrix}),

and particularize the spectrum for

u_{θ}^{'} = 0

. Then we obtain

\tilde{L} (u_{x}^{'}, 0) = \iint_{R^{2}} L (x, θ) \exp {- i 2 π (x u_{x}^{'} + θ \tan α u_{x}^{'})} d x d θ .

As the last step, we calculate the inverse 1D Fourier transform of

\tilde{L} (u_{x}^{'}, 0)

:

I (x_{α}) = \int_{R} \tilde{L} (u_{x}^{'}, 0) \exp {i 2 π (x_{α} u_{x}^{'})} d u_{x}^{'} = \int_{R} (\iint_{R^{2}} L (x, θ) \exp {- i 2 π (x u_{x}^{'} + θ \tan α u_{x}^{'})} d x d θ) \exp {i 2 π (x_{α} u_{x}^{'})} d u_{x}^{'} = \iint_{R^{2}} L (x, θ) d x d θ \int_{R} \exp {- i 2 π u_{x}^{'} (x + θ \tan α - x_{α})} d u_{x}^{'} .

Now we take into account the following two properties of the Dirac delta function:

\int \exp {- i 2 π u (x - x_{0})} d u = δ (x - x_{0}),

and

\int f (x) δ (x - x_{0}) d x = f (x - x_{0}) .

Then we obtain

I (x_{α}) = \iint_{R^{2}} L (x, θ) δ (x + θ \tan α - x_{α}) d x d θ = \int_{R} L (x_{α} + θ \tan α, θ) d θ,

which is similar to Eq. (25), provided that

z_{R} = \tan α

.

Figure 19 Sketch of the Fourier slice algorithm.

Download Full Size | PDF

The main advantage of the Fourier slice algorithm comes from its computational efficiency [75]. This efficiency occurs due to that fact that the operation that more computation time consumes, the 2D Fourier transform, is performed only one time for any given plenoptic image.

4.3. Resolution and Depth of Field in Captured Views

As explained above, in the capture stage of an InIm system, a collection of images of a 3D scene is captured with an array of lenses that are set in front of a single image sensor array (CCD or CMOS), or by an array of digital cameras, each with its own sensor. Whatever the capture modality is, the resolution and the DoF of the captured elemental images are determined by the classical equations of 2D imaging systems.

Recalling the optical imaging fundamentals discussed in Sections 2 and 3, the DoF and the spatial resolution limits of directly captured EIs are given by the competition between geometric and diffractive factor; that is,

{DoF}_{EI} = {DoF}_{geo} + {DoF}_{dif} = 2 \frac{f_{#}^{'}}{M^{2}} (Δ_{p} + 2 λ_{0} f_{#}^{'}),

and

ρ_{EI} = \max {ρ_{geo}, ρ_{dif}} = \max {2 \frac{Δ_{p}}{| M |}, 1.22 λ_{0} \frac{f_{#}^{'}}{| M |}} .

In the experiment shown in Fig. 14 the images were captured with the following experimental parameters:

Δ_{p} = 46.8 μm

,

f_{#}^{'} = 22

, and

| M | = 0.033

. Assuming

λ_{0} = 0.55 μm

we find that

{DoF}_{geo} = 1.89 m

and

{DoF}_{dif} = 0.98 m

are comparable, ensuring that the entire scene was captured sharply. The values for the lateral resolution are

ρ_{geo} = 2.8 mm

and

ρ_{dif} = 0.45 mm

.

4.4. Resolution and Depth of Field in Refocused Images

The algorithms used for the calculation of refocused images are mainly based on the sum of multiple, shifted images of the perspectives of a single 3D scene. Thus, the resolution of refocused images is determined by the resolution of directly captured EIs. Specifically, in the planes corresponding to an integer value of $n_{S}$ , the resolution of refocused images is the same as the resolution of captured EIs. In planes corresponding to fractional values of $n_{S}$ there is some gain in resolution due to the entanglement between elemental images. However, as demonstrated in [76], where a study in terms of Monte Carlo statistics is performed, this gain is never larger than a factor of 2. In conclusion, the spatial lateral resolution of refocused images is in the range defined in the interval:

ρ_{Refoc} \in [\frac{Δ_{p}}{| M |}, 2 \frac{Δ_{p}}{| M |}] .

In the evaluation of the depth of field of refocused images, one needs to take account two different concepts, that is, the DoF of the refocusing process (

{DoF}_{RProc}

) and the DoF of refocused images (

{DoF}_{Refoc}

). The

{DoF}_{RProc}

is defined as the length of the axial interval in which it is possible to calculate refocused images that are sharp. By sharp, we mean reconstructed images having a spatial resolution similar to that of EIs at the ROP. It is important to point out here that the refocusing algorithm does not have the ability of sharpening images where the captured elemental images are blurry. Thus, sharp images can be refocused only at depths where the captured EIs are already sharp. In other words,

{DoF}_{RProc} = {DoF}_{EI}

.

To quantify the second concept, that is, the ${DoF}_{Refoc}$ , we must take into account that in imaging systems the DoF is defined as the axial interval in which the irradiance of the refocused image of a single point source falls less than a factor of 1/2. We can quantify the ${DoF}_{Refoc}$ as the length of axial interval corresponding to $Δ n_{S} = 2 / (N_{H} + 1)$ , where $N_{H}$ is the number of elemental images along the horizontal direction. In Fig. 20, we show a graphical example to illustrate this property. Note, however, that if we take into account the 2D structure of real EIs this relation changes to $Δ n_{S} = 2 / {(N_{H} + 1)}^{2}$ .

Figure 20 Illustration of the DoF of refocusing process ( ${DoF}_{Refoc}$ ). We are considering in this example the refocusing of a single point source when $N_{H} = 7$ . In this case only one pixel per EI is impressed. When the refocused image is calculated at the object plane, all the EIs match and the image is a rectangle with width $Δ_{p}$ and height $N_{H}$ . On the right side, we show the refocused image corresponding to $Δ n_{S} = 2 / (N_{H} + 1) = 1 / 4$ . In this case, the refocused image has a pyramid structure with height $(N_{H} + 1) / 2$ .

Download Full Size | PDF

Applying this concept to Eq. (26), we obtain

Δ n_{S} = g N \frac{Δ z_{R}}{z_{R}^{2}} .

Therefore,

{DoF}_{Refoc} = Δ z_{R} = 2 \frac{z_{R}^{2}}{N N_{H}^{2} g},

where

{DoF}_{Refoc} = Δ z_{R}

, and we have approximated the product

{(N_{H} + 1)}^{2}

to

N_{H}^{2}

.

As an example, we can apply these equations to the rendering process shown in Fig. 18, and obtain

ρ_{Refoc} \in [1.4 mm, 2.8 mm]

and

{DoF}_{RProc} = {DoF}_{EI} = 2.9 m; and {DoF}_{Refoc} = 2.2 mm .

Summarizing, in the refocusing process, what is blurry in the elemental images cannot be sharpened in the refocused images. Instead, the refocusing algorithm adds increasing blurring to parts of the 3D scene that are far from the refocused plane. This effect, also known as bokeh effect, is similar to the one obtained in conventional photography when one decreases the

f

-number (or increases the aperture diameter), but keeps in focus the same object plane.

4.5. Plenoptic Camera

As explained previously, there are basically two ways of capturing the plenoptic field based on the Lippmann photography architecture. One is by inserting a lens array in front of a single image sensor (camera). The other is with an array of synchronized digital cameras. The first method has the advantage that it works with a single image sensor and therefore does not need any synchronization. The main drawbacks of this method are that the elemental images are captured with small parallax and that the lateral magnification is too small. The second method has the advantage of allowing the capture with high parallax. The drawbacks are that the system may become bulky with large information bandwidth, and it requires synchronizing the huge amount of data provided by many digital cameras [77].

An alternative method, which is very useful when small parallax is acceptable, is the plenoptic camera. This new instrument is obtained by performing simple modifications of a conventional photographic camera [23]. Specifically, the plenoptic camera is the result of inserting an array of microlenses just in the image plane, and then shifting the sensor axially. In Fig. 21 we present a scheme, in which we have drawn the photographic objective by means of a single thin lens. This is far from the real case, in which objectives may be composed by the coupling of a number of converging and diverging lenses, built with different glasses, and a hard aperture stop. However, all these optical elements can be substituted in the analysis (at least in the paraxial case) by the cardinal elements, that is, principal planes and focal points, and by the entrance and the exit pupils. In our illustration, we go further and use a thin lens in which the principal planes and the aperture stop are at the plane of the lens. This approximation can appear as very radical, but it helps to simplify the schemes and does not limit the generality to the conclusions that we will present.

Figure 21 Single scheme of a plenoptic camera.

Download Full Size | PDF

In the plenoptic camera setup, the conjugation relations are of great importance. First, one must select a ROP within the 3D scene. Then, we define the image plane as the plane that is conjugated with the ROP through the objective lens. Second, the pixelated sensor must be shifted axially up to the plane that is conjugated with the aperture stop through the microlenses. This second constraint is very important because it ensures that a circular microimage is formed behind every microlens. The array of microimages captured with the sensor after a single shot will be named hereafter as the plenoptic frame (in the case of a video camera) or the plenoptic picture for a single capture.

There are some features that distinguish the plenoptic picture captured with the plenoptic camera and the integral image captured with the InIm setup. The first difference is that an integral imaging system does not capture the plenoptic field as emitted by the object, but a propagated one. In contrast the plenoptic camera captures the plenoptic field as emitted by the ROP, but scaled and sheared. Making use again of ABCD algebra we can find that the relation between the field emitted by the ROP and the captured one is

(\begin{matrix} x^{'} \\ θ^{'} \end{matrix}) = (\begin{matrix} - \frac{z^{'}}{f} & 0 \\ \frac{1}{f} & - \frac{f}{z^{'}} \end{matrix}) (\begin{matrix} x \\ θ \end{matrix}),

where

z

and

z^{'}

are measured from the focal points, and

z^{'} = - f^{2} / z

.

The second difference between the plenoptic captured picture and the integral image is that while the EIs in InIm are sharp perspectives of the 3D scene (assuming a sufficiently long ${DoF}_{EI}$ ), the microimages are the sampled sections of the plenoptic map. There exists the common error of trying to find within the microimages sharp images of regions of the scene. But this is not possible because the microimages are conjugated with the aperture stop, which optically is far from the 3D scene. Another difference is that typically the integral image may be composed of fewer EIs with many pixels each, while the plenoptic picture is composed of many microimages with few pixels each.

However, and in spite of these differences, there are major similarities between the spatial-angular information captured with an InIm system and with a plenoptic camera. To understand the similarities, it is convenient to start by representing the plenoptic map captured with a plenoptic camera, as shown in Fig. 22(a). In this map, a column represents any microimage. The central microimage is just behind the central microlens. However, the other microimages are displaced outwards. This explains the vertical sheared structure of the captured radiance map.

Figure 22 (a) Sampled plenoptic map captured with the plenoptic camera of the previous figure and (b) the plenoptic map at the lens plane. The two maps are related through a rotation by $π / 2$ and a horizontal shearing.

Download Full Size | PDF

If we apply the ABCD algebra, we can calculate the plenoptic map in the plane of the objective lens, but just before refraction takes effect on the field:

(\begin{matrix} x_{L} \\ θ_{L} \end{matrix}) = (\begin{matrix} 1 & f + z^{'} \\ - \frac{1}{f} & - \frac{z^{'}}{f} \end{matrix}) (\begin{matrix} x^{'} \\ θ^{'} \end{matrix}) .

For a simple understanding of this transformation it is better to consider the case

z^{'} = 0

, in which the MLA is placed just at the BFP of the objective lens. Now the transformation corresponds to a rotation by

π / 2

and a horizontal shearing. The transformed plenoptic map is shown in Fig. 22(b). We appreciate the similitude between this plenoptic map, and the one shown in Fig. 13(b), corresponding to integral photography. Then we can state that the map captured with the plenoptic camera is equivalent to the map that could be captured with an array of lenses (or digital cameras) placed at the plane of the photographic objective. Consequently, by applying a simple pixel mapping, from the plenoptic picture captured with a plenoptic camera we can extract a collection of subimages that are similar to the EIs captured with an integral photography system placed at the objective plane. Conversely, from the EIs captured with InIm, it is possible to extract a collection of subimages that are similar to the collection of microimages captured with a plenoptic camera. In other words, the microimages are the subimages of the EIs, and vice versa.

To illustrate these concepts, next we show the plenoptic picture captured with a prototype of a plenoptic camera composed of an objective of focal length $f = 100 mm$ and diameter $ϕ = 24 mm$ . The MLA was composed of lenslets of focal length $f_{L} = 0.930 mm$ and pitch $p = 0.222 mm$ (APO-Q-P222-F0.93 from AMUS). Note that the $f$ -number matching requirement is fulfilled, since in both cases $f_{#}^{'} = 4.2$ . As an image sensor we used a CMOS (EO-5012c 1/2”) with $2560 \times 1920$ pixels size $Δ_{p} = 20 μm$ . The plenoptic picture is shown in Fig. 23(a). From the microimages, and by a simple pixel mapping procedure we calculated the associated integral image, which is composed of $11 \times 11$ EIs, and shown in Fig. 23(b).

Figure 23 (a) Plenoptic picture capture with the plenoptic camera prototype (zoomed area is scaled by a factor of 4), (b) calculated EIs, (c) central EI, and (d) refocused image at the second jug.

Download Full Size | PDF

As in the previous section, once we have the collection of EIs we can compose a multi-perspective movie with the EIs as the frames (see Visualization 4), or calculate the refocused images of the 3D scene (see Visualization 5). Note, however, that now the parallax, which is determined by the diameter of the aperture stop, is much smaller than in the case in which the capture is made with an array of digital cameras. The quality of the refocused images is worse as well due to two reasons. One is the smaller parallax and the other is the lower number of pixels available when using a single image sensor.

4.6. Resolution and Depth of Field in Calculated EIs

To understand the concepts of resolution and DoF in any calculated EI, we particularize our analysis to the central EI. Naturally, the equations that we will deduce are applicable to the other EIs. As explained above, extracting the central pixel of each microimage, and composing them following the same order, provides the central calculated EI. Then, according to the Nyquist concept, two points in the object are resolved in the calculated EI provided that their images fall in different microlenses, but leaving at least one microlens separation in between. In mathematical terms, this geometric resolution limit can be expressed as

ρ_{geo} = 2 \frac{p}{| M |} .

The diffraction resolution limit is given, as in previous cases, by

ρ_{dif} = 1.22 \cdot λ_{0} \cdot \frac{f_{#}^{'}}{| M |} .

It is apparent that the value of the pitch is much higher than the product

λ_{0} f_{#}^{'}

because of the small wavelength of light. Therefore, the geometric factor is strongly dominant over the diffractive one. Thus, we can conclude here that

ρ_{cEI} = ρ_{geo}

, where cEI stands for the computed EI. This result confirms the strong loss in resolution inherent to plenoptic cameras, which is, however, caused by the capture of dense angular information.

To calculate the ${DoF}_{cEI}$ , we simply need to adapt Eq. (22), taking into account that the lenslet array pitch has the same effect as the pixel size, that is,

{DoF}_{cEI} = {DoF}_{geo} + {DoF}_{dif} = 4 λ_{0} \frac{f_{#}^{' 2}}{M^{2}} (\frac{μ^{2}}{2} + 1),

where

μ = \frac{p}{ρ_{dif}^{'}} .

Again, the geometrical term is much larger than the diffractive one, which is negligible. As an example, we calculate the resolution and DoF corresponding to the experiment shown in Fig. 23. In this case, $p = 222 μm$ , $| M | = 0.18$ (for the central jug), $λ_{0} = 0.55 μm$ (mid-spectrum wavelength), and $f_{#}^{'} = 4.2$ . Then we obtain, $ρ_{geo} = 2.22 mm$ , $ρ_{dif} = 0.014 mm$ , $μ = 11$ , ${DoF}_{geo} = 60 mm$ , and ${DoF}_{dif} = 0.5 mm$ .

In general, there are trade-offs between spatial resolution, field of view, and depth of field, which are challenges facing 3D integral imaging and light-field systems. Some approaches to remedy these issues include resolution-priority and depth-priority integral imaging systems [78], and dynamic integral imaging systems [79–84]. Some other approaches have proposed increased ray sampling or post-processing [85,86].

Generally speaking, InIm displays should have less problems with eye fatigue as the 3D object is reconstructed optically as opposed to stereoscopic display systems where there is convergence-accommodation conflict. However, poor angular resolution may affect the accommodation response [87]:

5. Display of Plenoptic Images: the Integral Monitor

The original idea of Lippmann was to use the elemental images for the display of 3D scenes. Specifically, he proposed inserting the elemental images in front of a MLA similar to the one used in the image capture stage. The light emitted by any point source generates a light cone after passing through the corresponding microlens. The real intersection of the light cones in front of the MLA (or the virtual one behind the MLA) produces a local concentration of light that reproduces the irradiance distribution of the original 3D scene. The observer perceives as 3D the irradiance reconstruction. In Fig. 24, we show a scheme for illustrating the integral photography process.

Figure 24 Scheme of integral photography (IP) concept. (a) Image capture stage and (b) 3D display stage, which produces floating 3D images in front of the 2D monitor.

Download Full Size | PDF

It is important to point out that there is an essential difference between the integral photography concept and stereoscopic (or auto-stereoscopic) systems. Stereoscopy is based on the production of two images from two different perspectives: the left and the right perspective images. These images are projected, by different means, to the left and the right retinas of the viewer. The two retinal images have a distance-dependent disparity, which stimulates a change of the convergence of the binocular visual axes to allow the fusion between the right and left images. As a result, the scene is perceived as 3D by the human visual system. The main issue here is that stereo systems are not producing real 3D scenes, but stereo pairs that are fused by the brain to generate a perception of depth. Such fusion is obtained at the cost of decoupling the physiological processes of binocular convergence and eye accommodation. This decoupling is a non-natural process, which when maintained for some time, may produce visual discomfort and adverse effects, such as headache, dizziness, and nausea. However, an integral monitor does not produce stereo images. Instead it produces real concentrations of light to optically produce 3D images that are observed without decoupling the convergence and the accommodation. Thus, the detrimental effects of the convergence-accommodation conflict are avoided.

In order to implement Lippmann ideas with modern opto-electronic devices, one should take into account first that the function of the microlenses here is not to produce individually images of the microimages, but to produce light-swords that intersect to create the expected local concentrations of light. Thus, in order to allow the light-swords to be as narrow as possible, and also to avoid the facet-braiding effect [72], the MLA should be set such that its focal plane coincides with the position of the panel’s pixels.

The second and important issue is the resolution. As explained above, one of major limitations of plenoptic technology comes from the trade-off between the angular and the spatial resolution. In the case of integral displays, the observer can see only a single pixel through any microlens. Thus, the display resolution unit is just the pitch of the MLA. Taking into account that the angular resolution limit of the human eye is about 0.3 mrad, one can calculate the optimum pitch depending on the observation distance. For example, in the case of a tablet device that is observed from about 0.4 m, the optimum value for the pitch would be about 0.12 mm. In the case of a TV that is observed from about 3.0 m, the optimum pitch would be about 0.9 mm.

If we refer now to the angular resolution, it is remarkable that there are not significant studies about optimum values in integral display. However, in auto-stereoscopic display systems that are based on the stereovision there is a more extended experience, and there are even commercial auto-stereoscopic monitors [88]. In this case it is widely accepted that 8–12 pixels per microlens can provide a continuously smooth angular experience. This conclusion can be extrapolated to the case of integral displays. Thus, the required pixel size would be of about 0.015 mm (1700 dpi) in tablet devices and of 0.12 mm (225 dpi) in TVs. Note that currently there are commercial tablets with 359 ppi and commercial TVs with 102 ppi.

Another issue of interest is that the quality of the reconstructed images gets worse as the reconstruction plane moves away from the plane of the screen. To minimize this problem, and also to have a good 3D experience, the integral monitor should be designed in such a way that it displays the 3D image in the neighborhood of the MLA, with some parts floating behind the panel and some others floating in front of the panel.

As an example of implementation of integral display, we used the microimages recorded with a plenoptic camera. Note, however, that to avoid a pseudoscopic effect, it is necessary to rotate by an angle $π$ any microimage. As for the integral monitor, we used Samsung tablet SM-T700 (359 ppi), and a MLA consisting of $113 \times 113$ lenslets of focal length $f_{L} = 3.3 mm$ and pitch $p = 1.0 mm$ (Model 630 from Fresnel Technology). In our experiment, we displayed on the tablet the microimages shown in Fig. 23(a), but rotated by angle $π$ and upsized to each have $15 \times 15$ pixels. After fixing and aligning the MLA with the tablet, we implemented the integral monitor that is shown in Fig. 25.

Figure 25 Overview of our experimental integral-imaging 3D display system. We moved the recording device vertically and horizontally to record different perspectives provided by the integral monitor.

Download Full Size | PDF

To demonstrate full parallax of the displayed images, we recorded pictures of the monitor from many vertical and horizontal perspectives. From the pictures, we composed Visualization 6. Note from the video that although plenoptic frames are very well adapted for the display aim, they produce 3D images with poor parallax. This is due to the small size of the entrance pupil, inherently associated with photographic objectives.

The poor parallax associated with the use of the microimages captured with plenoptic cameras can be overcome if one uses the elemental images captured with an array of digital cameras. In this case, the parallax is determined by the angle subtended by the outer cameras as seen from the center of the ROP. The only constraint is that the region of interest must be within the field of view of all the cameras of the array. From the captured EIs, and by application of the pixel mapping algorithm, that is, the plenoptic-map transposition, the microimages are computed. Before applying the algorithm, the EIs must be resized so that their number of pixels equals the number of lenses of the MLA in the integral monitor. After applying the algorithm, the microimages should be resized so that their number of pixels equals the number of pixels behind each microlens in the integral monitor. As an example, in Fig. 26 we show the same EIs and microimages shown in Fig. 14, but resized accordingly. The result of the display, as seen from an observer placed in front of the monitor, is shown in Visualization 7.

Figure 26 (a) $7 \times 7$ central EIs already shown in Fig. 16 but resized to $113 \times 113$ pixels; (b) the corresponding $113 \times 113$ microimages but resized to $15 \times 15$ pixels each.

Download Full Size | PDF

An interesting point here is that some easy manipulations over the captured elemental images are possible. For example, by cropping all the elemental images one can narrow their field of view [89], and therefore simulate an approximation to the scene (see Visualization 8).

6. Integral Microscopy

Three-dimensional live microscopy is important for the comprehension of some biomedical processes. The ability of obtaining fast stacks of depth images is important for the study high-speed dynamics of biological functions, or the response of biological systems and tissues to rapid external perturbations.

In current 3D techniques, such as confocal microscopy [90–94], structured illumination microscopy [95–97], or light-sheet microscopy [98,99], the 3D image is not recorded in a single shot, but is obtained computationally after recording a stack of 2D images of different sections within the sample. The stack is captured by a mechanical axial scanning of the specimen, which could slow down the acquisition or introduce distortions due to vibrations. A solution to avoid the mechanical scanning is the use of digital holographic microscopy [100–102], which allows the digital reconstruction of the wave field in the neighborhood of the sample. The main drawback of this technique is that it operates coherently and makes impossible fluorescence imaging. More recent is the proposal of using an electrically tunable lens for obtaining stacks of 2D images of 3D specimens but avoiding the mechanical vibrations [58,103–105]. In this case, the challenge is to reduce the aberrations introduced by the liquid lens.

An interesting alternative for capturing 3D microscopic images in a single shot is based in the plenoptic technology. Plenoptic cameras have the drawback of capturing views with very poor parallax when imaging far scenes. However, this problem is overcome when plenoptic cameras and/or an integral imaging system with a single camera are used for the case of small scenes that are very close to the objective. These are the conditions that inherently occur in the case of optical microscopy. In Fig. 27(a), we show a schematic layout of an optical microscope, which is arranged by coupling in a telecentric manner a MO and a converging tube lens. In the scheme, we have tried to make visible the optical-design complexity of the MO, which intends to produce aberration-free images within a large field of view. The microscope is designed to provide the sensor with highly magnified images of the focal plane. The magnification of the microscope is determined by the specifications of the MO, so that

M_{hst} = \frac{f_{TL}}{f_{ob}},

where

f_{TL}

and

f_{ob}

are the focal length of the tube lens and the MO, respectively. Note that we have omitted a minus sign in Eq. (48). Such a sign is irrelevant here but would account for the inversion suffered by the image obtained at the image sensor. It is interesting, as well, that we have used the subscript “hst” because we are naming as the “host microscope” the optical microscope in which the MLA is inserted to obtain, as explained later, an integral microscope.

Figure 27 (a) Scheme of a telecentric optical microscope and (b) the integral microscope is obtained by inserting a MLA at the image plane.

Download Full Size | PDF

Now we can particularize Eqs. (21) and (22) to the case of the optical microscope and state

ρ_{hst} = \max {2 \frac{Δ_{p}}{M_{hst}}, \frac{0.61 \cdot λ_{0}}{NA}},

and

{DoF}_{hst} = \frac{1}{NA} (\frac{Δ_{p}}{M_{hst}} + \frac{λ_{0}}{NA}) .

The implementation of an IMic is illustrated in Fig. 27(b). The key point is to insert an adequate MLA at the image plane of the host microscope. The sensor is then axially displaced toward the microlenses BFP. The MLA consists of only two surfaces, one plane and the other molded with an array of spherical diopters. As shown in the figure, the focal plane is imaged on the curved surface. Then, at the sensor a collection of microimages is obtained. In order to avoid the overlapping between the microimages, and also to make effective use of the image sensor pixels, the numerical aperture of the microlenses (

{NA}_{L} = p / 2 f_{L}

) and that of the MO in its image space (

{NA}^{'} = NA / M_{hst}

), should be equal.

From the captured microimages, and after applying simple ABCD algebra, it easy to calculate the plenoptic map at the aperture-stop plane:

(\begin{matrix} x_{F} \\ θ_{F} \end{matrix}) = (\begin{matrix} 0 & f_{T L} \\ - \frac{1}{f_{T L}} & 0 \end{matrix}) (\begin{matrix} x^{'} \\ θ^{'} \end{matrix}) .

This transformation is similar to the one shown in Fig. 22, but simpler since there is no shearing in this case. Thus, by plenoptic-map transposition (or pixel-mapping procedure) it is possible to calculate the orthographic views of the 3D sample. In the views, the number of pixels is equal to the number of lenslets of the MLA. For the evaluation of the resolution limit and the DoF provided by the IMic, we must adapt Eqs. (44) and (45) to the microscopy regime, and therefore we obtain

ρ_{View} = \max {2 \frac{p}{M_{hst}}, ρ_{hst}} .

If we use again the coefficient

μ

, defined in Subsection 4.6, which was defined as the quotient between the MLA pitch and the Airy disk radius, and assume that such a quotient is always higher than one, we obtain

ρ_{View} = 2 μ ρ_{hst} .

Similarly, the DoF of the views is given by

{DoF}_{View} = \frac{p}{M NA} + \frac{λ_{0}}{{NA}^{2}} = \frac{λ_{0}}{{NA}^{2}} (\frac{μ^{2}}{2} + 1) .

It is also straightforward to find the DoF of the refocused images is

{DoF}_{Refoc} = \frac{λ_{0}}{{NA}^{2}} (\frac{μ}{2} + 1) .

It is important to remind here that the IMic is a hybrid technique in which the capture is a purely optics process. In this process the setting of the optical elements, the optical aberrations, and the diffraction effects have great influence in the quality of the captured microimages. However, the calculation of the orthographic views and the corresponding computation of the refocused images are para-geometrical computational procedures in which it is assumed that ray optics is valid. Thus, potentially a conflict between the diffractive nature of the capture and the para-geometrical nature of the computation can appear, especially in the microscopy regime. To the best of our knowledge, no study has been published exploring the limits of IMic; however, we have found from our own experiments that for values of

μ > 5

the technique is providing acceptable results and the measured resolution limit and the DoF are similar to the ones predicted by Eqs. (53)–(55).

To illustrate the utility of IMic, we built in our laboratory a pre-prototype composed of a $20 \times / 0.40$ MO and a tube lens of $f_{TL} = 200 mm$ . Since the MO was designed to be coupled with a tube lens of 180 mm, the effective magnification was $M_{nat} = 22.22$ . The MLA was composed of $120 \times 120$ lenslets with pitch $p = 80 μm$ and numerical aperture ${NA}_{L} = 0.023$ (APO-Q-P80-R0.79 manufactured by AMUS). The coupling between NAs is reasonably good since ${NA}^{'} = 0.018$ . As a sample object, we used regular cotton, which provides an almost hollow specimen with large depth range and composed of long fibers with thin structure. The fiber diameter varies from 11 to 22 µm. The sample was stained with fluorescent ink using a marking pen, and illuminated with light from a laser of wavelength $λ_{0} = 532 nm$ . A chromatic filter ( $λ_{c} = 550 nm$ ) was used to reject the non-fluorescent light. The microimages captured are shown in Fig. 28(a). From these microimages it is possible to calculate the corresponding orthographic views, and from them the refocused images. The experimental results are shown in Fig. 28 with associated Visualizations.

Figure 28 (a) Microimages captured directly with the IMic (zoomed area is magnified by a factor of 3), (b) calculated views, (c) the central view (full movie is shown in Visualization 9), and (d) refocused image (full movie is shown in Visualization 10).

Download Full Size | PDF

These results illustrate the potential possibilities of IMic, which from a single capture can provide multiple perspectives of microscopic samples.

Naturally, this is an incipient technology and there is still much work to do for providing IMic with competitive resolution and DoF, and also for implementing the real-time display of microscopic views. In this sense it is of great interest the recent proposal of capturing the light-field emitted by a microscopic sample but placing the MLA not at the image plane, but at the Fourier plane of the microscope. This system has been named the Fourier integral microscope (FIMic) [50,106], and a scheme of it is shown in Fig. 29.

Figure 29 Schematic layout of Fourier integral microscopy (FIMic). A collection of EIs is obtained directly. The telecentric relay system is composed of two converging lenses (RL1 and RL2) coupled in an afocal manner.

Download Full Size | PDF

An advantage of the FIMic is that the orthographic views (named here as elemental images) are recorded directly. Since in commercial MOs the AS may not be accessible, a telecentric relay system can be necessary. The CCD is set at the BFP of the MLA. The FS is chosen such that the EIs are tangent at the CCD; in other words,

ϕ_{FS} = p \frac{f_{2}}{f_{L}} .

The efficiency of the FIMic is determined, mainly, by the dimensionless parameter

N = \frac{f_{2}}{f_{1}} \frac{ϕ_{AS}}{p},

which accounts for the number of microlenses that are fitted within the aperture stop, and represents also the number of EIs provided by the FIMic. This shrinkage gives rise to a reduction of the effective numerical aperture up to

{NA}_{EI} = \frac{NA}{N},

which implies a reduction of the spatial resolution and an increase of the DoF.

The spatial resolution of EIs produced by the FIMic is determined by the competition between wave optics and sensor pixelation. According to wave optics, two points are distinguished in the EIs provided that the distance between them fulfills the condition

ρ_{EI}^{W} = \frac{0.61 λ_{0}}{{NA}_{EI}} .

On the other hand, and according to Nyquist, the distance between the two points should be large enough to be recorded by different pixels leaving at least an empty pixel in between. Therefore,

ρ_{EI}^{Nyq} = 2 Δ_{p} \frac{f_{MO}}{f_{1}} \frac{f_{2}}{f_{L}} .

The combination of these two factors leads to

ρ_{EI} = \max {N \frac{0.61 λ_{0}}{NA}, 2 Δ_{p} \frac{f_{2} f_{MO}}{f_{L} f_{1}}} .

If the pixel size is selected in such a way that the two terms have the same value, we obtain

ρ_{EI} = \frac{0.61 λ_{0}}{NA} N .

Concerning the DoF, we adapt the classical formula to the effective

{NA}_{EI}

[60], that is,

{DOF}_{EI} = λ_{0} \frac{N^{2}}{{NA}^{2}} + Δ_{p} \frac{N}{NA} \frac{f_{2} f_{MO}}{f_{L} f_{1}} .

Assuming again the same pixel size as above, we find

{DOF}_{EI} = \frac{5}{4} \frac{λ_{0}}{{NA}^{2}} N^{2} .

If we compare these formulae with the ones obtained in the case of the IMic [Eqs. (52) and (53)], we obtain

ρ_{EI} = \frac{N}{2 μ} ρ_{View} and {DOF}_{EI} = \frac{5 N^{2}}{4 + 2 μ^{2}} {DOF}_{View} .

From Eq. (65) it comes out that given an IMic, it is possible to design a FIMic with the same resolution but much better DOF, or with the same DOF but much better resolution. This advantage of the FIMic is achieved, however, at the cost of producing a smaller number of views.

To illustrate the utility of the Fourier concept we have implemented a FIMic composed of the following elements: a $20 \times$ MO of focal length $f_{MO} = 9.0 mm$ and $NA = 0.4$ . The relay system is composed of two achromatic doublets of focal length $f_{1} = 50 mm$ and $f_{2} = 40 mm$ . The lens array was composed of microlenses of focal length $f_{L} = 6.5 mm$ and pitch $p = 1.0 mm$ ( ${NA}_{L} = 0.077$ ), arranged in a hexagonal way (APH-Q-P1000-R2.95 from AMUS). The diameter of the field stop was set to $ϕ_{FS} = 6.2 mm$ . The sensor was a CMOS (EO-5012c ½”) with $2560 \times 1920$ pixels ( $5.6 \times 4.2 mm$ ) of size $Δ_{p} = 2.2 μm$ . This sensor allows the capture of up to five EIs in the horizontal direction and up to four in the vertical one. Each EI is circular and with a diameter of 454 pixels.

Assuming a standard value of $λ_{0} = 0.55 μm$ this optical setup is able to produce EIs with optical resolution of $ρ_{EI}^{W} = 4.0 μm$ . According to Nyquist, this value provides a good matching between the optical and the pixel resolution. Taking into account all these experimental parameters, the value of $N = 5.68$ . This implies an expected resolution of $ρ_{EI} = 4.3 μm$ and an expected ${DOF}_{EI} \approx 150 μm$ . As a sample we used again regular cotton.

We implemented a setup able to function in two different modes: bright-field and fluorescence. In the bright-field experiment the sample was illuminated with the white light proceeding from a fiber bundle. In the fluorescence case, the sample was stained with fluorescent ink, and illuminated with the light proceeding from a laser of wavelength $λ_{0} = 532 μ m$ . A chromatic filter ( $λ_{c} = 550 nm$ ) was used to reject the laser light. In Fig. 30, we show the central elemental images obtained in the two experiments.

Figure 30 Seven central elemental images obtained with the bright-field (left) and with the fluorescence (right) setup.

Download Full Size | PDF

Any EI provided by the FIMic is directly an orthographic view of the 3D sample, and therefore a composition of them provides a multi-perspective movie. The movies are shown in Visualization 11 (bright-field) and Visualization 12 (fluorescence).

In order to illustrate the capacity of the FIMic for providing refocused images with good and homogeneous resolution along a large depth range, we calculated the refocused images from the EIs by direct application of shifting and sum algorithm. Here the relation between the refocusing depth and the number of pixels, $n_{S}$ , that the EIs are shifted is

z_{R} = n_{S} \frac{Δ_{p} f_{2}^{2} f_{ob}^{2}}{p f_{1}^{2} f_{L}} .

Naturally, the precision of the depth calculation is $Δ z_{R} = Δ_{p} f_{ob}^{2} f_{2}^{2} / p f_{1}^{2} f_{L}$ . In Fig. 31 we show the refocused irradiances.

Figure 31 Examples of refocused irradiance distribution. In Visualization 13 and Visualization 14, we show movies corresponding to refocusing tracks ranging up to 0.4 mm.

Download Full Size | PDF

It is remarkable that in the past some interesting research was addressed to design new digital processing algorithms (see, for example, [107]) for the improving the resolution of refocused images calculated from low-resolution view images. Naturally this kind of computational tool could be applied as well to the high-resolution EIs obtained with the FIMic.

7. Conclusions

In the past decade, there has been a substantially increasing interest and R&D activities in researching and implementing efficient technologies for the capture, processing, and display of 3D images. This interest is evident by the broad research and development efforts in government, industry, and academia in this topic. Among the 3D technologies, integral imaging is a promising approach for its ability to work without laser sources, and under incoherent or ambient light. The image capture stage is well suited for outdoor scenes and for short- or long-range objects. Integral imaging systems have been applied in many fields, such as entertainment, industrial inspection, security and defense, and biomedical imaging and display, among others. This tutorial is intended for engineers, scientists, and researchers who are interested to learn about this 3D imaging technique by presenting the fundamental principles to understand, analyze, and experimentally implement plenoptic, light-field, and integral-imaging-type capture and display systems.

The tutorial is prepared for readers who are familiar with the fundamentals of optics as well as those readers who may not have a strong optics background. We have reviewed the fundamentals of optical imaging, such as the geometrical optics and the wave optics tools for analysis of optical imaging systems. In addition, we have presented more advanced topics in 3D imaging and displays, such as image capture stage, manipulation of captured elemental images, the methods for implementing 3D integral imaging monitors, 3D reconstruction algorithms, performance metrics, such as lateral and longitudinal magnifications, and field of view, integral imaging applied in microscopy, etc. We have presented and discussed simple laboratory setups and optical experiments to illustrate 3D integral imaging, light-field, and plenoptics principles.

While we have done our best to provide a tutorial on the fundamentals of the integral imaging, light-field, and plenoptic systems, it is not possible to present an exhaustive coverage of the field in a single paper. Therefore, we apologize in advance if we have inadvertently overlooked some relevant work by other authors. A number of references [1–121], including overview papers, are provided to aid the reader with additional resources and better understanding of this technology.

Appendix A: Fundamental Equations of Geometrical Optics and ABCD Formalism

This appendix presents a brief summary of the fundamentals of geometrical optics and ABCD formulism to describe the spatial-angular state of optical rays. An alternative way of expressing the state of a ray in a given moment of its trajectory is by means of a 2D vector, whose components are the spatial and the angular coordinates of the ray. The matrices that express the different spatial-angular states have $2 \times 2$ dimensions and are called ABCD matrices. In what follows, we show the advantages of the ABCD formalism, and deduce the fundamental equations of geometrical optics [57].

A.1. ABCD Matrices for Ray Propagation and Refraction

We start by considering a ray of light that propagates in free space. Although no axis of symmetry is defined in this case, we can use a Cartesian reference system in which we define the optical axis in the $z$ direction. The optical axis and the ray define a plane, named here as the meridian plane, on which the trajectory of the ray is confined. This confinement happens in the case of free-space propagation and also in the case of refraction. Then, the state of a ray at a given plane perpendicular to the optical axis can be described with only two, spatial and angular, coordinates. In Fig. 32, we show the trajectory of a single ray in free space and define the spatial-angular coordinates at two different planes separated by distance $t$ .

Figure 32 Scheme for illustration of the spatial-angular coordinates of a ray. For the distances we consider positive the directions that are from left to right and from bottom to top. The angles are measured from the ray to the optical axis and are positive if they follow the counterclockwise direction. Following such criteria, the angles $σ_{1}$ and $σ_{2}$ shown in this figure are negative.

Download Full Size | PDF

From Fig. 32, it is apparent that using small angle approximation,

σ_{2} = σ_{1} and y_{2} = y_{1} - t σ_{1} .

These relations can be grouped in a single matrix equation,

(\begin{matrix} y_{2} \\ σ_{2} \end{matrix}) = (\begin{matrix} 1 & - t \\ 0 & 1 \end{matrix}) (\begin{matrix} y_{1} \\ σ_{1} \end{matrix}),

where we have made an implicit definition of the ABCD matrix,

T

, corresponding to free-space propagation through distance

t

:

T = (\begin{matrix} A & B \\ C & D \end{matrix}) = (\begin{matrix} 1 & - t \\ 0 & 1 \end{matrix}) .

In what follows, we will call the two planes connected by an ABCD matrix the input plane and the output plane. It is also remarkable that in the above calculation, and also in all the forthcoming ABCD formalism, we assume the small angle (or paraxial) approximation, so that

\tan σ \approx σ

.

The next step is to calculate the ABCD matrix corresponding to the refraction at a diopter. Note that we name as diopter the surface that separates two media with different refractive indices, $n_{i}$ . In paraxial optics, diopters are typically plane or spherical.

To calculate the ABCD matrix of a spherical diopter, we need to follow the trajectory of two rays. Our deduction makes use of the Snell’s law for refraction at a diopter; that is, $n_{1} \sin σ_{1} = n_{2} \sin σ_{2}$ , which in the paraxial approximation is $n_{1} σ_{1} \approx n_{2} σ_{2}$ . Then we follow first the ray that refracts at the vertex of the diopter [see ray (1) in Fig. 33(b)] and find that for any value of $σ_{1}$ , $y_{2} = y_{1} = 0$ and $σ_{2} = n_{1} σ_{1} / n_{2}$ . In other words, we find the value of three elements of the matrix: $A = 1$ , $B = 0$ , and $D = n_{1} / n_{2}$ . Following ray (2), which corresponds to a ray parallel to the optical axis ( $σ_{1} = 0$ ), and taking into account the refraction at the diopter, $σ^{'} = n_{1} σ / n_{2}$ , and that $σ_{2} = σ^{'} - σ$ , we find

C = \frac{n_{2} - n_{1}}{n_{2} r} = \frac{1}{f_{D}} .

In this equation, we recognize the fact that all the rays that impinge the diopter parallel to the optical axis, focus at the same point, called the focal point, or simply the focus,

F^{'}

. We define also the focal length of the diopter,

f_{D}

, as the distance from the vertex to the focus. Thanks to the sign criterion, the definition of

f_{D}

covers all the possible cases. One example is the case shown in Fig. 33(b); that is the convex or converging, diopter for which

n_{2} > n_{1}

and

r > 0

, and therefore

f_{D} > 0

. Another example is the concave diopter (

n_{2} > n_{1}

and

r \approx 0

), which produces diverging rays with

f_{D} < 0

.

Figure 33 Scheme for deduction of the ABCD matrices that describe refraction in (a) the plane diopter and (b) the spherical diopter.

Download Full Size | PDF

We can write the ABCD matrix corresponding to a spherical diopter as

S = (\begin{matrix} 1 & 0 \\ \frac{n_{2} - n_{1}}{n_{2} r} & \frac{n_{1}}{n_{2}} \end{matrix}) = (\begin{matrix} 1 & 0 \\ \frac{1}{f_{D}} & \frac{n_{1}}{n_{2}} \end{matrix}) .

Naturally, this matrix can be written for the particular case of the plane diopter (

r = \infty

) [see Fig. 33(a)]:

P = (\begin{matrix} 1 & 0 \\ 0 & \frac{n_{1}}{n_{2}} \end{matrix}) .

A.2. ABCD Matrices Thick and Thin Lenses

Making use of the two canonic matrices, $T$ and $S$ , discussed before, we can tackle the study of any paraxial optical system. First, we start by calculating the matrix corresponding to a lens made with a glass of refractive index $n$ , and with thickness $e$ (see Fig. 34).

Figure 34 Scheme of refraction through a thick lens.

Download Full Size | PDF

In this case, any ray that impinges on the lens suffers three transformations in cascade. First is refraction in $S_{1}$ , then propagation by distance $e$ , and finally refraction in $S_{2}$ . In the ABCD formulism, the matrix of the lens is the result of the product:

L = S_{2} \cdot T \cdot S_{1} = (\begin{matrix} 1 - \frac{e}{f_{D 1}} & - \frac{e}{n} \\ \frac{n}{f_{D 1}} + \frac{1}{f_{D 2}} - \frac{e}{f_{D 1} f_{D 2}} & 1 - \frac{e}{n f_{D 2}} \end{matrix}) .

From this matrix one can define the focal length,

f

, of the thick lens as

\frac{1}{f} = \frac{n}{f_{D 1}} + \frac{1}{f_{D 2}} - \frac{e}{f_{D 1} f_{D 2}} = (n - 1) (\frac{1}{r_{1}} - \frac{1}{r_{2}} + e \frac{n - 1}{n r_{1} r_{2}}) .

Also in this case, and depending on its geometry, a lens can be convergent (

f > 0

) or divergent (

f < 0

). An equivalent way of describing the capacity of lenses for focusing the light beams is through their optical power, defined as

P = 1 / f

, which is measured in diopters,

D = m^{- 1}

. In what follows, we can use, at convenience,

P

or

f

for describing the focusing capacity of a lens.

It is beneficial to list some general properties of ABCD matrices:

(a) The determinant of any ABCD matrix, $M$ , is $| M | = n_{1} / n_{2}$ . Naturally, in the case of operating between planes with the same refractive indices, $| M | = 1$ .
(b) In the case of $B = 0$ , $y_{2} = A y_{1}$ independently of the value of $σ_{1}$ . Consequently, all the rays emitted by a point on the input plane cross at a single point on the output plane. This means that the output point is the image of the input one; in other words, the two planes are conjugate through the optical system. Thus, we can state as general property that if an ABCD matrix is operating between two conjugate planes, then $B = 0$ . In any other case $B \neq 0$ . In addition, the following conclusions can be made:
- (b1) If $B = 0$ , the element $A = y_{2} / y_{1}$ represents the lateral magnification between the conjugate planes. From now, we will denote the lateral magnification of any imaging system with the letter $M$ .
- (b2) If $B = 0$ , and considering only rays proceeding from the axial point of the object plane, $y_{1} = 0$ , the element $D = σ_{2} / σ_{1}$ represents the angular magnification, to which we assign the letter $γ$ .
(c) For all the planes connected with an ABCD matrix, the element $C = 1 / f$ is always the inverse of the focal length of the system.

Coming back to the particular case of the lens, it is remarkable that in many optical systems it is acceptable to consider that the quotient $e / n$ is vanishingly small and can be omitted in the ABCD matrix. In this case, known as the thin-lens approximation, we can write

L_{thin} = (\begin{matrix} 1 & 0 \\ \frac{1}{f} & 1 \end{matrix}),

where

\frac{1}{f} = \frac{1}{f_{D 1}} + \frac{1}{f_{D 2}} .

This approximation allows us to derive very easily some of the classical equations of the geometrical optics. This is the case of the Gaussian conjugation equations, which are obtained as result of calculating the matrix that operates between two planes that are conjugate through a thin lens (see Fig. 35). The matrix that connects the plane

O

with the plane

O^{'}

is

M_{{OO}^{'}} = T_{{LO}^{'}} \cdot L_{thin} \cdot T_{OL} = (\begin{matrix} 1 - \frac{a^{'}}{f} & a - a^{'} (1 + \frac{a}{f}) \\ \frac{1}{f} & 1 + \frac{a}{f} \end{matrix}) .

In the case that

O

and

O^{'}

are conjugate points, element

B = 0

, and therefore

- \frac{1}{a} + \frac{1}{a^{'}} = \frac{1}{f},

which is the well-known Gaussian lens equation. Substituting this result into elements

A

and

D

of Eq. (A12) we obtain the lateral and the angular magnification:

M_{{OO}^{'}} = (\begin{matrix} M = \frac{a^{'}}{a} & 0 \\ \frac{1}{f} & γ = \frac{a}{a^{'}} \end{matrix}) .

Also of great interest is the calculation of the ABCD matrix between the FFP and the BFP of a thin lens (see Fig. 36). In this case,

M_{{FF}^{'}} = T_{{LF}^{'}} \cdot L_{thin} \cdot T_{FL} = (\begin{matrix} 1 & - f \\ 0 & 1 \end{matrix}) (\begin{matrix} 1 & 0 \\ P & 1 \end{matrix}) (\begin{matrix} 1 & - f \\ 0 & 1 \end{matrix}) = (\begin{matrix} 0 & - f \\ P & 0 \end{matrix}) .

This matrix yields a rotation by

π / 2

(plus an anamorphic scaling) of spatial-angular information. Explicitly,

y_{2} = - f σ_{1} and σ_{2} = P y_{1} .

It is noticeable that apart from their well-known capacity for forming images, the optical lenses have the capacity of transposing the spatial-angular information of incident rays after propagating from the FFP and the BFP. In short, light beams with the same spatial content in the FFP have the same angular content in the BFP, and vice versa. In fact, this is the property that explains the well-known fact that if one places a point source at the FFP of a lens, a bundle of parallel rays is obtained at the BFP.

Figure 35 Image formation through a thin lens.

Download Full Size | PDF

Figure 36 ABCD matrix between the focal planes.

Download Full Size | PDF

A.3. Principal Planes and the Nodal Points

We revisit the study of the thick lens and search for a special pair of conjugate planes, which in case they exist, have the property of having the lateral and the angular magnifications equal to one. A scheme for this situation is shown in Fig. 37.

Figure 37 ABCD matrix between the principal planes of a thick lens. Points $N$ and $N^{'}$ are known as the nodal points.

Download Full Size | PDF

Such planes are denoted as the principal planes, named as $H$ and $H^{'}$ , and their position can be easily calculated from the ABCD matrix:

M_{{HH}^{'}} = T_{S_{2} H^{'}} \cdot L \cdot T_{{HS}_{1}} = (\begin{matrix} 1 - \frac{e}{f_{D 1}} - \frac{x_{H}^{'}}{f} & x_{H} (1 - \frac{e}{f_{D 1}}) - \frac{e}{n} - x_{H}^{'} (1 + \frac{x_{H}}{f} - \frac{e}{n f_{D 2}}) \\ \frac{1}{f} & 1 + \frac{x_{H}}{f} - \frac{e}{n f_{D 2}} \end{matrix}) .

From the above matrix it is straightforward to find that for

x_{H}^{'} = - e \frac{f}{f_{D 1}} and x_{H} = \frac{e}{n} \frac{f}{f_{D 2}},

the ABCD matrix is reduced to

M_{{HH}^{'}} = (\begin{matrix} 1 & 0 \\ \frac{1}{f} & 1 \end{matrix}) .

In other words, any thick lens shows a behavior similar to the one shown by a thin lens, provided that the origin for the axial distances is set at the principal planes, whose positions are given by Eq. (A17). The important outcome is that the conjugation equations deduced above, for the case of thin lenses, are also valid in the case of thick lenses. The axial points of the principal planes are named as the nodal points (

N

and

N^{'}

) of the thick lens, and are related for having angular magnification equal to one. Another important issue is that in thick lenses the focal length

f

, also named elsewhere as the effective focal length, is measured from the principal plane. The distance between the rear diopter and the focus is known as the back focal length.

As an example, next we calculate the position of the principal planes of the two converging lenses, as shown in Fig. 38. In the case of the biconvex lens, we obtain that the EFL is $f = 12.8 mm$ , the positions of the principal planes are $x_{H} = 4.4 mm$ , $x_{H}^{'} = - 5.1 mm$ , and the back focal length is $BFL = x_{H}^{'} + f = 7.7 mm$ . In the case of the plano–convex lens, $f = 25.0 mm$ , $x_{H} = 0.0 mm$ , $x_{H}^{'} = - 10.0 mm$ , and $BFL = 15.0 mm$ .

Figure 38 Two examples for the calculation of cardinal parameters of a thick lens. (a) Biconvex lens with $r_{1} = 13 mm$ , $r_{2} = - 10 mm$ , $e = 10 mm$ , and $n = 1.52$ ; (b) plano–convex lens with $r_{1} = 13 mm$ , $r_{2} = \infty$ , $e = 10 mm$ , and $n = 1.52$ .

Download Full Size | PDF

Although we have not demonstrated explicitly in this appendix, the concept of principal planes and nodal points can be extended to any focal system [110]. Then, we can state that for any given complexity, a focal system can be described by its focal length and the principal planes. Once those parameters are known, the matrix shown in Eq. (A18) can be used to calculate the position and size of an image or, in more general terms, the spatial-angular properties at any propagated distance.

One example of this capacity is the case in which we have a focal system from which we know only the position of the focal planes, and the EFL. Suppose that we want to know the position and size of the image of an object that is placed at a distance $z$ from $F$ (see Fig. 39). To solve this problem, we only need to calculate the matrix:

M_{{OO}^{'}} = M_{F^{'} O^{'}} \cdot M_{{FF}^{'}} \cdot M_{OF} = (\begin{matrix} - z^{'} / f & - f - z z^{'} / f \\ \frac{1}{f} & z / f \end{matrix}) .

From this matrix we infer that conjugated planes,

B = 0

, are related by the equation

z z^{'} = - f^{2} .

The lateral magnification between the image and the object plane is

M = - \frac{z^{'}}{f} = \frac{f}{z} .

These two equations are known as Newtonian conjugation equations.

Figure 39 Scheme for the calculation of the correspondence equations when the axial distances are measured from the focal planes.

Download Full Size | PDF

Appendix B: Fundamental Equations of Wave Optics Theory of Image Formation

This appendix presents a brief summary of the fundamentals of wave-optics free-pace propagation and lenses interaction. The main outcome from this appendix is that a converging lens has the ability of transposing the spatial-frequency information carried a by a light beam. In the main text of this paper we show that this essential characteristic helps to explain the image-formation capacity of optical systems.

B.1. Interferences between Waves

We start by considering a monochromatic plane light wave that propagates in the vacuum along the $z$ direction with speed $c$ . Its amplitude, $f (z, t)$ , is given by [59]

f (z, t) = A \cdot \cos [2 π (\frac{z}{λ_{0}} - \frac{t}{T})] = A \cdot \cos (k_{0} z - ω t) = \frac{A}{2} [e^{i (k_{0} z - ω t)} + c c],

where

λ_{0}

is the spatial period (or wavelength in vacuum), and

T

is the temporal period, which are related through

c = λ_{0} T^{- 1}

, where

c

is the speed of the light in vacuum. In addition,

k_{0} = 2 π / λ_{0}

is the wavenumber, and

ω = 2 π ν

is the angular frequency, with

ν = T^{- 1}

the temporal frequency of the wave. In Eq. (B1) the acronym cc refers to the complex-conjugate term

e^{- i (k_{0} z - ω t)}

. As an example, we consider the monochromatic wave emitted by a He–Ne laser,

λ_{0} = 0.633 μm

. Taking into account that

c = 3 \cdot 10^{8} {ms}^{- 1}

, the temporal frequency of this wave is

4.8 \cdot 10^{14} s^{- 1}

. Let us remark that currently there is no instrument capable of detecting such a fast waveform and therefore, the wave nature of a light beam is undetectable directly. However, as shown below, the interference phenomenon will allow us to perceive and measure the wave parameters.

It is common in wave optics to use complex representation of the monochromatic waves, and omit the complex-conjugate term. When the spatial information is the main interest, it is also usual to omit the temporal term and concentrate on the spatial variations. Thus, the amplitude of a monochromatic plane wave is usually written as

f (z) = A e^{i k_{0} z} .

If the plane wave propagates along a direction that forms angles

α

,

β

, and

γ

with the axes

x

,

y

, and

z

, respectively, the amplitude is given by

f (x, y, z) = A e^{i k_{0} (x \cos α + y \cos β + z \cos γ)},

where

\cos^{2} α + \cos^{2} β + \cos^{2} γ = 1

.

The complex representation of the amplitude of a spherical wave (produced by a monochromatic point source) evaluated at a point placed at a distance $r$ from the point source is

f (r) = A \frac{e^{i k_{0} r}}{r} = A \frac{e^{i k_{0} \sqrt{x^{2} + y^{2} + z^{2}}}}{\sqrt{x^{2} + y^{2} + z^{2}}} .

To avoid dealing with the functional square root and simplify the analysis, it is usual to perform the paraxial (or small angle, or parabolic) approximation that assumes that the field is evaluated only in regions in which

z^{2} ≫ x^{2} + y^{2}

. In this case, Eq. (15) can be written as

f (x, y, z) = A \frac{e^{i k_{0} z}}{z} \cdot e^{i \frac{k_{0}}{2 z} (x^{2} + y^{2})} .

The paraxial approximation simplifies the analysis. For example, consider the classical Young experiment in which the interference of two monochromatic spherical waves is obtained (see Fig. 40). If the pinholes are sufficiently small, we can consider that each one is producing a monochromatic spherical wave. On the screen, we can observe the interference between the two spherical wavefronts. The amplitude distribution on the screen,

u (x, y, z)

, is given by the sum of the amplitudes of two mutually shifted spherical waves:

u (x, y, z) = A (\frac{e^{i k_{0} z}}{z} e^{i \frac{k_{0}}{2 z} [{(x - a)}^{2} + y^{2}]} + \frac{e^{i k_{0} z}}{z} e^{i \frac{k_{0}}{2 z} [{(x + a)}^{2} + y^{2}]}) .

What is captured by any light detector, such as the human retina or a CCD camera, is not the amplitude distribution of the light but the irradiance distribution, which is proportional to the intensity (or squared modulus of the amplitude) distribution,

I_{T}

:

I_{T} (x, y, z) = {| u (x, y, z) |}^{2} = \frac{A^{2}}{z^{2}} \cos^{2} (π \frac{x}{λ_{0} z / 2 a}),

where we have omitted a constant proportionality factor. We find that as result of the Young experiment, a set of cosine interference fringes is obtained. The period of fringes is

p_{λ} = λ_{0} z / 2 a

. As an example, we consider the light emitted by a He–Ne laser (

λ_{0} = 0.633 μm

), two pinholes separated by distance

2 a = 1.0 mm

, and the screen placed at

z = 1000 mm

away from the pinholes. In this experiment, the period of the fringes is

p_{λ} = 0.63 mm

. We infer from this experiment that (1) the interference makes perceptible the wave nature of light, and (2) the wave nature of light appears when the light passes through small obstacles.

Figure 40 Illustration of the wave nature of the light scheme of the experimental setup for the implementation of the Young experiment. The monochromatic wave emitted by a laser is expanded and impinges a diffracting screen composed of two pinholes.

Download Full Size | PDF

B.2. Interferences between Multiple Waves: the Concept of Field Propagation

Next, we shall study a much more general case in which we substitute the two pinholes by a diffracting screen, which can be considered as being composed of a continuous distribution of pinholes each having a different transmittance. In this case, to obtain the amplitude distribution at a distance $z$ , we must consider multiple interferences, which are described by the superposition of a continuous distribution of spherical waves with different amplitudes and phases. This superposition can be described by using the following 2D integral:

u (x, y, z) = \iint_{- \infty}^{+ \infty} t (x_{0}, y_{0}) \frac{e^{i k z}}{z} e^{i \frac{k}{2 z} [{(x - x_{0})}^{2} + {(y - y_{0})}^{2}]} d x_{0} d y_{0} = t (x, y) \otimes h (x, y; z) .

Equation (B8) can be considered as a convolution between two functions

t (\cdot)

and

h (\cdot)

. In this equation,

t (x_{0}, y_{0})

is the continuous magnitude counterpart of

A

in Eq. (B6), and represents the transmittance in amplitude of the screen;

h (\cdot)

is a quadratic phase function; and the symbol

\otimes

represents the convolution operator, defined as

g (x, y) = f (x, y) \otimes h (x, y) = \int \int_{R^{2}} f (x_{0}, y_{0}) h (x - x_{0}, y - y_{0}) d x_{0} d y_{0} .

From Eq. (B8) we find that the amplitude distribution at a distance

z

from the diffracting screen is given by the 2D convolution between the amplitude transmittance of the screen and the function

h (x, y; z) = \frac{e^{i k_{0} z}}{λ_{0} z} {i \frac{k}{2 z} (x^{2} + y^{2})} .

This function represents the PSF associated with the free-space propagation of light waves. Note that a more rigorous, and tedious, deduction of this formula would yield to the presence of factor

λ_{0}

in the denominator of Eq. (B10). This factor does not appear from our deduction, but we have included it for the sake of rigorousness.

B.3. Propagation of Light Waves through Converging Lenses

The next step toward our aim of analyzing the image formation in terms of wave optics is to define the amplitude transmittance of a thin lens. To this end we use a heuristic reasoning that is based on the well-known capacity of lenses for focusing plane waves [see Fig. 41(a)]. Specifically, a thin lens can transform an incident plane wave, $u_{L}^{-} (x, y) = \exp (i k_{0} z)$ , into a converging spherical wave, $u_{L}^{+} (x, y) = \exp (i k_{0} z) \exp {- i k_{0} (x^{2} + y^{2}) / 2 f}$ . Then we can define the transmittance of a lens as

t_{L} (x, y) = \frac{u^{+} (x, y)}{u^{-} (x, y)} = \exp {- i \frac{k_{0}}{2 f} (x^{2} + y^{2})} .

With these analytical tools we can calculate how the wave field propagates from the FFP to the BFP of a lens [see Fig. 41(b)]. Proceeding in a way similar to the one used in the ABCD formalism, we simply have to apply in cascade a propagation by distance

f

, the action of the lens, and again a propagation by distance

f

.

Figure 41 (a) Scheme for the definition of the amplitude transmittance of a thin lens and (b) scheme for the propagation of light waves from the FFP to the BFP of a lens.

Download Full Size | PDF

As the first step, we calculate

u_{L}^{-} (x, y) = t (x, y) \otimes \frac{e^{i k_{0} f}}{λ_{0} f} \exp {i \frac{k_{0}}{2 f} (x^{2} + y^{2})} = \frac{e^{i k_{0} f}}{λ_{0} f} \exp {i \frac{k_{0}}{2 f} (x^{2} + y^{2})} \times \iint_{R^{2}} t (x_{0}, y_{0}) \exp {i \frac{k_{0}}{2 f} (x_{0}^{2} + y_{0}^{2})} \exp {- i 2 π (\frac{x}{λ_{0} f} x_{0} + \frac{y}{λ_{0} f} y_{0})} d x_{0} d y_{0} .

The integral in Eq. (B12) is easily recognized as the Fourier transform of the product of two functions, and therefore, it can be rewritten as

u_{L}^{-} (x, y) = \frac{e^{i k_{0} f}}{i λ_{0}^{2} f^{2}} \exp {i \frac{k_{0}}{2 f} (x^{2} + y^{2})} [\tilde{t} (\frac{x}{λ_{0} f}, \frac{y}{λ_{0} f}) \otimes \exp {- i \frac{k_{0}}{2 f} (x^{2} + y^{2})}],

where

\tilde{t} (\cdot)

denotes the Fourier transform of function

t (\cdot)

. To obtain the above equation, we have made use of three well-known properties. First, the Fourier transform of a product of two functions is equal to the convolution between their Fourier transforms (and vice versa):

\iint_{R^{2}} m (x, y) n (x, y) \exp {- i 2 π (x u + y v)} d x d y = \tilde{m} (u, v) \otimes \tilde{n} (u, v) .

The second is the scaling property of the convolution operation, which states that if

f (x, y) \otimes h (x, y) = g (x, y)

, then

f (\frac{x}{a}, \frac{y}{a}) \otimes h (\frac{x}{a}, \frac{y}{a}) = a^{2} g (\frac{x}{a}, \frac{y}{a}) .

And the third is the well-known Fourier transform of a quadratic phase function:

F {\exp [i \frac{k_{0}}{2 f} (x^{2} + y^{2})]} = - i λ_{0} f \exp [- i π λ_{0} f (u^{2} + v^{2})],

with

u = x / λ_{0} f

and

v = y / λ_{0} f

.

As the second step toward calculating the amplitude distribution at the BFP of a lens, we calculate the effect of the lens on the impinging light wave, that is,

u_{L}^{+} (x, y) = u_{L}^{-} (x, y) t_{L} (x, y) = \frac{e^{i k_{0} f}}{i λ_{0}^{2} f^{2}} [\tilde{t} (\frac{x}{λ_{0} f}, \frac{y}{λ_{0} f}) \otimes \exp {- i \frac{k_{0}}{2 f} (x^{2} + y^{2})}] .

Finally, we obtain the amplitude at the BFP of the lens after calculating the propagation by distance

f

:

u_{1} (x, y) = u_{L}^{+} (x, y) \otimes \frac{e^{i k_{0} f}}{λ_{0} f} \exp {i \frac{k_{0}}{2 f} (x^{2} + y^{2})} = \frac{e^{i 2 k_{0} f}}{i λ_{0}^{3} f^{3}} \tilde{t} (\frac{x}{λ_{0} f}, \frac{y}{λ_{0} f}) \otimes \exp {- i \frac{k_{0}}{2 f} (x^{2} + y^{2})} \otimes \exp {i \frac{k_{0}}{2 f} (x^{2} + y^{2})} = \frac{e^{i 2 k_{0} f}}{i λ_{0} f} \tilde{t} (\frac{x}{λ_{0} f}, \frac{y}{λ_{0} f}) .

To obtain this result, we have taken into account the following convolution:

\exp {- i \frac{k_{0}}{2 f} (x^{2} + y^{2})} \otimes \exp {i \frac{k_{0}}{2 f} (x^{2} + y^{2})} = λ_{0}^{2} f^{2} δ (x, y),

where

δ (x, y)

is the 2D Dirac delta function. We have taken into account the following well-known delta function property for any function

f (x, y)

:

f (x, y) \otimes δ (x - x_{0}, y - y_{0}) = f (x - x_{0}, y - y_{0}) .

If we omit the irrelevant amplitude and phase constant factors in Eq. (B18), we find that converging lenses have the ability to perform in real time the 2D Fourier transform of the amplitude distribution at the lens FFP. Although this property has been deduced here for the case of a thin lens, it is valid for a thick lens and in general for any focusing system. In other words, and similar to what was obtained in the geometrical optics with the ABCD formalism, a lens has the capacity of transposing, from the BFP to the FFP, the spatial-frequency information of the light beam. An example of this is that a point source, represented by a delta function and placed in the FFP of a lens, is transformed into a plane wave and vice versa, as shown in Eq. (B21):

δ (x - x_{0}, y - y_{0}) \overset{F}{⟶} \exp {- i 2 π (\frac{x}{λ_{0} f} x_{0} + \frac{y}{λ_{0} f} y_{0})} .

Funding

Ministerio de Economía y Competitividad (MINECO) (DPI2015-66458-C2-1R); Generalitat Valenciana (PROMETEOII/2014/072); National Science Foundation (NSF) (NSF/IIS-1422179); Office of Naval Research (ONR) (N000141712561, N000141712405); Army (W909MY-12-D-0008).

Acknowledgment

We thank A. Dorado, A. Llavador, and G. Scrofani from the University of Valencia for their help in obtaining many of the images shown in the paper. We thank the editor in chief, Prof. Govind Agrawal, and the editorial staff Rebecca Robinson for their support of this paper. B. Javidi acknowledges support in part under NSF, ONR, and Army.

References and Notes

1. C. Wheatstone, “Contributions to the physiology of vision,” Philos. Trans. R. Soc. London 4, 76–77 (1837).

2. W. Rollmann, “Notiz zur Stereoskopie,” Ann. Phys. 165, 350–351 (1853). [CrossRef]

3. S. S. Kim, B. H. You, H. Choi, B. H. Berkeley, D. G. Kim, and N. D. Kim, “World’s first 240 Hz TFT-LCD technology for full-HD LCD-TV and its application to 3D display,” in SID International Symposium Digest of Technical Papers (2009), Vol. 40, pp. 424–427.

4. H. Kang, S. D. Roh, I. S. Baik, H. J. Jung, W. N. Jeong, J. K. Shin, and I. J. Chung, “A novel polarizer glasses-type 3D displays with a patterned retarder,” in SID International Symposium Digest of Technical Papers (2010), Vol. 41, pp. 1–4.

5. F. L. Kooi and A. Toet, “Visual confort of binocular and 3D displays,” Displays 25, 99–108 (2004). [CrossRef]

6. H. Hiura, K. Komine, J. Arai, and T. Mishina, “Measurement of static convergence and accommodation responses to images of integral photography and binocular stereoscopy,” Opt. Express 25, 3454–3468 (2017). [CrossRef]

7. T. Okoshi, “Three-dimensional displays,” Proc. IEEE 68, 548–564 (1980). [CrossRef]

8. J.-Y. Son, V. V. Saveljev, Y.-J. Choi, J.-E. Bahn, S.-K. Kim, and H. Choi, “Parameters for designing autostereoscopic imaging systems based on lenticular, parallax barrier, and integral photography plates,” Opt. Eng. 42, 3326–3333 (2003). [CrossRef]

9. K. Muller, P. Merkle, and T. Wiegand, “3-D video representation using depth maps,” Proc. IEEE 99, 643–656 (2011). [CrossRef]

10. J. Geng, “Three-dimensional display technologies,” Adv. Opt. Photon. 5, 456–535 (2013). [CrossRef]

11. D. E. Smalley, E. Nygaard, K. Squire, J. Van Wagoner, J. Rasmussen, S. Gneiting, K. Qaderi, J. Goodsell, W. Rogers, M. Lindsey, K. Costner, A. Monk, M. Pearson, B. Haymore, and J. Peatross, “A photophoretic-trap volumetric display,” Nature 553, 486–490 (2018). [CrossRef]

12. S. Tay, P. A. Blanche, R. Voorakaranam, A. V. Tunç, W. Lin, S. Rokutanda, T. Gu, D. Flores, P. Wang, G. Li, P. St Hilaire, J. Thomas, R. A. Norwood, M. Yamamoto, and N. Peyghambarian, “An updatable holographic three-dimensional display,” Nature 451, 694–698 (2008). [CrossRef]

13. G. Lippmann, “Epreuves reversibles donnant la sensation du relief,” J. Phys. 7, 821–825 (1908).

14. D. F. Coffey, “Apparatus for making a composite stereograph,” U.S. patent 2063985A (December 15, 1936).

15. N. Davies, M. McCormick, and L. Yang, “Three-dimensional imaging systems: a new development,” Appl. Opt. 27, 4520–4528 (1988). [CrossRef]

16. H. Arimoto and B. Javidi, “Integral three-dimensional imaging with computed reconstruction,” Opt. Lett. 26, 157–159 (2001). [CrossRef]

17. S. Manolache, A. Aggoun, M. McCormick, N. Davies, and S. Y. Kung, “Analytical model of a three-dimensional integral image recording system that uses circular- and hexagonal-based spherical surface microlenses,” J. Opt. Soc. Am. A 18, 1814–1821 (2001). [CrossRef]

18. F. Okano, H. Hoshino, J. Arai, and I. Yuyama, “Real-time pickup method for a three-dimensional image based on integral photography,” Appl. Opt. 36, 1598–1603 (1997). [CrossRef]

19. B. Javidi and F. Okano, Three-Dimensional Television, Video, and Display Technologies (Springer, 2002).

20. A. Isaksen, L. McMillan, and S. J. Gortler, “Dynamically reparameterized light fields,” in Proceedings of ACM SIGGRAPH (2000), pp. 297–306

21. E. H. Adelson and J. R. Bergen, “The plenoptic function and the elements of early vision,” Comput. Models Visual Process. 1, 3–20 (1991).

22. E. H. Adelson and J. Y. A. Wang, “Single lens stereo with plenoptic camera,” IEEE Trans. Pattern Anal. Mach. Intell. 14, 99–106 (1992). [CrossRef]

23. R. Ng, M. Levoy, M. Brédif, G. Duval, M. Horowitz, and P. Hanrahan, “Light field photography with a hand-held plenoptic camera,” Technical Report CSTR 2005-02 (2005).

24. https://www.lytro.com.

25. https://raytrix.de.

26. N. Bedard, T. Shope, A. Hoberman, M. A. Haralam, N. Shaikh, J. Kovacevic, N. Balram, and I. Tosic, “Light field otoscope design for 3D in vivo imaging of the middle ear,” Biomed. Opt. Express 8, 260–272 (2017). [CrossRef]

27. H. Chen, V. Sick, M. Woodward, and D. Burke, “Human iris 3D imaging using a micro-plenoptic camera,” in Optics in the Life Sciences Congress, OSA Technical Digest (2017), paper BoW3A.6.

28. J. Liu, D. Claus, T. Xu, T. Keßner, A. Herkommer, and W. Osten, “Light field endoscopy and its parametric description,” Opt. Lett. 42, 1804–1807 (2017). [CrossRef]

29. A. Hassanfiroozi, Y. Huang, B. Javidi, and H. Shieh, “Hexagonal liquid crystal lens array for 3D endoscopy,” Opt. Express 23, 971–981 (2015). [CrossRef]

30. R. S. Decker, A. Shademan, J. D. Opfermann, S. Leonard, P. C. W. Kim, and A. Krieger, “Biocompatible near-infrared three-dimensional tracking system,” IEEE Trans. Biomed. Eng. 64, 549–556 (2017).

31. N. C. Pégard, H.-Y. Liu, N. Antipa, M. Gerlock, H. Adesnik, and L. Waller, “Compressive light-field microscopy for 3D neural activity recording,” Optica 3, 517–524 (2016). [CrossRef]

32. L. Cong, Z. Wang, Y. Chai, W. Hang, C. Shang, W. Yang, L. Bai, J. Du, K. Wang, and Q. Wen, “Rapid whole brain imaging of neural activity in freely behaving larval zebrafish (Danio rerio),” eLife 6, e28158 (2017). [CrossRef]

33. A. Klein, T. Yaron, E. Preter, H. Duadi, and M. Fridman, “Temporal depth imaging,” Optica 4, 502–506 (2017). [CrossRef]

34. T. Nöbauer, O. Skocek, A. J. Pernía-Andrade, L. Weilguny, F. Martínez Traub, M. I. Molodtsov, and A. Vaziri, “Video rate volumetric Ca²⁺ imaging across cortex using seeded iterative demixing (SID) microscopy,” Nat. Methods 14, 811–818 (2017). [CrossRef]

35. Y. Lv, H. Ma, Q. Sun, P. Ma, Y. Ning, and X. Xu, “Wavefront sensing based on partially occluded and extended scene target,” IEEE Photon. J. 9, 7801508 (2017).

36. S. Komatsu, A. Markman, A. Mahalanobis, K. Chen, and B. Javidi, “Three-dimensional integral imaging and object detection using long-wave infrared imaging,” Appl. Opt. 56, D120–D126 (2017). [CrossRef]

37. P. A. Coelho, J. E. Tapia, F. Pérez, S. N. Torres, and C. Saavedra, “Infrared light field imaging system free of fixed-pattern noise,” Sci. Rep. 7, 13040 (2017). [CrossRef]

38. H. Hua and B. Javidi, “A 3D integral imaging optical see-through head-mounted display,” Opt. Express 22, 13484–13491 (2014). [CrossRef]

39. A. Markman, J. Wang, and B. Javidi, “Three-dimensional integral imaging displays using a quick-response encoded elemental image array,” Optica 1, 332–335 (2014). [CrossRef]

40. http://real-eyes.eu/3d-displays/.

41. D. J. Brady, M. E. Gehm, R. A. Stack, D. L. Marks, D. S. Kittle, D. R. Golish, E. M. Vera, and S. D. Feller, “Multiscale gigapixel photography,” Nature 486, 386–389 (2012). [CrossRef]

42. H. Navarro, M. Martínez-Corral, G. Saavedra, A. Pons, and B. Javidi, “Photoelastic analysis of partially occluded objects with an integral-imaging polariscope,” J. Disp. Technol. 10, 255–262 (2014). [CrossRef]

43. L. D. Elie and A. R. Gale, “System and method for inspecting road surfaces,” U.S. patent 0096144A1 (April 6, 2017).

44. P. Drap, J. P. Royer, M. Nawaf, M. Saccone, D. Merad, A. López-Sanz, J. B. Ledoux, and J. Garrabou, “Underwater photogrammetry, coded target and plenoptic technology: a set of tools for monitoring red coral in mediterranean sea in the framework of the ‘perfect’ project,” in International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (2017), Vol. XLII-2/W3, pp. 275–282.

45. J. S. Jang and B. Javidi, “Three-dimensional integral imaging of micro-objects,” Opt. Lett. 29, 1230–1232 (2004). [CrossRef]

46. M. Levoy, R. Ng, A. Adams, M. Footer, and M. Horowitz, “Light field microscopy,” ACM Trans. Graph. 25, 924–934 (2006). [CrossRef]

47. M. Levoy, Z. Zhang, and I. McDowall, “Recording and controlling the 4D light field in a microscope using microlens arrays,” J. Microsc. 235, 144–162 (2009). [CrossRef]

48. A. Llavador, E. Sánchez-Ortiga, J. C. Barreiro, G. Saavedra, and M. Martínez-Corral, “Resolution enhancement in integral microscopy by physical interpolation,” Biomed. Opt. Express 6, 2854–2863 (2015). [CrossRef]

49. X. Lin, J. Wu, G. Zheng, and Q. Dai, “Camera array based light field microscopy,” Biomed. Opt. Express 6, 3179–3189 (2015). [CrossRef]

50. A. Llavador, J. Sola-Picabea, G. Saavedra, B. Javidi, and M. Martinez-Corral, “Resolution improvements in integral microscopy with Fourier plane recording,” Opt. Express 24, 20792–20798 (2016). [CrossRef]

51. B. Javidi, I. Moon, and S. Yeom, “Three-dimensional identification of biological microorganism using integral imaging,” Opt. Express 14, 12096–12108 (2006). [CrossRef]

52. P. Vilmi, S. Varjo, R. Sliz, J. Hannuksela, and T. Fabritius, “Disposable optics for microscopy diagnostics,” Sci. Rep. 5, 16957 (2015). [CrossRef]

53. S. Nagelberg, L. D. Zarzar, N. Nicolas, K. Subramanian, J. A. Kalow, V. Sresht, D. Blankschtein, G. Barbastathis, M. Kreysing, T. M. Swager, and M. Kolle, “Reconfigurable and responsive droplet-based compound micro-lenses,” Nat. Commun. 8, 14673 (2017). [CrossRef]

54. J. F. Algorri, N. Bennis, V. Urruchi, P. Morawiak, J. M. Sanchez-Pena, and L. R. Jaroszewicz, “Tunable liquid crystal multifocal microlens array,” Sci. Rep. 7, 17318 (2017). [CrossRef]

55. Strictly speaking, telecentricity means that both the entrance and the exit pupils are at infinity. To obtain this condition, the system must be necessarily afocal. However, the use of the word “telecentric” is often extended to systems that are simply afocal.

56. M. Born and E. Wolf, Principles of Optics (Cambridge University, 1999), Chap. 4.

57. A. Gerrard and J. M. Burch, Introduction to Matrix Methods in Optics (Wiley, 1975).

58. M. Martinez-Corral, P.-Y. Hsieh, A. Doblas, E. Sánchez-Ortiga, G. Saavedra, and Y.-P. Huang, “Fast axial-scanning widefield microscopy with constant magnification and resolution,” J. Disp. Technol. 11, 913–920 (2015). [CrossRef]

59. J. W. Goodman, Introduction to Fourier Optics (McGraw-Hill, 1996).

60. M. Pluta, Advanced Light Microscopy. Principles and Basic Properties (Elsevier, 1988).

61. The radiance is a radiometric magnitude defined as the radiant flux per unit of area and unit of solid angle, emitted by (or received by, or passing through) a differential surface in a given direction. Irradiance is the integration of the radiance over all the angles.

62. M. Martinez-Corral, A. Dorado, J. C. Barreiro, G. Saavedra, and B. Javidi, “Recent advances in the capture and display of macroscopic and microscopic 3D scenes by integral imaging,” Proc. IEEE 105, 825–836 (2017). [CrossRef]

63. R. C. Bolles, H. H. Baker, and D. H. Marimont, “Epipolar-plane image analysis: an approach to determining structure from motion,” Int. J. Comput. Vis. 1, 7–55 (1987). [CrossRef]

64. In general, an epipolar image is a 2D slice of plenoptic function with a zero angular value in the direction normal to this slice. However, we use a more restricted definition so that an epipolar image is a 2D slice of plenoptic function in which $y^{'} = 0$ is fixed and $ϕ = 0$ .

65. R. Gorenflo and S. Vessella, Abel Integral Equations: Analysis and Applications, Lecture Notes in Mathematics (Springer, 1991), Vol. 1461.

66. A. Schwarz, J. Wang, A. Shemer, Z. Zalevsky, and B. Javidi, “Lensless three-dimensional integral imaging using a variable and time multiplexed pinhole array,” Opt. Lett. 40, 1814–1817 (2015). [CrossRef]

67. B. Wilburn, N. Joshi, V. Vaish, E.-V. Talvala, E. Antunez, A. Barth, A. Adams, M. Horowitz, and M. Levoy, “High performance imaging using large camera arrays,” ACM Trans. Graph. 24, 765–776 (2005). [CrossRef]

68. J. S. Jang and B. Javidi, “Three-dimensional synthetic aperture integral imaging,” Opt. Lett. 27, 1144–1146 (2002). [CrossRef]

69. X. Xiao, M. Daneshpanah, M. Cho, and B. Javidi, “3D integral imaging using sparse sensors with unknown positions,” J. Disp. Technol. 6, 614–619 (2010). [CrossRef]

70. S.-H. Hong, J.-S. Jang, and B. Javidi, “Three-dimensional volumetric object reconstruction using computational integral imaging,” Opt. Express 12, 483–491 (2004). [CrossRef]

71. M. Martinez-Corral, A. Dorado, A. Llavador, G. Saavedra, and B. Javidi, “Three-dimensional integral imaging and display,” in Multi-Dimensional Imaging, B. Javidi, E. Tajahuerce, and P. Andres, eds. (Wiley, 2014), Chap. 11.

72. H. Navarro, R. Martínez-Cuenca, A. Molina-Martín, M. Martínez-Corral, G. Saavedra, and B. Javidi, “Method to remedy image degradations due to facet braiding in 3D integral imaging monitors,” J. Disp. Technol. 6, 404–411 (2010). [CrossRef]

73. M. Levoy, “Volume rendering using the Fourier projection-slice theorem,” in Graphics Interface (1992), pp. 61–69.

74. R. Ng, “Fourier slice photography,” ACM Trans. Graph. 24, 735–744 (2005). [CrossRef]

75. J. P. Lüke, F. Rosa, J. G. Marichal-Hernández, J. C. Sanluís, C. Domínguez Conde, and J. M. Rodríguez-Ramos, “Depth from light fields analyzing 4D local structure,” J. Disp. Technol. 11, 900–907 (2015). [CrossRef]

76. H. Navarro, E. Sánchez-Ortiga, G. Saavedra, A. Llavador, A. Dorado, M. Martínez-Corral, and B. Javidi, “Non-homogeneity of lateral resolution in integral imaging,” J. Disp. Technol. 9, 37–43 (2013). [CrossRef]

77. M. Tanimoto, M. Tehrani, T. Fujii, and T. Yendo, “Free-viewpoint TV,” IEEE Signal Process. Mag. 28(1), 67–76 (2011). [CrossRef]

78. F. Jin, J. Jang, and B. Javidi, “Effects of device resolution on three-dimensional integral imaging,” Opt. Lett. 29, 1345–1347 (2004). [CrossRef]

79. J. S. Jang and B. Javidi, “Large depth-of-focus time-multiplexed three-dimensional integral imaging by use of lenslets with non-uniform focal lengths and aperture sizes,” Opt. Lett. 28, 1924–1926 (2003). [CrossRef]

80. J. S. Jang and B. Javidi, “Three-dimensional integral imaging with electronically synthesized lenslet arrays,” Opt. Lett. 27, 1767–1769 (2002). [CrossRef]

81. J. S. Jang and B. Javidi, “Improved viewing resolution of three-dimensional integral imaging with nonstationary micro-optics,” Opt. Lett. 27, 324–326 (2002). [CrossRef]

82. S. Hong and B. Javidi, “Improved resolution 3D object reconstruction using computational integral imaging with time multiplexing,” Opt. Express 12, 4579–4588 (2004). [CrossRef]

83. C.-W. Chen, M. Cho, Y.-P. Huang, and B. Javidi, “Improved viewing zones for projection type integral imaging 3D display using adaptive liquid crystal prism array,” J. Disp. Technol. 10, 198–203 (2014). [CrossRef]

84. T.-H. Jen, X. Shen, G. Yao, Y.-P. Huang, H.-P. Shieh, and B. Javidi, “Dynamic integral imaging display with electrically moving array lenslet technique using liquid crystal lens,” Opt. Express 23, 18415–18421 (2015). [CrossRef]

85. R. Martínez-Cuenca, G. Saavedra, M. Martinez-Corral, and B. Javidi, “Enhanced depth of field integral imaging with sensor resolution constraints,” Opt. Express 12, 5237–5242 (2004). [CrossRef]

86. K. Wakunami, M. Yamaguchi, and B. Javidi, “High resolution 3-D holographic display using dense ray sampling from integral imaging,” Opt. Lett. 37, 5103–5105 (2012). [CrossRef]

87. Y. Kim, J. Kim, K. Hong, H.-K. Yang, J.-H. Jung, H. Choi, S.-W. Min, J.-M. Seo, J.-M. Hwang, and B. Lee, “Accommodative response of integral imaging in near distance,” J. Disp. Technol. 8, 70–78 (2012). [CrossRef]

88. http://www.alioscopy.com/.

89. M. Martinez-Corral, A. Dorado, H. Navarro, G. Saavedra, and B. Javidi, “3D display by smart pseudoscopic-to-orthoscopic conversion with tunable focus,” Appl. Opt. 53, E19–E26 (2014). [CrossRef]

90. J. B. Pawley, Handbook of Biological Confocal Microscopy, 3rd ed. (Springer, 2006).

91. M. Martinez-Corral and G. Saavedra, “The resolution challenge in 3D optical microscopy,” Prog. Opt. 53, 1–67 (2009). [CrossRef]

92. M. Gu and C. J. R. Sheppard, “Confocal fluorescent microscopy with a finite-sized circular detector,” J. Opt. Soc. Am. A 9, 151–153 (1992). [CrossRef]

93. T. Wilson, Confocal Microscopy (Academic, 1990).

94. E. Sánchez-Ortiga, C. J. R. Sheppard, G. Saavedra, M. Martínez-Corral, A. Doblas, and A. Calatayud, “Subtractive imaging in confocal scanning microscopy using a CCD camera as a detector,” Opt. Lett. 37, 1280–1282 (2012). [CrossRef]

95. M. A. A. Neil, R. Juskaitis, and T. Wilson, “Method of obtaining optical sectioning by using structured light in a conventional microscope,” Opt. Lett. 22, 1905–1907 (1997). [CrossRef]

96. M. G. L. Gustafsson, “Surpassing the lateral resolution by a factor of two using structured illumination microscopy,” J. Microsc. 198, 82–87 (2000). [CrossRef]

97. A. G. York, S. H. Parekh, D. D. Nogare, R. S. Fischer, K. Temprine, M. Mione, A. B. Chitnis, A. Combs, and H. Shroff, “Resolution doubling in live, multicellular organisms via multifocal structured illumination microscopy,” Nat. Methods 9, 749–754 (2012). [CrossRef]

98. E. H. K. Stelzer, K. Greger, and E. G. Reynaud, Light Sheet Based Fluorescence Microscopy: Principles and Practice (Wiley-Blackwell, 2014).

99. P. A. Santi, “Light sheet fluorescence microscopy: a review,” J. Histochem. Cytochem. 59, 129–138 (2011). [CrossRef]

100. I. Moon and B. Javidi, “Three-dimensional identification of stem cells by computational holographic imaging,” J. R. Soc. Interface 4, 305–313 (2007). [CrossRef]

101. S. Ebrahimi, M. Dashtdar, E. Sanchez-Ortiga, M. Martinez-Corral, and B. Javidi, “Stable and simple quantitative phase-contrast imaging by Fresnel biprism,” Appl. Phys. Lett. 112, 113701 (2018). [CrossRef]

102. P. Picart and J. C. Li, Digital Holography (Wiley, 2012).

103. B. F. Grewe, F. F. Voigt, M. van’t Hoff, and F. Helmchen, “Fast two-layer two-photon imaging of neural cell populations using an electrically tunable lens,” Biomed. Opt. Express 2, 2035–2046 (2011). [CrossRef]

104. F. O. Fahrbach, F. F. Voigt, B. Schmid, F. Helmchen, and J. Huisken, “Rapid 3D light-sheet microscopy with a tunable lens,” Opt. Express 21, 21010–21026 (2013). [CrossRef]

105. J. M. Jabbour, B. H. Malik, C. Olsovsky, R. Cuenca, S. Cheng, J. A. Jo, Y.-S. L. Cheng, J. M. Wright, and K. C. Maitland, “Optical axial scanning in confocal microscopy using an electrically tunable lens,” Biomed. Opt. Express 5, 645–652 (2014). [CrossRef]

106. G. Scrofani, J. Sola-Pikabea, A. Llavador, E. Sanchez-Ortiga, J. C. Barreiro, G. Saavedra, J. Garcia-Sucerquia, and M. Martinez-Corral, “FIMic: design for ultimate 3D-integral microscopy of in-vivo biological samples,” Biomed. Opt. Express 9, 335–346 (2018). [CrossRef]

107. M. Broxton, L. Grosenick, S. Yang, N. Cohen, A. Andalman, K. Deisseroth, and M. Levoy, “Wave optics theory and 3-D deconvolution for the light field microscope,” Opt. Express 21, 25418–25439 (2013). [CrossRef]

108. K. Kwon, M. Erdenebat, Y. Lim, K. Joo, M. Park, H. Park, J. Jeong, H. Kim, and N. Kim, “Enhancement of the depth-of-field of integral imaging microscope by using switchable bifocal liquid-crystalline polymer micro lens array,” Opt. Express 25, 30503–30512 (2017). [CrossRef]

109. A. Dorado, M. Martinez-Corral, G. Saavedra, and S. Hong, “Computation and display of 3D movie from a single integral photography,” J. Disp. Technol. 12, 695–700 (2016). [CrossRef]

110. R. S. Longhurst, Geometrical and Physical Optics (Longman, 1973), Chap. 2.

111. J. Hong, Y. Kim, H.-J. Choi, J. Hahn, J.-H. Park, H. Kim, S.-W. Min, N. Chen, and B. Lee, “Three-dimensional display technologies of recent interest: principles, status, and issues,” Appl. Opt. 50, H87–H115 (2011). [CrossRef]

112. J.-Y. Son, H. Lee, B.-R. Lee, and K.-H. Lee, “Holographic and light-field imaging as future 3-D displays,” Proc. IEEE 105, 789–804 (2017). [CrossRef]

113. J. Arai, E. Nakasu, T. Yamashita, H. Hiura, M. Miura, T. Nakamura, and R. Funatsu, “Progress overview of capturing method for integral 3-D imaging displays,” Proc. IEEE 105, 837–849 (2017). [CrossRef]

114. B. Javidi, X. Shen, A. S. Markman, P. Latorre-Carmona, A. Martinez-Uso, J. M. Sotoca, F. Pla, M. Martinez-Corral, and G. Saavedra, “Multidimensional optical sensing and imaging system (MOSIS): from macroscales to microscales,” Proc. IEEE 105, 850–875 (2017). [CrossRef]

115. M. Yamaguchi, “Full-parallax holographic light-field 3-D displays and interactive 3-D touch,” Proc. IEEE 105, 947–959 (2017). [CrossRef]

116. M. Yamaguchi and K. Wakunami, “Ray-based and wavefront-based 3D representations for holographic displays,” in Multi-Dimensional Imaging, B. Javidi, E. Tajahuerce, and P. Andres, eds. (Wiley, 2014).

117. S. Park, J. Yeom, Y. Jeong, N. Chen, J.-Y. Hong, and B. Lee, “Recent issues on integral imaging and its applications,” J. Inf. Disp. 15, 37–46 (2014). [CrossRef]

118. M. Yamaguchi and R. Higashida, “3D touchable holographic light-field display,” Appl. Opt. 55, A178–A183 (2016). [CrossRef]

119. D. Nam, J.-H. Lee, Y.-H. Cho, Y.-J. Jeong, H. Hwang, and D.-S. Park, “Flat panel light-field 3-D display: concept, design, rendering, and calibration,” Proc. IEEE 105, 876–891 (2017). [CrossRef]

120. A. Stern, Y. Yitzhaky, and B. Javidi, “Perceivable light fields: matching the requirements between the human visual system and autostereoscopic 3-D displays,” Proc. IEEE 102, 1571–1587 (2014). [CrossRef]

121. B. Javidi and A. M. Tekalp, “Emerging 3-D imaging and display technologies,” Proc. IEEE 105, 786–788 (2017). [CrossRef]

Manuel Martinez-Corral was born in Spain in 1962. He received his Ph.D. degree in physics in 1993 from the University of Valencia, which honored him with the Ph.D. Extraordinary Award. He is currently a full professor of optics at the University of Valencia, where he co-leads the “3D Imaging and Display Laboratory.” His teaching experience includes lectures and supervision of laboratory experiments for undergraduate students on geometrical optics, optical instrumentation, diffractive optics, and image formation. Dr. Martinez-Corral lectures on diffractive optics for Ph.D. students and has been a fellow of the SPIE since 2010 and a fellow of the OSA since 2016. His research interests include microscopic and macroscopic 3D imaging and display technologies. He has supervised 17 Ph.D. students on these topics (three honored with the Ph.D. Extraordinary Award), published over 115 technical articles in major journals (which have been cited more than 2500 times, h-index = 26), and provided over 50 invited and keynote presentations in international meetings. He is co-chair of the SPIE Conference “Three-Dimensional Imaging, Visualization, and Display.” He has been a topical editor of the IEEE/OSA Journal of Display Technology and is a topical editor of the OSA journal Applied Optics.

Prof. Bahram Javidi received a B.S. degree from George Washington University and a Ph.D. from Pennsylvania State University in electrical engineering. He is the Board of Trustees Distinguished Professor at the University of Connecticut. His interests are a broad range of transformative imaging approaches using optics and photonics, and he has made seminal contributions to passive and active multidimensional imaging from nanoscales to microscales and macroscales. His recent research activities include 3D visualization and recognition of objects in photon-starved environments; automated disease identification using biophotonics with low-cost compact sensors; information security, encryption, and authentication using quantum imaging; non-planar flexible 3D image sensing; and bio-inspired imaging. He has been named a fellow of several societies, including IEEE, The Optical Society (OSA), SPIE, EOS, and IoP. Early in his career, the National Science Foundation named him a Presidential Young Investigator. He has received the OSA Fraunhofer Award/Robert Burley Prize (2018), the Prize for Applied Aspects of Quantum Electronics and Optics of the European Physical Society (2015), the SPIE Dennis Gabor Award in Diffractive Wave Technologies (2005), and the SPIE Technology Achievement Award (2008). In 2008, he was awarded the IEEE Donald G. Fink Paper Prize, and the John Simon Guggenheim Foundation Fellow Award. In 2007, the Alexander von Humboldt Foundation (Germany) awarded him the Humboldt Prize. He is an alumnus of the Frontiers of Engineering of The National Academy of Engineering (2003-). His papers have been cited 39,000 times (h-index = 92) according to a Google Scholar citation report.

Name	Description
Visualization 1	Multi-perspective movie with the elemental images as frames.
Visualization 2	Display on a flat 2D monitor of different elemental images following the mouse movements.
Visualization 3	Example of application of the refocusing algorithm, using the elemental images as the input.
Visualization 4	Multi-perspective movie with the calculated elemental images as frames.
Visualization 5	Example of application of the refocusing algorithm, using the calculated elemental images as the input.
Visualization 6	3D monitor displaying the microimages captured with the plenoptic camera.
Visualization 7	3D monitor displaying the microimages calculated from the directly-captured elemental images.
Visualization 8	3D monitor displaying the microimages calculated from the directly-captured, but cropped, elemental images.
Visualization 9	Multi-perspective movie of microscopic sample, with the calculated elemental images as frames.
Visualization 10	Example of application of the refocusing algorithm, using the calculated elemental images as the input.
Visualization 11	Multi-perspective movie of microscopic sample, with the calculated elemental images as frames.
Visualization 12	Multi-perspective movie of microscopic sample, with the calculated elemental images as frames.
Visualization 13	Example of application of the refocusing algorithm, using the calculated elemental images as the input.
Visualization 14	Example of application of the refocusing algorithm, using the calculated elemental images as the input.

Fundamentals of 3D imaging and displays: a tutorial on integral imaging, light-field, and plenoptic systems

Abstract

1. Introduction

2. Fundaments of Geometrical Optics

2.1. Telecentric Optical Systems

2.2. Aperture and Field Limitation

2.3. Lateral Resolution and Depth of Field

3. Wave Theory of Image Formation

3.1. Propagation of Waves through Telecentric Optical Systems

3.2. Spatial Resolution and Depth of Field

4. Three-Dimensional Integral Imaging Analyzed in Terms of Geometrical Optics and Wave Optics

4.1. Integral Photography

4.2. Depth Refocusing of 3D Objects

4.3. Resolution and Depth of Field in Captured Views

4.4. Resolution and Depth of Field in Refocused Images

4.5. Plenoptic Camera

4.6. Resolution and Depth of Field in Calculated EIs

5. Display of Plenoptic Images: the Integral Monitor

6. Integral Microscopy

7. Conclusions

Appendix A: Fundamental Equations of Geometrical Optics and ABCD Formalism

A.1. ABCD Matrices for Ray Propagation and Refraction

A.2. ABCD Matrices Thick and Thin Lenses

A.3. Principal Planes and the Nodal Points

Appendix B: Fundamental Equations of Wave Optics Theory of Image Formation

B.1. Interferences between Waves

B.2. Interferences between Multiple Waves: the Concept of Field Propagation

B.3. Propagation of Light Waves through Converging Lenses

Funding

Acknowledgment

References and Notes

Supplementary Material (14)

Cited By

Figures (41)

Tables (1)

Equations (108)

Advances in Optics and Photonics

Acronym	Full Name
3D	Three-dimensional
AS	Aperture stop
BFL	Back focal length
BFP	Back focal plane
CCD	Charge-coupled device
DoF	Depth of field
Dof	Depth of focus
EFL	Effective focal length
EI	Elemental image
EP	Entrance pupil
FFP	Front focal plane
HVS	Human visual system
IMic	Integral microscope
InIm	Integral imaging
IP	Integral photography
MLA	Microlens array
MO	Microscope objective
NA	Numerical aperture
PSF	Point spread function
ROP	Reference object plane
XP	Exit pupil