Ray-transfer functions for camera simulation of 3D scenes with hidden lens design

Thomas Goossens; Zheng Lyu; Jamyuen Ko; Gordon C. Wan; Joyce Farrell; Brian Wandell

doi:10.1364/OE.457496

1. Introduction

In many engineering disciplines, accurate simulation tools enable low-cost and rapid design (soft prototyping). To effectively support image systems simulation, tools must incorporate the full pipeline: the scene, camera optics, and image sensor. In addition to soft prototyping, image systems simulation is increasingly valuable for training machine learning algorithms. In the case of autonomous driving, simulation is being used to create training sets for challenging conditions [1–5].

There has been extensive work in computer graphics on three-dimensional scene simulation [6,7]. There is also a substantial history in simulating optics and sensor simulation [8,9]. This article addresses an important challenge that has limited the ability to accurately simulate the optics as part of the prototyping: lens manufacturers rarely disclose details of the lens design. Knowledge of the lens properties is essential for accurate soft prototyping and accurate synthetic data used in machine learning applications. Although the lens design is often confidential, manufacturers are sometimes willing to provide a “black box” description of the optical system. Such black box data can be used with lens design software, such as Zemax OpticStudio, to analyze the lens properties that are essential for soft prototyping.

A common approach for simulating an unknown camera design makes use of point spread functions (PSF) at different field heights, object distances, and wavelengths [9–14]. This analysis takes into account geometrical, wave, and chromatic aberrations. An important development was learning how to interpolate the point spread correctly from a finite number of sampled PSFs. This was demonstrated by [15] and used by [16].

But important phenomena, such as ray occlusions (which change the PSF; see Fig. 12 in [17] for an example) or multiple inter-reflections, are not simulated from the PSF alone; at best, the PSF can be calculated for multiple depths. Occlusions and inter-reflections are present and visually significant in many scenes. Camera simulation of such complex three-dimensional spectral scenes is now practical due to advances in physically-based rendering and increased computing power [18–20]. Combining image sensor simulation tools (e.g., ISETCam [21]) with physically based rendering offers new possibilities for designing and evaluating imaging systems. Simulation can also be used to generate large numbers of training scenes along with accurate pixel-level labels. The ability to synthesize labeled and physically meaningful rendered scenes can significantly accelerate the training cycle for machine learning vision applications [3,22,23].

In this paper we describe how to use Zemax black box data to obtain an equivalent lens model - the ray transfer function - using polynomial approximations [24–26]. The ray transfer function, when embedded into a ray-tracer, accounts for three-dimensional effects (depth, occlusion). We evaluate the accuracy of the ray-transfer approximation using two image metrics: relative illumination and blurring. Finally, we provide a link to an open-source and free implementation of all these methods and related data.

2. Ray-transfer matrices and functions

For ray-tracing purposes, a lens can be considered a black box system that transforms an incoming ray on one side of the lens to an outgoing ray on another side of the lens. For many lenses, the paraxial ray trace can be accurately modeled using a Ray-Transfer Matrix [27]. In this framework, rays are represented by their intersection $p$ at some chosen input and output plane and their angle $\theta$ with respect to the optical axis (see Fig. 1(a)). The ray-transfer matrix then relates the angles and intersections at the input and output planes using the well-known ABCD formulation [28]

(1)$$ \left[\begin{array}{l} p_{\text {out }} \\ \theta_{\text {out }} \end{array}\right]=\left[\begin{array}{ll} A & B \\ C & D \end{array}\right]\left[\begin{array}{l} p_{\text {in }} \\ \theta_{\text {in }} \end{array}\right]. $$

Generalized ray-tranfer matrix formulations for aberrated three-dimensional lens systems have been proposed as well [29–31].

Fig. 1. Ray-transfer functions (RTF) extend the classic ABCD ray-transfer matrices used for paraxial calculations. (a) The ray-transfer matrix has position and angle inputs with respect to two planes. (b) The RTF maps a position vector and a unit direction vector (6D) to an output position and direction vector (6D). (c) The RTF can represent the transformation using curved surfaces, which can be important for wide angle lenses (see Fig. 2(b)).

Download Full Size | PDF

In this article, we generalize the matrix to a nonlinear formalism, i.e., Ray-Transfer Functions (RTF). The RTF maps incoming rays to outgoing rays in 3D space (Fig. 1). A ray is described by an origin vector (in boldface) $\boldsymbol {\mathbf {p}}\in \mathbb {R}^{3}$ and a direction vector $\boldsymbol {\mathbf {d}}\in \mathbb {R}^{3}$ (instead of angles). The transfer function that relates incoming and outgoing rays can be formulated as

(2)$$\left[ \begin{array}{l} \\ \boldsymbol{\mathbf{p}}_{\text{out}} \\\\ \hline \\ \boldsymbol{\mathbf{d}}_{\text{out}} \\\\ \end{array}\right] = \left[ \begin{array}{c} x\\y\\z\\ \hline d_x\\d_y\\d_z \end{array}\right]_{\text{out}} = \left[ \begin{array}{l} f_{x}(\boldsymbol{\mathbf{p}}_{\text{in}};\boldsymbol{\mathbf{d}}_{\text{in}})\\ f_{y}(\boldsymbol{\mathbf{p}}_{\text{in}};\boldsymbol{\mathbf{d}}_{\text{in}})\\ f_{z}(\boldsymbol{\mathbf{p}}_{\text{in}};\boldsymbol{\mathbf{d}}_{\text{in}})\\ \hline f_{d_x}(\boldsymbol{\mathbf{p}}_{\text{in}};\boldsymbol{\mathbf{d}}_{\text{in}})\\ f_{d_y}(\boldsymbol{\mathbf{p}}_{\text{in}};\boldsymbol{\mathbf{d}}_{\text{in}})\\ f_{d_z}(\boldsymbol{\mathbf{p}}_{\text{in}};\boldsymbol{\mathbf{d}}_{\text{in}})\\ \end{array}\right].$$

Extending the usual formulation to include the z-components of the ray origin and direction vectors is helpful for two specific tasks. First, for the ray-transfer matrix method, we expect that the output direction vector points away from the optics so that $d_z=+\sqrt {1-d_x^{2}-d_y^{2}}$. This is not true for the general case: for mirrors and wide-angle lenses $d_z$ can be negative. Second, including the z component of the origin vector allows for applying the RTF to curved output surfaces rather than only planes (Fig. 1(c)). This enables us to capture rays traveling backwards or parallel to the planes which would otherwise be incorrectly represented (e.g., Fig. 2(b)).

Fig. 2. Illustration of the Ray Transfer Function (RTF). (a) The Double Gauss lens can be accurately simulated with rays represented on an input and output plane. (b) The wide angle lens can be accurately simulated with input rays on a plane and output rays on a spherical surface. Because of the high degree of bending near the aperture edge, certain rays would not intersect any output plane to the right of the final vertex.

Download Full Size | PDF

3. RTF for rotationally symmetric lenses with vignetting

For most lenses, many incoming rays will not exit the optics due to f-number dependent vignetting. We implemented a method to determine whether an input ray will pass through the optics (e.g., similar to an entrance pupil). This ray-pass function effectively defines the valid input domain of the RTF. The ray-transfer function converts a valid input ray to an output ray. This architecture is summarized as a flowchart in (Fig. 3).

Fig. 3. Architecture of the ray transfer function. See text for details.

Download Full Size | PDF

The RTF architecture for rotationally symmetric lenses consists of three auxiliary planes: the input plane, the output plane, and the ray-pass plane (Fig. 4). Rays are initially traced from sample positions in the sensor plane to positions on the input plane. The input plane is placed at the first surface (sensor-side) of the lens, and the output plane is placed at the last surface of the lens model (scene-side). Finally, the ray-pass plane is placed at a nonzero distance from the input plane. A natural choice is to place the ray-pass plane in the paraxial exit pupil plane. The distances between these three planes are stored in the RTF JSON file that is read by PBRT.

Fig. 4. Ray-transfer function implementation in PBRT. A ray is traced from the sensor to the input plane. The position on the input plane - not the sensor plane - determines which ray pass region is used. The ray-pass region determines which rays pass, just like an entrance pupil. It can be modeled using a ray-pass function. The dashed arrows represent rays that do not make it through the lens.

Download Full Size | PDF

A PBRT simulation file contains references to all materials, lights and objects. The file also specifies the camera by defining a lens file and its position. The position of the sensor is determined by the “film distance” parameter — this is the distance from the sensor to the first surface of the lens:

Camera “realistic”

“float filmdistance” [0.057315]

“string lensfile” “dgauss28deg.dat”

Likewise, we encapsulate the parameters of the RTF in a JSON file. This file encodes the positions of the three planes and the ray transfer function. We implement the RTF as a subclass of the PBRT Camera class:

Camera “raytransfer”

“float filmdistance” [0.057315]

“string lensfile” “dgauss28deg-raytransfer.json”

For the RTF, the film distance measures the distance from the sensor to the first surface of the Zemax black box. The distance can be modified without changing the ray-pass function or the ray transfer function.

4. Estimation of ray-transfer functions

There are two methods to construct a ray-transfer function: expansion methods and fitting methods. The expansion method uses full knowledge of the lens and calculates a ray-transfer polynomial by a recursive Taylor-expansion of the ray-trace equations [24]. The fitting method begins with a ray-trace data set and finds a polynomial [25,32] or neural network [26] as an interpolating function. Also for fitting methods, knowledge of the lens design can be incorporated. For example, in [25] the final surface of a wide angle lens is used as the output surface of the polynomial. Our work must be agnostic about the lens design; hence, we use the fitting method to estimate polynomial parameters from the ray tracing data obtained with Zemax.

The generation of the ray data set is a crucial step to obtain accurate results. In what follows, we explain how we generate a data set, exploit rotational symmetry, fit a polynomial and account for vignetting with a “ray-pass function”.

4.1 Lens reversal

For efficiency, modern ray tracing software typically follows rays from the sensor (film) plane into object space. Correspondingly, we need to train a model based on rays traced from the sensor towards the scene. In Zemax this can be accomplished by reversing the lens using the procedure documented in [33].

4.2 Data set generation using Zemax

We generated a data set for six lenses from the Zemax sample library: Double Gauss 28 deg, Inverse telephoto, Petzval, Cooke 40 deg, Tessar, and a Wide angle lens 200 deg. For most lenses, we placed an input and output plane onto the first vertex of each side (i.e., an offset of 0 mm). For the wide angle 200 deg lens, we used a spherical output surface (Fig. 2(b)). This is necessary to capture all the outgoing rays. We experimented with different input planes positions. For example, for the Petzval lens we found that placing the planes at a 5 mm offset facilitated creating the ray-pass function which varied more smoothly and thus was more accurately interpolated given a fixed sampling density. For the wide angle lens, we placed the input plane at the scene-side focal plane which yielded more accurate results. These empirical observations suggest additional learning opportunities for future work.

To ensure that the full exit pupil is sampled at each chosen field height, we use the pupils that Zemax calculates by default. We implemented a Zemax macro that samples the pupil using normalized pupil coordinates (Zemax terminology); the Zemax macro is included in the ISETRTF repository [34]. To obtain a more uniform sampling of the pupil in a polar coordinate system, the number of angular samples is increased linearly with radial distance (see Zemax macro). By default, the macro generates approximately 400,000 ray samples. For the wide angle lens, ray aiming in Zemax was enabled to more accurately estimate the pupil dimensions.

For rotationally symmetric lenses, we only have to sample radially along one axis (the y-axis in our case): any incoming ray can be rotated around the optical axis (Z) such that its x-coordinate becomes zero. The output of the Zemax macro is a text file containing the correspondences between input and output rays (Table 1).

Table 1. Organization of an RTF dataset generated by the Zemax macro.^a

View Table

4.3 Fitting the polynomial transfer function

In its most general form, the RTF consists of six multivariate polynomials, each a function of six variables ($\mathbb {R}^{6} \rightarrow \mathbb {R}^{6}$). However, the input space can be reduced to $\mathbb {R}^{3}$ by assuming $d_Z = \sqrt {1-d_x^{2}-d_y^{2}}$ and by exploiting rotational symmetry. Including these assumptions, the transfer function calculation becomes (Fig. 3):

(3)$$\left[ \begin{array}{l} \\ \boldsymbol{\mathbf{p}}_{\text{in}} \\\\ \hline \\ \boldsymbol{\mathbf{d}}_{\text{in}} \\\\ \end{array}\right] \xrightarrow[\text{Rotation}]{} \left[\begin{array}{l} \hat{y} \\ \hline \hat{d_{x}} \\ \hat{d_{y}} \end{array}\right]_{\text{in}} \xrightarrow[\mathbb{R}^{3}\rightarrow\mathbb{R}^{6}]{\text{RTF}} \left[ \begin{array}{l} \\ \hat{\boldsymbol{\mathbf{p}}}_{\text{out}} \\\\ \hline \\ \hat{\boldsymbol{\mathbf{d}}}_{\text{out}} \\\\ \end{array}\right] \xrightarrow[\text{Rotation}]{\text{Undo}} \left[ \begin{array}{l} \\ \boldsymbol{\mathbf{p}}_{\text{out}} \\\\ \hline \\ \boldsymbol{\mathbf{d}}_{\text{out}} \\\\ \end{array}\right].$$

Not all output variables are always needed. When using output planes instead of surfaces, one can ignore the redundant output variables $z$ and $d_z$. For curved output surfaces, we used all six output variables (Fig. 2(b)). More efficient ways to encode the information exist but are not explored here.

For fitting the RTF, we used a multivariate polynomial fitting toolbox from the MATLAB Exchange [35]. In Section 5, we define metrics to quantify the quality of the polynomial fit from an imaging point of view. As we obtained satisfactory results, we did not explore other polynomial fitting algorithms as proposed in prior art [25,32]. Overfitting is not a concern because there is no noise, except for numerical rounding errors.

4.4 Accounting for vignetting: the ray-pass function

The fitted polynomial can be evaluated for all input ray positions and directions. Although the polynomial returns a value, some of the input rays are blocked (vignetted), depending on angle and field height. Hence, we developed a “ray-pass function” that decides which rays should be traced (Fig. 3). With this architecture, only the ray-pass function must be updated when changing the aperture stop size.

We implemented a method to represent the subset of passing rays as a function of field height as follows. First, group rays originating at the same position on the input plane. Second, calculate their intersection with the ray-pass plane. These intersections produce convex shapes which can be mathematically approximated as a function of field height (Fig. 5). Multiple choices exist to mathematically represent the pupil shapes: Circle intersections (Appendix A), ellipses [36], closest neighbor masks [32]. In this work we implemented both the circle intersection and ellipse approach. Circle intersections are physically meaningful and accurate in the paraxial regime (Appendix A). A useful feature is that changing the aperture stop size simply corresponds to changing the size of the circle corresponding to the exit pupil.

Fig. 5. Considering all the rays emerging from a position on the RTF input plane, a subset (blue) exit from the lens and the others (gray) do not. Usually, but not always, the rays that pass through the lens define a compact region.

Download Full Size | PDF

Fig. 6. Ellipses are used to approximate the convex shape of the ray-pass function for rays originating at different positions $\hat {y}$ on the input plane.

Download Full Size | PDF

The main method we advocate in this work is to implement the ray-pass functions using ellipses, an approach also used by Zemax (i.e., vignetting factors). While this has no physical basis, this approach is often a good approximation when the exit pupil is convex (Fig. 6). In addition, the ellipses can robustly fit the data and their parameters can be interpolated. We use an algorithm that finds the ellipse of minimal area that encompasses all ray intersections, up to a chosen tolerance [37]. In our software, we define vectors of ellipse radii and ellipse centers as a function input plane position. Evaluation at intermediate positions is accomplished by linear interpolation to estimate the parameters between the two nearest neighbors.

5. Evaluation of ray-transfer functions

The obtained ray-transfer functions are fitted approximations, not exact representations. We evaluate how well the approximation performs from a camera simulation perspective in several different ways. The details of the simulation (e.g., number of rays and sensor resolution) are documented on the ISETRTF Github repository [34, isetrtf/paper/]. All figures can be reproduced using the following scripts: generateChessSets.m, generateChessSetsLines.m, generateRelIllumFigures.m, and generateESFFigures.m.

A visual comparison of a three-dimensional rendered image is a first test of the methodology but cannot be used when the lens design is actually unknown. We implemented six lens designs from the Zemax sample library as a “realistic” lens in PBRT [38]. We then approximated the lenses using the RTF method and compared the rendered results for both the real lens and the RTF approximation. The rendered images are nearly indistinguishable. See Fig. 7 for a rendered scene for the Double Gauss lens. For all other lenses we plot a horizontal line through the center of the scene (Fig. 8). We used 20,000 ray samples to minimize the effect of rendering noise and show the RMSE of the difference between the curves. Color was obtained using the spectrum of the light source and the reflectance of the materials. We emphasize that producing such a visual evaluation is currently not possible using only Zemax which uses a point-spread convolution approach that applies only to two-dimensional scenes.

Fig. 7. A comparison of ray traced images using (a) a known lens, and (b) the RTF approximation to the lens. The images are very close to identical. Both reproduce the out-of-focus behavior accurately. The lens surfaces and refractive indices of the Double Gauss 28 deg lens are specified in the Zemax lens library. The dashed horizontal lines indicate the curves plotted in Fig. 8.

Download Full Size | PDF

Fig. 8. The lines plot normalized photon count across a horizontal line for the rendered chess set scene (Fig. 7). The red and blue curves - which largely overlap - compare the known (blue) and RTF (red) calculations. The RMS of the difference between the curves is given. The wide angle lens (f) is the only case for which the image circle is not correctly predicted and this also dominates the RMSE. We explain the reason in Appendix B.

Download Full Size | PDF

To further quantify the accuracy of the PBRT implementation of the RTF, we use two metrics: the relative illumination and the edge-spread function which are compared to the Zemax prediction.

5.1 Relative illumination

The relative illumination is the radial fall-off in image irradiance; it is a common metric used for image systems evaluation. Its value is measured by imaging a uniformly illuminated Lambertian surface. The relative illumination mostly depends on the area of the exit pupil at different field heights. Hence, a good fit with Zemax indicates that the ray-pass function accurately models the vignetting.

We calculate the relative illumination for six lenses in the Zemax sample library. In all cases there is a good match with the Zemax calculation (Fig. 9). For the Petzval lens we found that the ellipse approximation is less suitable and that a ray-pass function based on circle intersections performs much better (see Appendix A). For the Cooke lens, a circle approach slightly improves the relative illumination fit.

Fig. 9. Relative illumination predictions by the RTF in PBRT (red solid line) match the Zemax prediction (black dashed line). The Ray-Pass function was modeled using a sequence of field-height dependent ellipses, except for the Petzval lens which used the circle intersection method (see Appendix A). The mismatch for the wide-angle lens is discussed in AppendixB.

Download Full Size | PDF

5.2 Edge spread functions

To evaluate the fitted RTF from an imaging perspective we calculate its geometrical edge-spread function (ESF) and compare it to the prediction by Zemax. The ESF is the image of a step function and measures how much blurring occurs. Zemax has a convenient function to generate the geometrical ESF without diffraction so that it can be compared to the PBRT simulations, which also do not include diffraction. To efficiently obtain the ESF in PBRT with micron resolution we render the scene for only one row of pixels on the film/sensor. The edge-spread functions are compared in Fig. 10. To minimize rendering noise we used 20,000 rays per pixel.

Fig. 10. Comparing the edge-spread functions computed using Zemax and the RTF simulations implemented in PBRT. The comparison is calculated for two object distances (see panel legends). There is no substantial difference between the two methods. The polynomial degree used to fit the RTFs differed between the lenses, and the rationale for selecting the values is shown in Fig. 11.

Download Full Size | PDF

Fig. 11. The difference of the root mean squared error (RMSE) between the RTF and Zemax edge-spread functions as a function of polynomial degree. The two curves in each panel show different object distances (measured from front vertex). As polynomial degree increases, RMSE generally declines.

Download Full Size | PDF

To select the polynomial degree for the RTF, we compared the root mean squared error (RMSE) between the Zemax edge spread function and RTF approximation for polynomials degrees ranging from 1 to 16. After selecting the polynomial degree we computed the edge-spread functions for each of the six lenses. Two object distances were chosen so that the edge-spread functions were significantly distinct. The agreement between the two methods is very good. The Tessar lens converges slowest. The small RMSE differences are not significant for sensor prototyping or machine learning applications.

6. Discussion

6.1 Related work

The ray transfer function is a fundamental concept in optics. The linear paraxial approximation (ABCD) is well-known from Gauss’ work [39]. Extensions to the paraxial approximations of light wavefronts, rather than rays, were developed by Seidel [40]. Polynomial characterizations of wavefront aberrations on circular support were introduced by Zernike [41]. These concepts are part of the classical optics toolkit.

In certain cases, the dominant rendering properties of camera optics can be captured by the ray representation without reference to wavefronts. Nonlinear extensions of the ABCD model were applied to computer graphics ray tracing by Hullin et al. [24]; they also provided software to create polynomial ray transfer functions. Their work included models of lens flare. Several papers built on this work, describing efficient ways to find and implement the polynomial approximations and considering alternative formulations to expand the range of lenses that could be accurately modeled [25,26,32]. The metrics used to evaluate the polynomial models included examples of rendered images and metrics in the ray parameter space.

Our paper builds on the existing polynomial framework in several ways. A main motivation is to implement methods that can be applied even when the lens design is not known. Construction of the polynomials from the black box description extends the domain of application and is particularly important for soft prototyping applications (e.g., [42]). Second, we evaluate the accuracy of the polynomial models using image domain metrics: the relative illumination and edge-spread functions. Evaluating the polynomial approximation in the ray representation is valuable, but in many applications the critical question is whether the soft prototype generates the correct sensor data. Third, we provide open-source software to enable others to check our work and integrate the methods into two commonly used tools: lens design software (Zemax) and physically based ray-tracing (PBRT).

6.2 Additional advice to avoid headaches

When generating the dataset using the Zemax macro in ISETRTF, one has to be careful about several details. First, it is important to make sure the primary wavelength in Zemax equals the wavelength used for data collection since this affects some Zemax calculations, such as pupil estimation. Second, when calculating the relative illumination and edge-spread functions with Zemax, make sure to select a single wavelength. In addition, as recommended by Zemax, it was necessary to turn off vignetting factors for the Tessar lens in order to obtain a satisfactory fit. Third, turning on ray aiming is advised when expecting large distortions of the exit pupil, particularly for a wide-angle lens [43]. This avoids incomplete sampling of the exit pupil.

When computing the ray transfer functions for the six lenses, we experimented with different positions of the input, ray-pass, and output planes. The default positions were satisfactory for four of the six lenses. For the wide angle lens, we found that placing the input plane near the position of the sensor improved the convergence of the polynomial. The advantage of placing the input plane at the sensor can be significant if the image circle of the lens is much larger than the largest sensor size. Placing the input plane on the sensor plane allows the user to generate only the relevant samples from the sensor positions. Fitting only over the portion of the image circle that contains the sensor typically yields a lower degree RTF polynomial. This approach is only possible, of course, when the largest sensor size is known in advance.

6.3 Future work

We implemented dispersion by fitting an RTF for each wavelength of interest. Prior work used a different approach by incorporating wavelength as an additional variable in the RTF polynomials [25,26,32]. A comparison of these approaches is a good topic for future work.

The method illustrated here and implemented in the software applies to rotationally symmetric lenses. This method is satisfactory for the vast majority of lenses, but free-form lenses are becoming more common. Sampling and fitting an RTF for these lenses in full 6D space (position and direction) is a significantly larger computational problem, as is ray tracing for free form lenses. For such lenses, it may be useful to fit an RTF instead of performing costly ray tracing. We consider algorithm work on these topics as something for the future.

It may be useful to distinguish between the rotational symmetry of the polynomial transfer function and the ray-pass function. For example, the lens surfaces might be intrinsically spherical but cut asymmetrically (e.g., square shape). Alternatively, perhaps only the diaphragm, or coded aperture lacks rotational symmetry. In such cases, the fitted polynomial can continue to exploit rotational symmetry while the ray-pass function must be generalized.

Another possible extension is to include ray-based diffraction techniques. While fundamentally ray-tracing approaches do not simulate wave-optics phenomena, useful methods for approximating diffraction with ray tracers have been proposed [17,44–47]. It would be interesting to explore how these methods could be combined with a fitted ray-transfer function.

For simulations involving diffraction, coatings, or time-of-flight imaging, knowing the optical path lengths of the rays and their change in spectral radiance might be required. One could potentially models these effects using two additional fitted functions: a “Pathlength Function” and a “Ray Amplitude Function”.

6.4 Performance improvements

The current PBRT implementation is several times slower than ray-tracing through a known lens. The bulk of the calculation involves the evaluation of high-degree polynomials. We are confident that the calculations can be significantly accelerated. For example, similar work on ray-transfer functions used polynomials as a faster alternative for brute force ray-tracing by limiting their search to sparse multivariate polynomials [25,32]. Readers who are concerned about speed might implement our solution using these alternative fitting algorithms. Also, lower degree polynomials might also be suitable for simulating sensors with large pixels. In this case, the edge-spread function might not be fully dominated by the optics alone and a lower-degree polynomial fit might be satisfactory. Finally, polynomial calculations can take advantage of GPUs [24], and we are considering implementing this method.

7. Conclusions

The methods described here significantly extend the range of camera applications that can benefit from soft prototyping. Prototyping can be performed while permitting lens manufacturers to maintain the intellectual property of the lens design. By providing a black box model, customers can build a ray-transfer function that can be incorporated into ray tracing software. We integrated these functions into PBRT for the specific case of Zemax, and we show that for six very different lenses the PBRT implementation replicates the Zemax specification. In a separate paper, we use this method as part of a general simulation pipeline, comparing the simulations with data acquired from a real camera with an unknown lens design and a constructed three-dimensional scene [42].

A. Ray-pass function using circle intersections

Each lens and diaphragm in a complex lens can be considered a finite aperture that limits the size of the light beam. The most limiting aperture is often called the Aperture Stop. We can calculate the image of each aperture (called pupils) as seen from different positions on the input plane. By construction, for a ray to pass through the system, it must pass through all of the pupils. If the diaphragms are circular, the paraxial images will be approximately circular and the shape of the lens system’s pass function can be approximately modeled by the intersection of these circles [49] (Fig. 12).

Fig. 12. Modeling vignetting with circle intersections by projecting the circular pupils to a common plane. Adapted from [50, p. 86].

Download Full Size | PDF

In our work we project the diaphragms to the “ray-pass plane” (Fig. 4). The projected circles have radii $R_i,\,i=1,2,\ldots \,$. Their displacement from the center $d_i$ depends on the input plane position $\hat {y}$ (Fig. 12). Assuming linearity,

(4)$$d_i = s_i \hat{y},$$

with $s_i$ being a sensitivity parameter.

For the classic Petzval lens we require three circles to model that the ray-pass region changes with input plane position (Fig. 13). Different diaphragms (circles) become dominant at different field heights (Fig. 14). Placing, arbitrarily, the ray-pass plane 17 mm above the input plane, the three circles have radii 5.74 mm, 42.67 mm and 9.65 mm with corresponding sensitivities of 0.72, −1.7 and 0.30.

Fig. 13. The ray-pass region for the Petzval lens is well modeled by intersecting three moving circles. Here shown for three position on the input plane.

Download Full Size | PDF

Fig. 14. For the Petzval lens, a ray-pass function based on (a) circle intersections performs better, than (b) using ellipse approximations.

Download Full Size | PDF

The circle radii and sensitivities are estimated in several steps. First, we select off-axis positions at which a different circle cuts the pupil (e.g., Fig. 13). Second, at each pupil edge we find the tangent circle with the minimal radius that encompasses all points of intersection. The sensitivity is then obtained by observing that $d_i = (\text {position on edge}-R_i)$ such that $s_i=d_i/\hat {y}$. This algorithm is implemented in the findCuttingCircleEdge(..) function in [34]. The script used to generate Fig. 13 is generatePetzvalCircles.m.

B. Wide angle lens relative illumination

We briefly comment on the difficulty of correctly predicting the relative illumination for the wide angle lens in (Fig. 8). Near the edge the ray-pass region becomes non-convex and consists of two clusters (Fig. 15). An ellipse significantly overestimates the area causing the image circle to appear larger. More versatile fitting functions, like neural networks, might be suitable for this situation.

Fig. 15. The ray-pass region of the wide angle lens is not convex; characterizing the region with an ellipse overestimates the region of passing rays. Consequently, many input rays are traced that should not be included, explaining the imperfections in Figs. 8(f) and 9(f).

Download Full Size | PDF

Acknowledgments

We thank David Cardinal and Zhenyi Liu for their contributions to the software and discussions. We thank Google and Ricardo Motta for support via the Stanford Center for Image Systems Engineering faculty-mentor-advisor program. Thomas Goossens holds a fellowship from the Belgian American Educational Foundation (B.A.E.F.).

Disclosures

The authors declare no conflicts of interest.

Data availability

All software developed for this paper is available on GitHub. The PBRT implementation of the ray-transfer function can be found on [48]. The scripts to generate the ray-transfer function based on the data from Zemax can be found on [34].

The software is intended to be used as follows. First, one uses the Zemax macro to generate the text file containing input and output rays. Second, this dataset is processed using a MATLAB script which generates a JSON file. Lastly, this JSON file is passed as input to the customized version of PBRT.

References

1. C. Kamel, A. Reid, and F. Plepp, “Validating NVIDIA DRIVE sim camera models,” https://developer.nvidia.com/blog/validating-drive-sim-camera-models/ (2021). Accessed: 2021-12-23.

2. Z. Liu, T. Lian, J. Farrell, and B. A. Wandell, “Neural network generalization: The impact of camera parameters,” IEEE Access 8, 10443–10454 (2020). [CrossRef]

3. Z. Liu, J. Farrell, and B. A. Wandell, “Isetauto: Detecting vehicles with depth and radiance information,” IEEE Access 9, 41799–41808 (2021). [CrossRef]

4. “Anyverse,” https://anyverse.ai/ (2021). Accessed: 2021-12-23.

5. “Simulation City: Introducing Waymo’s most advanced simulation system yet for autonomous driving,” https://blog.waymo.com/2021/06/SimulationCity.html. Accessed: 2021-12-23.

6. C. M. Goral, K. E. Torrance, D. P. Greenberg, and B. Battaile, “Modeling the interaction of light between diffuse surfaces,” SIGGRAPH Comput. Graph. 18(3), 213–222 (1984). [CrossRef]

7. S. N. Pattanaik, J. A. Ferwerda, K. E. Torrance, and D. Greenberg, “Validation of global illumination simulations through CCD camera measurements,” in Color and Imaging Conference 1997 (1997), pp. 250–253.

8. J. E. Farrell, F. Xiao, P. B. Catrysse, and B. A. Wandell, “A simulation tool for evaluating digital camera image quality,” in Image Quality and System Performance, vol. 5294 (International Society for Optics and Photonics, 2003), pp. 124–131.

9. J. E. Farrell, P. B. Catrysse, and B. A. Wandell, “Digital camera simulation,” Appl. Opt. 51(4), A80 (2012). [CrossRef]

10. P. Y. Maeda, P. B. Catrysse, and B. A. Wandell, “Integrating lens design with digital camera simulation,” in Digital Photography, vol. 5678 (SPIE, 2005), pp. 48–58.

11. R. C. Short, D. Williams, and A. E. W. Jones, “Image Capture Simulation Using an Accurate and Realistic Lens Model,” Sensors, Cameras, Appl. for Digit. Photogr. 3650, 138–148 (1999). [CrossRef]

12. J. Chen, K. Venkataraman, D. Bakin, B. Rodricks, R. Gravelle, P. Rao, and Y. Ni, “Digital camera imaging system simulation,” IEEE Trans. Electron Devices 56(11), 2496–2505 (2009). [CrossRef]

13. F. Toadere, “Simulating the functionality of a digital camera pipeline,” Opt. Eng. 52(10), 102005 (2013). [CrossRef]

14. S. Kim, D. Kim, K. Chung, and J. S. Yim, “Estimation of any fields of lens PSFs for image simulation,” Proc. IS&T Int’l. Symp. on Electronic Imaging 33(7), 72-1–72-5 (2021). [CrossRef]

15. M. Lehmann, C. Wittpahl, H. B. Zakour, and A. Braun, “Resolution and accuracy of nonlinear regression of point spread function with artificial neural networks,” Opt. Eng. 58(04), 1 (2019). [CrossRef]

16. E. Tseng, A. Mosleh, F. Mannan, K. St-Arnaud, A. Sharma, Y. Peng, A. Braun, D. Nowrouzezahrai, J.-F. Lalonde, and F. Heide, “Differentiable compound optics and processing pipeline optimization for end-to-end camera design,” ACM Trans. Graph. 40(2), 1–19 (2021). [CrossRef]

17. T. Lian, K. J. MacKenzie, D. H. Brainard, N. P. Cottaris, and B. A. Wandell, “Ray tracing 3D spectral scenes through human optics models,” J. Vis. 19(12), 23 (2019). [CrossRef]

18. M. Pharr, W. Jakob, and G. Humphreys, Physically Based Rendering: From Theory to Implementation (Morgan Kaufmann, 2016).

19. M. Nimier-David, D. Vicini, T. Zeltner, and W. Jakob, “Mitsuba 2: A retargetable forward and inverse renderer,” ACM Trans. Graph. 38(6), 1–17 (2019). [CrossRef]

20. Z. Lyu, H. Jiang, F. Xiao, J. Rong, T. Zhang, B. Wandell, and J. Farrell, “Simulations of fluorescence imaging in the oral cavity,” Biomed. Opt. Express 12(7), 4276 (2021). [CrossRef]

21. “Isetcam,” https://github.com/scienstanford/isetcam.

22. H. Blasinski, J. Farrell, T. Lian, Z. Liu, and B. Wandell, “Optimizing image acquisition systems for autonomous driving,” Electron. Imaging 30(5), 161-1–161-7 (2018). [CrossRef]

23. Z. Liu, T. Lian, J. Farrell, and B. Wandell, “Soft prototyping camera designs for car detection based on a convolutional neural network,” in Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019).

24. M. B. Hullin, J. Hanika, and W. Heidrich, “Polynomial optics: A construction kit for efficient ray-tracing of lens systems,” Comput. Graph. Forum 31(4), 1375–1383 (2012). [CrossRef]

25. E. Schrade, J. Hanika, and C. Dachsbacher, “Sparse high-degree polynomials for wide-angle lenses,” Comput. Graph. Forum 35(4), 89–97 (2016). [CrossRef]

26. Q. Zheng and C. Zheng, “NeuroLens: Data-Driven Camera Lens Simulation Using Neural Networks,” Comput. Graph. Forum 36(8), 390–401 (2017). [CrossRef]

27. J. Goodman, Introduction to Fourier Optics3rd ed. (Macmillan, 2005).

28. Wikipedia contributors, “Ray transfer matrix analysis,” https://en.wikipedia.org/w/index.php?title=Ray_transfer_matrix_analysis&oldid=1064230288 (2022).

29. B. Macukow and H. H. Arsenault, “Extension of the matrix theory for nonsymmetrical optical systems,” J. Opt. 15(3), 145–151 (1984). [CrossRef]

30. T. M. Jeong, D.-K. Ko, and J. Lee, “Generalized ray-transfer matrix for an optical element having an arbitrary wavefront aberration,” Opt. Lett. 30(22), 3009 (2005). [CrossRef]

31. P. D. Lin and C. C. Hsueh, “6×6 matrix formalism of optical elements for modeling and analyzing 3D optical systems,” Appl. Phys. B: Lasers Opt. 97(1), 135–143 (2009). [CrossRef]

32. Q. Zheng and C. Zheng, “Adaptive sparse polynomial regression for camera lens simulation,” Vis. Comput. 33(6-8), 715–724 (2017). [CrossRef]

33. Zemax, “How to reverse an optical system,” https://support.zemax.com/hc/en-us/articles/1500005575602-How-to-reverse-an. Accessed Feb. 11, 2022.

34. T. Goossens, “ISETRTF: Using ray transfer functions for camera simulation in PBRT with unknown lens designs,” GitHub (2022) [accessed 16 June 2022], https://github.com/ISET/isetrtf.

35. J. D’Errico, “Polyfitn,” Matlab Exchange (2021).

36. W. B. King, “The Approximation of Vignetted Pupil Shape by an Ellipse,” Appl. Opt. 7(1), 197 (1968). [CrossRef]

37. N. Moshtagh, “Minimum Volume Enclosing Ellipsoid,” Matlab Exchange (2021).

38. C. Kolb, D. Mitchell, and P. Hanrahan, “Realistic camera model for computer graphics,” Proceedings of the ACM SIGGRAPH Conference on Computer Graphics pp. 317–324 (1995).

39. C. F. Gauss, Dioptrische Untersuchungen.

40. P. L. Seidel, Ueber die Theorie der Fehler, mit welchen die durch optische Instrumente gesehenen Bilder behaftet sind, und über die mathematischen Bedingungen ihrer Aufhebung (Cotta, 1857).

41. v. F. Zernike, “Beugungstheorie des schneidenver-fahrens und seiner verbesserten form, der phasenkontrastmethode,” Physica 1(7-12), 689–704 (1934). [CrossRef]

42. Z. Lyu, T. Goossens, B. Wandell, and J. Farrell, “Accurate smartphone camera simulation using 3D scenes,” arXiv:2201.07411 (2022).

43. Zemax, “How to use Ray Aiming,” https://support.zemax.com/hc/en-us/articles/1500005486881-How-to-use-Ray-Aiming (2021).

44. A. L. Lin, J. Farrell, and B. Wandell, “Spectral optics simulation for rapid image systems prototyping: Ray-tracing, diffraction and chromatic aberration,” Imaging and Applied Optics 2014 paper JW3A.2 (2014).

45. E. R. Freniere, G. G. Gregory, and R. A. Hassler, “Edge diffraction in Monte Carlo ray tracing,” Proc. SPIE 3780, 151–157 (1999). [CrossRef]

46. B. Andreas, G. Mana, and C. Palmisano, “Vectorial ray-based diffraction integral,” J. Opt. Soc. Am. A 32(8), 1403 (2015). [CrossRef]

47. M. Mout, M. Wick, F. Bociort, J. Petschulat, and P. Urbach, “Simulating multiple diffraction in imaging systems using a path integration method,” Appl. Opt. 55(14), 3847 (2016). [CrossRef]

48. “Pbrt v3 modified version,” https://github.com/scienstanford/pbrt-v3-spectral.

49. N. Asada, A. Amano, and M. Baba, “Photometric calibration of zoom lens systems,” Proceedings of 13th International Conference on Pattern Recognition1, 186–190 (1996).

50. T. Goossens, “Tiny Thin-film Filters from a Different Angle: Correcting Angular Dependency for Spectral Imaging,” Ph.D. thesis (KU Leuven, 2020).

Input ray						Output ray
$x$	$y$	$z$	$d_{x}$	$d_{y}$	$d_{z}$	$x$	$y$	$z$	$d_{x}$	$d_{y}$	$d_{z}$
0	$y$	$z$	$d_{x}$	$d_{y}$	$d_{z}$	$x$	$y$	$z$	$d_{x}$	$d_{y}$	$d_{z}$
⋮						⋮
0	$y$	$z$	$d_{x}$	$d_{y}$	$d_{z}$	NaN	NaN	NaN	NaN	NaN	NaN

Ray-transfer functions for camera simulation of 3D scenes with hidden lens design

Abstract

1. Introduction

2. Ray-transfer matrices and functions

3. RTF for rotationally symmetric lenses with vignetting

4. Estimation of ray-transfer functions

4.1 Lens reversal

4.2 Data set generation using Zemax

4.3 Fitting the polynomial transfer function

4.4 Accounting for vignetting: the ray-pass function

5. Evaluation of ray-transfer functions

5.1 Relative illumination

5.2 Edge spread functions

6. Discussion

6.1 Related work

6.2 Additional advice to avoid headaches

6.3 Future work

6.4 Performance improvements

7. Conclusions

A. Ray-pass function using circle intersections

B. Wide angle lens relative illumination

Acknowledgments

Disclosures

Data availability

References

Data availability

Cited By

Figures (15)

Tables (1)

Equations (4)

Optics Express