Calibration of AOTF-based 3D measurement system using multiplane model based on phase fringe and BP neural network

Huijie Zhao; Shaoguang Shi; Hongzhi Jiang; Ying Zhang; Zefu Xu

doi:10.1364/OE.25.010413

1. Introduction

A hyperspectral imaging system can provide high-resolution images of multiple bands, which bears an important role in spectral imaging. With the ability of detecting three-dimensional (3D) structure data in real time, structured light sensor has been widely used in vision measurement applications. The integration of hyperspectral imaging data and 3D data provides improved outcomes distinctly over use of either data set alone [1]. An integrated instrument, which combines together hyperspectral imaging and 3D structure measurement, will have great potential in many domains, such as precision agriculture [2], botanical research [3], and industry inspection [4].

A specifically designed imaging system based on acousto-optic tunable filter (AOTF) naturally exhibits the advantage to integrate these two kinds of detection methods. AOTF is an electronic device based on the interaction of acoustic and optical waves in acousto-optic crystals [5]. Considering its characteristic to decompose light, AOTF converts the incident polarized light into two beams. One is the diffracted light of designated wavelength, and the other one is the transmitted light containing panchromatic bands. Diffracted light is used to conduct hyperspectral imaging. Differently from the traditional approach to shield the transmitted beam as stray light, this transmitted light can be used to provide high-quality panchromatic images [6]. Thus, the transmitted channel, equipped with a line-structured light, shows the ability to conduct 3D reconstruction.

To obtain accurate 3D data, a precise camera calibration is necessary. Traditional calibration methods enable the assumption that the imaging system (usually a camera) accords with a pinhole model and the imaging process is in line with the perspective projection [7]. Hall first proposed a technique to compute the transformation matrix, which transfers the 3D object points to their 2D image points [8]. Faugeras extracted the camera physical parameters from the matrix [9]. Tsai introduced the radial lens distortion and proposed a two-step calibration method [10]. Weng presented a camera model that included three different types of lens distortion [11]. Zhang proposed a calibration technique only requiring the camera to observe a planar pattern shown at a few different orientations, which simplified the calibration process significantly [12]. Some other techniques including self-calibration methods compute the camera parameters with various algorithms under the pinhole model and certain lens distortion assumption [13–16].

The AOTF imaging system is a complex system consisting of several optical elements other than a single camera. From the discussion in the next part, the transmitted channel of the AOTF imaging system presents a deviation from the pinhole model and carries out a special form of lens distortion compared with the common cameras. Thus, the accuracy of the “camera calibration” under the pinhole model is limited. In previous work, some nonparametric camera models are proposed. Martins raised a two-plane model to constrain the imaging light and compared three algorithms describing the relationship between the mark points and image points [17]. Wei further studied the two-plane method and presented a unified camera model, which they reported to be applicable for any kind of lens distortion [18]. Grossberg proposed a general imaging model in which they presented the concept “raxel” as the basic unit of imaging process [19]. Yin used the general imaging model to describe the imaging process and proposed a dedicated calibration approach to realize the quantitative 3D imaging [20].

We propose a multiplane model (MPM) to establish the mapping from the image point to its corresponding space line in this study. Differently from the previous work, the extra plane constraint carries out a least-square method to obtain the linear equation and the laser plane equation. A phase fringe technique is presented to obtain dense mark points, which can cover the entire field of view (FOV) of the imaging system pixel by pixel. A back propagation (BP) neural network algorithm is proposed to deal with the massive data produced by the phase fringe technique and provide a subpixel mapping between the image points and the mark points. The proposed multiplane technique improves the calibration accuracy distinctly compared with the traditional camera calibration process under the pinhole model because of the extra plane constraint, the dense mark points, and the accurate fitting ability of the neural network.

2. Theoretical background

2.1 AOTF-based 3D measurement system

As shown in Fig. 1, the AOTF-based 3D measurement system consists of two parts, namely, AOTF imaging system and laser scanning system. The AOTF imaging system is the main imaging part of the measurement system, which provides panchromatic images from the transmitted channel and hyperspectral images from the diffracted channel. The laser scanning system together with the transmitted channel of the AOTF imaging system forms a structured light sensor that can be used to conduct 3D measurement.

Fig. 1 Photo (a) and schematic (b) of AOTF-based 3D measurement system.

Download Full Size | PDF

The objective of the calibration of the AOTF-based 3D measurement system is to obtain the geometric parameters that are used to calculate the 3D information of the object. Particularly, the calibration should provide an accurate mathematical description of the imaging light and the laser planes, such as the linear and plane equations.

2.2 Traditional imaging model

Figure 2 shows a schematic of a typical imaging under the traditional pinhole model. The mapping due to the geometric similarity between the image and object points can be described as follows:

{\begin{matrix} x = f X_{c} / Z_{c} \\ y = f Y_{c} / Z_{c} \end{matrix},

where (x, y) is the coordinate of the image point in the camera coordinate system O_c-X_cY_cZ_c; (X_c, Y_c, Z_c) is the coordinate of the object point in the same coordinate system, and f is the focal length of the optical system.

Fig. 2 Schematic (a) and mathematical expression (b) of the pinhole model.

Download Full Size | PDF

Two main characteristics of the pinhole model are summarized to discuss the following problem conveniently:

1. The incident imaging rays can be treated as a beam of space lines that intersect at one single point, which can be described as Eq. (2)

\frac{x - x_{0}}{\sum_{i = 0}^{n} λ_{i} X_{i}} = \frac{y - y_{0}}{\sum_{i = 0}^{n} λ_{i} Y_{i}} = \frac{z - z_{0}}{\sum_{i = 0}^{n} λ_{i} Z_{i}},

where (x₀, y₀, z₀) is the coordinate of the intersect point; n is the total number of the space lines, and for each line l_k the parametric equation is

\frac{x - x_{0}}{X_{k}} = \frac{y - y_{0}}{Y_{k}} = \frac{z - z_{0}}{Z_{k}},

λ₁, λ₂, …, λ_n is a set of any real numbers with the constraint

λ_{1}^{2} + λ_{2}^{2} + ... + λ_{n}^{2} \neq 0

. In the pinhole model, the intersect point (x₀, y₀, z₀) is actually the principal point.

2. The imaging rays remain the axial symmetry to the optical axis after passing through the optical structure of the imaging system. As shown in Fig. 2, if $α = β$ , then $α' = β'$ .

The two characteristics mentioned above summarize the main principle of the ideal pinhole model. Meanwhile, in the actual imaging process, Eq. (1) presents a deviation as a result of the lens distortion. Various distortion types are introduced to correct the influence of distortion. The most commonly used form is described as

\begin{array}{l} δ_{x} (x, y) = δ_{x 1} + δ_{x 2} + δ_{x 3} = k_{1} x r^{2} + 2 p_{1} x y + p_{2}^{2} (r^{2} + 2 x^{2}) + s_{1} r^{2} \\ δ_{y} (x, y) = δ_{y 1} + δ_{y 2} + δ_{y 3} = k_{2} y r^{2} + 2 p_{2} x y + p_{1}^{2} (r^{2} + 2 y^{2}) + s_{2} r^{2}, \\ δ = \sqrt{δ_{x}^{2} (x, y) + δ_{y}^{2} (x, y)} \end{array}

where (x, y) is the image point in the image coordinate system;

δ_{x} (x, y)

is the image distortion along the x direction;

δ_{x 1}

,

δ_{x 2}

,

δ_{x 3}

represent the radial distortion component with the distortion coefficient k₁, the decentering distortion component with the distortion coefficient p₁, p₂, and the thin prism distortion component with the distortion coefficient s₁, respectively; r is the distance between the image points and the coordinate origin; the symbols show similar meanings for the distortions along the y direction;

δ

is the total distortion in pixels.

Figure 3 shows a simulation result of the image distortion. As can be seen from the figure, the distortion is also axisymmetric to the optical axis.

Fig. 3 Simulation result of the traditional image distortion.

Download Full Size | PDF

As mentioned above, the common imaging model deals with the imaging systems that follow the pinhole model and the axial symmetry assumption of lens distortion. For specific imaging systems that are not in line with this principle, traditional calibration techniques will cause a significant error to the calibration result.

2.3 Deviation of the AOTF imaging system from the traditional imaging model

The AOTF imaging system presents a deviation from the traditional imaging model because of the complex optical structure. A sketch map of the AOTF imaging system is shown in Fig. 4. To concentrate on the 3D structure measurement, the transmitted channel of the system is shown in the figure, and the diffracted channel is omitted that is used to conduct hyperspectral imaging detection. The system consists of several optical elements including an off-axis mirror group to obtain a large FOV, Polarizer 1 to obtain polarized incident light, an AOTF to convert the incident light into two channels, Polarizer 2 to transmit the diffracted light and reflect the transmitted light, a doublet prism to correct chromatic aberration, a planar mirror to fold the light path, and a lens group for final imaging. Among these optical elements, the two polarizers and the planar mirror only chose the polarization of the incident light and change the direction of the light ray. Thus, they are considered to exhibit less impact on the geometric property of the imaging device. Meanwhile, the other elements cannot be ignored. The influencing factors are discussed in this part.

Fig. 4 Sketch map of the AOTF imaging system.

Download Full Size | PDF

● Off-axis mirror group

To obtain a large FOV, the off-axis mirror group consists of two parabolic reflectors. Incidence angles and normal directions vary with the intersections between the incident light rays and the paraboloid surface because of the off-axis use of the reflectors. Thus, the axial symmetry of the reflected light is destroyed, which offends against the second characteristic of the pinhole principle. As a result of the asymmetry, the reflected light will not intersect at a single point, which deviates from the first characteristic. Figure 5 shows the influence on the converging light rays of the parabolic reflector compared with a planar mirror.

Fig. 5 (a) Beam of converging light ray. (b) After the planar mirror reflection, the light ray still intersects at one point and maintains the axial symmetry. (c) After the parabolic mirror reflection, the axial symmetry is destroyed and the light rays will not intersect at one point.

Download Full Size | PDF

● AOTF and doublet prism

AOTF converts the incident light into two channels, and the doublet prism is used to correct chromatic aberration. For the panchromatic transmitted light, AOTF and the doublet prism are equivalent to two optical wedges with large thickness. After passing through an optical wedge, the axisymmetric light rays will cause an asymmetric refraction because the wedge is a non-axisymmetric element as shown in Fig. 6. Similar to the parabolic reflector, the AOTF and the doublet prism make the AOTF imaging system deviate from the two main characteristics of the pinhole model.

Fig. 6 Asymmetric refraction in the wedge.

Download Full Size | PDF

● Machining and alignment errors

In a traditional 3D structure measurement application, the imaging device is usually a common camera. The lens group of a single camera is simple consisting of a few lenses and the number of optical surfaces is limited. However, the AOTF imaging system is complicated with a number of refracting surfaces and reflecting surfaces. Under this circumstance, some factors buried in the traditional camera calibration process are considered.

First, every element exhibits a machining error compared with its original design, making the optical surface not ideal. Second, each optical element is a discrete device. Although the system alignment is unified, misalignment still exists that cannot be ignored. Thus, the little deviation caused by the front elements will be amplified with the light rays passing through the optical surfaces one by one. For a small FOV or for a single optical surface, these factors show less influence on the final light symmetry and lens distortion. Meanwhile, the impact cannot be ignored for systems like the AOTF imaging system with a large FOV and a considerable number of optical surfaces.

● Special lens distortion type

The AOTF imaging system presents a certain kind of lens distortion because of the asymmetry of the light rays caused by the influencing factors mentioned above. This distortion is quite different from any distortion form that has been reported in the common camera calibration technique, neither radial distortion, nor decentering diction, nor thin prism distortion. Figure 7 shows the simulation curve of the lens distortion in the ZMAX. This kind of distortion exhibits different characteristics between the x and y directions. Even on the same axis, the distortion is different relating to the positive and negative directions.

Fig. 7 Simulation result of the image distortion.

Download Full Size | PDF

2.4 Summary

In general, the complicated optical structure of the AOTF imaging system changes the geometrical properties of the incident light, making AOTF imaging system deviate from the traditional imaging model. The reason of the deviation can be summarized as follows:

Reason 1. The non-axisymmetric optical elements including the off-axis mirror group, the AOTF, and the doublet prism cause the AOTF imaging system to deviate from the two main characteristics of the traditional pinhole model mentioned above.
Reason 2. The machining and alignment errors amplify the influence of the non-axisymmetric optical elements.
Reason 3. The lens distortion type of the AOTF imaging system is totally different from the traditional lens distortion form as a result of reasons 1 and 2.

3. Multiplane model based on fringe-phase marking method and BP neural network

Any point on the image plane corresponds to a straight line in space. One of the primary objectives of camera calibration is to obtain the parameters of the space line, which is also called the reverse projection problem. Traditional imaging model solves this problem under the pinhole assumption with the traditional lens distortion to compensate errors. From the discussion above, the AOTF imaging system deviates from the traditional imaging model. This study proposed a MPM based on fringe-phase marking method (FPMM) and BP neural network. The advantages of this model are listed as follows:

● MPM aims to solve the reverse projection problem implicitly by establishing the mapping between the image point and its corresponding space line. MPM does not force the imaging system to obey the pinhole model and certain lens distortion assumption. Thus, MPM can be used for any imaging system including the AOTF imaging system and common cameras.
● MPM shows extra constraints for space lines compared with the two-plane model reported in previous work [18] and introduces a least-square fitting method to calculate the linear equation; thus, it is precise.
● MPM needs a large number of mark points to obtain a good calibration result. The fringe-phase technique is introduced to encode the reference planes in MPM, which can provide dense mark points at the pixel scale.
● The BP neural network is introduced to obtain a subpixel calibration result because to deal with the massive data provided by the fringe-phase technique is difficult for traditional interpolation methods. The back projection error is very small because of the large number of marking points and the accurate fitting ability of the BP neural network.

3.1 Camera calibration

3.1.1 Multiplane model (MPM)

As shown in Fig. 8, we assume that a light ray in the world coordinate system C passes through the optical system of the AOTF imaging system and finally is imaged to a point P₀ on the image plane Π₀. The parametric equation of L is denoted as

Fig. 8 Multiplane imaging model.

Download Full Size | PDF

L : {\begin{matrix} x = m t + x_{0} \\ y = n t + y_{0} \\ z = p t + z_{0} \end{matrix},

Any object point lying along the space line L in the depth range of the AOTF imaging system will be imaged to the same point P₀. Several parallel reference planes exist in the world coordinate system C denoted by Π₁, Π₂,…, and Π_n. The space line L and the reference planes intersect at points P₁, P₂, …, P_n. If the 3D coordinates of these points are obtained, the linear equation of L is determined with a least-square method.

Without loss of generality, the origin of the world coordinate system C is fixed on the reference plane Π₁ with the Z axis perpendicular to the plane. As the reference planes are parallel and equidistant to each other, the z coordinates of the object points P₁, P₂,…,P_n are easily obtained; thus, only the x and y coordinates need to be determined. In other words, the key problem is to find which points on the reference planes will be imaged to point P₀ by the AOTF imaging system. For any other image point, the same method is used to find its corresponding space line. Thus, the problem to establish the mapping for every image point on the image plane Π₀ and its corresponding space line in 3D space is simplified into several mappings in two-dimensional space, which are mappings between planes Π₀ and Π₁, planes Π₀ and Π₂,…, and planes Π₀ and Π_n. This procedure is described taking P₀ as an example:

Γ_{0} : P_{0} (u_{0}, v_{0}) \to L : {\begin{matrix} x = m t + x_{0} \\ y = n t + y_{0} \\ z = p t + z_{0} \end{matrix} \Leftrightarrow Γ : {\begin{cases} Γ_{1} : P_{0} (u_{0}, v_{0}) \to P_{1} (x_{1}, y_{1}) \\ Γ_{2} : P_{0} (u_{0}, v_{0}) \to P_{2} (x_{2}, y_{2}) \\ \begin{matrix} \begin{matrix} \begin{matrix} \begin{matrix} ... \end{matrix} \end{matrix} \end{matrix} \end{matrix} \\ Γ_{n} : P_{0} (u_{0}, v_{0}) \to P_{n} (x_{n}, y_{n}) \end{cases}},

where P₀ is the image point with the image coordinate (u₀, v₀) in pixels. L is the corresponding space line. P₁, P₂,..., P_n are the object points on the reference planes Π₁, Π₂,…, Π_n with the two-dimensional coordinates (x₁, y₁), (x₂, y₂),…, (x_n, y_n), respectively.

3.1.2 Fringe-phase marking method (FPMM)

To obtain the two-dimensional mapping set Γ, FPMM is used to provide uniformly distributed dense mark points at the pixel scale. Together with the phase unwrapping algorithm, the fringe-phase technique can provide the space coordinate with continuous phase information. Therefore fringe-phase has the ability to encode the object surface continuously, which has been widely applied in 3D measurement [21–23]. The most commonly used method to retrieve fringe phase is the four-step phase-shifting algorithm. The intensities of the four fringe images I_i (i = 0, 1, 2, 3) are written as:

I_{i} (u, v) = I_{B} (u, v) + I_{A} (u, v) \cos [φ (u, v) + i \times π / 2],

where (u, v) is the point coordinate on the image plane of the camera, I_B (u, v) is the average intensity, and I_A (u, v) is the modulation intensity. Fringe phase

φ (u, v)

can be determined by:

φ (u, v) = arc \tan \frac{I_{3} (u, v) - I_{1} (u, v)}{I_{0} (u, v) - I_{2} (u, v)},

In this study, the fringe patterns are displayed on a liquid crystal display (LCD) with the known pixel size, which covers the FOV of the AOTF imaging system. The LCD is positioned in different locations acting as the reference planes. The fringe patterns are captured by the AOTF imaging system and for every image point the fringe phase is calculated. Eq. (8) gives the phase values between 0 and 2π only. Therefore, a phase-unwrapping algorithm is required to remove 2π discontinuities and to obtain a continuous and monotonic phase map. Multi-frequency heterodyne principle [24] is used for full-field phase unwrapping, which is realized by displaying fringe patterns with three different pitches. To get the unambiguous phase value over the field of view with the heterodyne principle, the fringe pitches should satisfy the inequality:

\frac{λ_{1} λ_{2} λ_{3}}{λ_{1} λ_{2} - 2 λ_{1} λ_{3} + λ_{2} λ_{3}} \geq \max (R_{x}, R_{y}),

where

λ_{1}

,

λ_{2}

,

λ_{3}

are respectively the three fringe pitches displayed on LCD, R_x and R_y are respectively the horizontal and vertical resolution of the LCD. In this study, the resolutions of the LCD is 1920 × 1080 pixels. Therefore the fringe pitches used in our study are 16 pixels, 17 pixels and 18 pixels respectively to satisfy Eq. (9). Figure 9 shows the displayed fringes, the captured images, and the result of the absolute phase calculation. For every pixel (u, v) on the image plane, the corresponding x and y coordinates with the unit mm on the reference plane Π_n are mapped by the following equations based on the absolute unwrapped phase:

{\begin{matrix} x = k_{x} \times ϕ_{x} (u, v) \times λ / 2 π \\ y = k_{y} \times ϕ_{y} (u, v) \times λ / 2 π \end{matrix},

where

λ

is the pitch of the displayed fringe patterns.

ϕ_{x} (u, v)

and

ϕ_{y} (u, v)

are horizontal and vertical absolute unwrapped phase values calculated from fringe images, k_x and k_y are the scale factors related to the vertical and horizontal pixel sizes and resolutions of the LCD.

Fig. 9 (a) Displayed fringe patterns in vertical direction with the three fringe pitches of 16 pixels, 17 pixels and 18 pixels respectively; (b) Displayed fringe patterns in horizontal direction with the three fringe pitches; (c) Acquired fringe images in vertical direction by the AOTF imaging system with the three fringe pitches; (d) Acquired fringe images in horizontal direction by the AOTF imaging system with the three fringe pitches; (e) The result of the absolute phase calculation in horizontal and vertical direction.

Download Full Size | PDF

Without loss of generality, we consider the calibration process of the mapping Γ_n as an example. The LCD is positioned in location n, and the display plane is denoted as plane Π_n. Then the mapping Γ_n is obtained with the following steps:

Step 1. Three sets of fringe patterns are displayed on the planeΠ_n with the designed fringe pitches in horizontal direction. For each pattern set, the four fringe patterns are displayed with the intensities described in Eq. (7) and the fringe images are obtained by the AOTF imaging system.
Step 2. Repeat Step 1 in vertical direction.
Step 3. Calculate the absolute phase value with the four-step phase-shifting algorithm and multi-frequency heterodyne principle.
Step 4. Calculate the corresponding x and y coordinates on the plane Π_n for every pixel with Eq. (10).

It should be noted that the position and orientation of the planes Π₁, Π₂,…, Π_n are strictly adjusted to make the reference planes parallel and equidistant. In this study, the LCD is fixed on an electric displacement platform and the different locations of the LCD are driven by the platform, which can be treated as an equidistant rigid motion. As mentioned above, the origin of the world coordinate system C is fixed on the reference plane Π₁. To get the z coordinates of the other reference plane from the platform, the display plane of the LCD should be perpendicular to the direction of the motion. This is guaranteed by the following procedures:

1. A laser is placed in front of the electric displacement platform. Carefully adjust the laser with the optical adjustment method to make the outgoing laser beam parallel to the motion direction of the platform.
2. Fix the laser and adjust the LCD to make the laser point reflected by the display plane of the LCD coincide with the laser exit.

3.1.3 Neural network algorithm

For every pixel on the image plane Π₀, the corresponding x and y coordinates on the reference plane Π_n is obtained with the help of the unwrapped phase. Thus, FPMM actually provides a lookup table between the pixel coordinate on plane Π₀ and the two-dimensional world coordinate in reference Π_n, as shown in Fig. 10.

Fig. 10 The lookup table between the pixel coordinate and the world coordinate.

Download Full Size | PDF

The pixel coordinate in the lookup table is integer. However, the image points often come in subpixels in the actual 3D measurement. Thus, mapping Γ_n must reach certain subpixel accuracy. An interpolation method is needed to obtain another form of Γ_n, a one-to-one function F_n:

F_{n} : F_{n} (u, v) = (x, y); \begin{matrix} ((u, v) \in Π_{0}, (x, y) \in Π_{n}) \end{matrix},

where (u, v) is the coordinate of the image point on plane Π₀; (x, y) is the world coordinate of the corresponding object point on the reference plane Π_n.

To get the continuous function F_n from the lookup table provided by FPMM, a robust fitting method should be introduced. Instead of using traditional interpolation methods, such as polynomial interpolation and spline interpolation, this study proposed a BP neural network algorithm, which is considered to have a better performance. The reasons are listed as follows:

1. FPMM provides the mapping between the image points and mark points at the pixel scale. Therefore the data volume is very large. For an image with the resolution of 1024 × 1024 pixels, this problem becomes a two-dimensional interpolation with 1,048,576 sample points, which throws heavy computation to traditional interpolation methods.
2. From the discussion above, the AOTF imaging system presents a special lens distortion type, which makes it very difficult to design a global expression to cover the interpolation within the whole FOV of the AOTF imaging system. To use an explicit local interpolation is an obvious solution to this problem, but it will bring much more computation and make the algorithm more complex.
3. The BP neural network naturally has the advantage to process massive data with an acceptable accuracy and time consumption. Meanwhile, the BP neural network has an extensive applicability because it does not concern with the lens distortion and solves the interpolation problem implicitly.

We consider the mapping Γ_n as an example. To get the continuous function F_n, a BP neural network consisting of linear input neurons, sigmoid hidden neurons and linear output neurons is designed, as shown in Fig. 11. The matrices containing the coordinates of the image points and the object points are designed as the input and output of the network. The exact number of the hidden neurons is gradually adjusted during the training process to get the best condition of the network. If the number is too small, the output will result in under-fitting and if the number is too large, the output will result in over-fitting. To guarantee the fitting quality of the BP neural network, 10% of the input and output data are randomly selected as the testing samples with the rest 90% data as the training samples. After training process, the BP neural network returns the function F_n described in Eq. (11). For the other mappings Γ₁, Γ₂,…, Γ_n−1, the process mentioned above is repeated n−1 times to obtain the functions F₁, F₂,…, F_n−1.

Fig. 11 The structure of the BP neural network.

Download Full Size | PDF

When all the functions are obtained, the continuous form of the mapping set Γ is acquired. This means that for every image point, the corresponding object points on the reference planes are determined. Thus, the linear equation of the space line L is easily obtained. Figure 12 illustrates the process to get the linear equation of the space line. For any point P (u, v) on the image plane, the coordinates of its corresponding object points on the reference planes are obtained from Eq. (12):

{\begin{matrix} \begin{matrix} (x_{1}, y_{1}) = F_{1} (u, v); z_{1} = 0 \\ \begin{matrix} (x_{2}, y_{2}) = F_{2} (u, v); z_{2} = d \\ ... \end{matrix} \end{matrix} \\ (x_{n}, y_{n}) = F_{n} (u, v); z_{n} = (n - 1) d \end{matrix},

where (u, v) is the coordinate of the image point P on the image plane, (x_n, y_n) is the coordinate of the corresponding object points P_n on reference plane Π_n, z_n is the z coordinate of P_n and d is the distance between the reference planes, which is provided by the electric displacement platform in this study.

Fig. 12 Linear equation of the space line is acquired with the mapping set Γ.

Download Full Size | PDF

Since all the points P₁, P₂,…, P_n are on the space line L and the coordinates of these points are known, to get the linear equation of the space line L becomes a linear fitting problem and the equation is obtained with a least-square method.

3.2 Laser plane calibration

Camera calibration solves the problem of reverse projection. To conduct the 3D structure measurement, the spatial location of the laser plane in the world coordinate system should also be calculated.

As shown in Fig. 13, the laser works with an optical scanner to produce several laser planes. The number of laser planes is designed in advance according to the FOV and required resolution of the 3D measurement application. The main purpose of the laser plane calibration is to obtain the equation for every laser plane in the world coordinate system. In the actual measurement, with the laser plane scanning over the object, the plane equation together with the linear equation obtained from the reverse projection provides the 3D coordinates of the object points.

Fig. 13 Laser planes are produced by the laser and optical scanner.

Download Full Size | PDF

The designed laser planes are denoted as Κ₁, Κ₂,…, Κ_n. The calibration process is then conducted plane by plane. As shown in Fig. 14, for a certain laser plane Κ_m, the plane equation is denoted as Κ_m: a_mx + b_my + c_mz = 1 in the same world coordinate system C with the reference planes. The intersection lines l₁, l₂,…, l_n between Κ_m and the reference planes are imaged by the AOTF imaging system sequentially. For the certain line l_n, the image coordinate of the laser strip on the image plane is first extracted. Then, for the object points lying on l_n, x and y coordinates are provided by function F_n:F(u,v) = (x,y) and the z coordinates are the fixed value determined by the location of plane Π_n. We repeat this procedure for every intersection line. Then, a set of image points is determined. As all these points lie on the plane Κ_m, a plane fitting process is carried out to obtain the plane equation. With this process conducted plane by plane, the calibration is completed with all the equations of the laser planes obtained.

Fig. 14 Calibration for the laser plane K_m.

Download Full Size | PDF

4. Experimental results and discussion

4.2 Experimental setup

To demonstrate the deviation of the AOTF imaging system from the traditional imaging model and verify the performance of the proposed calibration method, a series of experiments is carried out. A common camera is selected as a comparison with the AOTF imaging system. The model of the camera is Basler acA640-90gm at a resolution of 659 × 494. The resolution of the AOTF imaging system is 1024 × 1024. The experiments are designed as follows:

● The AOTF imaging system and the Basler camera is calibrated under the traditional imaging model using Zhang’s method. The calibration results are discussed in detail to demonstrate the deviation of the AOTF imaging system from the pinhole model.
● The AOTF imaging system and the Basler camera are calibrated under MPM, and the back projection error is presented to verify the performance of the proposed calibration method.
● A 3D structure measurement experiment is conducted with AOTF imaging system using the calibration result of MPM, which demonstrates the feasibility of the proposed calibration method in the actual 3D measurement.

4.3 Calibration under the pinhole model

A checkerboard with 11 × 10 effective key points is placed in front of the camera as the calibration target. Twenty images are captured by the AOTF imaging system and the Basler camera, subsequently. The calculation process is performed with the camera calibration toolbox [25].

Figure 15 illustrates the calibration results with the target placed in the camera’s perspective. In the calibration process, all the positions of the checkerboard are in front of the camera roughly. Meanwhile, in the calibration result, the AOTF imaging system turns out a declivity observation, and the Basler camera remains the view angle. Calibration data also indicates this phenomenon. The principal point of the Basler camera in Eq. (14) locates near the center of the image, whereas for the AOTF imaging system in Eq. (13), the principal point is off center obviously with the coordinate −234.35 pixels. This phenomenon is quite extraordinary in traditional camera calibration. The reason that leads to the abnormal result lies in the second characteristic of the pinhole model. As mentioned above, the incident light loses the axial symmetry to the optical axis of the AOTF imaging system after passing through several non-axisymmetric devices. However, the pinhole calibration model forces the optic axis to obey the axial symmetry hypothesis, leading to a position offset of the principal point. Furthermore, the second order radial distortion coefficient is very large, indicating the abnormal distortion type of the AOTF imaging system.

\begin{array}{l} A_{1} = [\begin{matrix} 5987.23 & 0 & - 234.35 \\ 0 & 6092.35 & 625.34 \\ 0 & 0 & 1 \end{matrix}], \\ k_{1} = 0.0675, k_{2} = 158.9546, p_{1} = 0.0374, p_{2} = - 0.0235 \end{array}

\begin{array}{l} A_{2} = [\begin{matrix} 6479.26 & 0 & 401.46 \\ 0 & 6501.38 & 272.44 \\ 0 & 0 & 1 \end{matrix}], \\ k_{1} = 0.0259, k_{2} = - 8.7796; p_{1} = 0.0057, p_{2} = 0.0231 \end{array}

where A₁ and A₂ is the intrinsic matrixes of AOTF imaging system and Basler camera, containing the image coordinates of the principal point, the scale factors in image u, and v axes; k₁ and k₂ are the radial distortion coefficients; p₁ and p₂ are the decentering distortion coefficients.

Fig. 15 Calibration result of the AOTF imaging system (a) and Basler camera (b) under the pinhole model.

Download Full Size | PDF

4.3 Calibration under MPM

This experiment is conducted at a distance of 2 m to match the working distance of the AOTF imaging system. As mentioned above, the dense mark points in the MPM is produced by a display with the FPMM. The display is moved by an electric displacement platform to seven locations to act as seven different reference planes. The display is strictly aligned with the optical adjustment method to guarantee that the normal direction of the display plane is consistent with the moving orientation, which is the z direction in the world coordinate system. The moving step is fixed, and thus the z coordinate of the reference planes are determined. The AOTF imaging system and the Basler camera are calculated under MPM successively.

In Fig. 16, the two beams of space lines represent the selected light rays of the AOTF imaging system and the Basler camera, respectively. Compared with the Basler camera, although the light rays of the AOTF imaging system present a trend of convergence, they would not intersect at a single point as shown in the enlarged view. This phenomenon is discussed in the preceding part of this study. AOTF imaging system presents a deviation from the first characteristic of the pinhole model because of a number of specific optical elements.

Fig. 16 Calibration result of the AOTF imaging system (a) and Basler camera (b) under MPM.

Download Full Size | PDF

Table 1 shows the back projection errors of the four different calibration processes mentioned above. In the pinhole model calibration, seven calibration planes are randomly selected out of the several positions of the checkerboard compared with the seven reference planes in the MPM to calculate the back projection error. For a certain mark point, the back projection error is the distance between the projected point and the image point. For the calibration under the pinhole model, the back projection error is provided by the MATLAB calibration toolbox [25]. For the calibration under the MPM, which solves the problem of reverse projection, the back projection error is first obtained in the form of two-dimensional Euclidean distance between the mark points and the projected points on the reference planes. Then, it is translated into image pixels according to the instantaneous field of view of the AOTF imaging system. Table 1 shows the mean back projection error of all the mark points. As can be seen from the table, the AOTF imaging system presents a relatively large back projection error compared with the Basler camera under the pinhole model. The MPM decreases the error efficiently for both of them because of the dense mark points provided by FPMM and the accurate fitting ability of the neural network.

Table 1. Back projection errors of the calibration

View Table | View all tables in this article

4.4 3D structure measurement experiment

An actual 3D structure measurement experiment is carried out to demonstrate the validity of the proposed calibration method. Standard samples are measured to evaluate the measurement error. Moreover, a maize seedling is measured at the top view to test the application ability of the vertical vegetation observation, which is the main design objective of the AOTF imaging system. The photos of the experiment are shown in Fig. 17. The measurement distance is 2 m to match the working distance of the AOTF imaging system.

Fig. 17 3D structure measurement with (a) plane and (b) seedling.

Download Full Size | PDF

First, a standard planar board is measured at a fixed location with the calibration result of pinhole model and MPM, respectively. The residuals of the plane fitting reflect the plane measurement error of the AOTF imaging system as shown in Fig. 18. As can be seen from Fig. 18(a), the plane measurement result of the pinhole model presents a bent arc evidently with the standard deviation of 4.34 mm of the plane fitting. For the same standard planar board, the measurement result of the MPM is accurate with a standard deviation of 0.48 mm reduced by 88.94% compared with the pinhole model.

Fig. 18 Residuals of the plane fitting using the pinhole model (a) and MPM (b).

Download Full Size | PDF

Second, a standard stepped sample is measured. The distance between the two working surfaces is a precise fixed value of 9.02 mm, as shown in Fig. 19(a). The difference between measured distance and actual distance reflects the measurement error of the 3D object. As shown in Fig. 19(b), the measured distance is calculated from the 3D point data with the following procedure:

Fig. 19 The standard stepped sample (a) and the calculation of the distance between the working surfaces (b).

Download Full Size | PDF

Step 1. The plane equation is acquired from the 3D point cloud of the working surface 1 with a plane fitting method;
Step 2. For every point in the 3D point cloud of the working surface 2, the distance from the point to the fitted plane in Step 1 is calculated;
Step 3. The mean value of the distances acquired in Step 2 represents the measured distance between the working surfaces.

Figure 20 shows the 3D point cloud measured with the calibration result with the pinhole model and MPM. As mentioned above, the actual distance is 9.02 mm. For the pinhole model, the measured distance is 11.27 mm and the measurement error is 2.25 mm. For MPM, the measured distance is 9.91 mm and the measurement error is 0.89 mm reduced by 60.44%.

Fig. 20 3D point cloud of the two working surfaces using the pinhole model (a) and MPM (b).

Download Full Size | PDF

Finally, a 3D structure measurement of a maize seedling is carried out as the AOTF imaging system is designed to conduct vegetation observation in close range as shown in Fig. 21. As can be seen from the figure, the 3D point cloud can reflect the shape and orientation of the leaves. Meanwhile, at the edge of the leaves, some noise data exists, especially for the leaves with large tilt angles, which is the common drawback when measuring the step edges with the structured light.

Fig. 21 Photo (a) and 3D point cloud after triangulation (b) of maize seedling.

Download Full Size | PDF

4.5 Discussion

4.5.1 The measurement accuracy

Comparison experiment shows that MPM can reduce the measurement error obviously, which verifies the feasibility of the proposed method. The plane measurement error is 0.48 mm and the 3D object error is 0.89 mm at a working distance of 2 m. However, as a general rule, the structured light system with laser scanning should reach a better accuracy. The main factor that limits the measurement accuracy in this study is the very shot baseline. For a common structured light system, the length of the baseline is optimized to get a better vertex angle, which is the angle between the optical axis and the laser plane, and the ideal value is 90°. However, in the structure design of the AOTF-based 3D measurement system, the size of the instrument is strictly restricted for specific application requirements. Thus, the base line is limited to a very small value of 160 mm compared with the long working distance of 2 m. Under this circumstance, the vertex angle is only 4.6°, which sacrifices measurement accuracy for the portable considerations of the instrument. Meanwhile, the AOTF-based 3D measurement system is mainly designed for vegetation observation at the close range and the application requirements can be satisfied by the current measurement accuracy with MPM.

4.5.2 The numerical performance analysis

In addition to the measurement accuracy, the numerical performance of the proposed method should also be considered. During the calibration process, the computation comes mainly in four steps: phase calculation, linear equation fitting, plane equation fitting and BP neural network training process. In the former three steps, the calculation can be accomplished by the linear operation of matrix without iterative process, so the computational burden is very small and we mainly concentrate on the BP neural network training process.

The training of the BP neural network is in fact an iterative optimization process and its numerical performance will be influenced by the number of the sampling points, the structure of the network and the training algorithm designated in the optimization. As mentioned above, the resolution of the AOTF imaging system is 1024 × 1024 pixels and the 90% of all the pixels are selected as the sampling points with the rest 10% as the testing points to evaluate the fitting result of the network. The number of the hidden neurons is determined as 12 after several testing experiments to avoid under-fitting and over-fitting. Not very training can converge at an acceptable result and in this study we set 1000 iterations as the as termination condition. Table 2 gives a description of the numerical performance of the BP neural network using the common three training algorithms: Levenberg-Marquardt algorithm, Bayesian Regularization algorithm and Scaled Conjugated Gradient algorithm:

Table 2. numerical performance of the BP neural network

View Table | View all tables in this article

From Table 2, the Bayesian Regularization algorithm tends to be difficult to converge and the Scaled Conjugated Gradient algorithm is easy to converge, but the average mean squared error is relatively big. The Levenberg-Marquardt algorithm has a balanced performance in convergence, time-consuming and the average mean squared error, which is suitable for the training process of the BP neural network in this study.

5. Conclusions

The influencing factors that deviate the AOTF imaging system from the traditional pinhole model are discussed in this study. A multiplane camera calibration model is proposed, which describes the imaging process implicitly and does not force the imaging system to follow the pinhole principle. Furthermore, this model also does not track the actual direction of the light rays inside the optical system. Thus, it can be applied to any form of the lens distortion. FPMM is proposed to produce the dense marking points required in the calibration process pixel by pixel. A related BP neural network algorithm is also presented to process the massive data in the calibration. Together with the FPMM and the neural network algorithm, MPM establishes the mapping between the image points and the corresponding space lines. Calibration experiment shows that MPM decreases the back projection error to 0.015 pixels compared with 0.72 pixels under the pinhole model. The actual 3D structure measurement shows that the plane measurement error is 0.48 mm and the 3D object measurement error is 0.89 mm at a working distance of 2 m, which are decreased by 88.94% and 60.44% respectively taking the measurement error under the pinhole model as a contrast.

Funding

National Natural Science Foundation of China (NSFC) (61227806, 61475013); Program for Changjiang Scholars and Innovative Research Teams in University (IRT1203).

Acknowledgments

We would like to thank Ziye Wang and Yang Xu for useful discussions.

References and links

1. J. E. Anderson, L. C. Plourde, M. E. Martin, B. Braswell, M. Smith, R. Dubayah, M. Hofton, and J. Blair, “Integrating waveform lidar with hyperspectral imagery for inventory of a northern temperate forest,” Remote Sens. Environ. 112(4), 1856–1870 (2008). [CrossRef]

2. S. Elsayed, H. Galal, A. Allam, and U. Schmidhalter, “Passive reflectance sensing and digital image analysis for assessing quality parameters of mango fruits,” Sci. Hortic. (Amsterdam) 212, 136–147 (2016). [CrossRef]

3. R. A. Zahawi, J. P. Dandois, K. D. Holl, D. Nadwodny, J. L. Reid, and E. C. Ellis, “Using lightweight unmanned aerial vehicles to monitor tropical forest recovery,” Biol. Conserv. 186, 287–295 (2015). [CrossRef]

4. T. Eckhard, E. M. Valero, J. Hernández-Andrés, and M. Schnitzlein, “Adaptive global training set selection for spectral estimation of printed inks using reflectance modeling,” Appl. Opt. 53(4), 709–719 (2014). [CrossRef] [PubMed]

5. H. Zhao, C. Li, and Y. Zhang, “Three-surface model for the ray tracing of an imaging acousto-optic tunable filter,” Appl. Opt. 53(32), 7684–7690 (2014). [CrossRef] [PubMed]

6. H. Zhao, P. Zhou, Y. Zhang, Z. Wang, and S. Shi, “Development of a dual-path system for band-to-band registration of an acousto-optic tunable filter-based imaging spectrometer,” Opt. Lett. 38(20), 4120–4123 (2013). [CrossRef] [PubMed]

7. P. Miraldo and H. Araujo, “Calibration of smooth camera models,” IEEE Trans. Pattern Anal. Mach. Intell. 35(9), 2091–2103 (2013). [CrossRef] [PubMed]

8. E. L. Hall, J. B. K. Tio, C. A. McPherson, and Sadjadi, “Measuring curved surfaces for robot vision,” Comput. J. 15(12), 42–54 (1982). [CrossRef]

9. O. D. Faugeras and G. Toscani, “The calibration problem for stereo,” in Proceedings of the IEEE Computer Vision and Pattern Recognition (1986), pp. 15–20.

10. R. Y. Tsai, “An efficient and accurate camera calibration technique for 3D machine vision,” in Proceedings of the IEEE Computer Vision and Pattern Recognition (1986), pp. 364–374.

11. J. Weng, P. Cohen, and M. Herniou, “Camera calibration with distortion models and accuracy evaluation,” IEEE Trans. Pattern Anal. Mach. Intell. 14(10), 965–980 (1992). [CrossRef]

12. Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000). [CrossRef]

13. S. Joaquim and A. Xavier, “A comparative review of camera calibrating methods with accuracy evaluation,” Pattern Recognit. 35(7), 1617–1635 (2002). [CrossRef]

14. M. Alemán-Flores, L. Alvarez, L. Gomez, P. Henriquez, and L. Mazorra, “Camera calibration in sport event scenarios,” Pattern Recognit. 47(1), 89–95 (2014). [CrossRef]

15. L. Heng, G. H. Lee, and M. Pollefeys, “Self-calibration and visual SLAM with a multi-camera system on a micro aerial vehicle,” Auton. Robots 39(3), 259–277 (2015). [CrossRef]

16. Z. Liu, Q. Wu, X. Chen, and Y. Yin, “High-accuracy calibration of low-cost camera using image disturbance factor,” Opt. Express 24(21), 24321–24336 (2016). [CrossRef] [PubMed]

17. H. A. Martins, J. R. Birk, and R. B. Kelley, “Camera models based on data from two calibration planes,” Comput. Graph. Image Process. 17(2), 173–180 (1981). [CrossRef]

18. G. Q. Wei and S. D. Ma, “A complete two-plane camera calibration method and experimental comparisons,” in Proceedings of the Fourth IEEE International Conference (1993), pp, 439 - 446. [CrossRef]

19. R. Swaminathan, M. D. Grossberg, and S. K. Nayar, “Non-single viewpoint catadioptric cameras: geometry and analysis,” Int. J. Comput. Vis. 66(3), 211–229 (2006). [CrossRef]

20. Y. Yin, M. Wang, B. Z. Gao, X. Liu, and X. Peng, “Fringe projection 3D microscopy with the general imaging model,” Opt. Express 23(5), 6846–6857 (2015). [CrossRef] [PubMed]

21. Q. Hu and P. S. Huang, “Calibration of a three-dimensional shape measurement system,” Opt. Eng. 42(2), 482–493 (2003). [CrossRef]

22. S. Zhang and P. S. Huang, “Novel method for structured light system calibration,” Opt. Eng. 45(8), 083601 (2006). [CrossRef]

23. J. Xue, X. Su, L. Xiang, and W. Chen, “Using concentric circles and wedge grating for camera calibration,” Appl. Opt. 51(17), 3811–3816 (2012). [CrossRef] [PubMed]

24. C. Zuo, L. Huang, M. Zhang, Q. Chen, and A. Asundi, “Temporal phase unwrapping algorithms for fringe projection profilometry: A comparative review,” Opt. Lasers Eng. 85, 84–103 (2016). [CrossRef]

25. J. Y. Bouguet, “Camera Calibration Toolbox for Matlab”, (2015), http://www.vison.caltech.edu/bouguetj/calib_doc/

Plane Number	Pinhole Model (Unit: pixels)
Plane Number	AOTF imaging system	Basler camera	AOTF imaging system	Basler camera
1	0.69	0.11	0.018	0.016
2	0.72	0.10	0.019	0.014
3	0.72	0.11	0.016	0.014
4	0.70	0.09	0.013	0.013
5	0.71	0.12	0.014	0.014
6	0.74	0.13	0.013	0.011
7	0.73	0.11	0.012	0.011
Mean Value	0.72	0.11	0.015	0.013

Training algorithm	Levenberg-Marquardt	Bayesian Regularization	Scaled Conjugated Gradient
Number of total trainings	30	30	30
Number of convergence trainings	22	3	30
Average time consumption	9s	15s	2s
Average mean squared error	0.0078mm	0.0097mm	0.16mm

Plane Number	Pinhole Model (Unit: pixels)
Plane Number	AOTF imaging system	Basler camera	AOTF imaging system	Basler camera
1	0.69	0.11	0.018	0.016
2	0.72	0.10	0.019	0.014
3	0.72	0.11	0.016	0.014
4	0.70	0.09	0.013	0.013
5	0.71	0.12	0.014	0.014
6	0.74	0.13	0.013	0.011
7	0.73	0.11	0.012	0.011
Mean Value	0.72	0.11	0.015	0.013

Training algorithm	Levenberg-Marquardt	Bayesian Regularization	Scaled Conjugated Gradient
Number of total trainings	30	30	30
Number of convergence trainings	22	3	30
Average time consumption	9s	15s	2s
Average mean squared error	0.0078mm	0.0097mm	0.16mm

Calibration of AOTF-based 3D measurement system using multiplane model based on phase fringe and BP neural network

Abstract

1. Introduction

2. Theoretical background

2.1 AOTF-based 3D measurement system

2.2 Traditional imaging model

2.3 Deviation of the AOTF imaging system from the traditional imaging model

● Off-axis mirror group

● AOTF and doublet prism

● Machining and alignment errors

● Special lens distortion type

2.4 Summary

3. Multiplane model based on fringe-phase marking method and BP neural network

3.1 Camera calibration

3.1.1 Multiplane model (MPM)

3.1.2 Fringe-phase marking method (FPMM)

3.1.3 Neural network algorithm

3.2 Laser plane calibration

4. Experimental results and discussion

4.2 Experimental setup

4.3 Calibration under the pinhole model

4.3 Calibration under MPM

4.4 3D structure measurement experiment

4.5 Discussion

4.5.1 The measurement accuracy

4.5.2 The numerical performance analysis

5. Conclusions

Funding

Acknowledgments

References and links

Cited By

Figures (21)

Tables (2)

Equations (14)

Optics Express