Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

High dynamic range 3D measurements based on space–time speckle correlation and color camera

Open Access Open Access

Abstract

Structured light (SL) based three-dimensional (3D) measurement struggles to estimate high dynamic range (HDR) scenes, where both high and low reflectivity parts exist simultaneously. This paper proposes a method through the joint design and optimization of hardware and algorithms, in which only four frames are required to realize the 3D reconstruction of HDR scenes. The height information of each sub-area in the scene under test can be encoded effectively by temporally projecting two sets of complementary speckle patterns onto target surface. To decode the corresponding patterns captured by the cameras, we design a stereo matching strategy consisting of space-time binary feature (ST-BIF) descriptor preliminary screening and zero-mean normalized cross-correlation (ST-ZNCC) final retrieval. The ST-BIF descriptor based on neighborhood comparison is designed to describe the space-time relative intensity change of projected speckles. Besides the HDR adaptability, the ST-BIF descriptor can effectively improve the matching speed. In addition, the measurable dynamic range can be further improved by fusing all channel disparities as evaluated results, benefitting from the different response of R, G and B channels in color camera to monochromatic light. Experiments are conducted to demonstrate the feasibility of the proposed method. The results indicate that our method achieves the root mean square error 0.2516mm (vs. 1.0668 by commonly used ZNCC) and an average coverage rate up to 94.87% (vs. 93.35% by commonly used ZNCC). Furthermore, the experimental results show that the proposed method can achieve 3D reconstruction of HDR scenes including specular reflection region.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Given the advantages of noncontact, high speed, and simple hardware configuration, optical three-dimensional (3D) measurement has obtained a wide range of studies and applications, such as object recognition, deformation monitoring, quality control of industrial parts, reverse engineering, and visual servo [16]. Nevertheless, the reconstruction accuracy and completeness heavily depend on the texture of targets for conventional passive stereovision methods. Structured light (SL)-based approaches are commonly used in reducing the dependence on target texture and expand the scope of applications. SL-based approaches contain an active projector to add encoded features onto targets surface artificially [4]. By capturing targets from different perspectives and performing stereo matching, the 3D reconstruction can be achieved using the triangulation principle [7,8].

However, one challenge of the SL-based approach is the saturation of high-reflection parts due to the limited dynamic range of the camera [2,4,921]. A model proposed by Nayar and Ikeuchi indicated that the reflected light consists of three components, i.e., diffuse lobe, specular lobe, and specular spike [22]. The diffuse lobe has nearly uniform intensity in all directions. The specular spike reflects only in one direction, which obeys the law of reflection. The specular lobe distributes in a limited range around the specular spike [13]. As shown in Fig. 1, for smooth surfaces, the specular lobe and spike have higher intensity than the diffuse lobe, easily causing the captured image pixels at reflection directions to be overexposed, whereas other pixels not at the reflection direction have relatively low single-to-noise ratio (SNR) due to weak reflected light [9,12]. These objects with large surface reflectivity variation are called high dynamic range (HDR) targets. For the Lambertian surface, which only has diffuse lobe, some pixels may also be overexposed or too dark due to excessive reflectivity variation between different areas, resulting in matching errors. In actual scenes, the above-mentioned cases may exist simultaneously, making the HDR measurement situation complicated.

 figure: Fig. 1.

Fig. 1. Three components of reflection for smooth surface.

Download Full Size | PDF

A direct way to deal with this kind of problem is spraying diffuse powder onto the target surface to cover the high-light parts [14]. Nevertheless, the spray method is time-consuming, introduces powder size as additional error source, and changes the surface characteristics of targets, which is undesired in most applications.

In terms of SL-based measurement, many HDR methods have been carried out to achieve an accurate 3D measurement of targets with large reflectivity variation range without changing the surface morphology [912,1521]. An intuitive and effective strategy is multi-exposure, because for low-reflection areas, the image with long exposure time is clear, whereas for high-reflection areas, the image with short exposure time has more details. The most typical multi-exposure methods capture a series of pictures with different exposure times corresponding to areas with different reflectivity, and fuse them together into a whole image to realize the 3D measurement of HDR scenes [10]. Likewise, the method of projecting multiple intensities and subsequently fusing them together can achieve the same effect [15] and can be seen as another form of multi-exposure method [2]. In addition, the above two methods of adjusting projector brightness and camera exposure time can be combined to further increase the dynamic range of measurement [16]. These multi-exposure methods are easy to implement and can considerable improve the measurement dynamic range. However, due to the lack of prior knowledge about scene reflectivity distribution, these methods need to try multiple exposure or projecting parameters, which are time-consuming and unsuitable for dynamic objects. Many improvements have been put forward to reduce the blindness of exposure parameter adjustment and accelerate 3D measurement of HDR scenes. Rao and Da proposed an optimized multi-exposure method, in which the modulation of the captured scene is analyzed to calculate the appropriate exposure time [17]. Similarly, some researchers analyzed the preliminary captured image and pixel-wise adjusted the intensity of projected pattern for the final HDR measurement [1820]. These above improvements directly calculate appropriate exposure instead of many time blindness attempts and have the advantages of few frames and short measurement time. However, before the measurement, additional experiments and calculations are needed for exposure parameters, causing these methods to be time consuming and limiting their application scope.

Some researchers improved the measurement dynamic range by changing the hardware structure. Feng et al. introduced a polarizer to remove high light. Unfortunately, the introduction of polarizer may lead to low signal-to-noise ratio (SNR) [11]. To obtain different intensity fringes for HDR 3D measurement, Zheng et al. captured images when the projector works fully and part-time and utilized the different responses of three channels in the RGB camera [21]. Cai et al. utilized the plenoptic camera for multidirectional depth estimation and realized the HDR 3D measurement [23]. The improvement of the dynamic range always comes with the cost of increasing the frame number, adding additional devices, or losing part of the information. Thus, we are devoted to acquiring information about saturated or dark parts with little frames and simple device. Generally, most researches of HDR measurement are based on fringe encoding system. Speckle encoding based 3D reconstruction methods [2429] are still lack of solutions for HDR scenes.

In this paper, we introduce a novel method to extend the measurement dynamic range by taking advantages of temporal complementary speckle projections, space–time speckle correlation and three channel disparity fusion based on channel crosstalk in the RGB camera. We design temporal complementary speckle projection patterns to ensure that each region can be illuminated by bright and dark spots, which are applied to avoid complete saturation. Then, the corresponding comparison-based space–time binary feature (ST-BIF) descriptor and zero-mean normalized cross-correlation (ST-ZNCC) are utilized to realize accurate and efficient matching. In addition, we fuse the disparity from different color channels to increase the measurement dynamic range effectively. Our system only needs four frames to realize HDR three-dimensional measurement. Experimental results with standard sphere under different projecting gray values show that our method can obtain average coverage rate up to 94.87% (vs. 93.35% by commonly used ZNCC [30]) and root mean square error 0.2516mm (vs. 1.0668mm by commonly used ZNCC). Our method also successfully measure objects with specular reflection and dark surface.

Section 2 introduces the principle of the proposed method. Section 3 gives experimental results to demonstrate the performance of the method and discusses the advantages and limitations. Section 4 concludes the paper.

2. Principle

2.1 System and measurement method construction

As shown in Fig. 2, we construct an active binocular system to achieve 3D measurement. The experimental system consists of a projector and two industrial RGB cameras. The projector sequentially projects speckle patterns onto the target surface. The two cameras are synchronized with the projector, forming a binocular module to capture two groups of object-modulated speckle patterns from different perspectives.

 figure: Fig. 2.

Fig. 2. Schematic diagram of the active binocular system of the proposed method.

Download Full Size | PDF

For HDR scenes, due to the cameras’ limited dynamic range, the low-reflectivity region always has low SNR, whereas the high-reflectivity region is prone to saturation. To extend the recording dynamic range, we propose a strategy containing temporal complementary speckle projection and three-channel disparity fusion. The details of these two operations are expressed as follows.

2.1.1 Temporal complementary speckle projection

In HDR measurements, the high-reflection region is easily saturated under bright spot, thus causing a complete or part loss of coding information. The reconstructed depth information of these saturated parts will be wrong.

To restrain the above-mentioned phenomenon and increase the measurement dynamic range, we sequentially project four monochromatic pairwise complementary binary speckle patterns onto the target surface. Figure 3 shows the pairwise complementary relationship. Images 1 and 2 are complementary, and images 3 and 4 are complementary. This method ensures that any measured area can be encoded by bright and dark spots. Therefore, although some places are saturated under bright spots, the modulation information can still be captured under the dark spot and used for subsequent matching and 3D reconstruction. The temporal complementary pattern projection strategy reduces the possibility of total overexposure or darkness and enables the measurement robust for HDR scenes. In addition, we can also obtain the region of interest (ROI) mask by comparing the differences between each complementary pair, which is detailed in Section 2.2.1.

 figure: Fig. 3.

Fig. 3. Temporal complementary speckle projection.

Download Full Size | PDF

2.1.2 Three-channel disparity fusion

An RGB camera has three sets of Bayer filters with different central wavelengths, corresponding to R, G, and B channels. Because of the non-ideal filtering characteristics of Bayer filters, each channel still allows light with a wavelength different from its center wavelength to pass through, and only the transmittance is different. The phenomenon can be applied to achieve HDR imaging. During the imaging process, the channel with high transmittance for projecting light is sensitive, thus obtaining high SNR in the low-reflectivity area, whereas the channel with high transmittance is also easy to be saturated when meeting high-reflection area. By contrast, channels with weak transmittance have relatively low SNR especially in the low-reflection region but are hardly saturated in high-reflectivity parts. Thus, the weak-transmittance channel can still obtain high SNR in high-reflectivity parts without saturation because of reflected strong energy.

Therefore, for a monochromatic illumination such as blue light, the three channels are appropriate to different areas. The medium-transmittance channel is applied to collect the information of regions with moderate reflectivity, which is the vast majority. For particular reflectivity regions, the high-transmittance channel can extract weak information in dark regions, and the low-transmittance channel can record strong intensity light from high-light regions. The three channels respond to different intensity ranges. Thus, their fusion can remarkably improve the measurement dynamic range. The disparity fusion algorithm is detailed in Section 2.2.4.

2.2 Computational framework for the HDR measurement

Figure 4 shows the overall computational framework we have developed for HDR measurement. Synchronized with the projector, the left and right RGB cameras capture two groups of speckle-encoded objects from different perspectives. Each group has four images corresponding to the temporally projected speckle patterns one by one. Each image contains three channels, which have different response intensities because of different transmittance values for specific wavelength illumination. After epipolar rectification for captured images, the corresponding points are in the same row [1]. First, we calculate the ROI mask to label regions to be matched in advance, which is detailed in Section 2.2.1. We separate the three channels of each image. The four images from the same camera and channel are arranged together by shooting order as a group. Second, we apply the proposed ST-BIF + ZNCC matching algorithm, consisting of ST-BIF descriptor preliminary screening (Section 2.2.2), and ST-ZNCC final retrieval (Section 2.2.3), to generate disparity maps for each channel. These three generated disparity maps are sensitive to different brightness ranges. Finally, by performing reliability evaluation and fusing the three-channel disparity maps together detailed in Section 2.2.4, we can generate an accurate and complete disparity map, which is also robust to reflectivity change. Thus, the 3D reconstructed result is improved. We detail the proposed computational framework in this subsection.

 figure: Fig. 4.

Fig. 4. Computational framework of the proposed high dynamic range measurement method.

Download Full Size | PDF

2.2.1 ROI mask calculation

In SL-based 3D measurement, only the area modulated by projection texture can be identified and reconstructed successfully. Otherwise, featureless pixels will become noise to reduce the reconstruction efficiency and interfere with other point matching. Therefore, we first calculate the ROI mask to mark the modulated region in left and right views. In the subsequent matching process, we only match ROI pixels to improve matching efficiency and accuracy.

Benefitting from the pairwise complementary pixel of designed projection patterns in Section 2.1.1, we can obtain the speckle modulation degree by calculating the sum of pixel value difference between each complementary pair, as shown in Eq. (1). Then, a proper modulation degree threshold is set to separate valid and invalid areas. With $G_\textrm{k}^\textrm{c}(u,v)$ indicating the pixel intensity at $(u,v)$ in channel c from the k-th (k = 1, 2, 3, 4) frame, we have:

$$(u,v) \in \sqrt {\frac{{\sum\limits_{c = R,G,B} {\{ {{[G_1^c(u,v) - G_2^c(u,v)]}^2} + {{[G_3^c(u,v) - G_4^c(u,v)]}^2}\} } }}{6}} > \sigma ,$$
where $\sigma$ is the threshold for the speckle modulation degree. Pixels satisfying Eq. (1) are identified as the ROI. Others are regarded as invalid area and are not processed in the subsequent steps.

2.2.2 Preliminary screening based on ST-BIF descriptor

The matching process of ST-BIF descriptor preliminary screening (Section 2.2.2), and ST-ZNCC final retrieval (Section 2.2.3) is the same for all color channels. For each pixel to be matched, we construct a ST-BIF descriptor which is a binary string in accordance with the relative order of its spatial and temporal neighboring pixel intensities to describe the space–time light and shade changes under the temporal complementary speckle projection. The ST-BIF descriptor is used as cost measure in preliminary screening.

Figure 5 shows the process of constructing feature pixel string and ST-BIF descriptor. For each pixel $(u,v)$ to be matched, we first extract its specific spatial and temporal neighborhood and arrange them together as a feature pixel string containing 20 elements. We use ${G_k}(u,v)$ to indicate the pixel intensity at $(u,v)$ in the k-th frame and symbol ${\otimes} $ to denote the concatenation operation. For coordinate $(u,v)$, its feature pixel string ${I^{(u,v)}}$ can be defined as:

$${I^{(u,v)}} = \mathop \otimes \limits_{k = 1}^4 [{G_k}(u,v)\mathop \otimes \limits_{n ={-} i,i}^{} {G_k}(u + n,v)\mathop \otimes \limits_{m ={-} i,i}^{} {G_k}(u,v + m)]. $$

The size of window is $(2i + 1) \times (2i + 1)$. Due to the pixel interpolation filling characteristics corresponding to the Bayer filter of the color camera, the captured image may have the phenomenon that every 2 × 2 pixels are the same in R and B channels. In order to prevent the feature pixel strings of adjacent coordinates are exactly the same, i in Eqs. (2) should be an odd number. In our experiment and Fig. 5, i=3.

 figure: Fig. 5.

Fig. 5. Generation of feature pixel string and ST-BIF descriptor.

Download Full Size | PDF

Then, we calculate the ST-BIF descriptor by three comparison-based transforms from step 1 to step 3 as shown in Fig. 5. For pixel intensities $G1$ and $G2$, we define that

$$\xi ({G_1},{G_2})\left\{ \begin{array}{l} 0,\textrm{ if }{G_1} \le {G_2}\\ \\ 1,\textrm{ else} \end{array} \right.. $$

Step 1: We calculate $B{\textrm{1}^{(u,v)}}$, which is the first part of the ST-BIF descriptor. $B{\textrm{1}^{(u,v)}}$ represents whether the intensity of pixels in ${I^{(u,v)}}$ is more than the mean value. We use $I_n^{(u,v)}$ to indicate the n-th element in ${I^{(u,v)}}$. ${\overline I ^{(u,v)}}$ shows the average.

$${\overline I ^{(u,v)}}\textrm{ = }\frac{{\sum\limits_{n = 1}^{20} {I_n^{(u,v)}} }}{{20}}. $$
$B{\textrm{1}^{(u,v)}}$ is defined as:
$$B{\textrm{1}^{(u,v)}} = \textrm{ }\mathop \otimes \limits_{n = \textrm{1}}^{\textrm{20}} \xi (I_n^{(u,v)},{\overline I ^{(u,v)}}). $$

Step 2: We calculate $B{\textrm{2}^{(u,v)}}$, which is the second part of the ST-BIF descriptor. $B{\textrm{2}^{(u,v)}}$ represents whether the intensity of feature pixels are more than that of the next-frame pixels at the same coordinate, whereas the last-frame pixels are compared with the first frame. $B{\textrm{2}^{(u,v)}}$ is defined as:

$$B{2^{(u,v)}} = \mathop \otimes \limits_{m = 1}^5 \textrm{[}\mathop \otimes \limits_{n = 1}^3 \xi (I_{m + 5n - 5}^{(u,v)},I_{m + 5n}^{(u,v)}) \otimes \xi (I_{m + 15}^{(u,v)},I_m^{(u,v)})]. $$
$B{2^{(u,v)}}$ is applied to describe the temporal complementary features of projected speckles.

Step 3: We apply the census transform proposed by Zabih and Woodfill [31] to calculate $B{3^{(u,v)}}$, which is the third part of the ST-BIF descriptor. $B{3^{(u,v)}}$ represents whether the neighborhood pixel intensities are less than that of the center pixel in the same frame. With ${G_k}(u,v)$ indicating the pixel intensity at $(u,v)$ in the k-th frame (k = 1, 2, 3, 4), $B{3^{(u,v)}}$ is defined as:

$$B{\textrm{3}^{(u,v)}} = \mathop \otimes \limits_{k = 1}^\textrm{4} \{ \mathop \otimes \limits_{n ={-} i,i} \xi [{G_k}(u,v),{G_k}(u + \textrm{n},v)]\mathop \otimes \limits_{m ={-} i,i} \xi [{G_k}(u,v),{G_k}(u,v + m)]\}, $$
where the value of i needs to be the same as that in Eqs. (2). In our experiment, i=3. $B{3^{(u,v)}}$ is applied to describe the spatial intensity changes of speckles.

Step 4: By concatenating $B{\textrm{1}^{(u,v)}}$, $B{\textrm{2}^{(u,v)}}$, and $B{3^{(u,v)}}$ together as shown in Eq. (8), the ST-BIF descriptor used for stereo matching can be obtained.

$${B^{(u,v)}} = B{1^{(u,v)}} \otimes B{2^{(u,v)}} \otimes B{3^{(u,v)}}.$$

The hamming distance is utilized as the matching cost to evaluate the similarity between two ST-BIF descriptors in stereo matching. Let ${\oplus} $ denote XOR operation. $B_i^{{{(u,v)}^L}}$ represents the i-th element of ST-BIF descriptor for coordinate $(u,v)$ in the left perspective. $B_i^{{{(u^{\prime},v^{\prime})}^R}}$ represents the i-th element of ST-BIF descriptor for $(u^{\prime},v^{\prime})$ in the right perspective. Then, for n-bit ST-BIF descriptors ${B^{{{(u,v)}^L}}}$ and ${B^{{{(u^{\prime},v^{\prime})}^R}}}$, the matching cost can be written as:

$${C_{BIF}} = \sum\limits_{i = 1}^n {\textrm{ }B_i^{{{(u,v)}^L}} \oplus } \textrm{ }B_i^{{{(u^{\prime},v^{\prime})}^R}}. $$

Smaller matching cost indicates higher matching degree. The ST-BIF descriptor encodes spatial and temporal local structures with the relative order of surrounding pixel intensities rather than intensity magnitude. Since the ST-BIF descriptor depends only on the intensity order, it can tolerate every radiometric distortion which preserves this order, resulting in robust to low SNR or overexposure caused by HDR scenes. The remarkably different intensities of spacetime adjacent pixels led by temporal complementary speckle projection also enhance the stability of the comparison-based ST-BIF descriptor. In an evaluation by Hirschmüller and Scharstein, as an important part of ST-BIF descriptor, the census transform performs the best and obtains the most robust overall results on all test sets, including under real exposure and light source changes and Gaussian noise [32]. This result further confirms the strong robustness of the ST-BIF descriptor. Because of the reflectivity variation with perspective change, the pixel intensity corresponding to the same area may be different for each camera. However, the speckle texture and spatial and temporal neighborhood gray distribution characteristics are relatively fixed for different perspectives and different imaging devices. Therefore, the ST-BIF descriptor is robust for HDR scene measurement. The storage and computation cost of the generation and comparison operation of ST-BIF descriptor are small, thereby significantly improving the efficiency of stereo matching. In addition, the data processing flow easily implements parallel computation, further improving the stereo matching speed [33].

2.2.3 ST-ZNCC for final retrieval

The ST-BIF descriptor can effectively improve the matching robustness and speed. However, considering that the original gray level has been completely abandoned, the ST-BIF descriptor contains limited image information compared with the original gray vector generated in Eq. (2). Therefore, the stereo matching process we proposed contains two steps, i.e., preliminary screening with the ST-BIF descriptor and optimal determination with ST-ZNCC.

For a point to be matched in one perspective, we first calculate its ST-BIF descriptor matching cost with all coordinates at the same row in another perspective. Points with matching cost lower than the predefined threshold are classified as candidate point. In the second step, for each candidate, we calculate the zero-mean normalized cross-correlation coefficient between its feature pixel string and that of the point to be matched. The feature pixel string is defined and generated through Eq. (2) in the preliminary screening step. We use $I_i^{{{(u,v)}^L}}$ and $I_i^{{{(u^{\prime},v^{\prime})}^R}}$ to represent the i-th element of feature pixel string for ${(u,v)^L}$ in the left image and ${(u^{\prime},v^{\prime})^R}$ in the right image, respectively. Then, the ST-ZNCC coefficients of ${(u,v)^L}$ and ${(u^{\prime},v^{\prime})^R}$ can be calculated using Eq. (10).

$${\rho _{ST - ZNCC}}[{(u,v)^L},{(u^{\prime},v^{\prime})^R}] = \frac{{\sum\nolimits_{i = 1}^{20} {[{I_i^{{{(u,v)}^L}} - \overline {{I^{{{(u,v)}^L}}}} } ]} [{I_i^{{{(u^{\prime},v^{\prime})}^R}} - \overline {{I^{{{(u^{\prime},v^{\prime})}^R}}}} } ]}}{{\sqrt {{{\sum\nolimits_{i = 1}^{20} {[{I_i^{{{(u,v)}^L}} - \overline {{I^{{{(u,v)}^L}}}} } ]} }^2}} \sqrt {{{\sum\nolimits_{i = 1}^{20} {[{I_i^{{{(u^{\prime},v^{\prime})}^R}} - \overline {{I^{{{(u^{\prime},v^{\prime})}^R}}}} } ]} }^2}} }}.$$
$$\overline {{I^{(j)}}} = \frac{1}{{\textrm{20}}}\sum\limits_{i = 1}^{\textrm{20}} {I_i^{(j)}\textrm{. (}j\textrm{ is the coordinate)}}$$

The candidate point with maximum ${\rho _{ST - ZNCC}}$ is considered as the optimal matching point. For this pair of matching points, we define the point from the left camera view as ${p_L} = ({u_1},{v_1})$ and the point from the right camera view as ${p_R} = ({u_2},{v_1})$ with $D = {u_1} - {u_2}$ indicating the disparity. By searching the optimal matching point for each point in a certain perspective image, we can get a disparity map containing the profile information.

2.2.4 Reliability evaluation and three-channel disparity fusion

The reliability of disparity from each channel should be evaluated and marked in advance to increase the quality of color channel fusion and 3D reconstruction. The left–right consistency check and ST-ZNCC coefficient are jointly utilized to evaluate the disparity reliability degree.

Step 1: Left–right consistency check. The efficiency of the left–right consistency check is ensured because corresponding points from the left and right views have the same disparities. We write that the disparity results of points from left and right views are ${D_{L\textrm{eft}}}$ and ${D_{R\textrm{ight}}}$, respectively. For points $(u,v)$ from left view, its corresponding point in right view can be expressed as $(u - {D_{L\textrm{eft}}}(u,v),v)$. Then the left–right consistency check criterion can be denoted as:

$$\left\{ {\begin{array}{{c}} {(u - {D_{L\textrm{eft}}}(u,v)) > 0}\\ {\textrm{ }|{\textrm{ }{D_{L\textrm{eft}}}(u,v) - {D_{Right}}(u - {D_{L\textrm{eft}}}(u,v),v)\textrm{ }} |< \partial } \end{array}} \right., $$
where $\partial $ is the predefined threshold for the disparity difference between corresponding points. For all left points, we operate this check. The disparity satisfying the Eq. (12) is marked as the preliminary qualified disparity.

Step 2: ST-ZNCC coefficient screening. For points whose disparities are preliminarily qualified, we further screen their ${\rho _{ST - ZNCC}}$ calculated at Section 2.2.3 with the following equation:

$${\rho _{ST - ZNCC}}[{(u,v)^L},{(\textrm{u} - {D_{L\textrm{eft}}}(u,v),v)^R}] > \textrm{ }\beta, $$
where $\beta $ is the predefined threshold of ST-ZNCC coefficient between corresponding points.

After the two rounds of screening represented by Eqs. (12) and (13), we can divide all disparities into three categories, namely, unqualified, suspicious, and reliable disparity. The disparity not satisfying Eq. (12) is marked as unqualified disparity. The disparity satisfying Eq. (12) but not satisfying Eq. (13) is marked as suspicious disparity. The disparity satisfying Eqs. (12) and (13) is marked as reliable disparity.

After the reliability test, we fuse the three-color channel disparity maps into an accurate and complete disparity map. The three-color channel fusing order is based on their sensitivity to the experimental monochromatic light. The disparity of the medium-sensitivity channel has the highest priority because this channel is appropriate to most regions. The most sensitive channel obtains the second priority because of high SNR. The least sensitive channel has the lowest priority. Figure 6 shows the three-channel fusion process. Channels with high, medium, and low sensitivity are abbreviated as c1, c2, and c3, respectively.

 figure: Fig. 6.

Fig. 6. Schematic of the three-channel fusion procedure.

Download Full Size | PDF

In the search and fusion process, the descending order of priority is as follows: c2 reliable disparity > c1 reliable disparity > c3 reliable disparity > c2 suspicious disparity > c1 suspicious disparity > c3 suspicious disparity > c2 unqualified disparity > c1 unqualified disparity > c3 unqualified disparity. As shown in Fig. 6, we search in the priority order and fill the highest priority disparity point into the fused disparity map. Given that the selected disparity comes from three channels with different dynamic ranges, the fused disparity map is accurate and robust for HDR scenes.

With the obtained disparity map, we calculate 3D coordinates for each pixel on the basis of the standard stereovision 3D reconstruction algorithm [7]. For each depth value, we use a $\textrm{17} \times \textrm{17}$ window to check outliers and further perform de-noising. If only < 20% of the surrounding points whose depth difference from the center point is within 6 mm, the center point is considered as noise and deleted.

3. Experiments, results, and discussion

We construct the system prototype and verify the performance of the proposed method. As shown in Fig. 7, our system consists of two RGB industrial cameras (FLIR GS3-U3-28S5C-C, resolution: 1440 × 1920) with 12 mm lenses and a blue LED projector (Texas Instruments DLP LightCrafte 4500, resolution: 1140 × 912, wavelength: 460nm). Four pairwise complementary speckle patterns are generated by a computer and loaded into the projector in advance. Two cameras are triggered synchronously with the projector to capture images from different perspectives. The distance between the system and the scene under test is about 300–400 mm.

 figure: Fig. 7.

Fig. 7. Experimental system of the active binocular system in the proposed method.

Download Full Size | PDF

3.1 Quantitative evaluation

To verify the accuracy and robustness of our proposed method under reflectance variation, we first detect a pair of standard ceramic sphere under different illumination intensities. These standard spheres are shown in Fig. 8. The geometric parameters of the standard ceramic spheres have been verified with coordinate measuring machine (CMM). The diameter of Sphere A and B are 50.7956mm and 50.7964mm, respectively. The maximum permissible error of CMM is 0.7 um which is far beyond the accuracy of machine vision methods.

 figure: Fig. 8.

Fig. 8. Measured standard spheres

Download Full Size | PDF

As shown in Fig. 9, we change the value of projecting gray value to 50, 60, 70, 90, 110, 130, 150, 170, 190, 210, 230, and 250 with other settings fixed to evaluate the measurement accuracy and coverage rate (γ) under different projections and imaging environments.

 figure: Fig. 9.

Fig. 9. Captured images under different projecting gray value

Download Full Size | PDF

3.1.1 Accuracy evaluation of point cloud

To evaluate the accuracy of point cloud, the ideal spheres with diameters of 50.7956mm and 50.7964mm are used as the benchmarks of Spheres A and B, respectively. 3D reconstruction accuracy is evaluated by the root mean square error (RMSE) between measured points and the benchmarks.

To analyze the performance of the proposed method, we use three methods to realize 3D measurement for comparison: (a) the proposed method (ST-BIF + ZNCC after three-channel fusion), (b) the proposed ST-BIF + ZNCC matching method for single channel (R, G, and B channels), and (c) commonly used ZNCC matching method for R, G, and B channels using the first of the four frame for matching. As mentioned above, due to the pixel interpolation filling characteristics corresponding to the Bayer filter of the color camera, the captured image may have the phenomenon that every 2 × 2 pixels are the same in R and B channels. To avoid the repetitive intensities being used for matching, we perform interval sampling with step size of 3. The ZNCC coefficient between ${(u,v)^L}$ and ${(u^{\prime},v^{\prime})^R}$ is calculated using Eqs. (15) and (16).

$${\rho _{ZNCC}}[{(u,v)^L},{(u^{\prime},v^{\prime})^R}] = \frac{{\sum\limits_{m ={-} i}^i {\sum\limits_{n ={-} i}^i {[{G{{(u + 3m,v + 3n)}^L} - \overline {G{{(u,v)}^L}} } ][{G{{(u^{\prime} + 3m,v^{\prime} + 3n)}^R} - \overline {G{{(u^{\prime},v^{\prime})}^R}} } ]} } }}{{\sqrt {{{\sum\limits_{m ={-} i}^\textrm{i} {\sum\limits_{n ={-} i}^i {[{G{{(u + 3m,v + 3n)}^L} - \overline {G{{(u,v)}^L}} } ]} } }^2}} \sqrt {{{\sum\limits_{m ={-} i}^i {\sum\limits_{n ={-} i}^i {[{G{{(u^{\prime} + 3m,v^{\prime} + 3n)}^R} - \overline {G{{(u^{\prime},v^{\prime})}^R}} } ]} } }^2}} }}, $$
$$\overline {G(u,v)} = \frac{1}{{{{(2i + 1)}^2}}}\sum\limits_{m ={-} i}^i {\sum\limits_{n ={-} i}^i {G(u + 3m,v + 3n)} }, $$
where $G{(u,v)^L}$ represents the intensity of point at the coordinate ${(u,v)^L}$ in the left image, and $G{(u^{\prime},v^{\prime})^R}$ represents the intensity of point at the coordinate ${(u^{\prime},v^{\prime})^R}$ in the right image. i is set to 2 in our experiment.

Each channel of the camera has different transmittance to blue projection. The blue channel has the highest transmittance, the green channel has the medium transmittance, and the red channel has the lowest transmittance. Figure 10 shows the 3D reconstructions of standard spheres with a projecting gray value of 150 using the proposed ST-BIF + ZNCC method for R, G, and B channels and after three-channel fusion. As shown in Fig. 10(a), for R channel with lowest transmittance, the reconstruction result has many jump points and is sparse because of low SNR. As shown in Fig. 10(b), the reconstruction result of the G channel is smooth and dense. As shown in Fig. 10(c), for B channel with highest transmittance, only the most peripheral circle can be reconstructed because pixels in the center of the sphere are saturated, but the reconstructed point cloud at the periphery is denser than that of G and R channels. As shown in Fig. 10(d), after three channel fusion both the periphery and the center have been remarkable improved compared with that of Fig. 10(a), (b) and (c).

 figure: Fig. 10.

Fig. 10. 3D reconstructions of standard spheres with a projecting gray value of 150 using the proposed ST-BIF + ZNCC method. (a) 3D reconstructions of R channel. (b) 3D reconstructions of G channel. (c) 3D reconstructions of B channel. (d) 3D reconstructions after three-channel fusion.

Download Full Size | PDF

Figure 11 shows the RMSE values of spheres A and B with different projecting gray values using ZNCC matching method for single channel (R, G, and B channels), the proposed ST-BIF + ZNCC matching method for single channel (R, G, and B channels), and the proposed ST-BIF + ZNCC after three-channel fusion method. Table 1 shows the average RMSE of these different methods.

 figure: Fig. 11.

Fig. 11. RMSE with different projecting gray values using ZNCC for R channel (ZNCC R), G channel (ZNCC G), and B channel (ZNCC B); the proposed ST-BIF + ZNCC for R channel (ST-BIF + ZNCC R), G channel (ST-BIF + ZNCC G), and B channel (ST-BIF + ZNCC B); and the proposed ST-BIF + ZNCC after three-channel fusion (ST-BIF + ZNCC after fusion). (a) RMSE curves for sphere A. (b) RMSE curves for sphere B.

Download Full Size | PDF

Tables Icon

Table 1. Average RMSE of different methods.

From Fig. 11 and Table 1, the following findings are observed.

  • (1) The average RMSE of the proposed ST-BIF + ZNCC after three-channel fusion method is only 0.2516mm, while that of the conventional ZNCC method is up to 1.0668mm. Besides, the RMSE curve of the proposed ST-BIF + ZNCC after three-channel fusion method gives the lowest error rate and most stable overall performance, thereby indicating the high robustness and accuracy of the proposed method in HDR measurement.
  • (2) The proposed ST-BIF + ZNCC method has significantly lower RMSE compared with the conventional ZNCC matching method for each channel in most cases of 12 different projecting gray values, thereby containing higher accuracy and reliability. As the black star line shown, the RMSE is further decreased after fusing three channels, demonstrating that for every brightness in the whole high dynamic range, we can always collect more accurate information after three-channel fusion compared with that of the most effective channel.

3.1.2 Coverage rate evaluation of point cloud

We realize 3D reconstruction for each scene. Their coverage rate $\gamma $, which is used for quantitative evaluation, is defined as:

$$\gamma = \frac{{{N_\textrm{m}}}}{{{N_p}}},$$
where ${N_\textrm{m}}$ is the measured 3D point number of the target, and ${N_\textrm{p}}$ represents the overall pixel number of the target.

According to Eq. (14), we calculate the coverage rates with different projecting gray values using ZNCC matching method for single channel (R, G, and B channels), the proposed ST-BIF + ZNCC matching method for single channel (R, G, and B channels), and the proposed ST-BIF + ZNCC after three-channel fusion method as shown in Fig. 12. Table 2 shows the average coverage rate of these different methods.

 figure: Fig. 12.

Fig. 12. Coverage rate curves with different projecting gray values using ZNCC for R channel (ZNCC R), G channel (ZNCC G), and B channel (ZNCC B); the proposed ST-BIF + ZNCC for R channel (ST-BIF + ZNCC R), G channel (ST-BIF + ZNCC G), and B channel (ST-BIF + ZNCC B); the proposed ST-BIF + ZNCC after three-channel fusion (ST-BIF + ZNCC after fusion). (a) Coverage rate curves for sphere A. (b) Coverage rate curves for sphere B.

Download Full Size | PDF

Tables Icon

Table 2. Average coverage rate of different methods.

From Figs. 11, 12 and Table 2, the following findings are observed.

  • (1) The average coverage rate of the proposed ST-BIF + ZNCC after three-channel fusion method is up to 94.87%, while that of the conventional ZNCC method is 93.35%. Besides, the ST-BIF + ZNCC after three-channel fusion method performs the overall highest and most stable curve as the projecting gray values changes, which further indicating high robustness and efficiency of the proposed method for brightness change scenes in the HDR measurement.
  • (2) R, G, and B ST-BIF + ZNCC coverage curves reflect the characteristics of “as one falls, another rises” as projecting gray value changes, indicating that the sensitive brightness ranges of these three channels are complementary. When the gray value of projection is 50, the efficiency of channel R and channel G is reduced by low signal-to-noise ratio, while channel B is the most efficient one having the highest coverage rate in three channels. As the projecting gray value increases, its ST-BIF + ZNCC coverage rate decreases because of saturation. However, the coverage rate after the three-channel fusion is not decreasing but climbing because channels R and G start providing increased information as the SNR rises. When the gray value of projection increases to 70, channel G becomes the most efficient channel. With further improvement of brightness, the green channel will gradually saturate, and the red channel will become the most efficient channel to receive information. Although the three channels only have a limited sensitive brightness range, fortunately, a good reconstruction performance can still be maintained through the three-channel fusion.
  • (3) The ST-BIF + ZNCC method has slightly lower coverage rate compared with the conventional ZNCC matching method for each channel with most of projecting gray values. However, after three channel disparity fusion, both coverage rate and accuracy are promoted to the highest among all methods.

Furthermore, using a computer with CPU (i5- 6600K) and MATLAB 2018b platform and without any parallel computing frameworks utilized, we compared the processing time of the whole reconstruction process with the ST-BIF + ZNCC method and the ST-ZNCC method. The reconstruction target is the G channel of standard spheres with a projection gray value of 150. The reconstruction result of the two methods have similar RMSE and coverage rate. However, the running time of ST-BIF + ZNCC and ST-ZNCC is 259.9s and 4187.7s respectively, reflecting the acceleration ability of preliminary screening based on ST-BIF descriptor.

3.2 Qualitative evaluation

We firstly conduct a set of comparative experiments to test the HDR measurement ability of our method. Another HDR phase-shifting-based 3D measurement method based on the similar system structure is used as a comparison. In the comparison method, the HDR technique described in [12] is used for calculating the absolute phase maps, and then the principle of binocular stereo vision is used for three-dimensional reconstruction. We use a total of seven projected patterns in the comparison method, including four longitudinal fringe patterns with a period of 114 for four-step phase shifting and three gray code patterns for phase unwrapping. With the same system setting, we successively measure the ceramic vase and metal parts to evaluate the performance of the two HDR 3D reconstruction methods in specular reflection scenes.

Figure 13(a) shows the ceramic vase to be measured. Figure 13(b) shows the reconstruction result with our method, which is smooth and complete. Figure 13(c) shows the reconstruction result with comparison method, which has some longitudinal ripples. The reconstruction result in Fig. 13(c) also has some abnormal bright spots due to sparse point clouds, which are caused by highly reflection intensity in the middle of vase.

 figure: Fig. 13.

Fig. 13. Measurement target and reconstruction results with our method and comparison method. (a) Measurement target under uniform projection illumination. (b) Reconstruction result with our method. (c) Reconstruction result with comparison method.

Download Full Size | PDF

Then we measure two specular metal parts as shown in Fig. 14(a), which have larger specular reflection area compared with ceramic vase. Figure 14(b) shows the reconstruction result with our method. Although there is some noise, the reconstructed surface is relatively complete. Figure 14(c) shows the reconstruction result realized with the comparison method. The right metal sheet can’t be reconstructed integrally due to large area of specular reflection.

 figure: Fig. 14.

Fig. 14. Measurement target and reconstruction results with our method and comparison method. (a) Measurement target under uniform projection. (b) Reconstruction result with our method. (c) Reconstruction result with comparison method.

Download Full Size | PDF

The above two methods can both reconstruct high dynamic range objects with specular reflection. The proposed method only needs four frames and has a larger measurable dynamic range, but it compute relative slowly. In contrast, the comparison method uses up to seven frames. It has smaller measured dynamic range but shorter computation time.

Then we measure two objects with specular reflection and low-reflectivity parts to verify the performance of our proposed method in complex high dynamic scenes. Figure 15(a) shows these two different HDR targets we have measured. The left sculpture cat is a diffuse reflector with a range of reflectivity variations, and the right coffee artwork with smooth surface causes specular reflection. Figure 15(b) shows these two HDR targets under speckle projection. The light part of cat is overexposed because of high reflectivity. However, the dark coffee artwork is also overexposed because of specular reflection. The textures in overexposed areas are lost, thus leading to reconstruction errors.

 figure: Fig. 15.

Fig. 15. Targets to be tested and experimental scene. (a) Ceramic sculptures. (b) Ceramic sculptures with blue speckle.

Download Full Size | PDF

The experimental images and reconstruction results of HDR targets are shown in Fig. 16. Figure 16(g) shows the targets with blue speckle captured by the left camera. Figures 16(a) and (b) show the R channel of Fig. 16(g) and the ST-BIF + ZNCC reconstruction model of R channel, respectively. The R channel has poor result because of low SNR; however it can collect more information of high-light areas because low responsivity inhibits saturation. Figures 16(c) and (d) show the G channel of Fig. 16(g) and the ST-BIF + ZNCC reconstruction model of G channel, respectively. As the channel with medium transmittance, G channel can reconstruct most areas with moderate reflectivity. However, the point cloud of the high reflection area in the middle of the bottle and the low reflection area at the edge of the bottle is relatively sparse. Figures 16(e) and (f) show the B channel of Fig. 16(g) and the ST-BIF + ZNCC reconstruction model of B channel, respectively. The B channel has poor result because of saturation. However, collecting dark-place information is appropriate. After three-channel fusion, we have a dense and complete reconstruction model, as shown in Fig. 16(h).

 figure: Fig. 16.

Fig. 16. Experimental images and reconstruction results of high dynamic range targets. (a) the R channel of (g). (b) the ST-BIF + ZNCC reconstruction model of R channel. (c) the G channel of (g). (d) the ST-BIF + ZNCC reconstruction model of G channel. (e) the B channel of (g). (f) the ST-BIF + ZNCC reconstruction model of B channel. (g) targets with blue speckle captured by left camera. (h) 3D reconstruction by our proposed ST-BIF + ZNCC with three channel fusion method.

Download Full Size | PDF

3.3 Discussion

In this paper, we design a new high dynamic 3D measurement system by hardware and software co-design to realize the dense and complete reconstruction of complex high dynamic scenes only by four frames. Our system includes three advantages. (1) The temporal complementary speckle projection suppresses the complete loss of coding information in areas with too high or low reflectivity in HDR scenes. The corresponding matching strategy of ST-BIF descriptor preliminary screening and ST-ZNCC final retrieval improves the robustness of high dynamic measurement and matching speed. (2) We evaluate the disparity reliability and then fuse three-channel disparities together to generate a complete and accurate disparity map. Given that these three channels are effective in different brightness ranges, their fusion can remarkably improve the measurement dynamic range. Experiments show that the strategy can further improve the measurement dynamic range on the basis of the single channel ST-BIF + ZNCC measurement. (3) The system is simple and only need four frames to reconstruct the complex high dynamic scenes including specular reflection and low reflectance regions. The above two strategies can be used separately and easily combined with other methods.

However, this experiment only uses a blue projector to verify the feasibility of this method. If using other color light sources, such as red and green lights, diverse effects may be obtained. Considering the amount of calculation, this paper only uses pixel string containing 20 pixels for matching. In the future, we can obtain more samples in a large window and investigate whether the performance of this method is affected by the size of the search window and the number of samples.

4. Conclusion

In this paper, we design a new high dynamic 3D measurement system by hardware and algorithm co-design to realize a dense and complete reconstruction of complex high dynamic scene only by four frames. In system setting, we use temporal complementary speckle projection and three-channel disparity fusion to ensure that every region has the chance of being illuminated by bright and dark spots, and each region can be collected by R, G, and B channels, which have different sensitive intervals. These strategies enhance the possibility of special brightness information being collected. In algorithms, we propose the corresponding ST-BIF + ZNCC matching strategy, reliability evaluation, and three-channel disparity fusion algorithms for efficient reconstruction. Quantitative experiments show that the ST-BIF + ZNCC method has higher accuracy than the common ZNCC method in the case of a similar number of sampling points. Moreover, the three-channel disparity fusion strategy can further improve the measurement dynamic range. Qualitative experiment shows the dense and complete reconstruction of complex HDR scene, including specular reflection region.

Funding

National Natural Science Foundation of China (61735003, 61805011).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. J.-S. Hyun, G. T.-C. Chiu, and S. Zhang, “High-speed and high-accuracy 3D surface measurement using a mechanical projector,” Opt. Express 26(2), 1474–1487 (2018). [CrossRef]  

2. L. Zhang, Q. Chen, C. Zuo, and S. Feng, “High-speed high dynamic range 3D shape measurement based on deep learning,” Optics and Lasers in Engineering 134, 106245 (2020). [CrossRef]  

3. B. Chen and B. Pan, “Calibration-free single camera stereo-digital image correlation for small-scale underwater deformation measurement,” Opt. Express 27(8), 10509–10523 (2019). [CrossRef]  

4. X. Liu, W. Chen, H. Madhusudanan, J. Ge, C. Ru, and Y. Sun, “Optical Measurement of Highly Reflective Surfaces From a Single Exposure,” in IEEE Transactions on Industrial Informatics17(3), 1882–1891 (2021).

5. J. Pages, J. Salvi, R. Garcia, and C. Matabosch, “Overview of coded light projection techniques for automatic 3D profiling,” Robotics and Automation, 2003. Proceedings. ICRA ‘03. IEEE International Conference on. IEEE, 1:133–138(2003).

6. Harding and Kevin, “Industrial metrology: Engineering precision,” Nat. Photonics 2(11), 667–669 (2008). [CrossRef]  

7. T. Luhmann, S. Robson, S. Kyle, and J. Boehm, “Close-Range Photogrammetry and 3d Imaging, 2nd edition,” (Walter de Gruyter GmbH: Berlin, Germany, 2014).

8. Z. Li, K. Zhong, Y. F. Li, X. Zhou, and Y. Shi, “Multiview phase shifting: a full-resolution and high-speed 3D measurement framework for arbitrary shape dynamic objects,” Opt. Lett. 38(9), 1389–1391 (2013). [CrossRef]  

9. S. Feng, L. Zhang, C. Zuo, T. Tao, Q. Chen, and G. Gu, “High dynamic range 3-D measurements with fringe projection profilometry: A review,” Meas. Sci. Technol. 29(12), 122001 (2018). [CrossRef]  

10. S. Zhang and S.-T. Yau, “High dynamic range scanning technique,” Opt. Eng. 48(3), 030505 (2009). [CrossRef]  

11. S. Feng, Y. Zhang, Q. Chen, C. Zuo, R. Li, and G. Shen, “General solution for high dynamic range three-dimensional shape measurement using the fringe projection technique,” Optics and Lasers in Engineering 59, 56–71 (2014). [CrossRef]  

12. Y. Yin, Z. Cai, H. Jiang, X. Meng, J. Xi, and X. Peng, “High dynamic range imaging for fringe projection profilometry with single-shot raw data of the color camera,” Optics and Lasers in Engineering 89, 138–144 (2017). [CrossRef]  

13. S. Feng, Q. Chen, C. Zuo, and A. Asundi, “Fast three-dimensional measurements for dynamic scenes with shiny surfaces,” Opt. Commun. 382, 18–27 (2017). [CrossRef]  

14. D. Palousek, M. Omasta, D. Koutny, J. Bednar, T. Koutecky, and F. Dokoupil, “Effect of matte coating on 3D optical measurement accuracy,” Opt. Mater. 40, 1–9 (2015). [CrossRef]  

15. C. Waddington and J. Kofman, “Saturation avoidance by adaptive fringe projection in phase-shifting 3D surface-shape measurement,” 2010 International Symposium on Optomechatronic Technologies. IEEE, (2011).

16. H. Jiang, H. Zhao, and X. Li, “High dynamic range fringe acquisition: A novel 3-D scanning technique for high-reflective surfaces,” Optics and Lasers in Engineering 50(10), 1484–1493 (2012). [CrossRef]  

17. L. Rao and F. Da, “High dynamic range 3D shape determination based on automatic exposure selection,” Journal of Visual Communication and Image Representation 50, 217–226 (2018). [CrossRef]  

18. H. Lin, J. Gao, Q. Mei, Y. He, J. Liu, and X. Wang, “Adaptive digital fringe projection technique for high dynamic range three-dimensional shape measurement,” Opt. Express 24(7), 7703–7718 (2016). [CrossRef]  

19. D. Li and J. Kofman, “Adaptive fringe-pattern projection for image saturation avoidance in 3D surface-shape measurement,” Opt. Express 22(8), 9887–9901 (2014). [CrossRef]  

20. G. Babaie, M. Abolbashari, and F. Farahi, “Dynamics range enhancement in digital fringe projection technique,” Precis. Eng. 39, 243–251 (2015). [CrossRef]  

21. Y. Zheng, Y. Wang, V. Suresh, and B. Li, “Real-time high-dynamic-range fringe acquisition for 3D shape measurement with a RGB camera,” Meas. Sci. Technol. 30(7), 075202 (2019). [CrossRef]  

22. S. K. Nayar, K. Ikeuchi, and T. Kanade, “Surface reflection: physical and geometrical perspectives,” in IEEE Transactions on Pattern Analysis and Machine Intelligence13(7), 611–634 (1991).

23. Z. Cai, X. Liu, X. Peng, Y. Yin, A. Li, J. Wu, and B. Z. Gao, “Structured light field 3D imaging,” Opt. Express 24(18), 20324–20334 (2016). [CrossRef]  

24. P. Zhou, J. Zhu, and H. Jing, “Optical 3-D surface reconstruction with color binary speckle pattern encoding,” Opt. Express 26(3), 3452–3465 (2018). [CrossRef]  

25. F. Gu, Z. Song, and Z. Zhao, “Single-Shot Structured Light Sensor for 3D Dense and Dynamic Reconstruction,” Sensors 20(4), 1094 (2020). [CrossRef]  

26. A. Wiegmann, H. Wagner, and R. Kowarschik, “Human face measurement by projecting bandlimited random patterns,” Opt. Express 14(17), 7692–7698 (2006). [CrossRef]  

27. F. Zhong, R. Kumar, and C. Quan, “RGB laser speckles based 3D profilometry,” Appl. Phys. Lett. 114(20), 201104 (2019). [CrossRef]  

28. L. Yu and B. Pan, “Full-frame, high-speed 3D shape and deformation measurements using stereo-digital image correlation and a single color high-speed camera,” Optics & Lasers in Engineering 95, 17–25 (2017). [CrossRef]  

29. M. Schaffer, M. Grosse, B. Harendt, and R. Kowarschik, “Fast 3D shape measurements using laser speckle projection,” Proceedings of SPIE - The International Society for Optical Engineering 8082, 808219 (2011). [CrossRef]  

30. J. Banks and P. Corke, “Quantitative Evaluation of Matching Methods and Validity Measures for Stereo Vision,” The International Journal of Robotics Research 20(7), 512–532 (2001). [CrossRef]  

31. Ramin Zabih and John Woodfill, “Non-parametric Local Transforms for Computing Visual Correspondence,” (Springer-Verlag, Berlin, Heidelberg, 1994), pp. 151–158.

32. H. Hirschmuller and D. Scharstein, “Evaluation of Stereo Matching Costs on Images with Radiometric Differences,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, 31(9):1582–1599(2009).

33. M. Weber, M. Humenberger, and W. Kubinger, “A very fast census-based stereo matching implementation on a graphics processing unit,” IEEE International Conference on Computer Vision WorkshopsCIEEE, 786–793(2009).

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (16)

Fig. 1.
Fig. 1. Three components of reflection for smooth surface.
Fig. 2.
Fig. 2. Schematic diagram of the active binocular system of the proposed method.
Fig. 3.
Fig. 3. Temporal complementary speckle projection.
Fig. 4.
Fig. 4. Computational framework of the proposed high dynamic range measurement method.
Fig. 5.
Fig. 5. Generation of feature pixel string and ST-BIF descriptor.
Fig. 6.
Fig. 6. Schematic of the three-channel fusion procedure.
Fig. 7.
Fig. 7. Experimental system of the active binocular system in the proposed method.
Fig. 8.
Fig. 8. Measured standard spheres
Fig. 9.
Fig. 9. Captured images under different projecting gray value
Fig. 10.
Fig. 10. 3D reconstructions of standard spheres with a projecting gray value of 150 using the proposed ST-BIF + ZNCC method. (a) 3D reconstructions of R channel. (b) 3D reconstructions of G channel. (c) 3D reconstructions of B channel. (d) 3D reconstructions after three-channel fusion.
Fig. 11.
Fig. 11. RMSE with different projecting gray values using ZNCC for R channel (ZNCC R), G channel (ZNCC G), and B channel (ZNCC B); the proposed ST-BIF + ZNCC for R channel (ST-BIF + ZNCC R), G channel (ST-BIF + ZNCC G), and B channel (ST-BIF + ZNCC B); and the proposed ST-BIF + ZNCC after three-channel fusion (ST-BIF + ZNCC after fusion). (a) RMSE curves for sphere A. (b) RMSE curves for sphere B.
Fig. 12.
Fig. 12. Coverage rate curves with different projecting gray values using ZNCC for R channel (ZNCC R), G channel (ZNCC G), and B channel (ZNCC B); the proposed ST-BIF + ZNCC for R channel (ST-BIF + ZNCC R), G channel (ST-BIF + ZNCC G), and B channel (ST-BIF + ZNCC B); the proposed ST-BIF + ZNCC after three-channel fusion (ST-BIF + ZNCC after fusion). (a) Coverage rate curves for sphere A. (b) Coverage rate curves for sphere B.
Fig. 13.
Fig. 13. Measurement target and reconstruction results with our method and comparison method. (a) Measurement target under uniform projection illumination. (b) Reconstruction result with our method. (c) Reconstruction result with comparison method.
Fig. 14.
Fig. 14. Measurement target and reconstruction results with our method and comparison method. (a) Measurement target under uniform projection. (b) Reconstruction result with our method. (c) Reconstruction result with comparison method.
Fig. 15.
Fig. 15. Targets to be tested and experimental scene. (a) Ceramic sculptures. (b) Ceramic sculptures with blue speckle.
Fig. 16.
Fig. 16. Experimental images and reconstruction results of high dynamic range targets. (a) the R channel of (g). (b) the ST-BIF + ZNCC reconstruction model of R channel. (c) the G channel of (g). (d) the ST-BIF + ZNCC reconstruction model of G channel. (e) the B channel of (g). (f) the ST-BIF + ZNCC reconstruction model of B channel. (g) targets with blue speckle captured by left camera. (h) 3D reconstruction by our proposed ST-BIF + ZNCC with three channel fusion method.

Tables (2)

Tables Icon

Table 1. Average RMSE of different methods.

Tables Icon

Table 2. Average coverage rate of different methods.

Equations (16)

Equations on this page are rendered with MathJax. Learn more.

( u , v ) c = R , G , B { [ G 1 c ( u , v ) G 2 c ( u , v ) ] 2 + [ G 3 c ( u , v ) G 4 c ( u , v ) ] 2 } 6 > σ ,
I ( u , v ) = k = 1 4 [ G k ( u , v ) n = i , i G k ( u + n , v ) m = i , i G k ( u , v + m ) ] .
ξ ( G 1 , G 2 ) { 0 ,  if  G 1 G 2 1 ,  else .
I ¯ ( u , v )  =  n = 1 20 I n ( u , v ) 20 .
B 1 ( u , v ) =   n = 1 20 ξ ( I n ( u , v ) , I ¯ ( u , v ) ) .
B 2 ( u , v ) = m = 1 5 [ n = 1 3 ξ ( I m + 5 n 5 ( u , v ) , I m + 5 n ( u , v ) ) ξ ( I m + 15 ( u , v ) , I m ( u , v ) ) ] .
B 3 ( u , v ) = k = 1 4 { n = i , i ξ [ G k ( u , v ) , G k ( u + n , v ) ] m = i , i ξ [ G k ( u , v ) , G k ( u , v + m ) ] } ,
B ( u , v ) = B 1 ( u , v ) B 2 ( u , v ) B 3 ( u , v ) .
C B I F = i = 1 n   B i ( u , v ) L   B i ( u , v ) R .
ρ S T Z N C C [ ( u , v ) L , ( u , v ) R ] = i = 1 20 [ I i ( u , v ) L I ( u , v ) L ¯ ] [ I i ( u , v ) R I ( u , v ) R ¯ ] i = 1 20 [ I i ( u , v ) L I ( u , v ) L ¯ ] 2 i = 1 20 [ I i ( u , v ) R I ( u , v ) R ¯ ] 2 .
I ( j ) ¯ = 1 20 i = 1 20 I i ( j ) . ( j  is the coordinate)
{ ( u D L eft ( u , v ) ) > 0   |   D L eft ( u , v ) D R i g h t ( u D L eft ( u , v ) , v )   | < ,
ρ S T Z N C C [ ( u , v ) L , ( u D L eft ( u , v ) , v ) R ] >   β ,
ρ Z N C C [ ( u , v ) L , ( u , v ) R ] = m = i i n = i i [ G ( u + 3 m , v + 3 n ) L G ( u , v ) L ¯ ] [ G ( u + 3 m , v + 3 n ) R G ( u , v ) R ¯ ] m = i i n = i i [ G ( u + 3 m , v + 3 n ) L G ( u , v ) L ¯ ] 2 m = i i n = i i [ G ( u + 3 m , v + 3 n ) R G ( u , v ) R ¯ ] 2 ,
G ( u , v ) ¯ = 1 ( 2 i + 1 ) 2 m = i i n = i i G ( u + 3 m , v + 3 n ) ,
γ = N m N p ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.