Temporal-spatial binary encoding method based on dynamic threshold optimization for 3D shape measurement

Pei Zhou; Pei Zhou; Xiaoyi Feng; Xiaoyi Feng; Jun Luo; Jun Luo; Jiangping Zhu; Jiangping Zhu

doi:10.1364/OE.493903

1. Introduction

Fringe projection profilometry (FPP) is a 3D measurement technique that is known for its advantages of non-contact, high-precision and whole-field, and it is used for a variety of applications, such as mechanical measurements, industrial monitoring, and dental reconstruction [1,2]. However, balancing the accuracy and speed for 3D measurement is still challenging in many important applications for traditional FPP [3,4]. The reasons are: (1) The intensity range of 8-bit sinusoidal fringe patterns projected in traditional FPP is 0-255. Yet the maximum switching rate of 8-bit sinusoidal fringe patterns projected by a DMD-based projector is usually limited to 120Hz. It is not feasible to use an off-the-shelf digital projector to perform a high-speed task with sinusoidal fringe patterns projection. (2) If careful nonlinearity correction is not carried out in advance [5], the nonlinearity of DLP projectors may deteriorate the high-quality sinusoidal fringe patterns. For this purpose, some nonlinear correction techniques [6] can be used to compensate for phase errors caused by the nonlinearity and improve measurement accuracy.

Lately, numerous researchers have dedicated themselves to investigating binary encoding techniques to accomplish both high-speed and high-precision in 3D shape measurement. Due to the fact that binary encoding patterns only have two grayscale values (0 and 255), the nonlinearity can be avoided, and high-speed projection can be achieved by virtue of DLP technology with its inherit high binary pattern switching rate. Then, approximate sinusoidal fringe patterns with a reduced contrast can be created through defocused projection, allowing 3D measurement at a rapid speed while maintaining or even improving measurement accuracy. Typical binary encoding techniques can be classified into two categories: one-dimensional (1D) modulation encoding and two-dimensional (2D) modulation encoding.

The early binary encoding methods are mostly designed in 1D modulation. Lei et al. [7] proposed the squared binary modulation(SBM), but a large defocus was required to suppress high-order harmonics when the fringe period was larger. To generate high-quality sinusoidal fringes under a small projector defocus, Ayubi et al. [8] borrowed the idea of sinusoidal pulse width modulation(SPWM) from power electronics and applied it to the binary defocusing 3D measurement. By shifting non-fundamental frequency components to higher frequency components, high-order harmonics can be eliminated even with a small defocus, resulting in better sinusoidal characteristics of the fringe images. Later, Wang et al. [9] proposed the optimal pulse width modulation(OPWM), which could eliminate the harmonics of specific orders in the square wave, making it possible to obtain high-quality sinusoidal fringes after defocusing projection, and effectively improving the phase quality. Zuo et al. [10] studied the sensitivity of N-step phase-shifting to different components of high-order harmonics and found that if the four-step phase-shifting method was used, phase errors only affected odd-order harmonics. The authors further proposed three-level SPWM [11] and applied it to the reconstruction of dynamic scenes. Cai et al. [12] introduced a 3D shape measurement method based on 1D temporal-spatial binary encoding. This method utilized a spatial encoding mode and two temporal encoding modes to obtain a 6-bit code word, which could distinguish 64 fringe periods in sinusoidal phase-shifting images and achieve absolute phase unwrapping. However, in order to extract 6-bit code words from 1D mode, a complex decoding scheme was designed.

Researches have been conducted on 2D modulation techniques which include area modulation, ordered-Bayer dithering, and error diffusion. Su et al. [13] proposed an area modulation method using micro-machined gratings to create sinusoidal fringe patterns. Wang et al. [14] used the Bayer dithering algorithm to encode the sinusoidal pattern, generating good phase quality for wider fringe periods, but it has visible repetitive texture structures, leading to poor phase quality. Among these approaches, the spatial error diffusion (ED) method and its variants [15–20] have been most widely investigated and applied, which is generally a more reliable and efficient way to improve the encoding quality of binary patterns than 1D.

The spatial ED methods calculate the "quantization error" for each pixel when binarizing the grayscale pattern, and the quantization error is subsequently diffused to the adjacent pixels rather than simply discarding it, creating a more accurate representation of sinusoidal fringe pattern. Typical ED methods include Floyd Steinberg dithering, Sierra Lite dithering, Stucki dithering, etc. These methods have different error diffusion kernels, which are designed to spread the error as naturally and smoothly as possible across the pattern, but the same quantization threshold(0.5) is adopted, which is a predefined value to determine if a pixel should be replaced with black(0) or white(1). There will be large encoding errors in some areas, resulting in large phase jump errors in the final result and a loss of accuracy in the 3D reconstruction. Obviously, the quantization threshold can be further optimized to improve encoding quality.

Scholars have suggested optimizing the diffusion kernel in spatial ED methods. Zhou et al. [16] proposed a time-consuming kernel optimization algorithm with intensive search. While Zhu et al. [17] proposed using a genetic algorithm to speed up the optimization process.

The fixed threshold in spatial ED doesn’t fully utilize the characteristics of sinusoidal fringe, so non-fixed threshold dithering methods like Zheng’s sigmoid function and gradient descent algorithm [18] are emerged instead. However, careful parameter design is necessary to avoid local optima. Cai et al. [21] introduced threshold optimization, but its effective measurement depth range is limited because of defocused projection. The quality of binary coding patterns is further improved by the optimization algorithms. Common optimization objectives include intensity-based technique [22], phase-based technique [23], structural similarity error [20], frequency domain signal error, or their combination, which can be performed on the entire pattern, pattern block or diffusion kernel.

The projector needs to be defocused for the vast majority of spatial ED methods, in order to effectively suppress the higher-order harmonics contained in the binary encoding patterns themselves and to approach ideal sinusoidal fringes. However, defocusing projection inevitably leads to a reduction in the measurement depth and a decline in SNR of captured images, which can adversely affect the accuracy of phase extraction, particularly for high-frequency fringes [24] for final 3D reconstruction. To address this issue, temporal-spatial binary encoding methods have been proposed to ensure that the binary fringes in the form of focused projection still maintain good sinusoidal properties. Ayubi et al. [8] decomposed one sinusoidal fringe pattern into eight binary patterns, heavily increasing the number of projected patterns, which was also accompanied by a complex and time-consuming decoding process. Alternatively, Li et al. [25] proposed sampling on a sinusoidal curve to generate a series of square binary patterns with different widths. Zhu et al. [26] proposed a temporal-spatial binary(TSB) encoding method, which evenly divided the continuous grayscale intervals [0,1] into $K$ intervals during the error diffusion process and applied corresponding $K$ fixed quantization thresholds to binarize them. One sinusoidal fringe pattern is binarized into $K$ binary patterns, and after integral imaging, a corresponding approximate sinusoidal fringe is obtained.

The purpose of this study is to achieve fast and accurate 3D measurement through temporal-spatial encoding of sinusoidal fringes using the high-speed projection characteristics of DMD-based projector. In this paper, we propose a new dynamic threshold optimization model to generate temporal-spatial binary encoding patterns, which enables the encoded patterns to achieve smaller and uniformly distributed phase errors. We modify fixed thresholds by using the grayscale values of local regions and derive their optimal weight coefficient through the simulation experiment. We calculate the binary patterns for each interval by encoding the sinusoidal fringe pattern in the temporal axis and spatial coordinates. Finally, approximate sinusoidal fringes are obtained through integral imaging, and the phase analysis algorithm is used to acquire the absolute phase map. By combining it with the system calibration parameters, the 3D shape of the tested object is retrieved. The experimental results indicate that the temporal-spatial binary encoding method with dynamic threshold optimization proposed in this paper allows for the use of binary fringes in high-speed and high-precision measurement scenarios, even when the projector is nearly-focusing. This method greatly expands the measurement range of binary fringes and provides a depth measurement range that is equivalent to standard sinusoidal fringes.

2. Temporal-spatial binary fringe encoding based on dynamic threshold optimization

2.1 Encoding principle

Both the idea of temporal binary encoding and spatial binary encoding are carried over into the temporal-spatial binary fringe encoding. While spatial binary encoding refers to encoding within an image and the currently processed pixel is related to its surrounding pixels, temporal binary encoding refers to encoding along the time axis [26]. Temporal encoding often performs better than spatial encoding in terms of decoding and measurement accuracy.

Using the temporal-spatial binary encoding approach we provide, one computer-generated 8-bit standard gray sinusoidal fringe pattern yields more than two binary fringe patterns. One experimental sinusoidal fringe image can be produced with integral imaging (many temporal-spatial binary fringe patterns are simultaneously projected onto the target and imaged on the camera) by in-focus projecting the sequence; the process is illustrated in Figs. 1(a)-(c). The intensity-normalized sinusoidal fringe pattern $I_s^i$ shown in Fig. 1(a) is expressed by:

(1)$$I_{s,i}(u,v) = 0.5 + 0.5\cos \left(2\pi \frac{x}{P} + 2\pi \frac{i}{N}\right), i = 0,1,\ldots,N - 1$$

Where $(u,v)$ represents the pixel coordinates, $N$ is phase-shifting step, and $P$ represents the fringe period. Each pattern with phase shift $i$ undergoes the encoding process shown in Fig. 1.

Fig. 1. The 2D and 1D schematic diagram of the temporal-spatial binary encoding process when $K$=4:(a)sinusoidal fringe pattern $I_s$; (b)temporal-spatial binary encoding patterns $\{B_k\}$; (c) the approximate sinusoidal fringe pattern $\tilde {I}_s$ after the low-pass filtering of system point spread function and integral imaging; (d)the detailed 1D curve of sinusoidal fringe pattern $I_s$; (e) 1D encoding patterns $\{B_k\}$ and the integral imaging pattern$\tilde {I}_s$.

Methods	FS	FSDT	TSB	Proposed
RMSE of A sphere	0.0790	0.0472	0.0299	0.0238
RMSE of B sphere	0.0853	0.0493	0.0315	0.0251
RMSE of center distance	1.2169	0.0741	0.0469	0.0394

Abstract

1. Introduction

2. Temporal-spatial binary fringe encoding based on dynamic threshold optimization

2.1 Encoding principle

2.2 Dynamic threshold optimization

3. Simulation experiment

3.1 Dynamic threshold analysis and its impact on sinusoidal fringe patterns

3.2 Comparative results of phase accuracy

4. Real experiments

4.1 Plane experiment

4.2 Static object experiment

4.3 Dynamic object experiment

5. Conclusion and future work

Funding

Disclosures

Data availability

References

Supplementary Material (1)

Data availability

Cited By

Figures (17)

Tables (1)

Equations (13)

Optics Express