Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Averaging approaches for highly accurate image-based edge localization

Open Access Open Access

Abstract

We introduce an optical and a digital averaging technique that considerably improves edge localization performance. Especially for high quality images, the optical method achieves measurement uncertainties down to levels of millipixels. The approach uses an optical replication scheme based on a computer-generated hologram to reduce noise and discretization errors. The second method is based on a neural network denoising architecture and is especially suited for high levels of photon noise. Edge localization can be improved by up to 60% while preserving high lateral and temporal resolution. The methods are first tested using high quality images obtained by a scientific CMOS sensor imaging a razor blade mounted on a mechanical stage. Then, the laboratory results are tested for larger distances to validate the methods for building deformation measurements.

© 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

The detection and localization of edges is of fundamental importance for many vision-based applications. Edge detection is one of the earliest processes in computer as well as human vision and geometrical two- and three-dimensional measurements most often rely on the localization of edges [14]. The accuracy of edge localization, therefore, is strongly related to the accuracy of two- and three-dimensional image-based measurements.

Moment-based (e.g. center-of-gravity computation [5], Zernike moments [6,7] or others [8]) or model-based algorithms (e.g. fitting to a distribution or interpolation) [4,9,10] are typically used for processing the images. The achievable edge localization accuracy depends on a multitude of factors and one can distinguish between systematic and random or statistical errors. Although often difficult in practice, in principle, systematic errors can be calibrated. Also, a lot of applications, e.g. the measurement of deformations, do not need absolute accuracy, especially when applied to continuously monitoring and adaptive control of buildings [1113]. In this contribution we focus on the statistical errors, namely the measurement uncertainty, or precision and do not touch the issue of calibration. If necessary, highly accurate calibration using standard methods can be achieved once precision is given [14,15].

Random errors are typically denoted as ”noise”. Some errors, like discretization due to pixels or the photo-response non-uniformity of the image sensor are constant over time and could be calibrated. But calibration would have to be performed for each possible edge position and orientation – an approach that is rarely possible in real world applications. Therefore, such error contributions are also typically considered as ”noise”.

A lot of theoretical as well as experimental work has been already done towards improving edge localization accuracy. Reported results are often hard to compare but in general one can say that the achieved measurement uncertainties are in the range of 0.3 pixel down to 0.01 pixel [1,8,9,1622].

Sometimes, quite sophisticated algorithms are used. These algorithms are often designed, however, based on edge and noise models, which for the highest accuracies of localization (in the range of hundreds of a pixel) are questionable. For the model of the intensity distribution of the image of edges, sometimes hard steps, linear steps or Gaussian functions are used. The Gaussian approximation is already quite good, but in principle one would have to start with the coherent, incoherent or partially coherent edge transfer function [23]. Then, the position dependent aberrations of the imaging system have to be taken into account and finally also the microlenses as well as the spatially inhomogeneous sensitivity of the image sensor pixels would have to be considered [24]. For most practical applications this would be a rather unrealistic approach and one has to live with unrealistic edge models.

In most practical applications, the overall noise is dominated by photon noise, which is given by a Poissonian probability distribution. In practice for moderate to high number of photons, the Poissonian distribution can be approximated by a Gaussian distribution and the standard deviation of the recorded intensity is proportional to the square root of the number of photons (or electrons after photo-detection). This means that for every pixel one should compute the noisy output based on a distribution with a mean of the average number of photons $N$ and the standard deviation $\sqrt {N}$. To keep the signal to noise ratio high it is, therefore, in practice advantageous to employ image sensors with a high enough quantum well capacity (preferable more than 10.000 electrons).

Also, for highly accurate measurements one should take the other noise contributions into account [25]. Read-out noise and thermal noise can be reduced by temporal averaging whereas the fixed pattern noise contributions dark-signal non-uniformity (DSNU) and photo-response non-uniformity (PRNU) cannot be removed that way. PRNU can even lead to a systematic but illumination dependent shift of the edge location if one exposes the sensor such that the bright lit area is already near the sensor saturation. In this case the noise in the bright area leads to some clipped pixel values and as a result the edge seems to be shifted more to the direction of the dim edge area. Especially for sensors in the infrared spectral region, the fixed pattern noise contributions can be quite high and dead pixels have to be taken into account. Also, most often thermal noise is dominant and cooling of the sensor becomes necessary in infrared applications.

Anyway, noise is the major concern when trying to achieve highly accurate edge location measurements. And averaging is the main method to reduce the noise. Temporal averaging effectively reduces photon noise and image sensor (dark current) thermal noise as well as read-out noise. Also it reduces deviations due to air turbulence (an important factor even for a lot of indoor applications) and mechanical vibrations. The main drawback is, of course, the corresponding reduction of temporal resolution. In a concrete application one should use as much temporal averaging as the application allows.

Spatial averaging leads to reduced spatial (lateral) resolution. A common example is the measurement of the distance between two edges where one might measure the distance along each pair of pixels of the edges. The average distance then will have a reduced measurement uncertainty. Especially the discretization error can be strongly reduced, e.g. using so-called slanted-edge techniques [26]. However, fine variations of the distance then will not be visible anymore. The same is true for the parameters of circle localization. Since all points along the circumference of the circle can be used, the achievable localization accuracies are very high but small scale variations will not be detected. Often, the employed image processing algorithms implicitly perform already some sort of spatial averaging (e.g. Hough transform [27] or correlation-based approaches [28]).

As a result, performing spatial averaging (e.g. by a Gaussian filter) will improve the measurement uncertainty of edge localization, but also effectively will reduce the lateral resolution of the measurement. Most often, spatial averaging perpendicular to the edge orientation can and should be performed for best results because it will not affect the lateral resolution of the edge localization.

Since edge localization results vary strongly with noise it is clear that it only makes sense to compare algorithms or techniques that finally have comparable lateral resolutions and one can always trade edge localization accuracy against lateral resolution.

In this contribution we introduce two preprocessing techniques, one optical and one purely digital, that can be used with any edge localization algorithm. We demonstrate the techniques in combination with a very simple center-of-gravity (COG) based algorithm (Sobel followed by COG).

The two techniques considerably improve the edge localization uncertainty for real (not simulated) images while still preserving high lateral resolution.

The first approach (section 2. – 4.) optically replicates the image before photo-detection so that the same image of the edge falls onto different positions on the image sensor. Thereby, different discretization errors and most time varying errors can be reduced by spatial averaging. The second approach (section 5.) uses a purely digital method, a neural network based image denoising approach.

We show that this methods could be applied to building deformation measurement as shown in section 4. Certain easily visible edges of the building can be used, such as window frames. Clearly, if continuous measurement or monitoring is required the edges have to be illuminated during times where ambient light is scarcely available. For that purpose the experiment uses a laminated white wooden frame which is illuminated by halogen light.

The raw images for the measurements have been obtained first under nearly ideal conditions (air-damped table, turbulence shielding, high quality image sensor). We show that that for real world applications (window frame) the principle still works.

2. Image replication for edge localization

The core idea for optically achieving an improved edge localization is to use an optical element or system that replicates the image before photo-detection. Therefore, it becomes possible to localize an edge in each copy of the edge. Averaging of the edge positions obtained that way increases the precision because noise and discretization errors will be reduced. In principle one expects (for uncorrelated errors) an improvement by $\sqrt {N}$ if the number of copies is denoted by $N$.

This method has been already successfully used for spot-based measurements [29,30], but will be extended in the following towards image based edge localization. To this end a computer-generated hologram (CGH) can be inserted in the optical system (see Fig. 1). In addition, an intermediate image can be generated and a field stop (or a more complicated mask) can be added into the intermediate image plane to block irrelevant image parts (see Fig. 1).

 figure: Fig. 1.

Fig. 1. Experimental setup. The object (razor blade) and the light source (LED) are placed on a mechanical stage. A computer-generated hologram (CGH) is used for optical replication. The image is captured by a CMOS image sensor.

Download Full Size | PDF

The hologram can be regarded as a superposition of individual gratings. The 0th order thereby corresponds to the image without the use of a hologram. Each grating leads to a shift of the image. According to the Fourier shift theorem the shift is proportional to the grating frequency (for systems optimized according to the sine condition). In practice, field-dependent aberrations of the imaging system lead to small changes but if the object point is shifted, all the copies of the image are also shifted. If one chooses a symmetric replication pattern, part of the aberrations will cancel later during the averaging process but especially distortion might still be a problem. Anyway, we do not have to care about these deviations if the system is calibrated or just used for relative measurements (deformation measurements).

The binary hologram itself is optimized using the binary search algorithm [31] and photographically manufactured using a circular laser writing system [32]. Of course, the effect of the hologram itself is highly wavelength dependent so that considerable chromatic aberrations are present for polychromatic light. This might lead to strongly blurred edges that can cause problems during digital processing. All measurements have been performed using LED illumination with an additional bandpass filter to reduce this effect.

The individual shifts of the replication pattern have to be chosen such that for a given application the different images do not overlap. A mask in an intermediate image plane can help to achieve this for more complex scenes. For applications with changing scenes it is possible to use a programmable mask, e.g. a digital micromirror device.

3. Experimental setup and evaluation

Fig. 1 shows the experimental setup that we realized on an air-damped optical table. We used a scientific CMOS image sensor (PCO edge 3.1, 2048 $\times$ 1536 pixel, 6.5 $\times$ 6.5 ${\rm{\mu} \rm{m}}$, 30.000 electrons quantum well capacity) in combination with a simple and non-expensive objective lens.

The object is illuminated from behind by a LED ($\lambda _0$ = 554 nm, full-width-half maximum: 29 nm) followed by an interference filter (Thorlabs FL05532-10, center wavelength $\lambda _0$ = 532 nm, full-width-half maximum: 10 nm) and a collimating lens (focal length 45 mm). As a near perfect edge object we use a razor blade which is moved by a mechanical stage (Thorlabs MLJ050). The blade is imaged by a 35 mm focal length lens (Fujian China TV Lens GDS-35) at an f-number of 1.7 and a magnification of $\beta ' = -0.057$.

Air turbulence has been reduced (but not completely eliminated) by a cylindrical housing of the light path and the quite long exposure time of 260 ms. Also, for each of the 100 equidistant stage positions (step size: 4 ${\rm{\mu} \rm{m}}$, total travel: 0.4 mm) we recorded 10 images. Therefore, during evaluation we can use averaged (in order to reduce remaining time-variant noise contributions) or individual images. After each movement of the stage we delay the image recording in order to make sure that potential vibrations due to the movement are gone. An exemplary image is shown in Fig. 2. Nine replications of the original object are recorded by the image sensor.

 figure: Fig. 2.

Fig. 2. Exemplary image used for evaluation. Nine replications of the original object (razor blade) are recorded by a CMOS sensor. The blue boxes indicate the regions that were used for the evaluation. The edge position is determined along the direction of the vertical red lines. The blurriness in the image is caused by the phenomenon of chromatic dispersion, which results in wavelength-dependent refraction and diffraction. In the case of the wavelength dependent diffraction the frequency components of the illumination are spatially separated at a diffraction grating. This leads to a degradation of the lateral resolution, which can be seen here in the blurriness.

Download Full Size | PDF

The control of the experiment as well as the following image processing has been done using the open source measurement software itom [33]. For the evaluation, the images are smoothed using a truncated Gaussian filter (nearly a boxcar filter) with standard deviation 5 $\times$ 5 in a 7 $\times$ 7 pixel kernel followed by a one-dimensional Sobel filter with kernel size 5 across the edge. The one-dimensional COG of the resulting image across the edge for each column is computed to yield the edge position. This is done for each replicated edge image and then the average over the nine edge positions is computed. This gives the final edge position.

The simple and computationally inexpensive procedure is done for each time-averaged image resulting in edge positions along the edge.

To keep things as fundamental as possible in this contribution we concentrate on horizontally aligned edges. Different orientations of the edges would demand further preprocessing and calibration or more sophisticated evaluation algorithms. It has been shown [3436], that different methods can be used to implement rotation invariance into the system when needed.

For the statistical evaluation of the measurement uncertainty of the edge positions we perform the procedure at 100 different positions of the mechanical stage and concentrate on 50 pixels along the edge. Experiments using a commercial interferometer-based length measurement system have shown that for a unidirectional movement of the stage the positioning errors are below 0.4 ${\rm \mu} m$ (standard deviation). In combination with the magnification of the imaging system this leads to an image-sided error of the position below 0.4 ${\rm \mu} m~\times ~0.057 = 23$ nm, corresponding to 3.5/1000-th of a pixel.

We fit a polynomial of degree two through the determined edge positions of one edge (defined by one column, see Fig. 2) pixel in order to have a good estimate of the low-frequency error due to remaining errors of the stage and distortions of the optics. This estimate is subtracted from the original edge positions to yield the localization error of the edge at this position (column). The standard deviation is the measurement uncertainty. We repeat this for 50 columns (corresponding to 50 edge positions) in order to achieve the average measurement uncertainty $\sigma$.

This results in a multi-image measurement uncertainty of $\sigma _{MI} = 0.0078$ pixel without temporal averaging and $\sigma '_{MI} = 0.0055$ pixel with temporal averaging (10 images).

The reference benchmark is to apply exactly the same evaluation procedure to the case without optical image replication. To be able to use exactly the same measurement data we just use one of the replications for evaluation. This leads to a single-image measurement uncertainty $\sigma _{SI} = 0.0183$ pixel and $\sigma '_{SI} = 0.0089$.

We, therefore, have an improvement by a factor of 2.4 without temporal averaging. If we assume that all errors are independent one would expect an improvement by a factor of $\sqrt {9} = 3$. For the already time-averaged (10 images) the improvement is less pronounced (factor of 1.6).

We conclude that for the case with temporal averaging we already have so much averaging included that the additional averaging by the proposed multi-image method will have obviously reduced potential for further improvement by averaging.

Despite the very simple and fast evaluation algorithm the achievable results of about 5/1000 of a pixel are excellent and better than the previously reported results in the literature. Of course, one might use the basic method of image multiplication also in combination with other edge localization algorithms (e.g. [37]).

4. Scenario with intermediate image

To evaluate that this method is also applicable to building deformation measurements we applied this approach to the detection of windows. Since window frames tend to have vertical or horizontal edges, we try to focus on one of the two variants to keep the algorithm fast and simple. To achieve this, we changed the camera setup adding an intermediate image plane as shown in Fig. 3. This allows us to block irrelevant parts of the scene which might otherwise overlap with with the copies of the windows.

 figure: Fig. 3.

Fig. 3. Schematical setup showing the addition of an intermediate image to block irrelevant image parts using a movable rectangular aperture. We used a field lens (located in the intermediate image plane) to match the entrance pupil of the second imaging system (from L2 to Sensor) with the exit pupil of the first system (from Object to L2). The image is captured by a CMOS image sensor. L1: Pentax, L2 & L3: achromatic doublets, L4: Kowa.

Download Full Size | PDF

The window phantom is mounted on a vertical height-adjustable stage (Thorlabs MLJ150) in order to allow controlled movement with a step size of 0.04 mm and 4 mm total travel. The upper edge of the window is illuminated with a spotlight providing sufficient surface brightness.

For the imaging into the intermediate image plane we use an 50 mm objective lens (Pentax, f-number of 1.7). Two achromatic doublets (focal length 50 mm and 100 mm) and one 50 mm objective lens (Kowa, f-number of 1.2) are used for the second imaging stage. A bandpass filter with a central wavelength of 525 nm (FWHM 25 nm) is used. The images are captured using a CMOS sensor.

The intermediate movable image mask is realized using a rectangular orifice of $\approx 0.4$ mm aperture. The resulting regions of interest (ROI) (48 x 48 pixel) which were processed by the algorithm are outlined by blue borders, shown in Fig. 4.

 figure: Fig. 4.

Fig. 4. Overview of of the 8 replications $-4^{th}$ to $+4^{th}$-Order diffractions. The ROIs which were processed are 48 x 48 pixel in size are outlined by blue borders.

Download Full Size | PDF

Single- and multi-image evaluation have been further used in order to detect small shifts of the window. Multi-image evaluation is based on averaging of all ROIs contained in one image, whereas single-image evaluates only one ROI. For each y-stage increment, the camera captures 10 images which are then averaged to provide a reference for (time-averaged) measurement uncertainty and to obtain a ground truth for the neural network (section 5.).

As mentioned in the section before, all ROIs are preprocessed by Gaussian and Sobel-filters.

In this scenario with intermediate image we achieved an improvement factor of $\approx 2.7$ for time averaging by the multiple image evaluation, which is not far from the theoretical limit of $\sqrt {9} = 3$. The median measurement uncertainty for one ROI is $\sigma _{SI} = 0.0252$ pixel. The median measurement uncertainty for multi-image is $\sigma _{MI} = 0.0154$ pixel, which results in an improvement of $\approx ~64\%$ for the multi-image technique. The median measurement uncertainty for multi-image with temporal averaging is $\sigma '_{MI} = 0.0090$ pixel.

5. Neural network for noise reduction

The method presented in the last sections already give very good results limited in practice mainly by air turbulence and vibrations. However, for applications where cheaper image sensors are used and/or worse lighting or weather conditions (e.g. fog, rain, snow) are present the signal-to-noise ratio is considerably worse.

In this section we concentrate on a second, purely digital preprocessing method that helps to improve the results without optical preprocessing if strong (photon) noise is present. Typically, Gaussian filters offer good performance for such tasks but lead to a considerable reduction of lateral resolution.

We therefore investigated the use of neural network (U-Net) based denoising approaches [38]. The net consists of two paths, a contracting and an expansive one, whereby the two paths are concatenated and are mainly symmetric to each other (shown in Fig. 5). Ronneberger et al. originally used the U-Net for biomedical image segmentation [38]. Further segmentation applications employing the U-Net architecture have been presented [3942]. Apart from this and other applications U-Net architecture was employed to denoise images [4345].

 figure: Fig. 5.

Fig. 5. U-Net architecture. The boxes represent the feature maps whereby the number of channels is given above the boxes and their size on the left side.

Download Full Size | PDF

To train the neural network, the time-averaged images of the razor blade at the different positions were used (see section 3.). A 64 $\times$ 64 pixel area of each edge has been selected. As a result, 900 images of edges were created (100 images for each of the 9 replications). In most cases the overall noise is dominated by the photon noise, which is given by a Poisson probability distribution. To generate the noisy images, we artificially added photon noise (QWC = 1000, corresponding to a signal-to-noise ratio of 31.6) to the captured images by calculating a Poisson probability distribution for each pixel dependent on its gray-scale value. The pixelwise substitution of the gray-scale values of the original image with a random value of the appropriate Poisson probability distribution yields the noisy images.

The U-Net has been implemented using tensorflow and keras [46,47]. It consists of blocks of respectively two convolutional layers (kernel size = 3, activation function = ReLU) in the contracting path which are followed by max pooling (pool size = 2). In the expansive path upsampling is used instead of pooling. Each upsampling operation is followed by a convolutional layer (kernel size = 2, activation function = ReLU), a concatenation with the appropriate part of the contracting path and a block of two convolutional layers (kernel size = 3) (Fig. 5).

The noisy images have been used as input data and the corresponding original noiseless images as target data so that the neural network can be trained on denoising images. 700 of the 900 images (7 replications) have been used for training and the remaining 200 images (2 replications, replication A and replication B) have been used as test data on which the edge localization has been performed. The U-Net was trained for 500 epochs. Fig. 6 shows denoised images using the U-Net.

 figure: Fig. 6.

Fig. 6. Neural network denoiser. Top: Original noiseless images of the edges; Middle: Images after artificially adding photon noise (QWC = 1000); Bottom: Output of the trained U-Net (denoised images).

Download Full Size | PDF

We compare the standard approach of preprocessing by one-dimensional Gaussian filtering across the edge with one-dimensional Gaussian filtering followed by the trained U-Net. The parameters of the Gaussian filters are chosen so that the lateral resolution of the reprocessed edges stays constant. For the Gaussian filtering we employed the methods implemented in OpenCV [48]. To this end we used (noisy) line structures with different line separations to visually compare the lateral resolutions (Fig. 7).

 figure: Fig. 7.

Fig. 7. Comparison of lateral resolution using (noisy) line structures with different line separations. The standard approach of preprocessing by one-dimensional Gaussian filtering across the edge is compared with one-dimensional Gaussian filtering followed by the trained U-Net. a-e: Smoothed images (Gaussian filtering with a 9 $\times$ 9 pixel kernel and a standard deviation of 3 in X and Y direction) of noisy line structures with different line separations; f: U-Net output for the same noisy line structure that was used under d (without any prefiltering); g: U-Net output for the same noisy line structure that was used under d, prefiltered with a Gaussian filter with a 9 $\times$ 9 pixel kernel and a standard deviation of 1.5 in X and Y direction before applying the U-Net to the image.

Download Full Size | PDF

It turned out that for the U-Net version a Gaussian prefiltering with a 9 $\times$ 9 pixel kernel and a standard deviation of 1.5 in X and Y direction provided the same lateral resolution as a Gaussian filtering with a 9 $\times$ 9 pixel kernel and a standard deviation of 3 in X and Y direction (Fig. 7).

Due to this, we prefiltered the 200 noisy images of the test dataset (replication A and replication B) before applying the trained U-Net to the images to enable better comparability. For the images, which were preprocessed by Gaussian filtering only, we have chosen a 9 $\times$ 9 pixel kernel with a standard deviation of 3 in X and Y direction. Therefore, we prefiltered the images with a 9 $\times$ 9 pixel kernel and a standard deviation $\sigma _X$ = 1.5 and $\sigma _Y$ = 3 before applying the U-Net.

We compared the images that have been denoised by the neural network with the images that have been smoothed by the Gaussian filter regarding the standard deviation of the localization error. The evaluation has basically been performed in the same way as described in section 3. For careful spatial averaging without strongly reducing the lateral resolution we applied a one-dimensional Sobel filter with kernel size 5 across the edge to the resulting images, as described in section 3. Also, the rest of the evaluation is the same as the results obtained with the optical preprocessing (section 3.). By doing this for the 10 edge positions of the 10 columns, the average measurement uncertainty $\sigma$ can be computed.

For replication A we get an average measurement uncertainty of $\sigma _{rep A, Gauss} = 0.0630$ pixel for the Gaussian filter and $\sigma _{rep A, NN} = 0.0422$ pixel for the neural network denoiser and for replication B we get $\sigma _{rep B, Gauss} = 0.0417$ pixel for the Gaussian filter and $\sigma _{rep B, NN} = 0.0319$ pixel for the neural network denoiser. It turned out that the neural network shows better results than the Gaussian filter if strong (photon) noise is present. We get an improvement by 50% for replication A and 30% for replication B with the method with the U-Net denoiser compared to the method with the Gaussian filter.

In order to show that the neural network method can also provide an improvement for applications in the real-world scenario, we trained a U-Net, which has the same architecture as the one shown in Fig. 5. Again, the evaluation was performed as described in section 3. The results were evaluated for the single-image and multi-image methods.

As for the scenario with intermediate image (section 4.), the results of the window edge could be improved as well. The values for one image can further be optimized when applying the U-Net, resulting in $\sigma _{SI, NN} = 0.0163$ pixel for a single image (compared to $\sigma _{SI} = 0.0252$ pixel), an improvement of $\approx ~55\%$. The multi-image approach certainly already improves the measurement uncertainty. Nevertheless, the U-Net result is $\sigma _{MI, NN} = 0.0134$ pixel compared to $\sigma _{MI} = 0.0154$ pixel with only usage of the multi-image method.

6. Conclusions

We demonstrated two methods for precisely determining edge positions. The techniques have been demonstrated for horizontal edges captured under laboratory conditions as well as for real world applications (window frame) using a scientific CMOS image sensor. The optical replication technique using a computer-generated hologram achieves measurement uncertainties down to 5/1000-th of a pixel (36 nm in image space). In our experimental scheme this uncertainty actually is limited by the precision of the employed mechanical stage. Of course, the field of view is reduced by the technique in the current experimental setup. For such a limited field of view scenario, the results have to be compared to optical systems using longer focal lengths. For a 3 x 3 replication this would correspond to three times the focal length and, therefore, one would end up with a comparable precision of 15/1000-th of a pixel. However, for some applications, the replication can be done along one dimension (as shown in Fig. 4). Then, the full gain of the method becomes obvious and still the full field of the original imaging can be preserved.

In addition, it is also possible to preserve the field of view by using a segmented mask in an intermediate image plane in order to filter all non-interesting parts of the scene (section 4.). We demonstrated that under such conditions an improvement up to 60% can be achieved comparing single-image to multi-image evaluation.

Under less fortunate image conditions, most of the time the edge localization is limited by photon noise. In this case, a neural-net based denoising can considerably (improvement up to 50%) improve edge localization.

Both methods have been used in combination with a simple straight-forward moment-based edge localization algorithm but other, more sophisticated algorithms might be employed as well.

Funding

Deutsche Forschungsgemeinschaft (SFB1244).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. M. Baba and K. Ohtani, “A novel subpixel edge detection system for dimension measurement and object localization using an analogue-based approach,” J. Opt. A: Pure Appl. Opt. 3(4), 276 (2001). [CrossRef]  

2. Y. Yang, B. Yang, S. Zhu, and X. Chen, “Online quality optimization of the injection molding process via digital image processing and model-free optimization,” J. Mater. Process. Technol. 226, 85–98 (2015). [CrossRef]  

3. Z. Duan, N. Wang, J. Fu, W. Zhao, B. Duan, and J. Zhao, “High precision edge detection algorithm for mechanical parts,” Meas. Sci. Rev. 18(2), 65–71 (2018). [CrossRef]  

4. J. Ye, G. Fu, and U. P. Poudel, “High-accuracy edge detection with blurred edge model,” Image Vis. Comput. 23(5), 453–467 (2005). [CrossRef]  

5. E. P. Lyvers, O. R. Mitchell, M. L. Akey, and A. P. Reeves, “Subpixel measurements using a moment-based edge operator,” IEEE Trans. Pattern Anal. Machine Intell. 11(12), 1293–1309 (1989). [CrossRef]  

6. S. Ghosal and R. Mehrotra, “Orthogonal moment operators for subpixel edge detection,” Pattern recognition 26(2), 295–306 (1993). [CrossRef]  

7. Q. Ying-Dong, C. Cheng-Song, C. San-Ben, and L. Jin-Quan, “A fast subpixel edge detection method using sobel–zernike moments operator,” Image Vis. Comput. 23(1), 11–17 (2005). [CrossRef]  

8. T. Bin, A. Lei, C. Jiwen, K. Wenjing, and L. Dandan, “Subpixel edge location based on orthogonal fourier–mellin moments,” Image Vis. Comput. 26(4), 563–569 (2008). [CrossRef]  

9. M. Hagara and P. Kulla, “Edge detection with sub-pixel accuracy based on approximation of edge with erf function,” Radioengineering 20, 516–524 (2011).

10. G.-S. Xu, “Sub-pixel edge detection based on curve fitting,” in 2009 Second International Conference on Information and Computing Science, vol. 2 (IEEE, 2009), pp. 373–375.

11. F. Guerra, T. Haist, A. Warsewa, S. Hartlieb, W. Osten, and C. Tarín, “Precise building deformation measurement using holographic multipoint replication,” Appl. Opt. 59(9), 2746–2753 (2020). [CrossRef]  

12. Z. Ibrahim, H. Adeli, K. Ghaedi, and A. Javanmardi, “Invited review: Recent developments in vibration control of building and bridge structures,” J VIBROENG 19(5), 3564–3580 (2017). [CrossRef]  

13. R. Mitchell, Y.-J. Cha, Y. Kim, and A. A. Mahajan, “Active control of highway bridges subject to a variety of earthquake loads,” Earthq. Eng. Eng. Vib. 14(2), 253–263 (2015). [CrossRef]  

14. S. Hartlieb, M. Tscherpel, T. Haist, W. Osten, M. Ringkowski, and O. Sawodny, “Hochgenaue Kalibrierung eines Multipoint-Positionsmesssystems,” (2019).

15. S. Hartlieb, M. Tscherpel, F. Guerra, T. Haist, W. Osten, M. Ringkowski, and O. Sawodny, “Hochgenaue kalibrierung eines holografischen multi-punkt positionsmesssystems,” Accepted for publication in tm - Technisches Messen (2020).

16. F. Da and H. Zhang, “Sub-pixel edge detection based on an improved moment,” Image Vis. Comput. 28(12), 1645–1658 (2010). [CrossRef]  

17. A. Trujillo-Pino, K. Krissian, M. Alemán-Flores, and D. Santana-Cedrés, “Accurate subpixel edge location based on partial area effect,” Image Vis. Comput. 31(1), 72–90 (2013). [CrossRef]  

18. J. Guo and C. Zhu, “Dynamic displacement measurement of large-scale structures based on the lucas–kanade template tracking algorithm,” Mechanical Systems and Signal Processing 66, 425–436 (2016). [CrossRef]  

19. X. Hu, “Subpixel line extraction based on blurred edge model and adaptive least square template matching,” in ASPRS Annual Conference, Reno Nevada, (2006).

20. B. Liang, M. Dong, J. Wang, and B. Yan, “Sub-pixel location of center of target based on zernike moment,” in Sixth International Symposium on Precision Engineering Measurements and Instrumentation, vol. 7544 (International Society for Optics and Photonics, 2010), p. 75443A.

21. Y. Y. Bylinsky, A. Kotyra, K. Gromaszek, and A. Iskakova, “Subpixel edge detection method based on low-frequency filtering,” in Photonics Applications in Astronomy, Communications, Industry, and High-Energy Physics Experiments 2016, vol. 10031 (International Society for Optics and Photonics, 2016), p. 1003152.

22. X. Guo-Sheng, “Linear array ccd image sub-pixel edge detection based on wavelet transform,” in 2009 Second International Conference on Information and Computing Science, vol. 2 (IEEE, 2009), pp. 204–206.

23. W. Singer, M. Totzeck, and H. Gross, Handbook of optical systems, volume 2: Physical image formation (John Wiley & Sons, 2006).

24. A. Hornberg, Handbook of machine vision (John Wiley & Sons, 2007).

25. B. Jähne, “Emva 1288 standard for machine vision: Objective specification of vital camera data,” Optik & Photonik 5(1), 53–54 (2010). [CrossRef]  

26. K. Masaoka, T. Yamashita, Y. Nishida, and M. Sugawara, “Modified slanted-edge method and multidirectional modulation transfer function estimation,” Opt. Express 22(5), 6040–6046 (2014). [CrossRef]  

27. N. Cherabit, F. Z. Chelali, and A. Djeradi, “Circular hough transform for iris localization,” Science and Technology 2(5), 114–121 (2012). [CrossRef]  

28. P. Bing, X. Hui-min, X. Bo-qin, and D. Fu-long, “Performance of sub-pixel registration algorithms in digital image correlation,” Meas. Sci. Technol. 17(6), 1615–1621 (2006). [CrossRef]  

29. T. Haist, S. Dong, T. Arnold, M. Gronle, and W. Osten, “Multi-image position detection,” Opt. Express 22(12), 14450–14463 (2014). [CrossRef]  

30. T. Haist, M. Gronle, D. A. Bui, B. Jiang, C. Pruss, F. Schaal, and W. Osten, “Towards one trillion positions,” in Automated Visual Inspection and Machine Vision, vol. 9530 (International Society for Optics and Photonics, 2015), p. 953004.

31. M. A. Seldowitz, J. P. Allebach, and D. W. Sweeney, “Synthesis of digital holograms by direct binary search,” Appl. Opt. 26(14), 2788–2798 (1987). [CrossRef]  

32. M. Häfner, C. Pruss, and W. Osten, “Laser direct writing: Recent developments for the making of diffractive optics,” Optik & Photonik 6(4), 40–43 (2011). [CrossRef]  

33. M. Gronle, W. Lyda, M. Wilke, C. Kohler, and W. Osten, “itom: an open source metrology, automation, and data evaluation software,” Appl. Opt. 53(14), 2974–2982 (2014). [CrossRef]  

34. V. Torre and T. A. Poggio, “On edge detection,” IEEE Trans. Pattern Anal. Mach. Intell. 8(2), 147–163 (1986). [CrossRef]  

35. D. Ziou and S. Tabbone, “Edge detection techniques-an overview,” Pattern Recognition and Image Analysis C/C of Raspoznavaniye Obrazov I Analiz Izobrazhenii 8, 537–559 (1998).

36. B. Li, A. Jevtic, U. Söderström, S. Ur Réhman, and H. Li, “Fast edge detection by center of mass,” in The 1st IEEE/IIAE International Conference on Intelligent Systems and Image Processing 2013 (ICISIP2013), (2013), pp. 103–110.

37. B. Jahne, Digital image processing, vol. 4 (Springer Berlin, 2005).

38. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical image computing and computer-assisted intervention, (Springer, 2015), pp. 234–241.

39. B. Norman, V. Pedoia, and S. Majumdar, “Use of 2d u-net convolutional neural networks for automated cartilage and meniscus segmentation of knee mr imaging data to determine relaxometry and morphometry,” Radiology 288(1), 177–185 (2018). [CrossRef]  

40. H. Dong, G. Yang, F. Liu, Y. Mo, and Y. Guo, “Automatic brain tumor detection and segmentation using u-net based fully convolutional networks,” in Medical image understanding and analysis, vol. 723 of Communications in Computer and Information Science M. V. Hernández and V. González-Castro, eds. (Springer, Cham, 2017), pp. 506–517.

41. A. Sevastopolsky, “Optic disc and cup segmentation methods for glaucoma detection with modification of u-net convolutional neural network,” Pattern Recognit. Image Anal. 27(3), 618–624 (2017). [CrossRef]  

42. Z. Zhou, M. M. Rahman Siddiquee, N. Tajbakhsh, and J. Liang, “Unet++: A nested u-net architecture for medical image segmentation,” in Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, vol. 11045 of Image Processing, Computer Vision, Pattern Recognition, and GraphicsD. Stoyanov, Z. Taylor, G. Carneiro, T. Syeda-Mahmood, A. Martel, L. Maier-Hein, J. M. R. Tavares, A. Bradley, J. P. Papa, V. Belagiannis, J. Nascimento, Z. Lu, M. Moradi, H. Greenspan, and A. Madabhushi, eds. (Springer International Publishing and Springer, Cham, 2018), pp. 3–11.

43. J. Batson and L. Royer, “Noise2self: Blind denoising by self-supervision,”.

44. J. Batson, L. Royer, J. Lehtinen, J. Munkberg, J. Hasselgren, S. Laine, T. Karras, M. Aittala, and T. Aila, “Noise2noise: Learning image restoration without clean data,.

45. A. Krull, T.-O. Buchholz, and F. Jug, “Noise2void - learning denoising from single noisy images,”.

46. M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “TensorFlow: Large-scale machine learning on heterogeneous systems,” (2015). Software available from tensorflow.org.

47. F. Chollet, “Keras,” https://keras.io (2015).

48. G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools (2000).

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (7)

Fig. 1.
Fig. 1. Experimental setup. The object (razor blade) and the light source (LED) are placed on a mechanical stage. A computer-generated hologram (CGH) is used for optical replication. The image is captured by a CMOS image sensor.
Fig. 2.
Fig. 2. Exemplary image used for evaluation. Nine replications of the original object (razor blade) are recorded by a CMOS sensor. The blue boxes indicate the regions that were used for the evaluation. The edge position is determined along the direction of the vertical red lines. The blurriness in the image is caused by the phenomenon of chromatic dispersion, which results in wavelength-dependent refraction and diffraction. In the case of the wavelength dependent diffraction the frequency components of the illumination are spatially separated at a diffraction grating. This leads to a degradation of the lateral resolution, which can be seen here in the blurriness.
Fig. 3.
Fig. 3. Schematical setup showing the addition of an intermediate image to block irrelevant image parts using a movable rectangular aperture. We used a field lens (located in the intermediate image plane) to match the entrance pupil of the second imaging system (from L2 to Sensor) with the exit pupil of the first system (from Object to L2). The image is captured by a CMOS image sensor. L1: Pentax, L2 & L3: achromatic doublets, L4: Kowa.
Fig. 4.
Fig. 4. Overview of of the 8 replications $-4^{th}$ to $+4^{th}$-Order diffractions. The ROIs which were processed are 48 x 48 pixel in size are outlined by blue borders.
Fig. 5.
Fig. 5. U-Net architecture. The boxes represent the feature maps whereby the number of channels is given above the boxes and their size on the left side.
Fig. 6.
Fig. 6. Neural network denoiser. Top: Original noiseless images of the edges; Middle: Images after artificially adding photon noise (QWC = 1000); Bottom: Output of the trained U-Net (denoised images).
Fig. 7.
Fig. 7. Comparison of lateral resolution using (noisy) line structures with different line separations. The standard approach of preprocessing by one-dimensional Gaussian filtering across the edge is compared with one-dimensional Gaussian filtering followed by the trained U-Net. a-e: Smoothed images (Gaussian filtering with a 9 $\times$ 9 pixel kernel and a standard deviation of 3 in X and Y direction) of noisy line structures with different line separations; f: U-Net output for the same noisy line structure that was used under d (without any prefiltering); g: U-Net output for the same noisy line structure that was used under d, prefiltered with a Gaussian filter with a 9 $\times$ 9 pixel kernel and a standard deviation of 1.5 in X and Y direction before applying the U-Net to the image.
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.