Use of numerical orthogonal transformation for the Zernike analysis of lateral shearing interferograms

Fengzhao Dai; Feng Tang; Xiangzhao Wang; Peng Feng; Osami Sasaki

doi:10.1364/OE.20.001530

1. Introduction

In lateral shearing interferometry (LSI), the wavefront is interfered with a laterally shifted part of itself, thus eliminating the need of an extra reference wavefront. In addition, the two interference beams travel almost equivalent paths, which gives LSI an advantage that it is insensitive to mechanical vibration and environmental turbulence [1]. Owing to these advantages described above, LSI has been used extensively in many applications [2,3]. However, the interferograms of LSI is directly related to the difference between the wavefront and its sheared copy rather than the wavefront under test itself. Thus, an inversion problem that reconstructs the original wavefront from two wavefront differences (difference fronts) in two perpendicular shear directions has to be resolved. A variety of approaches to solve this problem have been devised [4–19]. They are mostly categorized as either zonal reconstruction or modal reconstruction. In zonal reconstruction, the wavefront is directly evaluated at specific grid points [4–9]. While in modal reconstruction, the wavefront is expanded into a set of certain basis functions and its corresponding coefficients are evaluated [10–18]. As shown in [19], if the wavefront under test can be decomposed into a set of basis functions, modal reconstruction is superior to zonal reconstruction because of its better noise propagation properties and its efficient calculations.

For the modal reconstruction, Zernike polynomials are commonly used as basis functions for wavefront expansion [14–17]. It was introduced to analyze the lateral shearing interferograms first by Rimmer and Wyant (1975) [14]. The two difference fronts in perpendicular shear directions as well as the original wavefront are expanded in terms of Zernike polynomials, the Zernike coefficients of difference front and original wavefront are related to each other by a shear matrix whose elements are functions of shear ratio. In fact, the number of terms of Zernike polynomials of a practical wavefront should be infinite in general case. However, the dimensions of the shear matrix cannot be extended to infinity. This contradiction leads to a result that the terms of Zernike polynomials of the original wavefront above a certain order have to be neglected because they cannot be represented by one or more elements of the shear matrix. These neglected terms are referred to as remaining high-order terms, and they have a negative effect on the outcomes of the lower-order terms. As can be seen below, this effect cannot be omitted even if the contributions of the remaining high-order terms to the original wavefront are slight compared with that of the lower-order terms. This problem was analyzed theoretically with matrix formulations by Herrmann [12], but he did not evaluate the error source that result from this problem. This error source was defined as remaining error and was evaluated theoretically and numerically by Dai [18], and the results showed that the remaining error can be reduced by use of Karhunen–Loève (K-L) functions instead of Zernike polynomials as basis functions to expand the wavefront. The K-L functions are optimal set of basis functions for atmospheric turbulence modal phase compensation. However, it is not employed in the field of optical testing.

In this paper, we propose a numerical orthogonal transformation method to reconstruct the wavefront from difference fronts based on Zernike polynomials. By using this method, the sensitivity of the outcomes of lower-order terms to the remaining high-order terms can be decreased, and a consequent result is that the remaining error is reduced and then the reconstruction accuracy is improved. This method can be implemented easily from Rimmer-Wyant method, and can be applied to reconstruct a wavefront on an aperture of arbitrary shape from its difference fronts. By theoretical analysis and numerical calculations, it is confirmed that the accuracy of the proposed method is superior to that of Rimmer-Wyant method.

2. Numerical orthogonal transformation

It is supposed that a wavefront $W (x, y)$ under test is represented by the first $J$ fringe Zernike polynomials [20] as

W (x, y) = \sum_{j = 2}^{J} a_{j} Z_{j} (x, y) .

where

x

and

y

are normalized in Cartesian coordinates,

Z_{j} (x, y)

denotes the

j th

fringe Zernike polynomials, and

a_{j}

is the corresponding weighting coefficient. In Eq. (1) the piston

Z_{1}

was omitted because it is of no concern. Assuming that the two lateral shearing interfergrams in two perpendicular shear directions have been obtained within the two overlap regions

Σ_{x}

and

Σ_{y}

on a series of discrete points, as shown in Figs. 1(a) and 1(b). And the shear ratio

s

in the two shear directions is assumed identical in this paper, thus the number of measurement points

N

for the two difference fronts is the same. The two difference fronts on these discrete points can be evaluated from the two lateral shearing interferograms.

Fig. 1 Schematic diagrams of lateral shearing interferograms. (a) Lateral shearing interferogaram when the shearing is in the x direction. (b) Lateral shearing interferogram when the shearing is in the y direction.

Download Full Size | PDF

The difference fronts are also expanded by Zernike polynomials. For the moment, we analyze the difference front when the shearing is in the $x$ direction. Then we have

Δ W_{x} (x, y; s) = W (x + s, y) - W (x - s, y) = \sum_{j = 1}^{J} A_{x, j} Z_{j} (x, y) .

where

A_{x, j}

denotes the

j th

Zernike coefficient of

Δ W_{x} (x, y; s)

, and the summation starts from

j = 1

since the piston of the difference front cannot be omitted because it relates to some terms of the original wavefront, e.g. tilt and coma. Note that some high-orders of

A_{x, j}

may be zero, but they still be reserved to facilitate the derivation subsequently. Substituting Eq. (1) and the expression of Zernike polynomials in Cartesian coordinates into Eq. (2), the relationship between the Zernike coefficients of the difference front and the original wavefront can be derived. It is written in matrix form for simplicity as

A_{x} = N_{x} a .

where

A_{x}

and

a

are

J \times 1

and

(J - 1) \times 1

vectors, respectively, they represent the arrays of Zernike coefficients of the difference front and the original wavefront, respectively.

N_{x}

is a shear matrix that relates the Zernike coefficients of the original wavefront and the difference front.

However, the Zernike coefficients of the difference fronts are not independent of each other, because Zernike polynomials are orthogonal basis functions over unit circle but not orthogonal over the overlap region $Σ_{x}$ of two circular beams where the difference front lying in. Thus, it can be expected that the use of orthogonal polynomials instead of Zernike polynomials to expand the difference fronts and a new shear matrix to relate the Zernike coefficients of the original wavefront and the orthogonal coefficients of the difference fronts may reduce the reconstruction error. Indeed, as can be seen below, the reconstruction of the original wavefront with orthogonal transformation results in a smaller remaining error and a more accurate reconstruction result.

2.1. Calculation of the numerical orthonormal polynomials

Although some methods have been presented to construct orthogonal polynomials analytically for non-circle pupils in simple shape, e.g. annular, hexagon, ellipse and rectangle [21–26], it is still a challenging work to construct a set of orthogonal polynomials over the overlap region of two inference beams analytically, especially when the wavefront under test is in a more complicated region, e.g. annular region with a spider that emerged in the Extreme Ultraviolet Lithography [27]. In fact, what we needed is a set of polynomials that are orthogonal over the discrete points of the difference front data set within $Σ_{x}$ rather than polynomials that are orthogonal over the full region of $Σ_{x}$ .The orthogonal polynomials over the discrete points can be obtained conveniently from numerical calculation by computer regardless of the shape of the region. Since Zernike polynomials form a complete set, the numerical polynomials ${F_{x, j} (x_{n}, y_{n})}, (n = 1, 2, ..., N)$ that are orthonormal over the discrete points within $Σ_{x}$ are available from the linear combination of Zernike polynomials. The Zernike polynomials and the set ${F_{x, j} (x_{n}, y_{n})}$ are related to each other according to

F_{x, i} (x_{n}, y_{n}) = \sum_{j = 1}^{J} M_{x, i j} Z_{j} (x_{n}, y_{n}) .

where

M_{x, i j}

is the element of

i th

row and

j th

column of the conversion matrix

M_{x}

which is a lower triangle matrix. Equation (4) can be written in matrix form as

F_{x} = Z_{x} M_{x}^{T} .

where

F_{x}

and

Z_{x}

are

N \times J

matrix representing each of J orthonormal polynomials and J Zernike polynomials over the

N

data points within the region

Σ_{x}

, respectively, and

M_{x}^{T}

is the transpose of matrix

M_{x} .

An elegant nonrecursive method to determine the conversion matrix

M_{x}

in integrable domain has been described by Dai [24] which can be extended to discrete domain conveniently. Multiply both sides of Eq. (5) by

F_{x}^{T}

and

Z_{x}^{T}

, respectively, we obtain

F_{x}^{T} F_{x} = F_{x}^{T} Z_{x} M_{x}^{T} .

Z_{x}^{T} F_{x} = Z_{x}^{T} Z_{x} M_{x}^{T} .

Because $F_{x}$ are orthonormal over the data set, so we have $(F_{x}^{T} F_{x}) / N = I$ , where $I$ is $J \times J$ identity matrix. Use this condition to Eq. (6), we get

M_{x} Z_{x}^{T} F_{x} = N I .

Substitute Eq. (7) to Eq. (8) and let $M_{x} = {(Q_{x}^{T})}^{- 1}$ , we obtain

Q_{x}^{T} Q_{x} = (Z_{x}^{T} Z_{x}) / N .

Equation (9) can be solved for $Q_{x}$ uniquely with, e.g, Cholesky decomposition due to the fact that $Z_{x}^{T} Z_{x}$ is symmetric positive definite matrix [24], and then conversion matrix $M_{x}$ can be determined from $M_{x} = {(Q_{x}^{T})}^{- 1} .$ The expressions of Zernike polynomials in Cartesian coordinate are known, and then the matrix $Z_{x}$ can be calculated by substitute the coordinate values of the measurement points into the expressions of Zernike polynomials. Therefore, the numerical orthonormal polynomials $F_{x}$ can be determined from Eq. (5) with the matrix $Z_{x}$ and the conversion matrix $M_{x}$ .

2.2 Derivation of the corresponding shear matrix

By expanding the difference front within $Σ_{x}$ with the fist $J$ numerical orthonormal polynomials ${F_{x, j} (x_{n}, y_{n})}$ , the corresponding orthogonal coefficients are obtained as $J \times 1$ vector $B_{x}$ . Since the fact that each term in the set ${F_{x, j} (x_{n}, y_{n})}$ is a linear combination of the Zernike polynomials, the difference front expanded with the orthonormal polynomials ${F_{x, j} (x_{n}, y_{n})}$ is identical to that expanded with corresponding set of Zernike polynomials. Then, we have a matrix formula,

Δ W_{x} = Z_{x} A_{x} = F_{x} B_{x} .

where

Δ W_{x}

is

N \times 1

vector representing the difference front values in

N

points within the region

Σ_{x} .

Substituting Eq. (5) to Eq. (10), the relationship between the orthonormal coefficients

A_{x}

and the Zernike coefficients

B_{x}

can be obtained as

A_{x} = M_{x}^{T} B_{x} .

Substituting Eq. (11) into Eq. (3), we have

B_{x} = {(M_{x}^{T})}^{- 1} N_{x} a = H_{x} a .

Here, $H_{x} = {(M_{x}^{T})}^{- 1} N_{x}$ denotes the shear matrix that relates the Zernike coefficients of the original wavefront and the orthonormal coefficients of difference front.

2.3 Wavefront reconstruction

For the analysis of the difference front within $Σ_{y}$ where the shearing is in the $y$ direction, we find that, similar to Eq. (3) and Eq. (12),

A_{y} = N_{y} a .

B_{y} = {(M_{y}^{T})}^{- 1} N_{y} a = H_{y} a .

where

A_{y}

and

B_{y}

are

J \times 1

vectors, which represent Zernike coefficients and orthonormal coefficients of the difference front within

Σ_{y}

, respectively.

M_{y}

is the conversion matrix which transforms Zernike polynomials to the numerical polynomials

{F_{y, j} (x_{n}, y_{n})}

. The numerical polynomials are orthonormal over the discrete points within

Σ_{y}

.

N_{y}

and

H_{y} = {(M_{y}^{T})}^{- 1} N_{y}

are the shear matrixes corresponding to

N_{x}

and

H_{x}

, respectively. All the four shear matrixes

N_{x}

,

N_{y}

,

H_{x}

and

H_{y}

are

J \times (J - 1)

upper triangular matrixes whose matrix elements are functions of shear ratio

s

or zero. Note that,

N_{x}

and

N_{y}

are given in analytical form, while

H_{x}

and

H_{y}

are given in numerical form.

Combing $A_{x}$ and $A_{y}$ into a single vector and matrix $N_{x}$ and $N_{y}$ into a single matrix, a matrix formula is obtained from Eq. (3) and Eq. (13) as

(\begin{array}{l} A_{x} \\ A_{y} \end{array}) = (\begin{array}{l} N_{x} \\ N_{y} \end{array}) a .

which can be abbreviated to

A = N a .

Appling the same treatment to Eq. (12) and Eq. (14), a matrix formula is obtained as

(\begin{array}{l} B_{x} \\ B_{y} \end{array}) = (\begin{array}{l} H_{x} \\ H_{y} \end{array}) a .

which can also be simplified as

B = H a .

Since Eqs. (16) and (18) are obviously over determined equations, the least-square solutions of the two equations are

\hat{a} = N^{†} \hat{A} .

\hat{a} = H^{†} \hat{B} .

where

\hat{A}

and

\hat{B}

are estimation of

A

and

B

obtained by fitting the two difference fronts

Δ W_{x} (x_{n}, y_{n})

and

Δ W_{y} (x_{n}, y_{n})

to Zernike polynomials and numerical orthonormal polynomials, respectively, and

N^{†} = {(N^{T} N)}^{- 1} N^{T}

and

H^{†} = {(H^{T} H)}^{- 1} H^{T}

are generalized inverse of matrixes

N

and

H

, respectively. Equations (19) and (20) are the representation of Rimmer-Wyant method and the proposed method, respectively.

We note that a similar method by using elliptical orthogonal transformation has been presented in Ref. 16. In this method, the difference front in $x$ direction within $Σ_{x}$ was reduced to an elliptical region. And the elliptical Zernike polynomials which are orthogonal over this elliptical region were generated from Zernike polynomials by coordinate transformation. Then the reduced difference front was expanded into the obtained elliptical Zernike polynomials. The shear matrix that relates the elliptical Zernike coefficients of difference front and the Zernike coefficients of the original wavefront were derived from a double integral. The difference front in $y$ direction within $Σ_{y}$ was analyzed following the same way.

However, the obtained elliptical Zernike polynomials are orthogonal over the full elliptical region, but not orthogonal over the discrete points at which the difference front are measured. Moreover, the applications of this method are limited to the situation that the wavefront under test is circular because the overlap region of the two interference beams cannot be approximated by an ellipse when the wavefront under test is generated from a non-circle aperture such as annular aperture.

For the proposed method, the numerical orthogonal transformation is implemented directly on the discrete points of the difference front data set, and the obtained numerical polynomials are orthogonal over these discrete points rather than the full overlap region. In addition, the orthogonality of the obtained numerical polynomials is not influenced by the shape of the region. Thus, the proposed method can be applied on apertures of arbitrary shape.

The fact that the outcomes of the lower-order terms are influenced by the remaining high-order terms was pointed out and illustrated by an example shown in Ref.16. However, the authors did not explain the effect of orthogonal transformation on the reduction of this influence. This effect will be made clear with a cross-coupling formula derived in the following section.

3. Impact of remaining high-order terms

Here, we assume the number J stands for infinity. Therefore, any practical wavefront can be represented completely by Eq. (1). Now, returning to Eq. (10), under the assumption that J stands for infinity, the column number of the matrix $Z_{x}$ is extended to infinity. If the matrix $Z_{x}$ and the vector $A_{x}$ are splited into two blocks, a formula can be obtained from Eq. (10) as

Δ W_{x} = Z_{x} A_{x} = (Z_{x f} ​ ​ ​ ​ Z_{x r}) (\begin{array}{l} A_{x f} \\ A_{x r} \end{array}) = Z_{x f} A_{x f} + Z_{x r} A_{x r} .

Here, $Z_{x f}$ and $Z_{x r}$ are the two blocks containing the first $K$ and the remaining columns of the matrix $Z_{x},$ respectively. $A_{x f}$ and $A_{x r}$ are the first $K$ and the remaining elements of the vector $A_{x}$ .

3.1 Least-square fitting of difference fronts

Since the difference front cannot be fitted with infinite terms of Zernike polynomials, we assume it is fitted by the first $K$ terms, and then the corresponding coefficients can be obtained by means of least-square fitting,

{\hat{A}}_{x f} = Z_{x f}^{†} Δ W_{x} .

where

{\hat{A}}_{x f}

is the estimation of

A_{x f},

and

Z_{x f}^{†} = {(Z_{x f}^{T} Z_{x f})}^{- 1} Z_{x f}^{T}

is the generalized inverse of matrix

Z_{x f}

. Substituting Eq. (21) into Eq. (22), we obtain

{\hat{A}}_{x f} = Z_{x f}^{†} (Z_{x f} A_{x f} + Z_{x r} A_{x r}) = A_{x f} + C_{x} A_{x r} .

where

C_{x} = Z_{x f}^{†} Z_{x r}

. A similar expression can be obtained for the fitting of the difference front within

Σ_{y}

where the shearing is in the

y

direction,

{\hat{A}}_{y f} = A_{y f} + C_{y} A_{y r} .

where

{\hat{A}}_{y f}

is the estimation of

A_{y f}

which is the first

K

Zernike coefficients of the difference front,

A_{y r}

is the Zernike coefficients of the remaining high-order terms, and

C_{y} = Z_{y f}^{†} Z_{y r}

, where

Z_{y f}^{†} = {(Z_{y f}^{T} Z_{y f})}^{- 1} Z_{y f}^{T}

is the generalized inverse of matrix

Z_{y f}

, and

Z_{y f}

and

Z_{y r}

are the first

K

and the remaining columns of the matrix

Z_{y},

respectively, and

Z_{y}

is

N \times J

matrix representing each of

J

Zernike polynomials over the

N

data points within the region

Σ_{y}

.

3.2 Splitting of the shear matrixes

To facilitate the derivation, Eq. (3) is written in another form as

(\begin{array}{l} A_{x f} \\ A_{x r} \end{array}) = (\begin{matrix} N_{x f 1} & N_{x f 2} \\ N_{x r 1} & N_{x r 2} \end{matrix}) (\begin{array}{l} a_{f} \\ a_{r} \end{array}) = (\begin{matrix} N_{x f 1} & N_{x f 2} \\ 0 & N_{x r 2} \end{matrix}) (\begin{array}{l} a_{f} \\ a_{r} \end{array}) .

Here, the shear matrix $N_{x}$ with infinite rows and columns is splited into four blocks by a horizontal line under the $K th$ row and a vertical line behind the $(K - 1) th$ columns. So, the block $N_{x f 1}$ is $K \times (K - 1)$ matrix, and the dimensions of other blocks can also be deduced easily. In addition, $N_{x r 1} = 0$ is because $N_{x}$ is an upper triangular matrix. Vectors $a_{f}$ and $a_{r}$ are arrays of the first $K - 1$ and the remaining elements of vector $a$ , respectively.

The shear matrix $N_{y}$ can also be splited into four blocks in the same way, and a similar expression as Eq. (24a) for the analysis of the difference front in $y$ shear direction can be obtained as

(\begin{array}{l} A_{y f} \\ A_{y r} \end{array}) = (\begin{matrix} N_{y f 1} & N_{y f 2} \\ N_{y r 1} & N_{y r 2} \end{matrix}) (\begin{array}{l} a_{f} \\ a_{r} \end{array}) = (\begin{matrix} N_{y f 1} & N_{y f 2} \\ 0 & N_{y r 2} \end{matrix}) (\begin{array}{l} a_{f} \\ a_{r} \end{array}) .

From Eq. (24a) and Eq. (24b) the following formula can be obtained easily,

A_{x f} = N_{x f 1} a_{f} + N_{x f 2} a_{r}

A_{x r} = N_{x r 2} a_{r} .

A_{y f} = N_{y f 1} a_{f} + N_{y f 2} a_{r} .

A_{y r} = N_{y r 2} a_{r} .

3.3 Reconstruction with finite-dimensional shear matrixes

As discussed in Sec.1, the wavefront cannot be reconstructed completely due to the fact that the dimensions of the shear matrix are finite. As Eq. (19), the estimation of the first $K$ Zernike coefficients of the original wavefront except piston can be obtained by shear matrix $N_{x f 1}$ , $N_{y f 1}$ and the estimation coefficients ${\hat{A}}_{x f}$ and ${\hat{A}}_{y f}$ as

{\hat{a}}_{f} = N_{f 1}^{†} (\begin{array}{l} {\hat{A}}_{x f} \\ {\hat{A}}_{y f} \end{array}) with N_{f 1} = (\begin{array}{l} N_{x f 1} \\ N_{y f 1} \end{array}) .

where

N_{f 1}^{†} = {(N_{f 1}^{T} N_{f 1})}^{- 1} N_{f 1}^{T}

is generalized inverse of matrix

N_{f 1}

.Substituting Eqs. (23a) and (23b) into Eq. (26), we obtain

{\hat{a}}_{f} = N_{f 1}^{†} (\begin{array}{l} A_{x f} + C_{x} A_{x r} \\ A_{y f} + C_{y} A_{y r} \end{array}) .

Substituting Eqs. (25a) - (25d) to Eq. (27), we have

\begin{matrix} {\hat{a}}_{f} = N_{f 1}^{†} (\begin{array}{l} N_{x f 1} a_{f} + N_{x f 2} a_{r} + C_{x} N_{x r 2} a_{r} \\ N_{y f 1} a_{f} + N_{y f 2} a_{r} + C_{y} N_{y r 2} a_{r} \end{array}) \\ = N_{f 1}^{†} {(\begin{array}{l} N_{x f 1} \\ N_{y f 1} \end{array}) a_{f} + (\begin{array}{l} N_{x f 2} \\ N_{y f 2} \end{array}) a_{r} + (\begin{array}{l} C_{x} N_{x r 2} \\ C_{y} N_{y r 2} \end{array}) a_{r}} . \\ = a_{f} + T_{Z} a_{r} \end{matrix}

where

T_{Z} = N_{f 1}^{†} N_{f 2} + N_{f 1}^{†} N_{c r 2}, N_{f 2} = (\begin{array}{l} N_{x f 2} \\ N_{y f 2} \end{array}) and N_{c r 2} = (\begin{array}{l} C_{x} N_{x r 2} \\ C_{y} N_{y r 2} \end{array}) .

$T_{Z}$ is referred to as cross-coupling matrix, and it represents the impact of the remaining high-order terms on the outcomes of the lower-order terms when the wavefront is reconstructed by use of Eq. (19). Note that this cross-coupling matrix is different from the cross-coupling matrix derived by Herrmann [12] and the cross-talk matrix derived by Dai [18].

3.4 Same analysis of numerical orthogonal transformation method

Although Eq. (28) is deduced from the situation that the Zernike polynomials are used as a basis to expand the difference fronts, it can also be used in the case that the numerical orthonormal polynomials are used as a basis to expand the difference fronts. To analyze the proposed method, by imitating Eq. (28) we obtain

{\hat{a}}_{f} = a_{f} + T_{H} a_{r} .

where

T_{H} = H_{f 1}^{†} H_{f 2} + H_{f 1}^{†} H_{c r 2}, H_{f 2} = (\begin{array}{l} H_{x f 2} \\ H_{y f 2} \end{array}) and H_{c r 2} = (\begin{array}{l} D_{x} H_{x r 2} \\ D_{y} H_{y r 2} \end{array}) .

Similar to $C_{x} = Z_{x f}^{†} Z_{x r}$ and $C_{y} = Z_{y f}^{†} Z_{y r}$ , we have $D_{x} = F_{x f}^{†} F_{x r}$ and $D_{y} = F_{y f}^{†} F_{y r}$ . Note that, due to the orthonormality of $F_{x}$ and $F_{y}$ , we have

D_{x} = F_{x f}^{†} F_{x r} = {(F_{x f}^{T} F_{x f})}^{- 1} F_{x f}^{T} F_{x r} = 0 .

D_{y} = F_{y f}^{†} F_{y r} = {(F_{y f}^{T} F_{y f})}^{- 1} F_{y f}^{T} F_{y r} = 0 .

Substituting Eqs. (30). (a) and (b) to the expression of $H_{c r 2}$ , the result of $H_{c r 2} = 0$ is obtained, and then substituting this result to the expression of $T_{H}$ , we obtain $T_{H} = H_{f 1}^{†} H_{f 2}$ .

Obviously, both $T_{Z}$ and $T_{H}$ are zero matrixes when $K = J$ , under this condition we have ${\hat{a}}_{f} = a_{f}$ , that is the wavefront can be reconstructed without error regardless of the orthogonality of the basis functions that are used to expand the difference front. However, as discussed in Sec.1, this condition cannot be met in practical situation. Consequently, it is inevitable that the estimations of lower-order coefficients ${\hat{a}}_{f}$ are affected by the high-order coefficients $a_{r}$ as long as $K \neq J$ .This problem is inherent deficiency of the modal methods which cannot be prevented by the proposed method. However, it can be alleviated by the proposed method, because the level of the cross coupling of $T_{H}$ is far below than that of $T_{Z}$ as demonstrated below.

4. Numerical simulation

4.1 Simulation condition

To confirm the proposed method and the theory analysis discussed above, numerical simulations were implemented. A digitized wavefront $W (x_{n}, y_{n})$ filtered by a circle pupil was generated over a $256 \times 256$ square grid by the first 20 fringe Zernike polynomials, that was $J = 20$ .The corresponding coefficients were generated randomly, but the last five elements were multiplied by $10^{- 1}$ .This attenuation followed the assumption that all the terms that gave significant contributions to the wavefront were contained in the first 15 terms, and the contributions of the last five terms were very weak compared with those of the first 15 terms. Noise-free difference fronts data in two perpendicular shear directions was calculated with the same shear ratio of $s = 0.1$ . The test wavefront and the difference fronts in two directions are shown in Figs. 2(a) to 2(c). The Zernike coefficients of the test wavefront are shown in Fig. 2(d). For comparison, the wavefront were reconstructed from the two difference fronts by Rimmer-Wyant method as Eq. (19) and by the proposed method as Eq. (20), respectively.

Fig. 2 Simulation condition, (a) simulated wavefront under test $W (x_{n}, y_{n})$ , (b) difference front $Δ W_{x} (x_{n}, y_{n})$ when the shearing was in the $x$ direction, (c) difference front $Δ W_{y} (x_{n}, y_{n})$ when the shearing was in the $y$ direction and (d) random Zernike coefficients of the wavefront under test.

Download Full Size | PDF

4.2 Reconstruction without remaining high-order terms

First, we consider the reconstruction without the remaining high-order terms, that is, with $K = J = 20$ . In this case, all of the Zernike coefficients of the original wavefront are represented by one or more elements of the shear matrix $N$ and $H$ . The dimensions of both the two shear matrixes $N$ and $H$ were $40 \times 19$ .The results of the evaluation of the Zernike coefficients of the original wavefront are shown in Table 1 . When $K = 20$ , the evaluated coefficients of both Rimmer-Wyant method and the proposed method are identical to the input coefficients. These results make it clear that both the two methods are capable to reconstruct the wavefront without error under the condition that all the terms of original wavefront are included in the analysis, that is there is no remaining high-order terms. Unfortunately, this condition can only be met in simulation because the existence of remaining high-order terms is inevitable in the analysis of a practical wavefront in general case, as discussed in Sec. 1.

Table 1. Input and evaluated Zernike coefficients of the original wavefront by Rimmer-Wyant method (M1) and the proposed method (M2) when the reconstruction was performed without and with the affections of the remaining high-order terms

View Table | View all tables in this article

4.3 Reconstruction under the impact of remaining high-order terms

To examine the impact of the remaining high-order terms on the outcomes of the lower-order terms, the wavefront was reconstructed under the condition that $K = 15$ , that is, the last five terms were the remaining high-order terms. The dimensions of both the two shear matrixes $N$ and $H$ were shrunk to $30 \times 14$ . The results of the evaluation of the Zernike coefficients of the original wavefront are also shown in Table 1. The value of “Percentage Error” column was calculated by the formula of $r_{j} = (a_{j} - {\hat{a}}_{j}) / a_{j} \times 100$ , where $a_{j}$ is the input coefficient, and ${\hat{a}}_{j}$ is the corresponding evaluated value by the two methods. The calculation results of $r_{j}$ are also diagramed in Fig. 3(d) for visualization. It can be clearly seen from the “Percentage Error” column of Table.1 that all of the absolute values of the coefficients error that were retrieved by the proposed method are smaller than those by Rimmer-Wyant method. And the proposed method retrieved the input coefficients with small differences of less than $1 %$ in most case, while these differences are higher than $30 %$ in some case by Rimmer-Wyant method.

Fig. 3 Original wavefront and reconstructed results of the two methods: (a) a part of test wavefront W_f, (b) the reconstruction result of Rimmer-Wyant method, (c) the reconstruction result of the proposed method, (d) the percentage error of the retrieved Zernike coefficients by Rimmer-Wyant method and the proposed method, (e) the difference between the reconstruction result W₁ of Rimmer-Wyant method and W_f, and(f) the difference between the reconstruction result W₂ of the proposed method and W_f .

Download Full Size | PDF

To facilitate the analysis, the wavefront under test is divided into two parts. One part is denoted by $W_{f}$ which represents the contributions of the first 15 terms, while the other part denoted by $W_{r}$ describes the contributions of the remaining 5 terms. In this simulation, we evaluate the reconstruction accuracy of $W_{f} .$ The wavefront was reconstructed by use of the evaluated Zernike coefficients shown in “K=15”column of Table 1. The notations of $W_{1}$ and $W_{2}$ were assigned to represent the reconstruction results of Rimmer-Wyant method and the proposed method, respectively. The contour plots of these wavefronts and the reconstruction errors of the two methods are shown in Fig. 3. The reconstruction error of the proposed method is obviously smaller than that of Rimmer-Wyant method, as can be clearly seen from Figs. 3(e) and 3(f).

4.4 Comparison of RMS and PV value

The root mean square (RMS) and peak-to-valley (PV) value were used to characterize the reconstruction accuracy. The RMS and PV value of the original wavefront and the reconstruction error of the two methods are shown in Table 2 . The RMS reconstruction error is $6.25 %$ for Rimmer-Wyant method and $0.81 %$ for the proposed method. In other words, the remaining error of the proposed method is about $1 / 8$ of that of Rimmer-Wyant method. Moreover, we know from Table 2 that the PV reconstruction error of the proposed method is about $1 / 10$ of that of Rimmer-Wyant method. In brief, the reconstruction accuracy of the proposed method is superior to that of Rimmer-Wyant method, which can also be confirmed from Figs. 3(a) to 3(c). Note that the reconstruction accuracy will change with some parameters, such as the number of the Zernike polynomials $K$ used in the reconstruction and the shear ratio $s$ , but the fact that the proposed method is more accurate will not change.

Table 2. RMS and PV values of the original wavefront and of the reconstruction error

View Table | View all tables in this article

4.5 Evaluation of the cross-coupling matrix

To explain the reason why the reconstruction accuracy of the proposed method is superior to that of Rimmer-Wyant method under the impact of remaining high-order terms, the cross-coupling matrixes $T_{Z}$ and $T_{H}$ of the two methods were evaluated. The two cross-coupling matrixes $T_{Z}$ and $T_{H}$ are shown in Figs. 4(a) and 4(b). As discussed in Sec.1, the cross-coupling matrix manifests the impact of the remaining high-order terms on the outcomes of the lower-order terms. For example, as shown in Fig. 4(a), the high-order coefficient $a_{16}$ has affections on the estimation of all the lower-order coefficients, especially $a_{14}$ and $a_{15}$ , when Rimmer-Wyant method is used. When the proposed method is used, the coefficient $a_{16}$ just affects on $a_{4}$ and $a_{9}$ with more slight level, as shown in Fig. 4(b).

Fig. 4 Cross-coupling matrix of the two methods, (a) the coupling-matrix $T_{Z}$ of Rimmer-Wyant method, (b) the cross-coupling matrix $T_{H}$ of the proposed method

Download Full Size | PDF

On the other hand, the cross-coupling matrix also manifests the sensitivity of the outcomes of the lower-order terms to the remaining high-order terms. For example, as shown in Fig. 4(a), the evaluation of the coefficient $a_{13}$ is sensitive to almost all of the five remaining terms, especially to $a_{16}$ and $a_{19}$ , when Rimmer-Wyant method is used. When the proposed method is used, the evaluation of $a_{13}$ is not sensitive to any one of the five remaining terms, as can be seen from Fig. 4(b). Thus the calculation error of $a_{13}$ of the proposed method is far below than that of Rimmer-Wyant method which can also be confirmed from Table 1.

Anyway, it can be clearly seen from Fig. 4 that the level of cross coupling of $T_{H}$ is far below than that of $T_{Z}$ .Therefore the sensitivity of the outcomes of lower-order terms to the remaining high-order terms can be decreased by the proposed numerical orthogonal transformation method which leads to a smaller remaining error and a more accurate reconstruction result than Rimmer-Wyant method.

4.6. Comparison of the computation time

To compare the computation time of the proposed method and Rimmer-Wyant method, several simulations were implemented in different sample sizes. The computation time of the proposed method was divided into the following two parts: one part is the time to do the numerical orthogonal transformation and the other part is the time to reconstruct the wavefront with the new shear matrixes and numerical orthogonal polynomials. The total reconstruction time is the summation of the two parts. It can be clearly seen from Table 3 that the proposed method is faster than Rimmer-Wyant method. The reason is that the time to obtain the orthogonal coefficients of difference fronts is shorter than the time to obtain the Zernike coefficients of difference fronts as follows: the orthogonal coefficients of the difference front in x direction can be obtained directly by ${\hat{A}}_{x} = F_{x}^{T} Δ W_{x} / N$ , while the Zernike coefficients are obtained by means of least-squares fitting as ${\hat{A}}_{x} = {(Z_{x}^{T} Z_{x})}^{- 1} Z_{x}^{T} Δ W_{x}$ which needs more computation time. The time difference is longer than the time to operate the numerical orthogonal transformation. Note that our calculation was performed on a personal computer equipped with a 2.80GHZ Pentium-4 processor and the software was MATLAB (Version 7.8.0).

Table 3. Comparison of Computation Time of the Two Methods for Different Sample Sizes

View Table | View all tables in this article

4.7. Simulation with a general wavefront

Although all of the test wavefronts used above were devised in a special case, the obtained conclusions can be promoted reasonably to general situation. To support this viewpoint, another simulation was implemented. In this simulation, the test wavefront was constructed by use of the first 36 fringe Zernike polynomials and the corresponding coefficients were decided from the experiment results of Ref. 28. The coefficients of tilt and defocus were set to be zero. The difference fronts data in the two directions were calculated with the same shear ratio $s = 0.1$ . The wavefront was reconstructed by use of the first 31 fringe Zernike polynomials and the last 5 terms played the parts of remaining high-order terms. We note that the coefficients of the last 5 terms were comparable to that of the first 31 terms in this case. The RMS reconstruction error was 11.29% for Rimmer-Wyant method and 2.98% for the proposed method.

5. Conclusion

The number of terms of Zernike polynomials of a practical wavefront is infinite in general case, but the dimensions of the shear matrix which relate the Zernike coefficients of the original wavefront and difference fronts cannot be extended to infinity. As a result, the outcome of the lower-order terms affected by the remaining high-order terms is inevitable when the modal method is used to reconstruct a practical wavefront from its corresponding difference fronts. Nevertheless, this effect can be decreased by use of orthogonal transformation which results in a smaller reconstruction error. This is just the primary idea of the proposed method which indeed leads to a more accurate result than Rimmer-Wyant method. In theory, the prescription described in this paper can be applied to reconstruct a wavefront on an aperture of arbitrary shape from its difference fronts, and it can also be applied when other polynomials are used as basis functions to expand the wavefront under test.

Acknowledgments

This work was supported by the Grant from the National Natural Science foundation of China under no. 60938003.

References and links

1. D. Malacara, Optical Shop Testing, 3rd ed, (CRC Press, Taylor& Francis, 2007).

2. A. Dubra, C. Paterson, and C. Dainty, “Study of the tear topography dynamics using a lateral shearing interferometer,” Opt. Express 12(25), 6278–6288 (2004). [CrossRef] [PubMed]

3. Y. Zhu, K. Sugisaki, M. Okada, K. Otaki, Z. Liu, J. Kawakami, M. Ishii, J. Saito, K. Murakami, M. Hasegawa, C. Ouchi, S. Kato, T. Hasegawa, A. Suzuki, H. Yokota, and M. Niibe, “Wavefront measurement interferometry at the operational wavelength of extreme-ultraviolet lithography,” Appl. Opt. 46(27), 6783–6792 (2007). [CrossRef] [PubMed]

4. M. P. Rimmer, “Method for evaluating lateral shearing interferograms,” Appl. Opt. 13(3), 623–629 (1974). [CrossRef] [PubMed]

5. D. L. Fried, “Least-square fitting a wave-front distortion estimate to an array of phase-difference measurements,” J. Opt. Soc. Am. 67(3), 370–375 (1977). [CrossRef]

6. R. H. Hudgin, “Wave-front reconstruction for compensated imaging,” J. Opt. Soc. Am. 67(3), 375–378 (1977). [CrossRef]

7. B. R. Hunt, “Matrix formulation of the reconstruction of phase values from phase differences,” J. Opt. Soc. Am. 69(3), 393–399 (1979). [CrossRef]

8. J. Herrmann, “Least-squares wave front errors with minimum norm,” J. Opt. Soc. Am. 70(1), 28–35 (1980). [CrossRef]

9. X. Liu, Y. Gao, and M. Chang, “A partial differential equation algorithm for wavefront reconstruction in lateral shearing interferometry,” J. Opt. A, Pure Appl. Opt. 11(4), 045702 (2009). [CrossRef]

10. S. Okuda, T. Nomura, K. Kamiya, H. Miyashiro, K. Yoshikawa, and H. Tashiro, “High-precision analysis of a lateral shearing interferogram by use of the integration method and polynomials,” Appl. Opt. 39(28), 5179–5186 (2000). [CrossRef] [PubMed]

11. P. Liang, J. Ding, Z. Jin, C. S. Guo, and H. T. Wang, “Two-dimensional wave-front reconstruction from lateral shearing interferograms,” Opt. Express 14(2), 625–634 (2006). [CrossRef] [PubMed]

12. J. Herrmann, “Cross coupling and aliasing in modal wavefront estimation,” J. Opt. Soc. Am. 71(8), 989–992 (1981). [CrossRef]

13. K. R. Freischlad and C. L. Koliopoulos, “Modal estimation of a wave front from difference measurements using the discrete Fourier transform,” J. Opt. Soc. Am. A 3(11), 1852–1861 (1986). [CrossRef]

14. M. P. Rimmer and J. C. Wyant, “Evaluation of large aberrations using a lateral-shear interferometer having variable shear,” Appl. Opt. 14(1), 142–150 (1975). [PubMed]

15. W. Shen, M. W. Chang, and D. S. Wan, “Zernike polynomial fitting of lateral shearing interferometry,” Opt. Eng. 36(3), 905–913 (1997). [CrossRef]

16. G. Harbers, P. J. Kunst, and G. W. R. Leibbrandt, “Analysis of lateral shearing interferograms by use of Zernike polynomials,” Appl. Opt. 35(31), 6162–6172 (1996). [CrossRef] [PubMed]

17. H. van Brug, “Zernike polynomials as a basis for wave-front fitting in lateral shearing interferometry,” Appl. Opt. 36(13), 2788–2790 (1997). [CrossRef] [PubMed]

18. G.- Dai, “Modal wavefront reconstruction with Zernike polynomials and Karhunen-Loève functions,” J. Opt. Soc. Am. A 13(6), 1218–1225 (1996). [CrossRef]

19. W. H. Southwell, “Wave-front estimation from wave-front slope measurements,” J. Opt. Soc. Am. 70(8), 998–1006 (1980). [CrossRef]

20. J. C. Wyant and K. Creath, Basic Wavefront Aberration Theory for Optical Metrology, Vol. XI of Applied Optics and Optical Engineering Series (Academic, 1992), 28.

21. V. N. Mahajan, “Zernike annular polynomials for imaging systems with annular pupils,” J. Opt. Soc. Am. 71(1), 75–85 (1981). [CrossRef]

22. R. Upton and B. Ellerbroek, “Gram-Schmidt orthogonalization of the Zernike polynomials on apertures of arbitrary shape,” Opt. Lett. 29(24), 2840–2842 (2004). [CrossRef] [PubMed]

23. V. N. Mahajan and G. M. Dai, “Orthonormal polynomials for hexagonal pupils,” Opt. Lett. 31(16), 2462–2464 (2006). [CrossRef] [PubMed]

24. G. M. Dai and V. N. Mahajan, “Nonrecursive determination of orthonormal polynomials with matrix formulation,” Opt. Lett. 32(1), 74–76 (2007). [CrossRef] [PubMed]

25. V. N. Mahajan and G. M. Dai, “Orthonormal polynomials in wavefront analysis: analytical solution,” J. Opt. Soc. Am. A 24(9), 2994–3016 (2007). [CrossRef] [PubMed]

26. G. M. Dai and V. N. Mahajan, “Orthonormal polynomials in wavefront analysis: error analysis,” Appl. Opt. 47(19), 3433–3445 (2008). [CrossRef] [PubMed]

27. M. Hasegawa, C. Ouchi, T. Hasegawa, S. Kato, A. Ohkubo, A. Suzuki, K. Sugisaki, M. Okada, K. Otaki, K. Murakami, J. Saito, M. Niibe, and M. Takeda, “Recent progress of EUV wavefront metrology in EUVA,” Proc. SPIE 5533, 27–36 (2004). [CrossRef]

28. Y. Zhu, S. Odate, A. Sugaya, K. Otaki, K. Sugisaki, C. Koike, T. Koike, and K. Uchikawa, “Method for designing phase-calculation algorithms for two-dimensional grating phase-shifting interferometry,” Appl. Opt. 50(18), 2815–2822 (2011). [CrossRef] [PubMed]

j	Input	K = 20		K = 15
		Evaluted		Evaluted		Percentage Error
		M1	M2	M1	M2	M1	M2
2	0.0326	0.0326	0.0326	0.0129	0.0339	60.5309	−4.2218
3	0.5525	0.5525	0.5525	0.5235	0.5510	5.2612	0.2750
4	1.1006	1.1006	1.1006	1.1223	1.1165	−1.9687	−1.4454
5	1.5442	1.5442	1.5442	1.5474	1.5442	−0.2096	−0.0000
6	0.0859	0.0859	0.0859	0.0873	0.0859	−1.5875	−0.0000
7	−1.4916	−1.4916	−1.4916	−1.496	−1.4911	−0.2670	0.0324
8	−0.7423	−0.7423	−0.7423	−0.744	−0.7428	−0.2809	−0.0720
9	−1.0616	−1.0616	−1.0616	−1.197	−1.0788	−12.3554	−1.6226
10	2.3505	2.3505	2.3505	2.2496	2.3451	4.2918	0.2269
11	−0.6156	−0.6156	−0.6156	−0.524	−0.6215	14.3200	−0.9577
12	0.7481	0.7481	0.7481	0.7434	0.7481	0.6235	0.0000
13	−0.1924	−0.1924	−0.1924	−0.124	−0.1924	34.3174	0.0000
14	0.8886	0.8886	0.8886	0.7596	0.8900	14.5175	−0.1577
15	−0.7648	−0.7648	−0.7648	−0.8883	−0.7664	−16.1373	−0.2025
16	−0.1402	−0.1402	−0.1402
17	−0.1422	−0.1422	−0.1422
18	0.0488	0.0488	0.0488
19	−0.0177	−0.0177	−0.0177
20	−0.0196	−0.0196	−0.0196

	$W$	$W_{f}$	$W_{r}$	$W_{1} - W_{f}$	$W_{2} - W_{f}$
$R M S$	1.5380	1.5364	0.0713	0.0960	0.0124
$P V$	8.8069	8.9616	0.4600	0.5492	0.0497

Sample size	Computation time (seconds)
	Rimmer-Wyant method	Proposed method
	Reconstruction	Orthogonal Transformation	Reconstruction
64×64	0.0168	0.0097	0.0010
128×128	0.0886	0.0331	0.0027
256×256	0.3504	0.1286	0.0104
512×512	1.4348	0.5074	0.0429

j	Input	K = 20		K = 15
		Evaluted		Evaluted		Percentage Error
		M1	M2	M1	M2	M1	M2
2	0.0326	0.0326	0.0326	0.0129	0.0339	60.5309	−4.2218
3	0.5525	0.5525	0.5525	0.5235	0.5510	5.2612	0.2750
4	1.1006	1.1006	1.1006	1.1223	1.1165	−1.9687	−1.4454
5	1.5442	1.5442	1.5442	1.5474	1.5442	−0.2096	−0.0000
6	0.0859	0.0859	0.0859	0.0873	0.0859	−1.5875	−0.0000
7	−1.4916	−1.4916	−1.4916	−1.496	−1.4911	−0.2670	0.0324
8	−0.7423	−0.7423	−0.7423	−0.744	−0.7428	−0.2809	−0.0720
9	−1.0616	−1.0616	−1.0616	−1.197	−1.0788	−12.3554	−1.6226
10	2.3505	2.3505	2.3505	2.2496	2.3451	4.2918	0.2269
11	−0.6156	−0.6156	−0.6156	−0.524	−0.6215	14.3200	−0.9577
12	0.7481	0.7481	0.7481	0.7434	0.7481	0.6235	0.0000
13	−0.1924	−0.1924	−0.1924	−0.124	−0.1924	34.3174	0.0000
14	0.8886	0.8886	0.8886	0.7596	0.8900	14.5175	−0.1577
15	−0.7648	−0.7648	−0.7648	−0.8883	−0.7664	−16.1373	−0.2025
16	−0.1402	−0.1402	−0.1402
17	−0.1422	−0.1422	−0.1422
18	0.0488	0.0488	0.0488
19	−0.0177	−0.0177	−0.0177
20	−0.0196	−0.0196	−0.0196

	$W$	$W_{f}$	$W_{r}$	$W_{1} - W_{f}$	$W_{2} - W_{f}$
$R M S$	1.5380	1.5364	0.0713	0.0960	0.0124
$P V$	8.8069	8.9616	0.4600	0.5492	0.0497

Use of numerical orthogonal transformation for the Zernike analysis of lateral shearing interferograms

Abstract

1. Introduction

2. Numerical orthogonal transformation

2.1. Calculation of the numerical orthonormal polynomials

2.2 Derivation of the corresponding shear matrix

2.3 Wavefront reconstruction

3. Impact of remaining high-order terms

3.1 Least-square fitting of difference fronts

3.2 Splitting of the shear matrixes

3.3 Reconstruction with finite-dimensional shear matrixes

3.4 Same analysis of numerical orthogonal transformation method

4. Numerical simulation

4.1 Simulation condition

4.2 Reconstruction without remaining high-order terms

4.3 Reconstruction under the impact of remaining high-order terms

4.4 Comparison of RMS and PV value

4.5 Evaluation of the cross-coupling matrix

4.6. Comparison of the computation time

4.7. Simulation with a general wavefront

5. Conclusion

Acknowledgments

References and links

Cited By

Figures (4)

Tables (3)

Equations (38)

Optics Express