A maximum likelihood approach to the inverse problem of scatterometry

Mark-Alexander Henn; Hermann Gross; Frank Scholze; Matthias Wurm; Clemens Elster; Markus Bär

doi:10.1364/OE.20.012771

1. Introduction

Similar to classical techniques such as electron and optical microscopy, scatterometry is widely used to evaluate structure dimensions of diffractive elements in lithography [1] and [2]. Since critical dimensions (CDs) of such elements decrease continuously, extreme ultraviolet (EUV) scatterometry using radiation with wavelengths around 13 nm is becoming more important in the characterization of photomasks and wafers [3,4]. Deep ultraviolet (DUV) scatterometry can additionally be used to measure critical dimensions of the absorbing structure [5, 6] and is less sensitive to variations in the substrate properties, e. g. in the multilayer stack of a typical EUV mask.

However, the solution to the inverse problem of scatterometry, i.e. the determination of the CDs from measurement data is a computationally challenging task, starting with the modelling of the measurement procedure. In several works [7–12] the so called rigorous coupled-wave analysis (RCWA) has been used to simulate the scattering process. Since the RCWA method is limited to gratings that consist of stacks of rectangular structures, the numerical simulation of the scattering process in the present paper is achieved by a more flexible rigorous modelling starting from Maxwell’s equations. Since periodic line structures resp. gratings are the main targets for scatterometric measurements, the model is reduced to the two dimensional Helmholtz equation [13, 14]. In this work a finite element method (FEM), that is also capable of computing electromagnetic fields characterized by spatial oscillations with high frequency, is applied [15–19].

Solving the inverse problem of scatterometry amounts to determining the optical grating whose diffraction pattern fits a given set of measurement data best, a problem related to the optimal design of diffractive optics [20]. The inverse problem of scatterometry is severely ill-posed [21]. However this can be reduced by including a priori information into the treatment.

The surface structures under investigation are typically sought in a certain class of gratings that can be described by a small number of parameters. Each parameter is assumed to lie in a certain interval according to the design specifications of the photomasks under investigation. The inverse problem is then typically solved by finding the parameters resp. critical dimensions that minimize a least-square function, quantifying the difference between the measurement data and the model. The weight factors in the least squares function account for the variances of the measurement errors and therefore represent knowledge of the underlying measurement error model.

The minimization of the least squares function (LSQ) can be done using either the comparison of the measurement data to a library of scatterograms previously calculated for several combinations of input data parameters [22] or using iterative optimization algorithms [23, 24]. In previous publications the reconstructions have been performed either without taking measurement errors into account [7] or with fixed weight factors in the LSQ function [8,23], commonly representing an assumed relative measurement uncertainty of about 1–3%. A comparison of the reconstructed profiles using EUV-scatterometry and the results obtained by atomic force or electron microscopy [25] has revealed that the sidewall angle, an important feature of the EUV-mask, is often severely underestimated by up to 10 degrees in scatterometry. One reason for such a discrepancy could be the presence of systematic errors in the measurement data, e.g. imperfect modelling [23, 26–28].

In the present paper, we start by studying the impact of the weight factors in the LSQ function on the reconstructed profile parameters. When the weight factors, resp. the variances of the measurements are known exactly, weighted LSQ can be recommended for solving the inverse problem. However, determining the variances with sufficient accuracy is time consuming and not always possible. Accurate variance estimates, on the other hand, are crucial for the application of weighted LSQ. A wrong assumption about the corresponding measurement accuracies of the scattering efficiencies can lead to an unacceptable bias in the solution of the inverse problem. To overcome this systematic problem we employ a maximum likelihood estimation (MLE) [29]. This approach quantifies the variances of the scatterometric measurement data by a statistical error model that depends on two parameters. Both, the geometrical parameters of the mask and the parameters of the statistical error model, are treated as variables to be determined simultaneously. In a second step the variances associated with the reconstructed geometric parameters (critical dimensions, sidewall angles) are evaluated employing the Fisher information matrix. This approach overall leads to the necessary consistency between the measurement data and the diffraction pattern of the optimized solution.

After a short description of the measurement set-ups and the investigated photomasks in Sections 2.1 and 2.2, we present in Sections 2.3–2.6 the mathematical tools that are used to model the diffraction and to solve the inverse problem of scatterometry. We furthermore describe the impact of the used measurement error model and a possibility to overcome problems arising from the use of a conventional least-square approach by application of the MLE. In Section 3.2 the evaluation of MLE is demonstrated by means of simulated data. In Sections 3.3 and 3.4 measurement data from EUV scatterometry [30] and from DUV scatterometry respectively are evaluated both by MLE and LSQ.

2. Methods

2.1. Scatterometric measuring setup for the EUV mask

The first object under investigation is a photo mask with a periodic line pattern designed for use in EUV. The cross section profile for one spatial period and a schematic representation of the measurement apparatus used for EUV scatterometry are shown in Fig. 1 below.

Fig. 1 a) scheme of the the spectroscopic reflectometer and b) cross section of the investigated EUV-mask.

Download Full Size | PDF

The cross section of the shown line-space structure is a symmetric polygonal domain composed of three trapezoidal layers of different materials (TaO, TaN, and SiO₂). These trapezoids are defined by the heights of the three layers p_i, i = 1, 6, 11 and by coordinates p_i, i = 2, 3, 7, 8, 12, 13 describing the corner positions. Beneath the line-space structure, there are two capping layers of SiO₂ and of Si on top of a Molybdenum silicide (MoSi) multilayer stack (MLS). The latter stack consists of a periodically repeated structure composed of a Mo layer and a Si layer separated by two intermediate layers. Note, that the MLS is added to enable the reflection of EUV waves. It acts as a Bragg mirror at the design wavelength of about 13 nm. Important geometric profile parameters are the height p₆ of the TaN layer (55 – 60 nm) and the x-coordinates p₂ and p₇ of the right corners of the TaN layer. The complex indices of refraction for the involved materials are given in Table 1 for one of the three wavelengths in the range of 13 nm.

Table 1. Geometrical parameters and optical constants at a wavelength of 13.0 nm of the EUV mask used for simulations, period d=720 nm.

View Table | View all tables in this article

A symmetric profile is imposed, i.e., the x-coordinates of the corresponding left corners depend on those of the right corners such as p₃ = d – p₂ or p₈ = d – p₇, where d is the period of the EUV mask. Furthermore the sidewall angle (SWA) for the TaO layer was fixed. The cross-section area of this trapezoidal layer is equal to a corresponding TaO layer having curved upper edges with a radius of about 6 nm. Additionally, we have assumed that the SWA of the SiO₂ layer is always equal to the SWA of the TaN layer above. The latter angle depends on the corners and the height of the TaN layer: $tan (SWA) = \frac{p_{6}}{p_{2} - p_{7}}$ . The geometrical features of main interest, i.e., the critical dimensions (CDs) to be determined by scatterometry are the height, top width and bottom width of the absorbing structure that depend on the parameters p₆, p₇ and p₂ (cf. Fig. 1). In the following we will refer to these parameters as height, top CD and bottom CD. An important quantity that derives from these three parameters is the sidewall angle (SWA) of the TaN absorber layer with a design value of 90°. In our evaluations all remaining parameters were set to the values given in Table 1 that represent the manufacturer’s design values [31].For a wavelength in the range of 13 nm the complex indices of refraction for the involved materials are given in Table 1 too.

The experimental data used in this work were measured with a EUV spectroscopic reflectometer in the synchrotron radiation laboratory of PTB at BESSY [30]. The beam line provides monochromatic radiation in the spectral range from 0.7 nm to 35 nm, including the EUV spectral range around 13.5 nm. The beam size at the sample is about 1 mm in both directions and the beam divergence is below 0.5 mrad. The area of interest is placed in the incoming photon beam and either wavelength, angle of incidence or the angle of detection are scanned. The measurements used here are obtained by scanning the detector angle in-plane for three different wavelengths and a fixed angle of incidence of 6°. For further processing we only used the measured diffraction efficiencies for the discrete diffraction orders, no diffusely scattered radiation. A typical set of measurement data consists of 69 to 75 efficiencies for diffraction orders in the range of −10 to 14.

2.2. Scatterometric measuring setup for the MoSi mask

The second object under investigation is a state-of-the-art molybdenum silicide (MoSi) photomask. Its cross section profile for one period and a scheme of the measurement set-up are shown in Fig. 2 below, its optical constants can be found in Table 2.

Fig. 2 a) scheme of the goniometric reflectometer and b) cross section of the investigated MoSi photomask.

Download Full Size | PDF

Table 2. Geometrical parameters and optical constants at a wavelength of 193 nm of the MoSi mask, period d=560 nm.

View Table | View all tables in this article

The cross section of the line-space structure is a trapezoidal domain made of MoSi based on a glass substrate. The trapezium is completely defined by its height h and by the x-coordinates p_i, i = 1, 2, 3, 4 of its corners. Again a symmetric profile is imposed and the sidewall angle (SWA) of the MoSi absorber layer has a design value of 90°.

The experimental data for the MoSi mask derives from measurements with a DUV goniometric scatterometer. It can be operated in a reflectometric, ellipsometric, and diffractometric measurement mode in a spectral range from 190 nm to 840 nm. A measurement data set in the present case includes reflected and transmitted diffraction orders at seven different incident angles for a laser beam with a wavelength of 193 nm and consists typically of 43 data points, see [6] for further details on the experiment.

2.3. Mathematical model of scatterometry

The mathematical model to describe the propagation of electromagnetic waves in matter used here is based on Maxwell’s equations. From the data of the incident light and from characteristic parameters of the irradiated surface profile, the efficiencies and phase shifts for the different diffraction directions are calculated. The time-harmonic Maxwell equations reduce to the two-dimensional Helmholtz equation if geometry and material properties are assumed to be invariant in one direction [13, 14]

Δ u (x, y) + k^{2} u (x, y) = 0.

In this equation u is the transversal field component that oscillates in the groove direction, resp. along the z-axis (cf. Fig. 2) and k is the wave number

k (x, y) = ω {(μ_{0} ε (x, y))}^{1 / 2},

that is assumed to be constant for areas filled with the same material. The boundary conditions that are imposed on this partial differential equation are the common transmission conditions on the interfaces between domains, quasi-periodic boundary conditions on the lateral boundaries due to the periodic nature of the structure and the usual outgoing wave conditions in the infinite regions [16]. This boundary value problem can be solved with the finite element method (FEM) for elliptic PDEs [32]. We use the software package DIPOG [33] for our investigations, developed by Weierstrass Institute for Applied Analysis and Stochastics (WIAS) in Berlin.

The diffraction pattern with a given parameter vector p = (p₁, . . . , p_n, λ_k, θ_l) characterizing the surface structure by its profile and optical indices p₁, . . . , p_n and the incidental light by its wavelength λ_k and angle of incidence θ_l is called the solution of the forward problem. It can be regarded as a nonlinear operator mapping p to the corresponding diffraction pattern: f(p) = (f₁ (p), . . . , f_m (p)). The model function f is calculated by solving the PDE in Eq. (1).

2.4. Measurement error model

A measurement data set, characterizing the diffraction pattern is usually given by a vector y = (y₁, . . . , y_m), consisting of efficiencies or phase shift differences for different wavelengths, incident angles or polarization states perturbed by measurement noise. The jth data point is a sum of the value of the model function and a noise contribution:

y_{j} = f_{j} (p) + ε_{j} .

Assuming no correlation between the measurements and the absence of systematic errors, the errors for the jth data point are modelled as normally distributed with zero mean, i.e.

ε_{j} \sim 𝒩 (0, σ_{j}^{2}),

where the variances are composed of the variances of two independent terms:

σ_{j}^{2} = {(a \cdot f_{j} (p))}^{2} + b^{2} .

The first term (a · f_j (p))² indicates the contribution of a linearly dependent noise. From an experimental point of view, power fluctuations of the incidental beam during the recording of the diffraction patterns are the main determinant for the values of a. The second term b² is the contribution of the background noise independent of the measured light intensities.

2.5. The solution of the inverse problem

2.5.1. Least squares method

The inverse problem of scatterometry is typically [23, 24] formulated as a least squares regression problem (LSQ) such that

{‖ f (\hat{p}) - y ‖}^{2} = min_{p} .

The squared norm in Eq. (6) is given by

χ^{2} (p) = {‖ f (p) - y ‖}^{2} = \sum_{j = 1}^{m} ω_{j} {[f_{j} (p) - y_{j}]}^{2} .

The weight factors ω_j are usually chosen to be the reciprocal variances of the measurement values (cf. Eq. (5)). The solution to the inverse problem is then found by minimizing the resulting weighted least squares function. Below, this is achieved with the DIPOG software that employs a Gauß-Newton type iterative algorithm. Note that this method requires knowledge of the noise level of the measurement process. This information is, of course, not always available, and the question arises how an inappropriate choice of the weights influences the reconstruction.

2.5.2. Maximum likelihood estimation

The additional parameters a and b can not be determined by LSQ. We employ the maximum likelihood estimation (MLE) [29] for this purpose. From the assumption that the measurement errors are normally distributed we obtain the likelihood function as a function of a, b and the geometry parameters p for given measurement data y [34]:

𝒧 (a, b, p) = \prod_{j = 1}^{m} {(2 π ({(a \cdot f_{j} (p))}^{2} + b^{2}))}^{- 1 / 2} exp [- \frac{{(f_{j} (p) - y_{j})}^{2}}{2 {((a \cdot f_{j} (p))}^{2} + b^{2}}]

The maximum likelihood estimator is the parameter combination that maximizes this function, i.e.

{\hat{θ}}_{ML} = (\hat{a}, \hat{b}, \hat{p}) = arg max_{a, b, p} 𝒧 (a, b, p) .

The above problem (9) is solved with a gradient-based trust region algorithm implemented in MATLAB^®, DIPOG [33] is in this case only used as a black box to evaluate the model function f (p), i.e. to solve the forward problem and also to calculate the partial derivatives of f w.r.t. the parameters p_i.

2.6. Variance estimation

2.6.1. Least squares method

It is clear that the variances of the input data have an influence on the variances of the reconstructed parameters. A higher variance in the measurement noise typically leads to a higher variance of the reconstructed parameter values. One way to estimate these variances is to calculate the approximative covariance matrix as proposed in [24, 26]. If we assume that the model function f is approximately linear in the relevant regions of the parameter values p_i, then the errors of the reconstructed parameters are, again, normally distributed random numbers with zero mean. The standard deviations u_i of the quantities p_i are given by the square root of the main diagonal entries of the covariance matrix Σ of the parameters. The matrix Σ can be approximated as

\begin{matrix} Σ \approx {[J^{T} \cdot J]}^{- 1}, & J = {(\frac{\partial f_{j}}{\partial p_{i}} \frac{1}{σ_{j}})}_{i, j}, \end{matrix}

hence

u (p_{i}) = {(Σ_{i, i})}^{1 / 2}

Note that in earlier work [23,24,26] the scaling factors σ_j in Eq. (10) were chosen according to the predefined error model. A modified approach to the variance estimation for LSQ is to chose the scaling factors according to the resulting residuals of the optimal solution, based on the following reasoning. A consistent solution p̂ to the optimization problem (cf. Eq. (7)) should pass the χ²-test, namely

χ_{min}^{2} = \sum_{j = 1}^{m} σ_{j}^{- 2} {[f_{j} (\hat{p}) - y_{j}]}^{2} \in [\begin{matrix} χ_{ν, α / 2}^{2}, & χ_{ν, 1 - α / 2}^{2} \end{matrix}]

where ν = m – n denotes the degrees of freedom and α = 5 ·10⁻² e.g.. If this is not the case we can fulfill the criterium of Eq. (12) by rescaling the variances of the input data resp. the weights with some scaling factor κ:

{\hat{σ}}_{j}^{2} = σ_{j}^{2} κ,

chosen such that the rescaled

χ_{min}^{2}

equals ν. Note, that this rescaling of the weights does not affect the result of the optimization given by the parameter values describing the minimum of the function in Eq. (7). In contrast, it increases only the variances of the reconstructed parameters. In this work LSQ refers to the minimization of the function in Eq. 6 followed by a rescaling of the variances according to Eq. 13. In [23, 35, 36] it has been shown that the variances of the reconstructed parameters calculated using the above approximation are comparable to those obtained using a more time-consuming Monte Carlo type method under the assumption of known error model parameters a and b and sufficient linearity of the model function f.

2.6.2. Maximum likelihood estimation

Asymptotically the standard error for the maximum likelihood estimator can be expressed in terms of the negative second derivative of the logarithm of the likelihood function, the so-called Fisher-information matrix [34]

I = {(- \frac{\partial^{2} log 𝒧}{\partial θ_{i} \partial θ_{j}})}_{i, j} .

If we denote by θ̂ the maximum likelihood estimator, then its standard error can be calculated as

u ({\hat{θ}}_{i}) = {(Σ_{i, i})}^{1 / 2}, with Σ = I^{- 1} .

3. Results

3.1. Dependency of the reconstruction on the chosen weight factors for the EUV measurement setup

In this section, we solve the reconstruction problem using simulated data that are superposed by a noise representing the variances of the input values that are parametrized by a and b. Different methods to solve the inverse problems are used and compared. We start with a least-squares approach assuming fixed uncertainty parameters a and b. Figure 3 shows the values of the χ²-function defined in Eq. (7) for a simulated data set perturbed by a noise contribution with a = 10% and b = 10⁻³ (cf. Eq. (5)) in dependence on the bottom CD and top CD for two different noise models, i.e. two different weightings in the least squares function from Eq. (7). Figure 4 shows the dependency of the reconstructed CDs on the ratio b/a for the same analysis. Note, that the reconstruction results depend only on the ratio of b to a and not on the absolute values of the two parameters.

Fig. 3 χ² in dependence on bottom CD and top CD for different ratios b/a.

Download Full Size | PDF

Fig. 4 a) reconstructed CDs and b) reconstructed SWA in dependence on ratio b/a for a simulated data set (SWA values are shown with approximate 95 % confidence intervalls based on LSQ variance estimation).

Download Full Size | PDF

The geometrical parameters of the mask used in the simulations were set to a bottom CD of 550 nm and a top CD of 546.9 nm, corresponding to a sidewall angle of 90° for the TaN and the SiO₂ layer and 82.6° for the TaO layer on top of the absorber line (cf. Fig. 1). In Fig. 3 one clearly sees how the minimum (the blue dot) shifts upon change of the ratio b/a resulting in two different solutions to the inverse problem. This dependency of course also has an impact on the reconstructed sidewall angle (see Fig. 4). One sees that the reconstructed value of the top CD (bottom CD) decreases (increases) by several nanometers as the value of b/a in the reconstruction is chosen by one order of magnitude bigger than the true value. It is this sensitivity that suggested to us to treat the noise parameters a and b as additional variables that need to be reconstructed as well, since a wrong or incomplete assessment regarding the variances of the input parameters (scattering efficiencies) will not only lead to an under- or overestimation of the variances of the output parameters (reconstructed geometry parameters) but also causes significant systematic errors in the results. This needs to be considered especially in the case of a newly set-up experiment when the knowledge of the variances of the measurement errors is usually incomplete.

3.2. Application to simulated EUV data

In the following, MLE is applied to data sets obtained by simulating the diffraction pattern of an EUV-mask with bottom CD=550 nm, top CD=546.9 nm and height=58 nm. The results are compared to those of a LSQ method applied to the same numerically generated data sets. The parameters used in the simulation are similar to the ones used in the actual EUV measurements on such masks. The −10th to 12th diffraction order of a laser beam at a fixed angle of incidence of 6° are numerically computed for three different wavelengths of the incoming EUV radiation, resulting in a set of 3 × 23 data points. The simulated data have been perturbed assuming a normally distributed measurement error with a = 10% and b = 10⁻² (cf. Eq. (5)) resulting in a total of 50 “noisy” datasets. The weights for the LSQ in Eq. (7) were set to a = 1% and b = 10⁻³ representing a typical estimate of the variances in the real measurement processes used in earlier publications [23]. For MLE those noise parameters were treated as unknowns and hence needed to be reconstructed as well.

The results for the ratio b/a along with the approximate 95% confidence intervals based on the variance estimation described in Section 2.6 are presented in Fig. 5. MLE is found to be capable to reconstruct the noise parameter a from a data set of limited size with a typical relative error of 10 – 20 %, while errors in b can be substantially larger. Figure 6 below presents the reconstructed sidewall angles along with the approximate 95% confidence intervals for the solutions of the two methods. Note that there is a systematic shift of about 1.5° between the actual value of 90° and the mean estimated sidewall angle obtained by LSQ, while the mean estimated sidewall angle obtained by MLE is almost identical to the actual value. This bias is observed because of the nonlinearity of the mathematical model. It is found to vanish for the present model if the “correct” weights are chosen.

Fig. 5 a) reconstructed noise parameter a and b) reconstructed ratio b/a with approximate 95% confidence intervals for MLE, the green dotted lines represent the actual values, the red dotted lines represent the mean values.

Download Full Size | PDF

Fig. 6 Comparison of the reconstructed sidewall angles with approximate 95% confidence intervals for the solutions a) for LSQ and b) for MLE, the green dotted lines represent the actual values, the red dotted lines represent the mean values.

Download Full Size | PDF

For a consistent reconstruction the true value of the geometry parameter should lie within the 95% confidence interval around the reconstructed value. Figure 6 shows that the percentage of consistent reconstructed sidewall angles is comparable for both methods (92% for LSQ, 94% for MLE). However the root mean square deviations (RMSD) of the reconstructed values from the true values as shown in Fig. 7 are about twice as high for LSQ as they are for MLE.

Fig. 7 Comparison of the RMSD and mean estimated standard deviations in % of the actual value a) for LSQ and b) for MLE.

Download Full Size | PDF

3.3. Application to measured EUV data

In the following we apply the maximum likelihood estimation to measurement data from EUV scatterometry at PTB. The EUV mask under investigation is structured into 121 dies that contain each two different scatterometry fields with periodic line and space structures. While the dies D4, D8, F6 and H8 have identical design values, dies H4, H5 and H6 have different periods but are assumed to have the same bottom CD, for detailed values see Table 3. All dies share the remaining geometrical and structural parameters given in Table 1.

Table 3. Design values of the EUV mask.

View Table | View all tables in this article

Note that the dependence of the reconstructed solution on the choice of weights shown in Fig. 8 has the same qualitative shape as the one obtained for simulated input data in Fig. 4. The reconstructed noise parameters a and the ratios b/a for all measured data sets are plotted in Fig. 9. The mean estimated value for b/a lies around 1.2·10⁻², a value that differs significantly from the value of ca. 3 ·10⁻² – 10⁻¹ (a = 1 − 3%, b = 10⁻³) used for the LSQ method in previous publications [23], the mean estimated value for the background noise b of about 2 ·10⁻³ agrees quite well with the value given by the experimenters, the relative error a lies around 16%, which is about a magnitude larger than expected. Similar to the observations in the case of simulated input data we observe a systematic shift of the sidewall angles reconstructed using MLE towards the design value of 90° compared to the LSQ solutions with fixed weights, see Fig. 10. Note that the mean sidewall angle obtained by SEM in [31] was approximately 86°.

Fig. 8 a) reconstructed CDs and b) reconstructed SWA in dependence on ratio b/a for measurement data set D4.

Download Full Size | PDF

Fig. 9 a) reconstructed noise parameter a and b) reconstructed ratio b/a with approximate 95% confidence intervals for measured data from the EUV scatterometer, the dotted lines represent the mean values of the reconstructed values.

Download Full Size | PDF

Fig. 10 Reconstructed sidewall angles with approximate 95% confidence intervals for measured data from EUV scatterometry, the dotted lines represent the mean values for the two methods.

Download Full Size | PDF

3.4. Application to measured DUV data

Finally, we apply the MLE approach to measurement data from the DUV scatterometer. The dataset consists of eleven measurements on different mask fields, that were fabricated with identical design specification, e. g. at a period of 560 nm [37]. The MLE results for the parameters a and b characterizing the uncertainty of the input data with their approximate 95% confidence intervals are presented in Fig. 11. The parameter b representing the background noise of the measurement setup lies around 3 ·10⁻², which is in a fair agreement with the value of 4 ·10⁻² given by the experimenters in [38]. Remember that the sensitivity of the reconstructed geometrical parameters decreases with respect to the ratio b/a (cf. Fig. 4 and 8), hence the geometrical features obtained by LSQ with fixed weights a = 2.5% and b = 4 ·10⁻² (b/a = 1.6) are almost identical to those obtained by MLE (b/a ≈ 1), both in terms of consistency and variance, as shown in Fig. 12.

Fig. 11 a) reconstructed noise parameter a and b) reconstructed ratio b/a with approximate 95% confidence intervals for measured data from DUV scatterometry, the dotted lines represent the mean values of the reconstructed values.

Download Full Size | PDF

Fig. 12 Reconstructed sidewall angles with approximate 95% confidence intervals for measured data from the DUV scatterometer a) for LSQ and b) for MLE, the dotted lines represent the mean values.

Download Full Size | PDF

4. Discussion and conclusions

Scatterometry is a technology that is employed in dimensional measurements of nanostructures. It has great potential and offers advantages over the more common scanning microscopy methods with respect to speed and non-destructiveness. It is, however, a non-imaging method that requires the solution of an inverse problem involving a mathematical model of the scattering process at the investigated nanostructures. In other words, the measured scattering efficiencies (input data) have to be transformed into the desired critical dimensions (output data) of the objects under study. The accuracy of scatterometric measurements of critical dimensions depends not only on the accuracy of the input data, but also on the accuracy and correctness of the mathematical model.

In the present paper, we have analysed in detail the influence of the applied statistical method and found that the results for the critical dimensions are substantially improved if a maximum likelihood estimation (MLE) method is used instead of the standard least squares (LSQ) approach used in earlier papers on the subjects [8, 23, 24]. By using simulation data, we have demonstrated the sensitivity of the LSQ approach with respect to the used statistical model. An inappropriate choice of the weights in the least squares function that accounts for the measurement errors of the input data can cause strong systematic deviations on the reconstructed critical dimensions and the resulting sidewall angles. Therefore an alternative method (MLE) has been proposed here, where the parameters modelling the measurement errors have been included as variables in the optimization process.

First the capability of the method to solve the inverse problem has been investigated by applying it to simulated data sets with known variances of the input data. It has been shown that the MLE procedure can reconstruct the geometrical parameters CDs and the statistical parameters that model the measurement errors with sufficient accuracy. The variances of the reconstructed CDs were approximated by a computation of the variances from the Fisher information matrix in the MLE procedure.

Second MLE has been applied to various sets of data from measurement of different photomasks obtained by EUV and DUV scatterometry. It has been shown that the inclusion of the parameters of the error model into the optimization improves the reconstruction of the mask’s geometry and leads to a much better agreement between results of the optimized and the corresponding measurement data. The obtained knowledge of the variances allows also a more realistic estimate of the accuracy of the reconstructed parameters.

The first seven experimental data sets were obtained by using EUV scatterometry with wavelengths in the range 13 – 14 nm to determine the dimensions of EUV photomask gratings with a fixed period of 720 nm, CDs between 140 and 550 nm and profile heights around 70 nm. Application of MLE yielded relative variances of 10 – 20 % superposed by absolute background noises in the range 1–2·10⁻³ for the measurement data. The resulting variances of the CDs were found to be in the range of 2 – 3 nm, whereas the height variances are ca. 0.5 nm. The sidewall angles were systematicly larger than with application of the LSQ method and showed better agreement with the design values as well as with independent measurement with scanning microscopy [31].

The second eleven data sets were obtained by DUV scatterometry for different arrays on a MoSi mask with identical periods of 560 nm, CDs around 250 nm and profile height around 70 nm. Application of MLE yielded relative variances of 3 % superposed by absolute background noises in the range of 3·10⁻². The variances of the CDS were found to be in the range of 1 – 2 nm and the height variances were determined to be 0.5 nm.

The analysed measurement data achieved with the EUV scatterometer are less noisy than those achieved with the DUV scatterometer. The relatively high variances for the geometrical parameters of the EUV mask are presumably caused by the much higher sensitivity of the EUV mask to systematic errors stemming from oversimplifications in the mathematical model (e. g. assumption of perfect periodic line structure without roughness) as well as from incomplete knowledge of crucial model parameters (e. g. the periods of the multilayer). In contrast, the MoSi mask has a much simpler structure and is therefore more robust against model errors. Nevertheless in both cases, accuracies in the 1 nm range are within reach. The MLE method suggested in this paper is sufficient to treat the MoSi mask in a DUV scatterometer. A reliable treatment of EUV masks in an EUV scatterometer will require further improvement of the mathematical modelling and inverse methods.

For EUV scatterometry, the reconstructed noise parameters represent a relative measurement noise of about 10–20%, a value that is much higher than the references given by the experimenters. A simple error model as proposed in this paper tends to compensate errors caused by an inappropriate modelling of the photomask by overestimating the measurement noise. Systematic errors due to inappropriate modelling can have many sources [26] and may play a crucial part in the treatment of the inverse problem. For future work, it is therefore of utmost importance to incorporate systematic errors, in particular roughness [27] and deviations in the multilayer structure [23] into the modelling and in the MLE procedure employed in the solution of the inverse problem. A second possibility of further improvement of the method is given by the inclusion of additional knowledge about the object under study obtained by alternative measurement methods. Such additional information can be included as prior knowledge of the inverse problem of scatterometry in terms of a Bayesian framework [29, 39, 40].

In summary, we have shown with a large number of simulated data sets that an MLE approach to the inverse problem of scatterometry avoids strong systematic deviations that can be inherent in the LSQ method used in earlier work. The MLE method was then applied to measurement data, where it yielded a removal of systematic shifts in the result that appear if LSQ is used. Moreover, consistent results with substantially smaller measurement errors for the critical dimensions than the LSQ method were obtained. Hence, compared to LSQ, MLE avoids a potential overestimation of errors and it can be easily extended to include known systematic errors.

Acknowledgments

The presented results are part of the CDuR32 project funded by the German Federal Ministry of Education and Research. The authors particularly thank A. Kato and A. Rathsfeld for many fruitful discussions. This research is supported by the European Metrology Research Program(EMRP) in Project IND17. The EMRP is jointly funded by the EMRP participating countries within EURAMET and the European Union.

References and links

1. H. Huang and F. Terry Jr., “Spectroscopic ellipsometry and reflectometry from gratings (scatterometry) for critical dimension measurement and in situ, real-time process monitoring,” Thin Solid Films 455, 828–836 (2004). [CrossRef]

2. C. Raymond, M. Murnane, S. Prins, S. Sohail, H. Naqvi, J. McNeil, and J. Hosch, “Multiparameter grating metrology using optical scatterometry,” J. Vac. Sci. Technol. B 15, 361–368 (1997). [CrossRef]

3. J. Perlich, F. Kamm, J. Rau, F. Scholze, and G. Ulm, “Characterization of extreme ultraviolet masks by extreme ultraviolet scatterometry,” J. Vac. Sci. Technol. B 22, 3059 (2004). [CrossRef]

4. F. Scholze and C. Laubis, “Use of EUV scatterometry for the characterization of line profiles and line roughness on photomasks,” Proc. SPIE 6792, 6792OU (2008).

5. M.-A. Henn, R. Model, M. Bär, M. Wurm, B. Bodermann, A. Rathsfeld, and H. Gross, “On numerical reconstructions of lithographic masks in DUV scatterometry,” Proc. SPIE 7390, 73900Q (2009). [CrossRef]

6. M. Wurm, S. Bonifer, B. Bodermann, and J. Richter, “Deep ultraviolet scatterometer for dimensional characterization of nanostructures: system improvements and test measurements,” Meas. Sci. Technol. 22, 094024 (2011). [CrossRef]

7. X. Niu, N. Jakatdar, J. Bao, C. Spanos, and S. Yedur, “Specular spectroscopic scatterometry in DUV lithography,” Proc. SPIE 3677, 159–168 (1999). [CrossRef]

8. H. Patrick, T. Germer, Y. Ding, H. Ro, L. Richter, and C. Soles, “In situ measurement of annealing-induced line shape evolution in nanoimprinted polymers using scatterometry,” Proc. SPIE 7271, 727128 (2009). [CrossRef]

9. M. Moharam and T. Gaylord, “Rigorous coupled-wave analysis of planar-grating diffraction,” J. Opt. Soc. Am. A 71, 811–818 (1981). [CrossRef]

10. M. Moharam, E. Grann, D. Pommet, and T. Gaylord, “Formulation for stable and efficient implementation of the rigorous coupled-wave analysis of binary gratings,” J. Opt. Soc. Am. A 12, 1068–1076 (1995). [CrossRef]

11. A. Tavrov, M. Totzeck, N. Kerwien, and H. Tiziani, “Rigorous coupled-wave analysis calculus of submicrometer interference pattern and resolving edge position versus signal-to-noise ratio,” Opt. Eng. 41, 1886 (2002).

12. P. Lalanne and G. Morris, “Highly improved convergence of the coupled-wave method for TM polarization,” J. Opt. Soc. Am. A 13, 779–784 (1996). [CrossRef]

13. R. Petit and L. Botten, Electromagnetic theory of gratings (Springer, 1980). [CrossRef]

14. L. Landau and J. Lifschitz, Lehrbuch der theoretischen Physik: 2, Klassische Feldtheorie (Akademie Verlag, 1977).

15. J. Elschner, R. Hinder, and G. Schmidt, “Finite element solution of conical diffraction problems,” Adv. Comput. Math. 16, 139–156 (2002). [CrossRef]

16. H. Gross, R. Model, M. Bär, M. Wurm, B. Bodermann, and A. Rathsfeld, “Mathematical modelling of indirect measurements in scatterometry,” Measurement 39, 782–794 (2006). [CrossRef]

17. O. Cessenat and B. Despres, “Application of an ultra weak variational formulation of elliptic PDEs to the two-dimensional Helmholtz problem,” SIAM J. Numer. Anal. 35, 255–299 (1998). [CrossRef]

18. F. Ihlenburg, Finite element analysis of acoustic scattering (Springer, 1998). [CrossRef]

19. J. Melenk and I. Babuška, “The partition of unity finite element method: basic theory and applications,” Comput. Meth. Appl. Mech. Eng. 139, 289–314 (1996). [CrossRef]

20. J. Turunen and F. Wyrowski, eds., Diffractive optics for industrial and commercial applications (Wiley-VCH, 1997).

21. A. Tarantola, Inverse problem theory (Elsevier, 1987).

22. S. Coulombe, B. Minhas, C. Raymond, S. Naqvi, and J. McNeil, “Scatterometry measurement of sub-0.1 μm linewidth gratings,” J. Vac. Sci. Technol. B 16, 80 (1998). [CrossRef]

23. H. Gross, A. Rathsfeld, F. Scholze, and M. Bär, “Profile reconstruction in extreme ultraviolet (EUV) scatterometry: modeling and uncertainty estimates,” Meas. Sci. Technol. 20, 105102 (2009). [CrossRef]

24. R. Al-Assaad and D. Byrne, “Error analysis in inverse scatterometry. I. Modeling,” J. Opt. Soc. Am. A 24, 326–338 (2007). [CrossRef]

25. M. Wurm, “Über die dimensionelle Charakterisierung von Gitterstrukturen auf Fotomasken mit einem neuartigen DUV-Scatterometer,” Ph.D. thesis, Friedrich-Schiller-Universität Jena (2008).

26. H. Patrick, T. Germer, R. Silver, and B. Bunday, “Developing an uncertainty analysis for optical scatterometry,” Proc. SPIE 7272, 72720T (2009).

27. A. Kato and F. Scholze, “Effect of line roughness on the diffraction intensities in angular resolved scatterometry,” Appl. Opt. 49, 6102–6110 (2010). [CrossRef]

28. H. Gross, M.-A. Henn, A. Rathsfeld, and M. Bär, “Stochastic modelling aspects for an improved solution of the inverse problem in scatterometry,” in “Advanced mathematical and computational tools in metrology and testing: AMCTM IX,” F. Pavese, M. Bär, J.-R. Filtz, A. B. Forbes, L. Pendrill, and H. Shirono, eds. (World Scientific Pub. Co. Inc., 2012), 202–209.

29. J. Ruanaidh and W. Fitzgerald, Numerical Bayesian Methods Applied to Signal Processing (Springer, 1996). [CrossRef]

30. C. Laubis, C. Buchholz, A. Fischer, S. Plöger, F. Scholz, H. Wagner, F. Scholze, G. Ulm, H. Enkisch, S. Müllender, M. Wedowski, E. Louis, and E. Zoethout, “Characterization of large off-axis EUV mirrors with high accuracy reflectometry at PTB,” Proc. SPIE 6151, 61510I (2006). [CrossRef]

31. J. Pomplun, S. Burger, F. Schmidt, F. Scholze, C. Laubis, and U. Dersch, “Metrology of EUV Masks by EUV-Scatterometry and Finite Element Analysis,” Proc. SPIE 7028, 70280P (2008). [CrossRef]

32. P. Ciarlet, The finite element method for elliptic problems (North-Holland, 1978).

33. J. Elschner, R. Hinder, A. Rathsfeld, and G. Schmidt, http://www.wias-berlin.de/software/DIPOG.

34. R. Millar, Maximum Likelihood Estimation and Inference (Wiley, 2011). [CrossRef]

35. H. Gross, A. Rathsfeld, and M. Bär, “Modelling and uncertainty estimates for numerically reconstructed profiles in scatterometry,” in “Advanced mathematical and computational tools in metrology and testing: AMCTM VIII,” F. Pavese, M. Bär, A. B. Forbes, J.-M. Linares, C. Perruchet, and N.-F. Zhang, eds. (World Scientific Pub. Co. Inc., 2009), 142–147.

36. H. Gross, R. Model, F. Scholze, M. Wurm, B. Bodermann, M. Bär, and A. Rathsfeld, “Modellbildung, Bestimmung der Messunsicherheit und Validierung für diskrete inverse Probleme am Beispiel der Scatterometrie,” VDI-Berichte 2011, 337–346 (2008).

37. J. Richter, J. Rudolf, B. Bodermann, and J. C. Lam, “Comparative scatterometric CD measurements on a MoSi photo mask using different metrology tools,” Proc. SPIE 7122, 71222U (2008). [CrossRef]

38. M. Wurm, S. Bonifer, B. Bodermann, and M. Gerhard, “Comparison of far field characterisation of DOEs with a goniometric DUV-scatterometer and a CCD-based system,” J. Eur. Opt. Soc. Rapid Publ. 6, 11015s (2011). [CrossRef]

39. J. Kaipio and E. Somersalo, Statistical and computational inverse problems (Springer, 2005).

40. J. Berger, Statistical decision theory and Bayesian analysis (Springer, 1985).

Material	height (nm)	SWA (°)	n	k
TaO	12.00	82.6	0.951	0.0030
TaN	58	90	0.946	0.0326
SiO₂ (buffer)	8.000	90	0.976	0.0141
SiO_x (capping)	1.234		0.976	0.0141
Si	12.869		1.002	0.0018
MoSi	0.147		0.975	0.0040
Mo	2.141		0.931	0.0056
MoSi	1.972		0.975	0.0040
Si	2.838		1.002	0.0018
Substrate	6.35 ·10⁶		0.976	0.0141

Material	height (nm)	SWA (°)	n	k
MoSi	75	90	2.308	0.5975
Substrate	6.35 ·10⁶		1.5748	0

Material	height (nm)	SWA (°)	n	k
TaO	12.00	82.6	0.951	0.0030
TaN	58	90	0.946	0.0326
SiO₂ (buffer)	8.000	90	0.976	0.0141
SiO_x (capping)	1.234		0.976	0.0141
Si	12.869		1.002	0.0018
MoSi	0.147		0.975	0.0040
Mo	2.141		0.931	0.0056
MoSi	1.972		0.975	0.0040
Si	2.838		1.002	0.0018
Substrate	6.35 ·10⁶		0.976	0.0141

Material	height (nm)	SWA (°)	n	k
MoSi	75	90	2.308	0.5975
Substrate	6.35 ·10⁶		1.5748	0

A maximum likelihood approach to the inverse problem of scatterometry

Abstract

1. Introduction

2. Methods

2.1. Scatterometric measuring setup for the EUV mask

2.2. Scatterometric measuring setup for the MoSi mask

2.3. Mathematical model of scatterometry

2.4. Measurement error model

2.5. The solution of the inverse problem

2.5.1. Least squares method

2.5.2. Maximum likelihood estimation

2.6. Variance estimation

2.6.1. Least squares method

2.6.2. Maximum likelihood estimation

3. Results

3.1. Dependency of the reconstruction on the chosen weight factors for the EUV measurement setup

3.2. Application to simulated EUV data

3.3. Application to measured EUV data

3.4. Application to measured DUV data

4. Discussion and conclusions

Acknowledgments

References and links

Cited By

Figures (12)

Tables (3)

Equations (15)

Optics Express