Multi-objective and multi-solution source mask optimization using NSGA-II for more direct process window enhancement

Qingyan Zhang; Qingyan Zhang; Qingyan Zhang; Liu Junbo; Liu Junbo; Liu Junbo; Haifeng Sun; Haifeng Sun; Ji Zhou; Ji Zhou; Chuan Jin; Chuan Jin; Jian Wang; Jian Wang; Jian Wang; Yanli Li; Song Hu; Song Hu; Song Hu

doi:10.1364/OE.515546

1. Introduction

The lithography process, a critical step in the fabrication of integrated circuit (IC), adheres to the numerical model of Moore’s Law [1]. Since the lithography systems have evolved to 193-nm ArF lithography as shown in Fig. 1 for nano-scale feature printing, the printability and process window (PW) of desired patterns are severely challenged due to lower process parameter k1 [2,3]. Fig. 1 illustrates the imaging system of deep ultraviolet lithography (DUVL) for the 193 nm ArF source. Initially, the illumination of light on the mask results in the production of diffracted light. The light via an optical projection system is subsequently transferred to the wafer, which is covered with photoresist, thereby inducing a photochemical reaction within the photoresist. Following the development process, the pattern on the wafer is ultimately printed.

Fig. 1. The optical system in 193 nm ArF lithography machine

Download Full Size | PDF

However, the fidelity of the printed wafer pattern is compromised due to factors such as the optical proximity effect (OPE) [4–6]. To improve the imaging quality of lithography and compensate for OPE, various resolution enhancement techniques (RETs) have been developed. A. Poonawala et al. [6] introduced a method known as optical proximity correction (OPC), which involves adding extra features to the original masks. Another RET, phase-shifting mask (PSM) method, was first introduced by Levenson et.al [7], and modulates the phase of incidence light by changing mask structure, thereby increasing the degree of contrast for the aerial image. The mode of illumination also influences the image result, and Off-axis illumination (OAI) [8,9] shifts more diffraction orders into the objective lens by tilting the illumination away from normal incidence.

Inverse lithography technology (ILT) encompasses methods such as mask optimization (MO) and source mask optimization (SMO). The primary concept of these methods is to reverse the imaging process to identify the optimal solution for achieving imaging targets. There have been many excellent pieces of research [10–12] since Saleh et. al ‘s pioneering MO research [13]. The development of beam-shaping elements like diffractive optical elements (DOE) [14,15] and programmable illuminators [16,17] has made freeform source optimization (SO) more flexible. Rosenbluth et al. introduced the concept of SMO, which significantly enhanced the degree of freedom compared to earlier MO research, making SMO a necessary technology for the 22 nm node and below.

SMO methods can be classified into gradient-based SMO algorithms and heuristic ones, based on the optimization principle. Gradient-based SMO methods utilize the gradient of cost functions, which primarily focus on the difference between desired patterns and patterns after imaging and resist model. These methods [18–21], represented by various studies, have distinct advantages and disadvantages. Gradient-based optimization algorithms like the steepest descent algorithm and conjugate gradient can quickly drive parameters to converge, but they may fall into local optimal solutions. On the other hand, heuristic algorithms like particle swarm optimization (PSO) [21] and genetic algorithm (GA) [22] have better global optimal solution searching ability, but they are less efficient and require more computational cost. Furthermore, these heuristic algorithms do not require any gradient or prior knowledge.

Considering the unevenness of the wafer surface [23] in the actual lithography process, the exposure position is not in the focal plane. Besides, other process variations from lens aberrations [24], and thermal aberration [25] to unideal mask effects [26,27] further modify the focal plane. Due to the high sensitivity of process variation at advanced nodes, the imaging performance is highly related to depth of focus (DOF). With the consideration of DOF, Peng et al. [28] and Ma et al. [29] added the resist error at a specific defocus position, as a penalty term to the cost function. The image fidelity is improved at the fixed position, but the fidelity on the global scale is not guaranteed. Jia et al. [30] proposed to choose the defocus position randomly, refined the strategy and enlarged the process window to some extent. Li et al. proposed the defocus robust SMO (DRSMO) method [31], and introduced mini-batch gradient descent (MBGD) on a former basis, improving the imaging robustness further. The process window is also considered by Gao et al.’s MO research [32] and achieved by combining the edge placement error and process variation band (PVB) in the optimization objective. Recently, Peng et al. proposed a novel defocus generative and adversarial method (DGASMO), enlarging the process window more directly combined with the Adam optimization method. Another strategy to compensate for aberration is changing the pupil [33–35] with the help of ASML’s wavefront manipulator equipment, which is known as pupil wavefront optimization (PWO).

The multi-objective SMO methods mentioned above focus on adding penalty terms related to printing quality in the general objective function and only a single alternative solution is obtained in the final stage. However, the optimization process of multi-objective problems aims to illustrate the trade-offs among objectives. The weights of different terms in traditional methods significantly influence the final SMO result. In real integrated circuit (IC) manufacturing, based on different process conditions and mask printing requirements, multiple source and mask solutions provide more flexibility for fabrication. Additionally, the computation of the process window, especially defocus, is complex due to the computationally intensive imaging model. Meanwhile, the gradient of the process window is challenging to express, thus unfriendly to gradient-based SMO methods. Previous methods tend to use resist pattern error in specific defocused positions to avoid calculating DOF and process window directly.

In this paper, we propose a more direct process window enhancement SMO method to reduce the sensitivity of process variation and improve the process window in the SMO. The optimization process of the abstract process window is transformed into the optimization of normalized image log slope (NILS) and DOF. The latter, with the help of the fast focus-variation aerial image simulation method, VLIM, can be optimized directly and efficiently for the first time. This method enables finding multiple reasonable solutions by our designed rank-based selector in a single optimization run, achieved by the non-dominated sorted genetic algorithm (NSGA-II) [36]. Then, a final mask and source solution can be selected after considering multiple objectives comprehensively.

2. Partially coherent imaging model

The aerial image as depicted in Fig. 1, indicates the intensity distribution on the surface of the resist. After exposure and development, the resist forms the patterns. Practically, the optical projection in lithography can be expressed as partially coherent imaging. There are two imaging models basically: Abbe’s model and Hopkins’ model. Abbe’s imaging model decomposes the source into single source points and integrates their imaging results coherently. Hopkins’ model, which is approximated from Abbe’s model, decomposes the aerial image into two terms. One term represents the effect caused by the imaging system and source, while the other term is the impact of the mask on imaging.

Abbe’s model is more convenient for source optimization because its intensity computation is highly related to the source pattern. In this paper, the source optimization is based on Abbe’s imaging model [37], and the aerial image is formulated as:

(1)$$I({{x_i},{y_i}} )= \mathop {\int\!\!\!\int }\limits_{ - \infty }^{ + \infty } J({f,g} )\left[ {{{\left|{\mathop {\int\!\!\!\int }\limits_{ - \infty }^{ + \infty } P({f + f^{\prime},g + g^{\prime}} )M({f^{\prime},g^{\prime}} ){e^{ - i2\pi [{{x_i}f^{\prime} + {y_i}g^{\prime}} ]}}df^{\prime}dg^{\prime}} \right|}^2}} \right]dfdg$$

where $({{x_i},{y_i}} )$ represents the coordinates in the image plane, $({f,g} )$ represents the coordinates in the pupil plane, and $J({f,g} )$ represents the source distribution. $P({f,g} )$ is the pupil function of the projection system and can be regarded as a low pass filter, $M({f,g} )$ is the spectrum of the mask pattern in the frequency domain. The inner two integral of Eq. (1) can be pre-calculated and extracted as illumination cross coefficient (ICC). Then, Abbe equation can be rewritten with ICC,

(2)$$I({{x_i},{y_i}} )= \mathop {\int\!\!\!\int }\limits_{ - \infty }^{ + \infty } J({f,g} )ICC({f,g;{x_i},{y_i}} )dfdg,$$

(3)$$\begin{array}{{c}} {ICC({f,g;{x_i},{y_i}} )= {{\left|{\mathop {\int\!\!\!\int }\limits_{ - \infty }^{ + \infty } P({f + f^{\prime},g + g^{\prime}} )M({f^{\prime},g^{\prime}} ){e^{ - i2\pi [{{x_i}f^{\prime} + {y_i}g^{\prime}} ]}}df^{\prime}dg^{\prime}} \right|}^2},} \end{array}$$

Compared to Abbe’s model, Hopkins’ model introduces transmission cross-coefficient (TCC), which is described by Eq. (4) and (5),

(4)$$\begin{array}{{c}} {\begin{array}{{c}} {I({x,y} )= \mathrm{\int\!\!\!\int \int\!\!\!\int }TCC({f^{\prime},g^{\prime};f^{\prime\prime},g^{\prime\prime}} )M({f^{\prime},g^{\prime}} ){M^\mathrm{\ast }}({f^{\prime\prime},g^{\prime\prime}} ){e^{ - i2\pi [{({f^{\prime} - f^{\prime\prime}} )x + ({g^{\prime} - g^{\prime\prime}} )y} ]}}df^{\prime}dg^{\prime}df^{\prime\prime}dg^{\prime\prime}} \end{array},\# } \end{array}$$

(5)$$\begin{array}{{c}} {TCC({f^{\prime},g^{\prime};f^{\prime\prime},g^{\prime\prime}} )= \mathrm{\int\!\!\!\int }J({f,g} )P({f + f^{\prime},g + g^{\prime}} ){P^\ast }({f + f^{\prime\prime},g + g^{\prime\prime}} ),} \end{array}$$

where $J({f,g} )$ is the source pattern, P is the pupil function, both $({f^{\prime},g^{\prime}} )$ and $({f^{\prime}\mathrm{^{\prime}},g^{\prime\prime}} )$ are normalized spatial frequency. Hence, the Hopkins model is much more friendly to mask optimization than Abbe’s model because TCC is fixed as the mask pattern changes. The sum of coherent system (SOCS) model is utilized in mask optimization to reduce the computational complexity of Hopkins’ model. It decomposes TCC into a few kernels via singular value decomposition (SVD) as:

(6)$$\begin{array}{{c}} {TCC({f^{\prime},g^{\prime};f^{\prime\prime},g^{\prime\prime}} )\approx \mathop \sum \nolimits_{k = 1}^n {\alpha _k}{\Phi _k}({f^{\prime},g^{\prime}} )\Phi _k^\mathrm{\ast }({f^{\prime\prime},g^{\prime\prime}} ),} \end{array}$$

${\alpha _k}$ is the kth singular value, and ${\mathrm{\Phi }_k}$ is the respective kernel. n is the number of kernels for approximation and the degree of approximation is directly related to it. In this paper, n is set to be 12. The final aerial image is expressed as below,

(7)$${I_{SOCS}} = \mathop \sum \nolimits_{k = 1}^n {\alpha _k}{|{IFFT\{{{\Phi _k}({f,g} )} \}\otimes M({f,g} )} |^2}.$$

In the above formula, $IFFT\{{\cdot} \}\; $ is the operation of the inverse fourier transform. Referring to different properties of Abbe’s model and SOCS model, they are applied in different stages of SMO.

Mask patterns are projected onto the photoresist afterward. In this paper, the resist model is approximated using the sigmoid function while calculating the error between the printed image and the desired image.

(8)$${I_p} = sig\left\{ {\frac{I}{{{I_N}}}} \right\} = \frac{1}{{1 + exp [ - \alpha \left( {\frac{I}{{{I_N}}} - {t_r}} \right)]}},$$

where ${I_p}$ is the print image, ${I_N}$ is the intensity term for normalization. The steepness and threshold of the resist are represented by the parameter $\alpha $ and ${t_r}$. In addition, the resist model is simplified to the hard threshold function while evaluating the image contrast or critical dimension (CD) in the follow-up simulation. The hard threshold function can be written as:

(9)$${I_p} = \mathrm{\Gamma }\left( {\frac{I}{{{I_N}}} - {t_r}} \right),$$

the meaning and value of each parameter remain the same. The hard threshold function, denoted as $\mathrm{\Gamma }({\cdot} )$, is equal to 1 if ${\cdot} $ is above zero. Otherwise, the function turns to zero.

3. Coding of the source and mask

In pixel-based source mask optimization (SMO), the light source and mask are sampled based on pixels in the Cartesian coordinate system. For heuristic optimization algorithms, an excessive number of optimization parameter units can hinder result convergence and lead to significant inefficiencies. Therefore, these discrete and pixelated elements are challenging to utilize in the optimization stage without encoding. Encoding involves using fewer variables to represent the redundant elements, and the actual source and mask pattern can be recovered by decoding transformation. This process allows for a more efficient and effective optimization process, improving the overall performance of the SMO.

Every source pixel in spatial frequency coordinate is denoted as $({\hat{f},\hat{g}} )$. The size of source S is ${N_s} \times {N_s}$, and the intensity range of a source point is normalized in $[{0,1} ]$. The partial coherent factor is assumed to be σ. Only source points in the first quadrant are encoded for optimization due to the symmetry of the source about the vertical axis and horizontal axis. These selected elements in S are then stacked into a vector S(n) The flow of source encoding and decoding in the optimization stage is illustrated in Fig. 2.

Fig. 2. Flow of encoding and decoding source in the optimization stage

Download Full Size | PDF

From Fig. 2, the vector S is updated to R by the optimization algorithm and decoded to the actual source ${S_i}$. Then Gaussian filter is set to simulate the effect of the finite-resolution illumination optics [38]. The updated and blurred source ${S_b}$ can be expressed by

(10)$${S_b} = {S_i} \otimes {G_k},$$

${G_k}$ in the above equation is a Gaussian convolution kernel. Its mean and variance are set to be zero and ${\sigma _k}$ respectively.

Unlike the source, masks employ a completely different coding strategy. This is due to the mask patterns exhibiting a higher unit density and more complex structure compared to source patterns. In this paper, the pixel-based mask is transformed into the frequency domain using the discrete cosine transform (DCT) and then represented by the frequency components. During the optimization stage, the updated frequency components can be decoded into a mask pattern. The detailed process of the above encoding and decoding of the mask is shown in Fig. 3.

Fig. 3. Flow of encoding and decoding mask in the optimization stage

Download Full Size | PDF

As depicted in Fig. 3, a 4-fold symmetric mask is employed for illustrative purposes. The dark areas in the mask pattern represent opacity, while the bright areas represent transparency. Similar to the coding way of the source, only pixels in the first quadrant of the mask are valid in optimization. All geometric information of the ${N_m} \times {N_m}$ mask matrix, M, can be encoded in frequency components. These frequency components can be understood as the weights of different basis functions in the final summation. The detailed process of DCT can be formulated as follows:

(11)$$\begin{array}{{cc}} {\begin{array}{{cc}} {\begin{array}{{cc}} {{D_{pq}} = {\alpha _p}{\alpha _q}\mathop \sum \limits_{m = 0}^{{N_m} - 1} \mathop \sum \limits_{n = 0}^{{N_m} - 1} {M_{mn}}\cos \frac{{\pi ({2m + 1} )p}}{{2M}}\cos \frac{{\pi ({2n + 1} )q}}{{2N}}}&{0 \le p,q \le {N_m} - 1} \end{array},}\\ {{\alpha _p} = \{ \begin{array}{{cc}} {\frac{1}{{\sqrt {{N_m}} }},}&{p = 0}\\ {\sqrt {\frac{2}{{{N_m}}}} ,}&{1 \le p \le {N_m} - 1} \end{array}\;\;\;{\alpha _q} = \{ \begin{array}{{cc}} {\frac{1}{{\sqrt {{N_m}} }},}&{q = 0}\\ {\sqrt {\frac{2}{{{N_m}}}} ,}&{1 \le q \le {N_m} - 1} \end{array},} \end{array}} \end{array}$$

where ${D_{pq}}$ is the DCT coefficient or frequency components of mask patterns, p and q are the indices of matrix D. Originating from the top left corner of the matrix, there is a gradual increase along both the horizontal and vertical directions. Subsequently, certain coefficients are truncated while others are padded with zeros to facilitate compression. The number of truncations is intrinsically linked to the quality of mask pattern reconstruction. It is noteworthy that this encoding method offers a significant advantage: the utilization of only low-frequency components for reconstruction enhances manufacturability [39]. Vectors, arranged as depicted in the preceding diagram, are subject to updates during the optimization process. Following this, these updated vectors are decoded via the inverse discrete cosine transformation (IDCT), as outlined in Eq. (12),

(12)$$\begin{array}{{cc}} {M_{pq}^\mathrm{^{\prime}} = \mathop \sum \limits_{p = 0}^{{N_m} - 1} \mathop \sum \limits_{q = 0}^{{N_m} - 1} {\alpha _p}{\alpha _q}{D_{pq}}\cos \frac{{\pi ({2m + 1} )p}}{{2M}}\cos \frac{{\pi ({2n + 1} )q}}{{2N}},}&{0 \le m,n \le {N_m} - 1} \end{array}$$

where $M^{\prime}$ denotes the reconstructed first quadrant of mask in optimization. The value of ${\alpha _p}$ and ${\alpha _q}$ remains unchanged compared to the previous equation. The process of IDCT is the summation of frequency components multiply basis function which can be written as:

(13)$$\begin{array}{{cc}} {\begin{array}{{cc}} {{F_{pq}} = {\alpha _p}{\alpha _q}\cos \frac{{\pi ({2m + 1} )p}}{{2M}}\cos \frac{{\pi ({2n + 1} )q}}{{2N}},}&{0 \le p,q \le {N_m} - 1} \end{array}}\\ {\textrm{where}\; M_{pq}^\mathrm{^{\prime}} = \mathop \sum \limits_{p = 0}^{{N_m} - 1} \mathop \sum \limits_{q = 0}^{{N_m} - 1} {D_{pq}}{F_{pq}}.} \end{array}$$

Due to the discarding of some frequency coefficients, it is not possible to convert them into the binary mask during reconstruction. The hard threshold function is applied:

(14)$${M_b} = \mathrm{\Gamma }({M^{\prime} - 0.5} ),$$

which means pixels in $M^{\prime}$ matrix are set to zero if they are smaller than 0.5, and set to one on the contrary.

4. Evaluation criterion for SMO

When assessing the quality of lithographic imaging in the SMO stage, multiple factors must be taken into account, beyond the commonly evaluated pattern error (PAE). The PAE serves as a measure of the fidelity of lithographic imaging and is defined as the Euclidean distance between the desired pattern (${I_d}$) and the actual printed image (${I_p}$), expressed as follows:

(15)$$PAE = ||{I_d} - {I_p}||_2^2.$$

In addition, the Process Window (PW) is introduced as a metric for evaluating the robustness of the imaging process in the face of process variations, including dose and defocus. A detailed evaluation of this aspect will be presented in sections 4.1 and 4.2. Furthermore, to ensure the production of inspectable masks, the manufacturability criterion is another factor that warrants consideration as a higher-level piece of information in NSGA-SMO. This aspect is achieved by opening and closing operations proposed by Erdmann and Fühner [40].

4.1 Depth of focus (DOF) and Variational lithography model (VLIM)

The conventional indicator of process robustness is the focus-exposure process window (PW). The lithographic result under the variation of exposure latitude (EL) and depth of focus (DOF) in the PW satisfies the condition, ΔCD ≤ 10%. However, acquiring the PW directly presents challenges due to simulation speed problems and model consistency issues, as it requires the recalculation of the image result under different positions and exposure doses, based on the conventional imaging model. Consequently, the PW, especially the DOF, does not directly participate in the optimization process. Conventional source mask optimization (SMO) methods with process window awareness tend to use the PAE under defocused positions, thereby indirectly evaluating the PW. In this paper, we introduce the variational lithography model (VLIM) [41] to efficiently calculate the focus-exposure process window. This model derives an analytical formula for the defocus latent image from Hopkins’ model. The image at the defocus position can be expressed as follows:

(16)$${I_G}({x,y} )= \mathop \sum \limits_{n = 0}^\infty {z^{2n}}{I_{2n}}({x,y} ).$$

Equation (16) is called the defocus aerial image expansion, and the derivation process of it is shown in Appendix 1. z is the focal error from the best focus normalized by $\frac{\lambda }{{N{A^2}}}$. ${I_0}$ is the in-focus (z = 0) aerial image. ${I_{2n}}({x,y} )$ are called the variational aerial images. The above equation indicates the defocus latent image can be decomposed of the in-focus image and correction terms. The higher order terms can be neglected if z is small, because ${z^n}$ (n > 2) is much smaller than ${z^2}$. The formula (16) can be approximated into,

(17)$${I_G}({x,y} )\cong {I_0}({x,y} )+ {z^2}{I_2}({x,y} )$$

The above approximation holds within $\mathrm{\ \pm 200nm}$ in the condition of $\mathrm{\lambda } = 193\textrm{nm}$ [42]. The corresponding variational TCCs of ${I_2}({x,y} )$ is expressed by (18).

(18)$$\begin{array}{{c}} {TC{C_2}({f^{\prime},g^{\prime};f^{\prime\prime},g^{\prime\prime}} )={-} \frac{{{\pi ^2}}}{4}\mathrm{\int\!\!\!\int }({{{(f + f^{\prime})}^2} + {{(g + g^{\prime})}^2}} )({{{(f + f^{\prime\prime})}^2} + {{(g + g^{\prime\prime})}^2}} )}\\ { \times J({f,g} ){P_0}({f + f^{\prime},g + g^{\prime}} )P_0^\mathrm{\ast }({f + f^{\prime\prime},g + g^{\prime\prime}} )dfdg} \end{array}$$

The 2nd-order variational aerial image can be calculated by Eq. (4) and decomposed by Eq. (6) like in-focus image calculation. Because the aerial image intensity is a function of defocus z. Once variational TCCs are calculated and restored, the aerial image under different focus conditions can be calculated more efficiently, compared with the direct calculation of aerial image. DOF can be determined by checking whether ΔCD ≤ 10% with different defocus values. VLIM enables faster and better DOF evaluation and makes it possible to achieve more direct variation-aware SMO.

4.2 Normalized image log-slope (NILS)

Exposure latitude (EL) indicates the change of linewidth as the exposure dose changes. In mathematics, EL can also be expressed as the slope of a critical dimension (CD) versus exposure dose (E), $\partial CD/\partial E$. Normalized image log-slope (NILS) is another matrix measuring the steepness of the image from bright to light, which can be expressed as

(19)$$NILS = CD\frac{{d\ln (I)}}{{dx}} = \frac{{CD}}{{{I_{con}}}} \times {\left. {\frac{{dI}}{{dx}}} \right|_{{I_{con}}}}.$$

In Eq. (19), I_con is the intensity in the contour and ${\left. {\frac{{dI}}{{dx}}} \right|_{{I_{con}}}}$ is the differential of the intensity to x on the contour of interest. In ideal and simplified cases, EL can be strongly related to NILS by [43]

(20)$$CD \times \frac{{\partial \ln E}}{{\partial CD}} = \frac{1}{2}NILS$$

If the EL is defined to be the range of exposure dose that keeps $\Delta CD \le 10\%$, the exposure latitude can be further approximated by

(21)$${\%}EL \approx 10 \times NILS$$

Because NILS is a much easier and faster matrix to calculate and qualify the impact of exposure dose change on image fidelity, it serves as a substitute for EL during optimization.

5. Framework of NSGA-SMO

The goal of SMO is to find the most suitable source ($\hat{S}$) and mask ($\hat{M}$) satisfying constraints based on a cost function. And SMO can be formulated generally as follows:

(22)$$({\hat{S},\hat{M}} )= \mathop {argmin}\limits_{S,M} D({S,M} )or\mathop {argmax}\limits_{S,M} D({S,M} )$$

where D is the criterion of image quality. $D$ is also called the cost function of SMO. Traditional methods model SMO as a single-objective, single-solution optimization. However, there are numerous criteria for determining the effectiveness of SMO results. As such, a single solution that performs exceptionally well concerning one objective may falter when it comes to another. Weighted sum [44] is a popular way to tackle multi-objective problems by designing a cost function as the weighted sum of different metrics $D = \mathop \sum \limits_{i = 1}^n {\omega _i}{D_i}$. ${D_i}$ is the ith target and ${\omega _i}$ is the corresponding weight. However, an improvement in one target may lead to a deterioration in another. Fixed weights severely restrict the diversity of solutions. The proposed NSGA-SMO method optimizes muti-objective uniquely and aims at finding a set of trade-off optimal solutions, known as Pareto-optimal solutions. It sorts solutions in a population into different levels based on dominance relationships and optimizes these targets separately. Moreover, considering the varying importance of different image quality criteria, we treat them differently.

In the NSGA-SMO method, the optimization process relies on the selection and update procedure. Selection is responsible for choosing individuals with superior performance, while the update procedure aims to produce potentially better-performing individuals. The former one is specifically a 2-tournament selection in this paper. The main idea is to randomly select two candidates and retain the winner (the one with better fitness or diversity) after comparison, while the loser is eliminated. The update of source and mask patterns relies on operators in the basic genetic algorithm (GA), including crossover and mutation. Both the source and mask in the proposed method are real-coded and normalized to [0,1]. The real-coded crossover operator is defined as follows:

(23)$${O_1} = {P_1} + \alpha \times ({{P_2} - {P_1}} ),$$

where ${O_1}$ is the offspring of chosen parents ${P_1}$ and ${P_2}$, $\alpha $ is the scaling factor chosen uniformly at random over the interval [-0.25,1.25]. The mutation operator is a random process in which one element of an array is replaced by another. This is achieved in this study by randomly selecting new values within the range [0,1]. The principles of these operators are illustrated in the following Fig. 4.

Fig. 4. The principles of crossover, mutation, and selection.

Download Full Size | PDF

5.1 Source optimization

Source optimization (SO) in the proposed method employs a traditional genetic algorithm. Fig. 5 schematically shows the steps of source optimization.

Fig. 5. The procedure of source optimization in the proposed method.

Download Full Size | PDF

As illustrated in Fig. 5, the initial population is randomly generated by placing the real value range from 0 to 1 into an encoded array. To expedite the convergence process, traditional shaped sources are injected into the solutions as seeds. In the third step, the pattern error of all solutions in the population can be evaluated after forward calculating the aerial image through the Abbe model, as mentioned in the second section. Subsequently, it is determined whether the termination criterion has been met. The fifth step involves selecting prevalent individuals using a 2-tournament selection operator. The selected solutions are then updated using crossover and mutation operators. The new population subsequently repeats the above steps until the termination criterion is met.

5.2 Mask optimization

Mask optimization is the part mainly reflecting our multi-solution and multi-objective idea in the proposed method. For multi-objective optimization problems, they do not have the best solutions, Instead, it is possible to find Pareto-optimal solutions or nondominated solutions. For minimum problems, domination in math can be expressed as

(24)$${\textrm{x}_1}\textrm{dominates}{\textrm{x}_2}:\textrm{}\forall \textrm{i} \in 1,2,3 \ldots \textrm{m},\textrm{}{\textrm{D}_\textrm{i}}({{\textrm{x}_1}} )\le \textrm{D}({{\textrm{x}_2}} ),\textrm{}\exists \textrm{j} \in 1,2,3 \ldots \textrm{m},{\textrm{D}_\textrm{j}}({{\textrm{x}_1}} )\le {\textrm{D}_\textrm{j}}({{\textrm{x}_2}} ).$$

Non-dominated solutions are the converse. Non-dominated solutions are the ones that cannot be improved in an objective without deteriorating at least one other objective. The goal of Non-dominated Sorted Genetic Algorithm II (NSGA II) is to find non-dominated solutions, but it has been proved that the ability to search for Pareto-optimal solutions diminishes when the number of objectives increases [36]. Therefore, in the proposed NSGA-SMO, evaluation metrics in section 4 are divided into two groups: critical metrics (PAE, DOF, and NILS) and higher-level information (MC). Critical targets are directly related to the imaging quality of SMO results, while MC determines whether the masks are easy to produce. The critical metrics are involved in the optimization as the optimization objectives and higher-level information and the critical metrics together determine which solution is selected among multiple solutions.

Compared with traditional GA methods, NSGA has three significant improvements: elitist principle, diversity preserving mechanism, and non-dominated solutions [36]. In the simple GA algorithm, the generated population after genetic operators may lose the individuals with good performance in the parent population. The elitist principle is used in the proposed NSGA-SMO. It is achieved by creating an offspring population by combining the parent population with the population after genetic operators. The solutions in the new population are ranked into different Pareto levels by non-dominated sorting. The illustration diagram of Pareto levels is shown in Fig. 6(a) In non-dominated sorting, the individuals are ranked based on their dominant relationships with each other. The detailed implementation process of non-dominated sorting is presented in Algorithm 1.

Fig. 6. The example of (a) Front level (b) Crowding distance, assuming to find the minimum for D₁, D₂

Download Full Size | PDF

Diversity is preserved by introducing the definition of crowding distance. The assignment of crowding distance is shown in Fig. 6(b). Before computing the crowding distance of each individual, the individuals should be sorted according to each objective in ascending order. As shown in Fig. 6(b), the crowding distance is represented by the difference in the objective value of adjacent individuals. If the individual is on the border, the crowding distance of it is assigned to infinity. And the sum of crowding distance forms the overall value. When different individuals are in the same Pareto level, the selection operator will favor the individual with higher diversity.

oe-32-4-5301-i001

The procedure of MO is demonstrated in Fig. 7. At the beginning of MO, the mask population composed of randomly generated masks and decoded seed masks is initialized. TCCs are also calculated in advance. Then, crossover and mutation operations are carried out in sequence and the offspring is produced. Then, the newly generated individuals are decoded referring to Fig. 3. The recovered masks are involved in the aerial image calculation model, Hopkins for PAE and VLIM for the process window. If the termination requirements are satisfied, the iteration is over. Otherwise, the original parent population and newly generated individuals form the new population. Based on the individuals’ performance on different objectives, non-dominated sorting is executed to rank individuals. The top half of the sorting will be chosen as the parents to generate new solutions, while another half will be abandoned. Those solutions with grades around half line need extra crowding distance sorting. The individuals with better diversity are retained after this sorting operation. The selected individuals repeat the first step in the new iteration until the termination criterion is reached.

Fig. 7. The procedure of mask optimization in NSGA-SMO

Download Full Size | PDF

Mask optimization is not the end of the whole optimization, and there are a few potential solutions in the final population. The optimal solution should be selected to continue the remaining optimization. A rank-based selector is added to choose the most suitable solutions. In this paper, we focus on the rank of solutions to build a rank-based selector. The detailed selection strategy is shown in Algorithm 2. Its main idea is to increase the ranking selection criteria gradually until the number of selected candidates reaches the set value.

oe-32-4-5301-i002

5.3 Source mask optimization

The proposed SMO structure is sequential, which means SO and MO are optimized alternately. The SMO flow starts with determining the desired mask and initial source pattern. Next, the parameters related to the optimization, such as the number of iterations, are initialized. Meanwhile, the source population is formed by combining the initial set source and randomly generated solutions. After a source optimization, the mask is optimized with the fixed source from SO. In the process of SO, only PAE is designed to be optimized. Subsequently, mask optimization outputs the final mask population, after optimizing several objectives including PAE, DOF, and NILS. If the termination criterion is satisfied, the SMO stops and the mask can be selected manually based on all objectives and higher-level information such as MC after the rank-based selector. Otherwise, the best-performance mask is selected from the mask population by the rank-based selector (${N_c} = 1$) and fed into the source optimization for the new iteration. Fig. 8 shows the above flow schematically.

Fig. 8. SMO optimization flow

Download Full Size | PDF

6. Simulation and analysis

6.1 Simulation settings

In this paper, conventional circular illumination as shown in Fig. 9 is exploited as the initial source of SO. The illumination wavelength is 193 nm, and the numerical aperture (NA) is 1.35. The mask patterns are discretized into $5.625\textrm{nm} \times 5.625\textrm{nm}$ per pixel with a total number of pixels is $211 \times 211$. Fig. 9(a) shows the initial source in simulation with $\sigma = 0.95$. Two dark-field masks, a simple grating pattern and a complicated pattern as shown in Fig. 9(b) and (c), are applied in the simulation. The parameter of the sigmoid resist model is 85 for steepness and 0.21 for threshold. The order of DCT coding is 24, and the number of TCC SVD kernels is 12. The size of the population is ${N_{pop}} = 50$, and the rate of mutation and cross are 0.04 and 0.6 respectively. The candidate number of the final selector ${N_c}$ is 5. The simulation platform is a PC with a 3.6 GHz CPU and 16GB RAM.

Fig. 9. The initial mask pattern in simulation and source (a) initial source (b) grating patterns(c) complicated patterns. Red lines are the tested positions in the simulation.

Download Full Size | PDF

We conducted comparative experiments on different optimization modes. The SMO methods with an objective function of the form like $D = {\lambda _1}D_{PAE}^{z = 0} + {\lambda _2}D_{PAE}^{z = {z_0}}$ are collectively referred to as the weighted-sum multi-objective SMO. And single objective SMO is optimized by GA, with an objective function $D = D_{PAE}^{z = 0}$ The weighted-sum SMO method [28] with objective $D = D_{PAE}^{z = 0} + D_{PAE}^{z = 100}$ and single-objective GA SMO are taken as a comparison to verify the superiority of our proposed method, NSGA-SMO.

6.2 Simple grating pattern

The iteration process of NSGA-SMO is illustrated in Fig. 10, where the red, green, and blue curves represent the convergence tendencies of PAE, DOF, and NILS, respectively. The optimization process took 1300 iterations to achieve convergence. The dashed lines in the figure distinguish the SO and MO stages in SMO. Since only the MO stage applied the process window direct optimization, the curves of DOF and NILS in the SO stage are missing. It can be observed that the PAEs decreased from 1741.88 to 371.50, which indicates a significant improvement. Moreover, the best DOF and NILS in the population also experienced a gradual and direct optimization to 110 nm and 4.29 in the process, respectively.

Fig. 10. The iteration process of NSGA-SMO for the simple grating pattern. The red line denotes the best PAE optimization steps. The green and blue lines denote the best DOF and NILS respectively in the population in the optimization.

Download Full Size | PDF

Since the proposed method produces multiple solutions, the final suitable solutions need to be selected. Table 1 shows the mask candidates after the rank-based selector. Among the mask solutions, we can see that none of them dominates the others for all objectives. For example, Mask# 5 has the best performance on pattern errors, but has the worst process window matrix among all solutions. Therefore, it can be seen that Mask# 5 improves the DOF at the expense of pattern error. After considering it comprehensively, Mask# 4 is considered as the most balanced solution, which is among the best for all objectives, and not difficult to be fabricated. Of course, any of them can be selected as the final choice according to specific requirements. And Mask# 4 is used to conduct the following comparative simulation experiments.

Table 1. Grayscale values projected by two DMDs in parallel

View Table | View all tables in this article

Figure 11 presents the simulation results of SMO using different optimization modes based on the simple grating pattern. The four columns from left to right represent the optimized source, the optimized mask, the resist pattern in the focal plane and the resist pattern at 100 nm defocus, respectively. The rows in Fig. 11(a-c) from top to bottom show the simulation results of single-objective GA SMO, weighted-sum SMO, and NSGA-SMO, respectively. For the source and mask solutions optimized by the single-objective SMO and the weighted-sum SMO, the PAEs in the focal plane are 408 and 373, respectively, which are comparable to the PAE of 384 for the NSGA-SMO’s resist pattern. However, when the imaging plane is defocused by 100 nm, the print images on the wafer for the single-objective SMO and the weighted-sum SMO deteriorate significantly, and irrelevant lines and assist features start to appear, to varying degrees. By contrast, even when the defocus reaches 100 nm for NSGA-SMO, the resist pattern remains stable with PAE = 536, which is half of the single-objective SMO’s, which demonstrates that NSGA-SMO has a more direct and thorough optimization for DOF.

Fig. 11. SMO results for the five-bar grating pattern. Rows (a-c): the results of Single-objective SMO, Weighted-sum SMO, and NSGA-SMO. Left to right: the optimized source, the optimized mask, the resist pattern at the focal plane, and the resist pattern at 100 nm defocus.

Download Full Size | PDF

Fig. 12(a) shows the optimized process window of the target pattern. Fig. 12(b) plots the defocus-log (PAE) curve of the desired pattern, with the defocus range from -200 nm to 200 nm. The dotted line, the green line, and the red line correspond to the methods from top to bottom in Fig. 11, respectively. Compared with the single-objective SMO and the weighted-sum SMO, the NSGA-SMO increases the DOF by 37% and 23%, and increases the EL by 50% and 29%, respectively. NSGA-SMO reduces the sensitivity of imaging to defocus, and significantly releases the optimization potential of the process window, resulting in a better process variation tolerance for lithography.

Fig. 12. Comparison of PW and defocus-log (PAE) for the simple grating pattern (a) The DOF-EL process window (b) the defocus-log(PAE) curve of grating patterns for Single-objective SMO (dotted line), the Weighted-sum SMO (green line) and NSGA-SMO (red line).

Download Full Size | PDF

6.3 Complicated pattern

Fig. 13 illustrates the optimization process for the complicated pattern. There were 1300 loops to optimize the objectives and reach convergence finally. The curves with different colors from top to bottom represent PAE, DOF, and NILS, respectively. The NSGA-SMO improved the best PAE, DOF, and NILS to 1320, 78 nm, and 3.784, respectively. From the diagram, it can also be seen that the MO stage is more critical for optimizing DOF and NILS than the SO stage, while after the SO stage, these two objectives did not change much.

Fig. 13. The iteration process of NSGA-SMO for the complicated pattern. The red line denotes the best PAE optimization steps. The green and blue lines denote the best DOF and NILS respectively in the population in the optimization.

Download Full Size | PDF

Among all the potential solutions in the first Pareto level, the rank-based selector outputs five alternative solutions. Table 2. shows the mask candidates after the rank-based selector for the complicated pattern. Among them, Mask#1 and Mask#2 have similar performances, which are not inferior to the others for all objectives. However, Mask#1 has a better mask manufacturability than Mask#2, with MC of 58 versus 76. Therefore, after analyzing, we chose Mask#1 as the optimal solution for the desired complicated pattern. The visualization result of Mask#1 is shown in the bottom row in Fig. 14.

Table 2. Grayscale values projected by two DMDs in parallel

View Table | View all tables in this article

Fig. 14. SMO results of the complicated pattern. Rows (a-c): the results of single objective SMO, Weighted-sum SMO, and NSGA-SMO. Left to right: the optimized source, the optimized mask, the resist pattern at the focal plane, and the resist pattern at 100 nm defocus.

Download Full Size | PDF

Figure 14 depicts the optimization results for the desired complicated patterns, and the imaging results from (a) to (c) are optimized by single-objective SMO, weighted-sum SMO, and NSGA-SMO, respectively. The first two columns are the optimized source and mask. The other columns are the resist patterns on the wafer at the focal plane and 100 nm defocus. Similar to Fig. 12, the results of NSGA-SMO are the most robust against defocus, although the image on the wafer changes noticeably with the PAE increasing from 1336 to 3616. For the SMO optimized by the weighted-sum loss or the single-objective loss, the lines with narrow linewidth almost disappear completely, with the PAE increasing from 1349 and 1330 to 6905 and 8663, respectively, under 100 nm defocus.

Figure 15(a) shows the process window of the complicated patterns using three different optimization modes (red for NSGA-SMO, green for weighted-sum SMO, and blue dashed line for traditional SMO). The PW of the traditional SMO result for the complicated pattern is very limited, which means that even a small process variation will cause the imaging result to exceed the tolerance. The PW is significantly improved using the weighted-sum SMO, while the NSGA-SMO’s PW (both DOF and EL) is further enhanced, which is several times larger than the traditional SMO and over 25% larger than the weighted-sum SMO. The defocus-log(PAE) curve ranging from -100 nm to 100 nm in Fig. 15(b) also verifies the above statement. It can be seen that the red line (NSGA-SMO) has a gentle plateau within the range of about ±70 nm, but it does not exist for the dashed line (single-objective SMO) and the green line (weighted-sum SMO). From the results in Fig. 14 and Fig. 11, the main difference between NSGA-SMO and other SMO methods is the placement of the assistant features in the mask, which plays a more and more important role in PW improvement.

Fig. 15. Comparison of PW and defocus-log (PAE) for the complicated pattern (a) The DOF-EL process window (b) the defocus-log(PAE) curve of the complicated pattern for single-objective SMO (dotted line), the Weighted-sum SMO (green line) and NSGA-SMO (red line).

Download Full Size | PDF

7. Conclusion

This paper proposes an innovative multi-solution multi-objective source mask optimization method based on NSGA2. The multi-objective optimization problem is formulated as finding the non-dominated solution set. The symmetry and DCT coefficients of the source and mask are used to encode them, improving the optimization efficiency. The Hopkins imaging model and the Abbe imaging model are used for the space image calculation of the mask optimization and source optimization in the SMO process. In addition, the VLIM decomposes the imaging result under defocus into the ideal imaging result and the correction terms caused by defocus, which is used for the fast calculation of the depth of focus in mask optimization. Combined with the NSGA2 multi-objective optimization algorithm and advanced optimization strategies, the direct optimization of the imaging indicators including the process window is realized, releasing the potential of the optimization of the process window. Multiple advantageous solutions are obtained, and the solutions that are more in line with the production requirements are artificially obtained from them with the rank-based selector. The simulation results also show that the optimization mode of non-dominated ranking with the direct optimization of the process window can effectively improve the imaging indicators, and has better imaging performance and process window than the single-objective SMO mode and weighted-sum SMO mode.

Appendix 1: Deviation of VLIM

In the Hopkins model, pupil function with defocus can be written as

(25)$$P({f,g} )= {P_0}({f,g} ){e^{i\pi z({{f^2} + {g^2}} )}}$$

where ${P_0}({f,g} )= \left\{ {\begin{array}{{c}} {1,\; \; {f^2} + {g^2} \le 1}\\ {0,\; \; {f^2} + {g^2} > 1} \end{array}} \right.$

By adapting the moment expansion method [41], ${e^{i\pi z({{f^2} + {g^2}} )}}$ can be extended as $\mathop \sum \limits_0^\infty \frac{{{{[{i\pi z({{f^2} + {g^2}} )} ]}^n}}}{{n!}}$. Then substituting $\mathop \sum \limits_0^\infty \frac{{{{[{i\pi z({{f^2} + {g^2}} )} ]}^n}}}{{n!}}$ into TCC formula (5) and using the Binomial Expansion, we have

(26)$$\begin{array}{{c}} {TCC({f^{\prime},g^{\prime};f^{\prime\prime},g^{\prime\prime}} )= \mathop \sum \nolimits_{n = 0}^\infty \frac{{{{({i\pi z} )}^n}}}{{n!}}\mathop \sum \nolimits_{k = 0}^n \left( {\begin{array}{{c}} n\\ k \end{array}} \right)\mathrm{\int\!\!\!\int }{{[{{{({f + f^{\prime}} )}^2} + {{({g + g^{\prime}} )}^2}} ]}^k},}\\ {{{[{{{({f + f^{\prime}\mathrm{^{\prime}}} )}^2} + {{({g + g^{\prime}\mathrm{^{\prime}}} )}^2}} ]}^{n - k}} \times J({f,g} ){P_0}({f + f^{\prime},g + g^{\prime}} )P_0^\mathrm{\ast }({f + f^{\prime\prime},g + g^{\prime\prime}} )dfdg.} \end{array}$$

Then, TCC can be written as

(27)$$TCC({f^{\prime},g^{\prime};f^{\prime\prime},g^{\prime\prime}} )= \mathop \sum \limits_{n = 0}^\infty {z^n}TC{C_n}({f^{\prime},g^{\prime};f^{\prime\prime},g^{\prime\prime}} ).$$

Plug the above equation into (4), the following form is acquired

(28)$$I({x,y} )= \mathop \sum \limits_{n = 0}^\infty {z^n}{I_n}({x,y} ),$$

where

(29)$$\begin{array}{{c}} {{I_n}({x,y} )= \mathrm{\int\!\!\!\int \int\!\!\!\int }TC{C_n}({f^{\prime},g^{\prime};f^{\prime\prime},g^{\prime\prime}} )M({f^{\prime},g^{\prime}} ){M^\mathrm{\ast }}({f^{\prime\prime},g^{\prime\prime}} )}\\ { \times {e^{ - i2\pi [{({f^{\prime} - f^{\prime\prime}} )x + ({g^{\prime} - g^{\prime\prime}} )y} ]}}df^{\prime}dg^{\prime}df^{\prime\prime}dg^{\prime\prime}} \end{array},$$

(30)$$\begin{array}{{c}} {TC{C_n}({f^{\prime},g^{\prime};f^{\prime\prime},g^{\prime\prime}} )= \frac{{{{({ - i\pi } )}^n}}}{{n!}}\mathop \sum \limits_{k = 0}^n \left( {\begin{array}{{c}} n\\ k \end{array}} \right){{({ - 1} )}^k}TC{C_{n,k}}({f^{\prime},g^{\prime};f^{\prime\prime},g^{\prime\prime}} ),} \end{array}$$

and

(31)$$\begin{array}{{c}} {\begin{array}{{c}} {TC{C_{n,k}}({f^{\prime},g^{\prime};f^{\prime\prime},g^{\prime\prime}} )= \mathrm{\int\!\!\!\int }{{[{{{({f + f^{\prime}} )}^2} + {{({g + g^{\prime}} )}^2}} ]}^k}{{[{{{({f + f^{\prime}\mathrm{^{\prime}}} )}^2} + {{({g + g^{\prime}\mathrm{^{\prime}}} )}^2}} ]}^{n - k}}}\\ { \times J({f,g} ){P_0}({f + f^{\prime},g + g^{\prime}} )P_0^\mathrm{\ast }({f + f^{\prime\prime},g + g^{\prime\prime}} )dfdg.} \end{array}} \end{array}$$

Obviously,

(32)$$\begin{array}{{c}} {\mathrm{\int\!\!\!\int }J({f,g} )TC{C_k}({f + f^{\prime},g + g^{\prime}} ){P_k}({f + f^{\prime},g + g^{\prime}} )P_{n - k}^\mathrm{\ast }({f + f^{\prime\prime},g + g^{\prime\prime}} )}\\ { = \mathrm{\int\!\!\!\int }J({f,g} )TC{C_k}({f + f^{\prime},g + g^{\prime}} )P_{n - k}^\mathrm{\ast }({f + f^{\prime\prime},g + g^{\prime\prime}} ){P_k}({f + f^{\prime},g + g^{\prime}} )}\\ { = TC{C_{n,k}}({f^{\prime},g^{\prime};f^{\prime\prime},g^{\prime\prime}} )= TCC_{n,n - k}^\mathrm{\ast }({f^{\prime\prime},g^{\prime\prime};f^{\prime},g^{\prime}} ).} \end{array}$$

Because the mask function in spatial is all real, the mask function in the frequency domain satisfies

(33)$$M({f,g} )= {M^\mathrm{\ast }}({f,g} ).$$

Similarly,

(34)$$I({f,g} )= {I^\ast }({ - f, - g} ).$$

From the above formula,

(35)$$\begin{array}{{c}} {\mathrm{\int\!\!\!\int }TC{C_{n,n - k}}({f^{\prime} + f,g^{\prime} + g;f^{\prime},g^{\prime}} )M({f^{\prime} + f,g^{\prime} + g} ){M^\mathrm{\ast }}({f^{\prime},g^{\prime}} )df^{\prime}dg^{\prime} = }\\ {{{\left( {\mathrm{\int\!\!\!\int }TC{C_{n,k}}({f^{\prime} - f,g^{\prime} - g;f^{\prime},g^{\prime}} )M({f^{\prime} - f,g^{\prime} - g} ){M^\mathrm{\ast }}({f^{\prime},g^{\prime}} )df^{\prime}dg^{\prime}} \right)}^\mathrm{\ast }}} \end{array}$$

When n is odd, the below formula holds,

(36)$${I_n}({f,g} )={-} I_n^\ast ({ - f, - g} ).$$

So the odd terms have no contribution to the final image. The defocus aerial image expansion can be rewritten as

(37)$$I({x,y} )= \mathop \sum \limits_{n = 0}^\infty {z^{2n}}{I_{2n}}({x,y} ),$$

When defocus range is small, the above equation can be approximated as

(38)$$I({x,y} )\cong {I_0}({x,y} )+ {z^2}{I_2}({x,y} )$$

Funding

Sichuan Province Science and Technology Support Program (2023JDRC0104); West Light Foundation, Chinese Academy of Sciences; Youth Innovation Promotion Association of the Chinese Academy of Sciences (2021380); National Natural Science Foundation of China (61604154); National Key Research and Development Program of China (2021YFB3200204, 61875201, 61975211, 62005287).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Reference

1. G. E. Moore, “Cramming more components onto integrated circuits,” Proc. IEEE 86(1), 82–85 (1998). [CrossRef]

2. C. Han, Y. Li, X. Ma, et al., “Robust hybrid source and mask optimization to lithography source blur and flare,” Appl. Opt. 54(17), 5291–5302 (2015). [CrossRef]

3. N. Jia and E. Y. Lam, “Pixelated source mask optimization for process robustness in optical lithography,” Opt. Express 19(20), 19384–19398 (2011). [CrossRef]

4. X. Ma and G. R. Arce, Computational Lithography, 1st ed. (John Wiley and Sons, 2010).

5. A. K. Wong, Optical Imaging in Projection Lithography (SPIE, 2005).

6. A. Poonawala and P. Milanfar, “Mask design for optical microlithography—an inverse imaging problem,” IEEE Trans. on Image Process. 16(3), 774–788 (2007). [CrossRef]

7. M. D. Levenson, N. S. Viswanathan, and R. A. Simpson, “Improving resolution in photolithography with a phase-shifting mask,” IEEE Trans. Electron Devices 29(12), 1828–1836 (1982). [CrossRef]

8. C. A. Mack, “Optimum stepper performance through image manipulation,” SPIE milestone series 178, 614–620 (2004).

9. D. L. Fehers, H. B. Lovering, and R. T. Scruton, “Illuminator modification of an optical aligner,” KTI Microelectronics Seminar 217–230 (1989).

10. Y. Granik, “Fast pixel-based mask optimization for inverse lithography,” J. Micro/Nanolith. MEMS MOEMS 5(4), 043002 (2006). [CrossRef]

11. X. Ma and G. R. Arce, “Pixel-based OPC optimization based on conjugate gradients,” Opt. Express 19(3), 2165–2180 (2011). [CrossRef]

12. W. Lv, S. Liu, Q. Xia, et al., “Level-set-based inverse lithography for mask synthesis using the conjugate gradient and an optimal time step,” J. Vac. Sci. Technol. B 31(4), 041605 (2013). [CrossRef]

13. B. E. A. Saleh and S. I. Sayegh, “Reduction of errors of microphotographic reproductions by optimal corrections of original masks,” Opt. Eng. 20(5), 781–787 (1981). [CrossRef]

14. Y. V. Miklyaev, W. Imgrunt, V. S. Pavelyev, et al., “Novel continuously shaped diffractive optical elements enable high-efficiency beam shaping,” in Proc. SPIE, 2010, vol. 7640, pp. 7640–7674.

15. J. T. Carriere, J. Stack, A. D. Kathman, et al., “Advances in DOE modeling and optical performance for SMO applications in immersion lithography at the 32-nm node and beyond,” in Proc. SPIE, 2010, vol. 7640, 793

16. J. Zimmermann, P. Gräupner, D. Hellweg, et al., “Generation of arbitrary freeform source shapes using advanced illumination systems in high-NA immersion scanners,” in Proc. SPIE, 2010, vol. 7640, pp. 36–50.

17. M. Mulder, A. Engelen, O. Noordman, et al., “Performance of flexray: A fully programmable illumination system for generation of freeform sources on high-NA immersion systems,” in Proc. SPIE, 2010, vol. 7640, pp.647–656.

18. Y. Shen, F. Peng, X. Huang, et al., “Adaptive gradient-based source and mask co-optimization with process awareness,” Chin. Opt. Lett. 17(12), 121102 (2019). [CrossRef]

19. S. Li, X. Wang, and Y. Bu, “Robust pixel-based source and mask optimization for inverse lithography,” Opt. Laser Technol. 45, 285–293 (2013). [CrossRef]

20. Y. Shen, F. Peng, and Z. Zhang, “Semi-implicit level set formulation for lithographic source and mask optimization,” Opt. Express 27(21), 29659–29668 (2019). [CrossRef]

21. L. Wang, S. Li, X. Wang, et al., “Source mask projector optimization method of lithography tools based on particle swarm optimization algorithm,” Acta Opt. Sin. 37(10), 1022001 (2017). [CrossRef]

22. T. Fühner and A. Erdmann, “Improved mask and source representations for automatic optimization of lithographic process conditions using a genetic algorithm,” Proc. SPIE 5754, 41–426 (2005). [CrossRef]

23. T. Fujisawa, M. Asano, and S. Sutani, “Wafer flatness for CD control in photolithography,” In Proceedings of the SPIE’s 27th Annual International Symposium on Microlithography, Santa Clara, CA, USA, 3–8 March 2002; Volume 4691.

24. A. K. Wong, Optical Imaging in Projection Lithography (SPIE, 2005).

25. A. Erdmann, J. Kye, Y. Mao, et al., “The thermal aberration analysis of a lithography projection lens,” presented at the Optical Microlithography XXX2017. [CrossRef]

26. A. Erdmann, “Mask modeling in the low k1 and ultrahigh NA regime: phase and polarization effects (invited paper),” Proc. SPIE 5835, 69–81 (2005). [CrossRef]

27. J.T. Azpiroz; Rosenbluth, A.E. Impact of Sub-Wavelength Electromagnetic Di_raction in Optical Lithography for Semiconductor Chip Manufacturing; IEEE: New York, NY, USA, 2013.

28. Y. Peng, J. Zhang, Y. Wang, et al., “Gradient-based source and mask optimization in optical lithography,” IEEE Trans. on Image Process. 20(10), 2856–2864 (2011). [CrossRef]

29. W. Conley, X. Ma, Y. Li, et al., “Robust resolution enhancement optimization methods to process variations based on vector imaging model,” presented at the Optical Microlithography XXV2012.

30. A. C. Chen, N. Jia, B. Lin, et al., “Robust mask design with defocus variation using inverse synthesis,” presented at the Lithography Asia (2008).

31. P. Wei, Y. Li, T. Li, et al., “Multi-Objective Defocus Robust Source and Mask Optimization Using Sensitive Penalty,” Appl. Sci. 9(10), 2151 (2019). [CrossRef]

32. J.-R. Gao, X. Xu, Y. Bei, et al., “MOSAIC: Mask optimizing solution with process window aware inverse correction,” Proc. ACM/IEEE Design Autom. Conf. (DAC), San Francisco, CA, USA, 2014 pp. 1–6.

33. M. K. Sears, G. Fenger, J. Mailfert, et al., “Extending SMO into the lens pupil domain,” Proc. SPIE 7973, 79731B (2011). [CrossRef]

34. M. K. Sears, J. Bekaert, and B. W. Smith, “Pupil wavefront manipulation for optical nanolithography,” Proc. SPIE 8326, 832611 (2012). [CrossRef]

35. M. K. Sears, J. Bekaert, and B. W. Smith, “Lens wave front compensation for 3D photomask effects in subwavelength optical lithography,” Appl. Opt. 52(3), 314–322 (2013). [CrossRef]

36. K Deb, “A Fast and Elitist Multiobjective Genetic Algorithm: NSGA-II,” IEEE Trans. Evol. Computat. 6(2), 182–197 (2002). [CrossRef]

37. A. K.-K. Wong, Optical Imaging in Projection Microlithography (SPIE, 2005).

38. A. R. Rosenbluth and N. Seong, “Global Optimization of the Illumination Distribution to Maximize Integrated Process Window,” Proc. SPIE 6154, 61540H (2006). [CrossRef]

39. L. Wang, S. Li, X. Wang, et al., “Pixel-based mask optimization via particle swarm optimization algorithm for inverse lithography,” Proc. SPIE 9780, 97801V (2016). [CrossRef]

40. T. Fühner, A. Erdmann, and S. Seifert, “Direct optimization approach for lithographic process conditions[J],” Journal of Micro/Nanolithography, MEMS and MOEMS 6(3), 031006 (2007). [CrossRef]

41. Y.-T. Wang, Y. C. Pati, and T. Kailath, “Depth of focus and the moment expansion,” Opt. Lett. 20(18), 1841–1843 (1995). [CrossRef]

42. P. Yu, S. X. Shi, and D. Z. Pan, “True process variation aware optical proximity correction with variational lithography modeling and model calibration,” J. Micro/Nanolithography, MEMS, and MOEMS, 6(3), 574–576 (2007). [CrossRef]

43. C. A. Mack Using the normalized image log-slope, Part 2 // Microlithography World. - 2001.

44. R. T. Marler and J. S. Arora, “The weighted sum method for multi-objective optimization: new insights,” Struct. Multidiscipl. Optim. 41(6), 853–862 (2010). [CrossRef]

Mask# id	PAE (Rank)	DOF (Rank)	NILS (Rank)	MC (Rank)
1	533.7059(5)	110(1)	3.0726(5)	72(5)
2	429.9276(4)	95(3)	3.7042(2)	23(1)
3	423.2260(3)	85(4)	3.8036(1)	28(2)
4	384.5076(2)	105(2)	3.6602(3)	40(3)
5	371.4987(1)	75(5)	3.2856(4)	43(4)

Mask# id	PAE (Rank)	DOF (Rank)	NILS (Rank)	MC (Rank)
1	1349.52(3)	78(1)	3.752(3)	58(2)
2	1346.19(2)	76(2)	3.760(4)	76(4)
3	1322.08(1)	65(5)	2.929(5)	83(5)
4	1352.66(3)	72(3)	3.733(2)	62(3)
5	1355.72(4)	68(4)	3.784(1)	32(1)

Mask# id	PAE (Rank)	DOF (Rank)	NILS (Rank)	MC (Rank)
1	533.7059(5)	110(1)	3.0726(5)	72(5)
2	429.9276(4)	95(3)	3.7042(2)	23(1)
3	423.2260(3)	85(4)	3.8036(1)	28(2)
4	384.5076(2)	105(2)	3.6602(3)	40(3)
5	371.4987(1)	75(5)	3.2856(4)	43(4)

Mask# id	PAE (Rank)	DOF (Rank)	NILS (Rank)	MC (Rank)
1	1349.52(3)	78(1)	3.752(3)	58(2)
2	1346.19(2)	76(2)	3.760(4)	76(4)
3	1322.08(1)	65(5)	2.929(5)	83(5)
4	1352.66(3)	72(3)	3.733(2)	62(3)
5	1355.72(4)	68(4)	3.784(1)	32(1)

Multi-objective and multi-solution source mask optimization using NSGA-II for more direct process window enhancement

Abstract

1. Introduction

2. Partially coherent imaging model

3. Coding of the source and mask

4. Evaluation criterion for SMO

4.1 Depth of focus (DOF) and Variational lithography model (VLIM)

4.2 Normalized image log-slope (NILS)

5. Framework of NSGA-SMO

5.1 Source optimization

5.2 Mask optimization

5.3 Source mask optimization

6. Simulation and analysis

6.1 Simulation settings

6.2 Simple grating pattern

6.3 Complicated pattern

7. Conclusion

Appendix 1: Deviation of VLIM

Funding

Disclosures

Data availability

Reference

Data availability

Cited By

Figures (15)

Tables (2)

Equations (38)

Optics Express