PSM design for inverse lithography with partially coherent illumination

Xu Maa; Gonzalo R. Arceb

doi:10.1364/OE.16.020126

1. Introduction

Due to the resolution limits of optical lithography, the electronics industry has relied on resolution enhancement techniques (RET) to compensate and minimize mask distortions as they are projected onto semiconductor wafers [1]. Resolution in optical lithography obeys the Rayleigh resolution limit $R = k \frac{λ}{NA}$ , where λ is the wavelength, NA is the numerical aperture, and k is the process constant which can be minimized through RET methods [2, 3, 4, 5]. In optical proximity correction (OPC) methods, mask amplitude patterns are modified by the addition of sub-resolution features that can pre-compensate for imaging distortions [6]. Phase-shifting mask (PSM) methods, commonly attributed to Levenson [7], induce phase shifts in the transmitted field which have a favorable constructive or destructive interference effect. Thus, a suitable modulation of both the intensity and the phase of the incident light can be used to effectively compensate for some of the resolution-limiting phenomena in optical diffraction.

Several approaches to PSM for inverse lithography have been proposed in the literature. Liu and Zakhor developed a binary and phase shifting mask design strategy based on the branch and bound algorithm and simulated annealing [8]. Pati-Kailath developed sub-optimal projections onto convex sets for PSM designs [9]. Both of the methods, however, are not based on gradient type optimization and thus the searching process for a suitable solution is either computationally expensive or not efficient. Recently, Poonawala and Milanfar introduced a novel PSM optimization framework for inverse lithography relying on a pixel-based, continuous function formulation, well suited for gradient-based search [10]. Ma and Arce generalized this algorithm so as to admit multi-phase components having arbitrary PSM patterns [11, 12]. Although these algorithms are computationally effective, they focused on coherent illumination systems.

Most practical illumination sources, however, have a nonzero line width and their radiation is more generally described as partially coherent [13]. Partially coherent illumination (PCI) is desired, as it can improve the theoretical resolution limit [1]. In partially coherent imaging, the mask is illuminated by light travelling in various directions. The source points giving rise to these incident rays are incoherent with one another, such that there is no interference that could lead to nonuniform light intensity impinging on the mask [1]. A schematic of an optical lithography system with partially coherent illumination is illustrated in Fig. 1. The light source with a wavelength of λ is placed at the focal plane of the first condenser (L ₁), illuminating the mask. The image of the photomask is formed by the projection optics onto the wafer [1]. The partial coherence factor $σ = \frac{a}{b}$ is defined as the ratio between the size of the source image and the pupil [14].

Fig. 1. Optical lithography system with partially coherent illuminations

Download Full Size | PDF

The optical lithography system reduces to simple forms in the two limits. When the illumination source is at a single point, the system reduces to the completely coherent case. On the other hand, when the illumination source is of infinite extent, the system reduces to the completely incoherent case. Phase-shifting masks provide no advantages in the completely incoherent case, while they make their most significant contributions to the output intensity in the completely coherent case. Common partially coherent illumination modes lie between these two limits, and include dipole, quadrupole and annular shapes, which provide small to large partial coherence factors. Illumination with large partial coherence factors is closer to the completely incoherent illumination case, while small partial coherence factors approach completely coherent illumination. There are some tradeoffs in the extent that partial coherence is used. Large partial coherence factors, such as σ=0.9, lead to improvements on resolution and contrast. On the other hand, small partial coherence factors, such as σ=0.3, have the advantage to form sparse patterns, which can be exploited effectively by phase-shifting masks. Medium partial coherence factors such as σ=0.5 and σ=0.6 are preferred for mask pattern containing both sparse and dense patterns. The smallest usable partial coherence factor is approximately σ=0.3 [1].

While inverse lithography optimization methods have been studied extensively in the past two decades for the case of coherent illumination, equivalent methods for inverse lithography under partially coherent illumination have not been addressed until recently [15]. In [16, 17], Ma and Arce used the sum of coherent systems (SOCS) model and the average coherent approximation model for partially coherent illumination to develop gradient-based binary mask design algorithms for inverse lithography under partially coherent illuminations. The goal of this paper is to extend the concepts in [16, 17] to focus on the development of gradient-based inverse optimization algorithms for the design of PSM under partially coherent illumination. As it will be described shortly, the proposed methods are most effective with small to medium partial coherence factors. Since the imaging synthesis and analysis of partially coherent systems are much more complex than the coherent case, the singular value decomposition (SVD) is applied to expand the partially coherent imaging equation by eigenfunctions into a sum of coherent systems [9, 18]. An iterative optimization framework for the PSM design is formulated when the partially coherent imaging system is approximated by the first order coherent approximation corresponding to the largest eigenvalue. The first order coherent approximation removes the influence among different coherent components during the inverse optimization process and reduces the computational complexity of the algorithms.

In order to attain simpler manufacturability properties, the reduction of mask pattern complexity is desired. To this end, we extend the method in [19] to develop a 2D DCT post-processing method, that reduces the detail complexity of the optimized PSM, yet preserves or improves the fidelity of the output pattern. In essence, the approach is to reduce the solution space by cutting off the high frequency components of the desired pattern in the discrete cosine spectrum. Low frequency components of the mask pattern are proved to have more influence to the fidelity of the output pattern than the high frequency components. Thus, a fraction of the high frequency components of the optimized PSM is therefore deleted. It is interesting to show that the fidelity of the output pattern may be improved by cutting of high frequency components. The relationships among the number of maintained DCT low frequency components, mask complexity and the output pattern fidelity are discussed.

The final contribution of this paper is the use of photoresist tone reversing in PSM design so as to project extremely sparse patterns [1]. Reversion of the photoresist tone thus allows for optimized PSM which can produce targets with higher spatial frequency than the resolution limit without application of the photoresist tone reversing. The remainder of the paper is organized as follows: Partially coherent imaging models are discussed in Section 2. The PSM optimization processes for partially coherent illumination with small to medium partial coherence factor is discussed in Section 3. The post-processing of the mask pattern based on 2D DCT is described in Section 4. The photoresist tone reversing technique for projecting extremely sparse patterns is described in Section 5. Conclusions are provided in Section 6.

2. Partially coherent imaging models

Practical lithography systems most often operate under partially coherent illuminations due to non-zero width sources and off-axis illuminations from spatially extended sources. Common partially coherent illumination modes include dipole, quadrupole and annular shapes, as well as small, medium, and large partial coherence factors. In order to formulate the optimization problem of ILT with partially coherent illuminations, the Hopkins diffraction model and the SOCS model based on SVD are discussed in this section.

2.1. Hopkins diffraction model

According to the Hopkins diffraction model, the light intensity distribution exposed on the wafer with PCI is bilinear and described by [20]

I (r) = \int \int_{- \infty}^{+ \infty} M (r_{1}) M (r_{2}) γ (r_{1} - r_{2}) h^{*} (r - r_{1}) h (r - r_{2}) d r_{1} d r_{2},

where r=(x,y), r ₁=(x ₁,y ₁) and r ₂=(x ₂,y ₂). M(r) is the mask pattern, γ(r ₁-r ₂) is the complex degree of coherence, and h(r) represents the amplitude impulse response of the optical system. The complex degree of coherence γ(r ₁-r ₂) is generally a complex number, whose magnitude represents the extent of optical interaction between two spatial locations r ₁=(x ₁,y ₁) and r ₂=(x ₂,y ₂) of the light source [1]. The complex degree of coherence in the spatial domain is the inverse 2-D Fourier transform of the illumination shape. In general, the intensity distribution equation in Eq. (1) is tedious to compute, both analytically and numerically [13]. The system reduces to simple forms in the two limits of complete coherence or complete incoherence. For the completely coherent case, the illumination source is at a single point, thus, γ(r)=1. In this case, the intensity distribution in Eq. (1) is separable on r ₁ and r ₂, and thus I(r)=|M(r)⊗h(r)|², where ⊗ is the convolution operation. For the completely incoherent case, the illumination source is of infinite extent and thus, γ(r)=δ(r). In this case, the intensity distribution reduces to I(r)=|M(r)|²⊗|h(r)|². In the frequency domain, Eq. (1) is translated as

I (x, y) = \int \int \int \int_{- \infty}^{+ \infty} TCC (f_{1}, g_{1}; f_{2}, g_{2}) \tilde{M} (f_{1}, g_{1}) {\tilde{M}}^{*} (f_{2}, g_{2})

where M̃(f ₁,g ₁) and M̃(f ₂,g ₂) are the Fourier transforms of M(x ₁,x ₂) and M(x ₂,y ₂), respectively. TCC(f ₁,g ₁; f ₂,g ₂) is the transmission cross-coefficient, which indicates the interaction between M̃(f ₁,g ₁) and M̃(f ₂,g ₂). Specifically,

TCC (f_{1}, g_{1}; f_{2}, g_{2}) = \int \int_{- \infty}^{+ \infty} \tilde{γ} (f, g) \tilde{h} (f + f_{1}, g + g_{1}) {\tilde{h}}^{*} (f + f_{2}, g + g_{2}) d f d g,

where γ(f,g), referred to as the effective source, is the Fourier transform of γ(x,y). h̃(f,g) is the Fourier transform of h(x,y).

2.2. SVD model as the sum of coherent systems

The singular value decomposition (SVD) model, described in this section, decomposes the Hopkins diffraction model of Section 2.1 into a sum of coherent systems [18]. The result is a bank of linear systems whose outputs are squared, scaled and summed. The SVD model is summarized as follows.

Given the discretization of the mask pattern M(x,y), referred to as M(m,n), m,n=1,2, …,N, the intensity distribution on the wafer shown in Eq. (2) can be reformulated as a function of matrices

I (m, n) = {\tilde{s}}^{H} A \tilde{s}, m, n = 1, 2, \dots, N,

where H is the conjugate transposition operator. ŝ is an N ²×1 vector, and the ith entry of s̃ is

{\tilde{s}}_{i} = \tilde{M} (p, q) \exp [i 2 π (pm + qn)] i = 1, 2, \dots, N^{2},

where M̃(p,q)=FFT{M(m,n)}, and FFT{·} is the FFT operator. p=i mod N, $q = ⌊ \frac{i}{N} ⌋$ , and ⌈·⌉ is the smallest integer larger than the argument. A is an N ²×N ² matrix including the information of the transmission cross-coefficient TCC. Specifically, the ith row and jth column entry of A is Ai _j=TCC(p,q; r,u), where p=i mod N, $q = ⌊ \frac{i}{N} ⌋$ , r=j mod N, and $u = ⌊ \frac{j}{N} ⌋$ . In order to reformulate Eq. (4) into the sum of coherent systems, the variable pairs of (p,q) and (r,u) in the argument of TCC should be separated by the singular value decomposition. The result of the singular value decomposition of A is $A = \sum_{k = 1}^{N^{2}} α_{k} V_{k} V_{k}^{*}$ , where α _k is the kth eigenvalue, and α ₁>α ₂>…>α _N ². The N ²×1 vector V _k is the eigenfunction corresponding to α _k. Thus Eq. (4) becomes

I (m, n) = \sum_{k = 1}^{N} α_{k} {∣ {\tilde{s}}^{T} V_{k} ∣}^{2} .

Let S ^-1(·) be the inverse column stacking operation which converts the N ²×1 column vector V _k into a N×N square matrix S ^-1(V _k). In particular,

{\tilde{h}}_{k} (p, q) = S^{- 1} (V_{k}) = (\begin{matrix} V_{k, 1} & V_{k, N + 1} & \dots & V_{k, N (N - 1) + 1} \\ V_{k, 2} & V_{k, N + 2} & \dots & V_{k, N (N - 1) + 2} \\ ⋮ & ⋮ & ⋱ & ⋮ \\ V_{k, N} & V_{k, 2 N} & \dots & V_{k, N^{2}} \end{matrix}),

where V _k,_i is the ith entry of V _k. Taking the inverse FFT of h̃_k(p,q) leads to the kth equivalent kernel of the SOCS model,

h_{k} (m, n) = IFFT {{\tilde{h}}_{k} (p, q)}, m, n = 1, 2, \dots, N .

Substituting Eq. (7) and Eq. (8) into Eq. (6),

I (m, n) = \sum_{k = 1}^{N^{2}} α_{k} {∣ h_{k} (m, n) \otimes M (m, n) ∣}^{2} .

Note that the partially coherent system is decomposed into the superposition of N ² coherent systems. The scheme of the SOCS decomposition by SVD is depicted in Fig. 2. The ith order coherent approximation to the partially coherent system is defined as

I (m, n) \approx \sum_{k = 1}^{i} α_{k} {∣ h_{k} (m, n) \otimes M (m, n) ∣}^{2}, i = 1, 2, \dots, N^{2} .

Fig. 2. A partially coherent system represented by a SVD decomposition of a sum of coherent systems.

Download Full Size | PDF

An example of the first 50 eigenvalues of the SOCS decomposition with a small and medium partial coherence factors is illustrated in Fig. 3. In this simulation, the partially coherent illuminations are circular illuminations with partial coherence factors σ=0.3 and σ=0.6. The dimension of the discretization is N=51. The pixel size is 11nm×11nm. Thus, the effective source is

\tilde{γ} (f, g) = \frac{λ^{2}}{π {(σ NA)}^{2}} circ (\frac{λ \sqrt{f^{2} + g^{2}}}{σ NA})

where NA=1.35 and λ=193nm. The amplitude impulse response is defined as the Fourier transform of the circular lens aperture with cutoff frequency NA/λ [21, 22]; therefore,

h (r) = h (x, y) = \frac{J_{1} (2 π r NA ⁄ λ)}{2 π r NA ⁄ λ} .

The Fourier transform of h(x,y) is

\tilde{h} (f, g) = \frac{λ^{2}}{{π (NA)}^{2}} circ (\frac{λ \sqrt{f^{2} + g^{2}}}{NA}) = {\begin{matrix} \frac{λ^{2}}{π {(NA)}^{2}} & for \sqrt{f^{2} + g^{2}} \leq \frac{NA}{λ} \\ 0 & elsewhere \end{matrix} .

Fig. 3. Eigenvalues α _k of sum of coherent systems decomposition by SVD.

Download Full Size | PDF

The amplitude of the first and second equivalent kernels corresponding to the first and second largest eigenvalues with σ=0.3 are illustrated in Figs. 4(a) and 4(b), respectively. It is noted that for the illuminations having small partial coherence factors, the eigenvalues decay very rapidly. It was proved that for partial coherence factors σ≤0.5, a partially coherent imaging system may be approximated to within 10% error by the first order coherent approximation [9].

3. PSM Optimization Using Inverse Lithography

3.1. Optimization using the SOCS model

Let M(m,n) be the input phase-shifting mask to an optical lithography system T{·}, with a partially coherent illumination. The PCI optical system is approximated by a Hopkins diffraction

model described in Eq. (1). The effect of the photoresist is modelled by a hard threshold operation. The output pattern is denoted as Z(m,n)=T{M(m,n)}. Given a N×N desired output pattern Z*(m,n), the goal of PSM design is to find the optimized M(m,n) called M̂(m,n) such that the distance

Fig. 4. Fig. 4. (a) The amplitude of the first equivalent kernel corresponding to the largest eigen-value |ϕ₁(x,y)|, (b) second equivalent kernel corresponding to the second largest eigenvalue |ϕ₂(x,y)|.

Download Full Size | PDF

D = d (Z (m, n), Z^{*} (m, n)) = d (T {M (m, n)}, Z^{*} (m, n))

is minimized, where d(·, ·) is the mean square error criterion. The PSM inverse lithography optimization problem can thus be formulated as the search of M̂(m,n) over the N×N real space ℜN×N such that

\hat{M} (m, n) = \arg min_{M (m, n) \in ℜ^{N \times N}} d (T {M (m, n)}, Z^{*} (m, n)) .

Figure 5 depicts the approximated forward process model [10]. The phase-shifting mask is the input of the system. Light propagating through the mask pattern is affected by diffraction and mutual interference—a phenomena described by the Hopkins Diffraction Model [6, 21, 23]. In this optimization approach, the PCI optical system is approximated by the first order coherent approximation of the SOCS model. |·| is the element-by-element absolute operation, and the output of the convolution and the absolute operation model is the intensity distribution of the aerial image. The aerial image is projected on a light-sensitive photoresist, which is subsequently developed through the use of solvents. The thickness of the remaining photoresist after development is proportional to the exposure dose exceeding a given threshold intensity. In a positive photoresist process, almost all the photoresist material remains in the low-exposure area on the wafer and is removed in the high-exposure area. Between these two extremes is the transition region. For mathematical simplicity, it is assumed that when the light field exceeds a threshold, the exposed area becomes a high-exposure area, otherwise, a low-exposure area. Thus, a hard threshold operation can adequately represent the exposure effect described above. Further, since the derivative of the sigmoid function exists, it is used to approximate the hard threshold function. The hard threshold function is a shifted unit step function U(x-t _r), which is approximated by the sigmoid function

sig (x) = \frac{1}{1 + \exp [- a (x - t_{r})]},

where t _r is the process threshold, and a dictates the steepness of the sigmoid function.

Fig. 5. Approximated forward process model.

Download Full Size | PDF

Following the definitions above, the following notation is used:

1) The M _N×N a real-valued matrix represents the mask pattern with a N ²×1 equivalent raster scanned vector representation, denoted as m̱.

2) A convolution matrix H ₁ is a N ²×N ² matrix. Its equivalent two-dimensional filter is the first equivalent kernel h ₁(m,n) of the SOCS model.

3) The desired N×N binary output pattern is denoted as Z*. It is the desired light distribution sought on the wafer. Its vector representation is denoted as ẕ*.

4) The output of the sigmoid function is the N×N image denoted as: Z=sig{|H ₁{m̱}|²}. The equivalent vector is denoted as ẕ.

5) The hard threshold version of Z is the binary output pattern denoted as Z _b. Its equivalent vector is denoted as ẕ_b, with all entries constrained to 0 or 1.

6) The optimized N×N real-valued mask denoted as M̂ minimizes the distance between Z and Z*, ie,

\hat{M} = \arg min_{M} d (sig {{∣ H_{1} {\underset{̲}{m}} ∣}^{2}}, Z^{*}) .

Its equivalent vector is denoted as m̱̂∈[-1,1].

7) The trinary optimized mask M̂_tri is the quantization of M̂. Its equivalent vector is denoted as m̱̂_tri, with all entries constrained to -1, 0 or 1.

Given the gray level pattern ẕ=sig{|H ₁{m̱}|²}, the ith entry in this vector can be represented as

{\underset{̲}{z}}_{i} = \frac{1}{1 + \exp [- a {∣ \sum_{i = 1}^{N^{2}} h_{1}, ij {\underset{̲}{m}}_{j} ∣}^{2} + a t_{r}]}, i = 1, \dots N^{2},

where h ₁ _,ij is the i, jth entry of the first equivalent kernel h ₁(m,n). In the optimization process, m̱̂ is searched to minimize the L ₂ norm of the difference between ẕ and ẕ*. Therefore,

\hat{\underset{̲}{m}} = \arg min_{\underset{̲}{\hat{m}}} {F (\underset{̲}{m})},

where the cost function F(·) is defined as:

F (\underset{̲}{m}) = {∣ {\underset{̲}{z}}^{*} - \underset{̲}{z} ∣}_{2}^{2} = \sum_{i = 1}^{N^{2}} {({\underset{̲}{z}}_{i}^{2} - {\underset{̲}{z}}_{i})}^{2},

where ẕ_i in Eq. (20) is represented in Eq. (18). In order to reduce the above bound-constrained optimization problem to an unconstrained optimization problem, we adopt the parametric transformation [10]. Let m̱_j=cos(θ̱_j), j=1,…,N ², where θ̱_j∈(-∞,∞) and m̱_j∈[-1,1]. Defining the vector $\underset{̲}{θ} = {[θ_{1}, \dots, θ_{N^{2}}]}^{T}$ , the optimization problem is formulated as

(\underset{̲}{\hat{θ}}) = \arg min_{\underset{̲}{θ}} {F (\underset{̲}{θ})}

The steepest-descent method is used to optimize the above problem. The gradients ∇F(θ̱) _θ̱ can be calculated as follows:

\nabla F (\underset{̲}{θ}) = {\underset{̲}{d}}_{\underset{̲}{θ}} = 2 a \times \sin \underset{̲}{θ} ⊙ {(H_{1}^{* H} [({\underset{̲}{z}}^{*} - \underset{̲}{z}) ⊙ \underset{̲}{z} ⊙ (\underset{̲}{1} - \underset{̲}{z}) ⊙ (H_{1}^{*} \underset{̲}{m})]}

where $\nabla F (\underset{̲}{θ}) \in ℜ^{N^{2} \times 1}$ , ⊙ is the element-by-element multiplication operator. $\underset{̲}{1} = {[1, \dots, 1]}^{T} \in ℜ^{N^{2} \times 1}$ . Assuming θ̱^k is the k ^th iteration result, then at the k+1^th iteration:

{\underset{̲}{θ}}^{k + 1} = {\underset{̲}{θ}}^{k} - s_{\underset{̲}{θ}} {\underset{̲}{d}}_{\underset{̲}{θ}}^{k},

where s _̱θ is the step-size.

The iterative optimization above, in general, leads to a real-valued mask with pixel values between -1 and 1. Therefore a post-processing step is needed to obtain the trinary optimized mask, m̱̂_tri _,i=sgn(m̱̂_i)U(|m̱̂_i|-t _m), i=1, …,N ², where t _m is a global threshold. We define the pattern error E as the distance between the desired output image Z* and the actual binary output pattern Z̄_b evaluated by the equation Eq. (2) and a hard threshold operator,

E = \sum_{i = 1}^{N^{2}} {∣ {\underset{̲}{z}}_{i}^{*} - {\underset{̲}{z}}_{bi} ∣}^{2} = \sum_{i = 1}^{N^{2}} {∣ {\underset{̲}{z}}_{i}^{*} - T {M_{tri}} ∣}^{2} .

3.2. Discretization regularization

In the prior simulation settings, the fact that the estimated output pattern should be trinary is not considered. An additional post-processing (trinarization) of the gray optimized mask pattern is suboptimal with no guarantee that the pattern error is under the goal [10]. One approach to overcome this disadvantage is through regularization during the optimization process [10, 24]. Regularization is formulated as follows:

\underset{̲}{\hat{m}} = \arg min_{\underset{̲}{\hat{m}}} {F (\underset{̲}{m}) + γ R (\underset{̲}{m})},

where F(m) is the data-fidelity term and R(m̄) is the regularization term which is used to reduce the solution space and constrain the optimized results. γ is the user-defined parameter to reveal the weight of the regularization. In the following, the discretization penalty is discussed.

In order to attain near-trinary optimized mask pattern through the optimization process, we adopt the discretization penalty [10]. The formulation of the discretization penalty is summarized as following. For each pixel value, the discretization penalty term is

r_{D} ({\underset{̲}{m}}_{i}) = - 4.5 {\underset{̲}{m}}_{i}^{4} + {\underset{̲}{m}}_{i}^{2} + 3.5, i = 1, \dots, N^{2} .

Thus, the cost function in Eq. (20) is adjusted as J(m̱)=F(m̱)+γ _D R _D(m̱). In our simulations for σ=0.3, discretization regularization attains near-trinary optimized mask and reduce 30% output pattern error.

3.3. Simulations

In order to demonstrate the validity of the optimization algorithms, consider the desired pattern shown in Fig. 6, with dimension of 561nm×561nm. The matrices representing all of the patterns have dimension of N×N, where N=51. The pixel size is 11nm×11nm. The partially coherent illumination is a circular illumination with small partial coherence factor σ=0.3. In Figure 6, the top row illustrates the input masks of (left) the desired pattern Z*, (center) the optimized real-valued mask M̂, and (right) the optimized trinary mask M̂_tri. The optimized trinary mask, referred to as the alternating phase-shifting mask, includes clear areas and shifting areas, which introduce 180° phase difference with each other. In particular, the transmission coefficients of the clear area and shifting area are assigned to 1 and -1, respectively. The binary output patterns are shown in the bottom row. White, grey and black represent 1, 0 and -1, respectively. The effective source and the amplitude impulse response are shown in Eq. (11) and Eq. (12) with NA=1.35 and λ=193nm. In the sigmoid function, we assign parameters a=200 and t _r=0.003. The binary output patterns in the bottom row are evaluated by the equation Eq. (2) followed by a hard threshold operator with threshold ${\bar{t}}_{r} = tr \times \sum_{k = 1}^{N^{2}} α_{k}$ . The global threshold is t _m=0.33. The step length and the regularization weights are s _θ̱=0.2 and γ _D=0.1. The initial mask pattern is the same as the desired binary output pattern Z*. For θ̱, we assign the phase of $\frac{π}{5}$ corresponding to the areas having a magnitude of 1 and the phase of $\frac{π}{2}$ for the areas with magnitude of 0. As shown in Fig. 6, this approach is effective for small partial coherence factors. These results are consistent with those obtained in other numerous simulations we have ran with small partial coherence factors.

Fig. 6. Top row (input masks), left to right: desired pattern, optimized real-valued mask and optimized trinary mask. The bottom row illustrates the corresponding binary output patterns. White, grey and black represent 1, 0 and -1, respectively. σ=0.3.

Download Full Size | PDF

These simulations, are then repeated in Fig. 7 with the same parameters except for the value of the partial coherence factor which in this case was raised to σ=0.6. The output pattern error corresponding to the optimized trinary mask in this case is increased compared with that of Fig. 6. This degradation results from the less accurate first order coherent approximation to the partially coherent system when medium or large partial coherence factors are used. In fact, the SVD approach taken here gradually degrades as the partial coherent factor increases from small to large values. Nevertheless, the optimized PSM attains a 65% reduction of the output pattern error even with the medium partial coherent factor values.

Fig. 7. Top row (input masks), left to right: desired pattern, optimized real-valued mask and optimized trinary mask. The bottom row illustrates the corresponding binary output patterns. White, grey and black represent 1, 0 and -1, respectively. σ=0.6.

Download Full Size | PDF

4. Post-processing of Mask Pattern Based on 2D DCT

In order to attain simpler manufacturability properties, the reduction of the complexity of mask patterns is desired. Recently, J. Zhang, et al., proposed an efficient mask design for inverse lithography based on 2D DCT [19]. The solution space is greatly reduced by cutting off the high frequency components of the desired pattern in discrete cosine spectrum. Low frequency components of the mask pattern were proved to have more influence to the fidelity of the output pattern than the high frequency components. From this point of view, a post-processing of the mask pattern based on 2D DCT is introduced in this Section. It is observed that most of the energy of the mask pattern concentrates on the low frequency components. In order to reduce the complexity of the mask pattern, a subset of high frequency components of the optimized real-valued mask are cut off. The post-processed real-valued mask M̂′ is the inverse 2D DCT of the maintained low frequency components. The post-processed trinary mask M̂′_tri is the discretization of M̂′.

Figure 8 illustrated the relationship between the number of maintained DCT low frequency components and the output pattern errors with partial coherence factors σ=0.3 (solid line) and σ=0.6 (dot line). Since the inverse lithography is an ill-posed problem, numerous input patterns can lead to the same binary output pattern. Thus the post-processing based on 2D DCT can even simultaneously reduce the complexity of the masks and the output pattern errors. It is shown in Fig. 8 that the fidelity of the output patterns is improved by maintaining just 136 low frequency components with σ=0.3 and 666 low frequency components with σ=0.6. Figure 9 illustrates the simulations of the PSM optimization using the DCT post-processing. All of the parameters are the same as the simulations shown in Fig. 6. In Fig. 9, the left figure shows the output pattern of the desired pattern. The middle figure shows the post-processed trinary PSM mask using DCT post-processing maintaining 136 low frequency components. The right figure shows the binary output pattern of the post-processed trinary mask. White, grey and black represent 1, 0 and -1, respectively.

Fig. 8. The relationship between the number of maintained DCT low frequency components and the output pattern errors.

Download Full Size | PDF

Fig. 9. Left to right: output pattern when the desired pattern is inputted, post-processed trinary mask with the DCT post-processing maintaining 136 low frequency components, and the binary output pattern of the post-processed trinary mask. White, grey and black represent 1, 0 and -1, respectively.

Download Full Size | PDF

5. PSM Optimization with Photoresist Tone Reversing

As a final extension of the PSM design method introduced in this paper, we focus on inverse lithography optimization where photoresist tone reversing is used. Photoresist can be divided by its polarity. In a positive photoresist process, more photoresist material remains in the low-exposure area on the wafer and less in the high-exposure area. Negative photoresist responds in the opposite manner. Photoresist tone reversing method exploits both kinds of the photoresist materials on the wafer. Reversion of the photoresist tone is proposed to improve lithography performance of sub-resolution features [1]. In this section, the photoresist tone reversing is used to image a desired resolution and contrast for the sparse pattern, whose resolution limit is much higher than the traditional case without application of the photoresist tone reversing. Consider an optical lithography system applying monotonous positive photoresist with the parameters k=0.29, λ=193nm and NA=1.35. The resolution limit is

R = k \frac{λ}{NA} = 41.5 nm .

PSM optimization without use of photoresist tone reversing fails to print an aerial image containing features with dimensions smaller than the limit in Eq. (27). An example is illustrated in Fig. 10, where the desired image of dimension 561nm×561nm contains two pairs of vertical bars, each with pitch width of 22nm<R. In this simulation, all of the parameters are the same as the simulation shown in Fig. 6, except for t _r=0.0003. In Fig. 10, from left to right, the first figure shows the desired pattern. The second figure shows the binary output pattern of the desired pattern. The third figure shows the optimized trinary PSM. The last figure shows the binary output pattern of the optimized PSM. White, grey and black represent 1, 0 and -1, respectively. Note that the binary output pattern of the optimized PSM is totally different from the desired pattern, indicating that the PSM optimization approach cannot attain the desired output pattern on the wafer. The reason is that the dimension of the features in the desired pattern is smaller than the resolution limit without application of the photoresist tone reversing.

Fig. 10. Left to right: desired pattern, output pattern when the desired pattern is inputted, optimized trinary mask, and the output pattern of the optimized trinary mask. White, grey and black represent 1, 0 and -1, respectively. σ=0.3.

Download Full Size | PDF

In order to overcome the limit, photoresist tone reversing is exploited to find an adequate distribution of positive and negative photoresist on the wafer. One possibility of the distribution is shown in the left figure in Fig. 11. White and black represent positive and negative photoresist, respectively. Assign negative photoresist in the gaps in each pair of the bars and positive photoresist in other areas. If the optimized mask is able to expose a rectangular aerial image in the area laying over each pair of bars, the negative photoresist will prevent the exposure in the gaps. Thus, the binary output pattern is the same as the desired pattern. Therefore, the PSM optimization approach is used to search the binary output pattern of two rectangles on the wafer. In Fig. 11, t _r=0.01, and the optimized trinary mask is shown in the middle figure. The binary output pattern of the optimized trinary mask is shown in the right figure. White, grey and black represent 1, 0 and -1, respectively. It is obvious that photoresist tone reversing method is effective to expose a sparse feature, whose resolution limit is much higher than the traditional case without application of photoresist tone reversing.

Fig. 11. Left to right: Photoresist distribution, optimized trinary mask using the photoresist tone reversing method, and the output pattern of the optimized trinary mask. White and black represent positive and negative photoresist, respectively in the first figure. White, grey and black represent 1, 0 and -1, respectively in the second and the third figures. σ=0.3.

Download Full Size | PDF

Fig. 12. Left to right: Photoresist distribution, post-processed trinary mask using the DCT post-processing maintaining 276 low frequency components, and the binary output pattern of the post-processed trinary mask. White and black represent positive and negative photoresist, respectively in the first figure. White, grey and black represent 1, 0 and -1, respectively in the second and the third figures. σ=0.3.

Download Full Size | PDF

The DCT post-processing of the mask developed in Section 4 can be simultaneously exploited with the photoresist tone reversing method. The simulation is illustrated in Fig. 12. The left figure shows the photoresist distribution. The middle figure shows the post-processed trinary mask using the DCT post-processing maintaining 276 low frequency components. The right figure shows the binary output pattern of the post-processed trinary mask. Note that the DCT post-processing successfully reduces the error of the binary output pattern from 66 to 54.

The heuristic photoresist distribution design approach described above is suitable for simple target patterns. The joint optimization of the photoresist distribution and PSMs is desirable and is of interest for future research work.

6. Conclusions

This paper focuses on PSM inverse lithography under partially coherent illuminations. Firstly, two kinds of partially coherent imaging models are described: the Hopkins diffraction model and the sum of coherent systems decomposition model by SVD. Based on the second model, a first order coherent approximation is used to represent the Hopkins diffraction model, and inverse lithography technology PSM optimization processes is formulated. In order to simultaneously reduce the complexity of the mask and the output pattern error, the post-processing based on 2D DCT is introduced. Finally, the photoresist tone reversing technique is exploited to design PSMs capable of projecting extremely sparse patterns, whose resolution limit is much higher than the traditional case without application of photoresist tone reversing.

Acknowledgments

We wish to thank Christof Krautschik, Yan Borodovsky and the TCAD group at the Intel corporation for their comments and support.

References and links

1. A. K. Wong, Resolution enhancement techniques, 1 (SPIE Press, 2001). [CrossRef]

2. S. A. Campbell, The science and engineering of microelectronic fabrication, 2nd ed. (Publishing House of Electronics Industry, Beijing, China, 2003).

3. F. Schellenberg, “Resolution enhancement technology: The past, the present, and extensions for the future, Optical Microlithography,” in Proc. SPIE , 5377, 1–20 (2004). [CrossRef]

4. F. Schellenberg, Resolution enhancement techniques in optical lithography (SPIE Press, 2004).

5. L. Liebmann, S. Mansfield, A. Wong, M. Lavin, W. Leipold, and T. Dunham, “TCAD development for lithography resolution enhancement,” IBM J. Res. Dev. 45, 651–665 (2001). [CrossRef]

6. A. Poonawala and P. Milanfar, “Fast and low-complexity mask design in optical microlithography - An inverse imaging problem,” IEEE Trans. Image Process. 16, 774–788 (2007). [CrossRef] [PubMed]

7. M. D. Levenson, N. S. Viswanathan, and R. A. Simpson, “Improving resolution in photolithography with a phase-shifting mask,” IEEE Trans. Electron Devices ED-29, 1828–1836 (1982). [CrossRef]

8. Y. Liu and A. Zakhor, “Binary and phase shifting mask design for optical lithography,” IEEE Trans. Semicond. Manuf. 5, 138–152 (1992). [CrossRef]

9. Y. C. Pati and T. Kailath, “Phase-shifting masks for microlithography: Automated design and mask requirements,” J. Opt. Soc. Am. A 11, 2438–2452 (1994). [CrossRef]

10. A. Poonawala and P. Milanfar, “OPC and PSM design using inverse lithography: A non-linear optimization approach,” in Proc. SPIE , 6154, 1159–1172 (San Jose, CA, 2006).

11. X. Ma and G. R. Arce, “Generalized inverse lithography methods for phase-shifting mask design,” in Proc. SPIE (San Jose, CA, 2007).

12. X. Ma and G. R. Arce, “Generalized inverse lithography methods for phase-shifting mask design,” Opt. Express 15, 15,066–15,079 (2007). [CrossRef]

13. B. Salik, J. Rosen, and A. Yariv, “Average coherent approximation for partially cohernet optical systems,” J. Opt. Soc. Am. A 13 (1996). [CrossRef]

14. E. Wolf, “New spectral representation of random sources and of the partially coherent fields that they generate,” Opt. Commun.38 (1981). [CrossRef]

15. P. S. Davids and S. B. Bollepalli, “Generalized inverse problem for partially coherent projection lithography,” in Proc. SPIE (San Jose, CA, 2008). [CrossRef]

16. X. Ma and G. R. Arce, “Binary mask opitimization for inverse lithography with partially coherent illumination,” in Proc. SPIE (Taiwan, 2008). [CrossRef]

17. X. Ma and G. R. Arce, “Binary mask opitimization for inverse lithography with partially coherent illumination,” J. Opt. Soc. Am. A25 (2008). [CrossRef]

18. N. Cobb, “Fast optical and process proximity correction algorithms for integrated circuit manufacturing,” Ph.D. thesis, University of California at Berkeley (1998).

19. J. Zhang, W. Xiong, M. Tsai, Y. Wang, and Z. Yu, “Efficient mask design for inverse lithography technology based on 2D discrete cosine transformation (DCT),” in Simulation of Semiconductor Processes and Devices, 12 (2007). [CrossRef]

20. B. E. A. Saleh and M. Rabbani, “Simulation of partially coherent imagery in the space and frequency domains and by modal expansion,” Appl. Opt.21 (1982). [CrossRef] [PubMed]

21. M. Born and E. Wolfe, Principles of optics (Cambridge University Press, United Kingdom, 1999).

22. R. Wilson, Fourier Series and Optical Transform Techniques in Contemporary Optics (John Wiley and Sons, New York, 1995).

23. N. Cobb and A. Zakhor, “Fast sparse aerial image calculation for OPC,” in BACUS Symposium on Photomask Technology, Proc. SPIE, 2440, 313–327 (1995).

24. C. Vogel, Computational methods for inverse problems (SIAM Press, 2002). [CrossRef]

PSM design for inverse lithography with partially coherent illumination

Abstract

1. Introduction

2. Partially coherent imaging models

2.1. Hopkins diffraction model

2.2. SVD model as the sum of coherent systems

3. PSM Optimization Using Inverse Lithography

3.1. Optimization using the SOCS model

3.2. Discretization regularization

3.3. Simulations

4. Post-processing of Mask Pattern Based on 2D DCT

5. PSM Optimization with Photoresist Tone Reversing

6. Conclusions

Acknowledgments

References and links

Cited By

Figures (12)

Equations (31)

Optics Express