Image based adaptive optics through optimisation of low spatial frequencies

Delphine Débarre; Martin J. Booth; Tony Wilson

doi:10.1364/OE.15.008176

1. Introduction

The objective of all adaptive optics systems is to reduce the wave front aberrations to an acceptable level. Normally this would involve a wave front sensor to measure the aberrations, which are in turn corrected using an adaptive element, such as a deformable mirror [1, 2]. In imaging systems, however, direct wave front sensing is not straightforward and wave front sensorless schemes are often employed. In certain situations, some aberration information can be extracted from a single image using phase retrieval methods; further information is obtained from two or more defocused images using the methods of phase diversity [3]. These methods use iterative calculations based upon a model of the imaging process to retrieve the aberrations and the object structure. However, these calculations are not guaranteed to converge to a unique solution for arbitrary objects [4]. In other wave front sensorless systems, the adaptive element is reconfigured in order to optimise a metric related to image quality. The optimisation procedure involves measurement of the metric for a number of trial correction aberrations, followed by the estimation of an improved correction aberration. This process is repeated until the image quality is considered acceptable. The number of measurements required during this process depends upon the optimisation algorithm and parameters used, the mathematical representation of the aberration, and the object structure.

Most previous work in this area has used model-free algorithms, such as simplex optimisation, conjugate gradient search or multidithering [5, 6, 7, 8, 9, 10, 11]. These schemes have employed several optimisation metrics that are appropriate for image-based adaptive optics. However, the derivation of a constructive model based upon these metrics is complicated by the image formation process, which depends on both the aberrations and the object structure. An effective model-based adaptive optics scheme should be object independent, so the model should permit the separation of aberration and object influences on the measurements. We show that this separation is possible through the appropriate choice of optimisation metric and aberration representation. A similar approach has been demonstrated for adaptive optics in focussing systems. Using a Strehl intensity metric in conjunction with a Zernike mode aberration expansion has led to efficient schemes for the correction of small aberrations [12]. Another system capable of correcting larger aberrations has been demonstrated, using a metric related to the focal spot radius and an alternative aberration expansion in Lukosz modes [13].

In this paper we describe and demonstrate an image-based adaptive optics scheme that is predominantly independent of object structure. This scheme uses the low spatial frequency content of the image as the optimisation metric but leads to correction for all spatial frequencies. The aberration is represented in terms of Lukosz modes; these modes are ideal for modelling the effects of aberrations on the imaging of low spatial frequencies[14]. We describe the imaging process in terms of spectral densities and the optical transfer function. The optimisation metric g is introduced as the sum of a range of low frequencies and is related to the coefficients of the aberration expansion, {a_i}. Because of this choice of aberration expansion and optimisation metric, the function g({a_i}) is found to have a paraboloidal maximum that permits the use of simple maximisation algorithms. Moreover, we show that this optimisation can be performed as a sequence of independent maximisations in each aberration coefficient. The correction scheme is demonstrated for imaging in an incoherent transmission microscope.

Fig. 1. The calculation of the incoherent optical transfer function: (a) the circular pupil P with circumference C; (b) The geometry used in the autocorrelation calculation, showing the pupil overlap A; (c) The resulting aberration-free incoherent optical transfer function.

Download Full Size | PDF

2. Image formation in an incoherent imaging system

For incoherent imaging, the image I(x) is given by the convolution of the object function t(x) and the intensity point spread function (IPSF), h(x), of the system:

I (x) = t (x) * h (x) .

where x is the position vector in the image plane; for clarity we have omitted magnification factors. The object is, of course, independent of any aberrations in the optical system and all aberration effects are therefore manifested in the IPSF. If instead we consider the imaging process in the frequency domain, the image Fourier transform (FT), J(m), is given by

J (m) = H (m) T (m),

where m is the spatial frequency vector, H(m) is the optical transfer function (OTF), which is equivalent to the FT of h(x), and T(m) is the FT of t(x). In general, each of the terms in Eq. 2 is a complex quantity. In order to deal solely with real quantities, we can also describe the imaging process in terms of spectral density functions. Defining the object spectral density function as S_T (m) = |T(m)|² and the image spectral density as S_J(m) = |J(m)|², then

S_{J} (m) = {∣ H (m) ∣}^{2} S_{T} (m),

where |H(m)| is the modulation transfer function (MTF). In Eq. 3, all aberration effects are confined to the MTF.

3. Image spectral density for low spatial frequencies

In this section we derive approximations for S_J(m) at low spatial frequencies. Expressions for heavily aberrated OTFs at low spatial frequencies have been derived using the geometrical OTF [14, 15, 16]. We present an alternative derivation, based upon the diffraction OTF, providing equivalent expressions that are valid for all aberration magnitudes. The diffraction OTF is calculated as the autocorrelation of the effective pupil function, P(r):

H (m) = P (r) \otimes P^{*} (r) = \iint P (r - m) P^{*} (r) d A,

where r is the position vector in the pupil and P ^* is the complex conjugate of P. This is illustrated in Fig. 1. When the pupils are circular and aberration free, we define the pupil function P(r) = ∏(r), where ∏(r) = 1 for |r| ≤ 1 and zero otherwise. From this we obtain the familiar expression for the OTF as

H_{0} (m) = \frac{1}{π} \iint \prod (r - m) \prod (r) d A = \frac{2}{π} [\cos^{- 1} (\frac{m}{2}) - \frac{m}{2} \sqrt{1 - {(\frac{m}{2})}^{2}}],

where m = |m|. A normalisation factor has been introduced so that H ₀(0) = 1. The spatial frequencies are also normalised such that the cut-off of the incoherent imaging system corresponds to |m| = 2. We model phase aberrations as a function Φ(r), such that the pupil function P(r) = ∏(r)exp[jΦ(r)], where j = √-1. The corresponding, aberrated OTF can be calculated as

H (m) = \frac{1}{π} \iint \prod (r - m) \prod (r) \exp j [Φ (r - m) - Φ (r)] d A .

This is valid for all spatial frequencies. However, for small spatial frequencies, the phase term Φ(r - m) can be approximated using a Taylor series expansion and Eqn. 6 can therefore be approximated by

H (m) \approx \frac{1}{π} \iint \prod (r - m) \prod (r) \exp j [- m ∙ \nabla Φ + O (m^{2})] d A,

where the dot represents the scalar product, ∇ is the gradient operator and O(m ²) represents error terms of at least the second order in m. For small arguments, the exponential term can also be expanded as a Taylor series, so that Eq. 7 becomes

H (m) \approx \frac{1}{π} \iint \prod (r - m) \prod (r) d A - \frac{j}{π} \iint \prod (r - m) \prod (r) (m ∙ \nabla Φ) d A - \frac{1}{2 π} \iint \prod (r - m) \prod (r) {(m ∙ \nabla Φ)}^{2} d A + O (m^{2}) .

The first integral is equivalent to H ₀(m). For the other integrals, the effective region of integration is defined by the overlap of the two offset pupils ∏(r-m)∏(r), shown as A in Fig. 1(b). If we approximate this region by the circular pupil P, then the corresponding approximation error is at least second order in m. Equation 8 can therefore be written

H (m) \approx H_{0} (m) - \frac{j}{π} \iint_{P} (m ∙ \nabla Φ) d A - \frac{1}{2 π} \iint_{P} {(m ∙ \nabla Φ)}^{2} d A + O (m^{2}) .

An approximation for the squared MTF, |H(m)|², follows as

{∣ H (m) ∣}^{2} \approx H_{0} {(m)}^{2} - \frac{1}{π} \iint_{P} {(m ∙ \nabla Φ)}^{2} d A + {[\frac{1}{π} \iint_{P} (m ∙ \nabla Φ) d A]}^{2},

where the error terms have now been omitted. The final term in Eq. 10 is non-zero only if Φ(r) contains a component of constant phase gradient (see Appendix A). The effect of this aberration component, which corresponds to the tip and tilt modes, is simply to shift the image laterally. We assume in the rest of this paper that these aberration modes play no role, although we note that the components are readily extracted from the phase of the image FT. If these modes are not present, Eq. 10 becomes

{∣ H (m) ∣}^{2} \approx H_{0} {(m)}^{2} - \frac{1}{π} \iint_{P} {(m ∙ \nabla Φ)}^{2} d A .

Substitution of this expression into Eq. 3 yields an expression for the image spectral density at low spatial frequencies:

S_{J} (m) \approx [H_{0} {(m)}^{2} - \frac{1}{π} \iint_{P} {(m ∙ \nabla Φ)}^{2} d A] S_{T} (m) .

In the equivalent derivation using the geometric OTF, the term H ₀(m)² would be replaced by unity. The corresponding error in the calculation of the MTF would be of order m, leading to significant differences compared to the diffraction OTF calculations.

4. Optimisation metric based upon low spatial frequencies

In this section, we derive an optimisation metric that uses the low spatial frequency content of an image. The metric permits the separation of the effects of specimen structure and aberrations. Let g(M ₁,M ₂) be defined as the total “energy” in all image spatial frequencies lying within the annulus for which M ₁ ≤ |m| ≤ M ₂, where M ₂ is small:

g (M_{1}, M_{2}) = \int_{ξ = 0}^{2 π} \int_{m = M_{1}}^{M_{2}} S_{J} (m) m d m d ξ

If we define m = (mcosξ, msinξ), then S_T(m) must be a periodic function of the polar angle ξ and can be represented by its Fourier series:

S_{T} (m) = \frac{α_{0} (m)}{2} + \sum_{i = 1}^{\infty} [α_{i} (m) \cos (2 iξ) + β_{i} (m) \sin (2 iξ)] .

Note that this series has fundamental period π as the spectral density always has even symmetry about the origin, such that S_T(m) = S_T(-m). This permits us to calculate the first integral with respect to ξ in Eq. 13 as

\int_{ξ = 0}^{2 π} S_{T} (m) d ξ = π α_{0} (m) .

In order to calculate the the second integral with respect to ξ, we can first show that the pupil integral, by expanding the scalar product, becomes

\iint_{P} {(m ∙ \nabla Φ)}^{2} d A = \frac{m^{2}}{2} \iint_{P} {∣ \nabla Φ ∣}^{2} [1 + \cos (2 ξ - 2 χ)] d A,

where χ(r) is the polar angle of the vector ∇Φ(r). The second integral is then calculated as

\int_{ξ = 0}^{2 π} S_{T} (m) [\iint_{P} {(m ∙ \nabla Φ)}^{2} d A] d ξ = \frac{π m^{2}}{2} \iint_{P} {∣ \nabla Φ ∣}^{2} [α_{0} (m) + α_{1} (m) \cos (2 χ) + β_{1} (m) \sin (2 χ)] d A .

Hence, we find that only the non-azimuthally variant component α ₀(m) and the first order terms α ₁(m) and β ₁(m) contribute to the value of g. Significant values of these first order coefficients will indicate that the object has noticeable periodicity in a predominant direction, such as a one dimensional grid. For other object structures more likely to be encountered in applications α ₁(m) and β ₁(m) are expected to be small so the corresponding terms in Eq. 17 can be neglected. Hence, we find that

g (M_{1}, M_{2}) \approx q_{0} (M_{1}, M_{2}) - q_{1} (M_{1}, M_{2}) \frac{1}{π} \iint_{P} {∣ \nabla Φ ∣}^{2} d A,

where

q_{0} (M_{1}, M_{2}) = π \int_{m = M_{1}}^{M_{2}} H_{0} {(m)}^{2} α_{0} (m) m d m

and

q_{1} (M_{1}, M_{2}) = \frac{π}{2} \int_{m = M_{1}}^{M_{2}} α_{0} (m) m^{3} d m

are both positive quantities if α ₀(m) ≠ 0 in the frequency range of interest. If M ₁ and M ₂ are fixed and the object contains frequencies in this range, it can be seen from Eq. 18 that g is maximum if and only if |∇Φ(r)| = 0 for all r or equivalently when Φ(r) is a constant. Although g is based only on low spatial frequencies, the optimisation process will remove all phase aberrations and hence improve imaging quality for all spatial frequencies. It is therefore appropriate to use g as an optimisation metric for aberration correction. It can also be seen from Eq. 18 that the variation of g for different aberrations can be derived entirely from the properties of the integral

I_{1} = \frac{1}{π} \iint_{P} {∣ \nabla Φ ∣}^{2} d A .

5. Aberration expansion in Lukosz functions

In order to understand the effects of the aberration on I ₁, it is useful to represent the aberration as a combination of Lukosz functions. These functions, based upon the Zernike polynomials, were first derived by Lukosz [14] and later, independently by Braat [15]. Like Zernike circle functions, the Lukosz functions are each expressed as the product of a radial polynomial and an azimuthal function and use the same dual index and numbering scheme. They can be defined as

L_{n}^{m} (r, θ) = B_{n}^{m} (r) \times {\begin{matrix} \cos (mθ) m \geq 0 \\ \sin (mθ) m < 0 \end{matrix}

with

B_{n}^{m} (r) = {\begin{matrix} \frac{1}{2 \sqrt{n}} [R_{n}^{0} (r) - R_{n - 2}^{0} (r)] n \neq m = 0 \\ \frac{1}{\sqrt{2 n}} [R_{n}^{m} (r) - R_{n - 2}^{m} (r)] n \neq m \neq 0 \\ \frac{1}{\sqrt{n}} R_{n}^{n} (r) m = n \neq 0 \\ 1 m = n = 0 \end{matrix}

where n and m are the radial and azimuthal indices, respectively, and R^m_n (r) is the Zernike radial polynomial given by

R_{n}^{m} (r) = \sum_{k = 0}^{\frac{n - m}{2}} \frac{{(- 1)}^{k} (n - k)! r^{n - 2 k}}{k! (\frac{n + m}{2} - k)! (\frac{n - m}{2} - k)!} .

It is also convenient to refer to the Lukosz polynomials using a single index numbering scheme, which is explained in Appendix B. We express the aberration as a series of N Lukosz modes with coefficients a_i:

Φ (r) = \sum_{i = 4}^{N + 3} a_{i} L_{i} (r, θ),

where the piston, tip and tilt modes (i = 1,2,3 respectively) have been omitted. Using this aberration expansion, we find that each mode contributes independently to I ₁:

I_{1} = \frac{1}{π} \iint_{p} {∣ \nabla Φ ∣}^{2} d A = \sum_{i = 4}^{N + 3} a_{i}^{2} .

Note that, in contrast to the derivations of Lukosz and Braat, we have chosen the normalisation of the radial polynomials B^m_n (r) to ensure that the weighting of each coefficient in Eq. 26 is independent of the coefficient’s indices. The normalisation of B^m_n (r) used here is also slightly different to that employed in Reference [13].

The optimisation metric g in Eq. 18 can now be directly expressed in terms of the set of aberration coefficients, {a_i}, to give

g ({a_{i}}) \approx q_{0} - q_{1} \sum_{i = 4}^{N + 3} a_{i}^{2},

where for clarity we have omitted the explicit dependence on M ₁ and M ₂. This approximation is accurate for small aberration amplitudes. However, for larger amplitudes it can give inaccurate, even negative values, whereas in practice g would tend to zero. A more appropriate approximation is a Lorentzian function, which is always positive, tends to zero for large aberrations, and retains an identical form to relation 27 for small aberrations:

g ({a_{i}}) \approx \frac{1}{q_{2} + q_{3} \sum_{i = 4}^{N + 3} a_{i}^{2}},

where q ₂ = 1/q ₀ and q ₃ = q ₁/q ² ₀. This Lorentzian approximation provides a close fit to empirical measurements of g, as shown in the next Section.

6. Experimental investigation of the optimisation metric

The properties of the optimisation metric g({a_i}) were investigated experimentally using the system shown in Fig. 2(a). The system comprised an incoherent transmission microscope with a deformable mirror (DM) and a CCD camera. For the purposes of this demonstration, the DM acted as both aberration source and correction element. A light emitting diode (LED; Lumileds, Luxeon Star/O, centre wavelength 650nm) provided incoherent illumination to a transmissive specimen which was imaged using a 150mm focal length achromatic doublet as the objective lens. An iris provided the 5mm diameter limiting aperture of the imaging system at the pupil plane of the objective. This pupil plane was imaged onto the DM (Boston Micromachines Corp., Multi-DM, 140 element, 1.5μm stroke) using a 4f system. The DM was then re-imaged through the same 4f system onto the pupil plane of the tube lens, which formed an image of the specimen on the CCD camera.

The DM was driven using a set of control signals, {c_i}, where each control signal was proportional to the square root of the corresponding electrode voltage. This was found to produce linear operation over most of the DM’s deflection range. In order to produce desired combinations of Lukosz modes, the control signals were obtained from Lukosz modal coefficients {a_i} through a matrix multiplication:

c = B^{†} D a,

Fig. 2. (a) Schematic diagram of the experimental apparatus; (b) Raw image of scatterer without aberrations; (c) Spectral density of the scatterer image (log scale) with M ₁, M ₂ and incoherent cut-off frequencies marked. The horizontal and vertical lines at the edge of this image are FT artefacts arising from the sharp image boundaries.

Download Full Size | PDF

where the elements of the vectors a and c are identical to the elements of {a_i} and {c_i}, respectively. The matrix D provides conversion from Lukosz coefficients into Zernike coefficients (see Appendix B). The pseudo-inverse matrix B ^† permits the calculation of the control signals from the Zernike coefficients. This matrix was obtained using an interferometric method described in Reference [17], which also enabled flattening of the initial DM aberration figure.

In order to characterise the properties of g, we used a holographic scatterer (Physical Optics Corp.) as the transmissive specimen. An aberration-free image of the scatterer is shown in Fig. 2(b). This specimen is ideal for this characterisation as it contains all spatial frequencies within the pass band of the imaging system; this can be seen in the image spectral density (Fig. 2(c)). Figure 3(a) shows the measured variation of g with the root mean square (rms) aberration amplitude using different spatial frequency ranges. The aberration consisted of eight Lukosz modes (i = 4 to 11). The rms amplitude was calculated from the Lukosz coefficients as a = |a| (see Appendix B). Each data point shows the mean and standard deviations for an ensemble of 200 random aberrations of magnitude a. Each aberration was constructed by generating random coefficients in the range (-1,1) with uniform probability; the resulting vector was then scaled to the magnitude a. When only small spatial frequencies are used in the calculation of g, the deviation from the mean is small and the response is predominantly quadratic, as predicted by Eq. 27. When larger frequencies are also included, so that the low frequency approximations no longer hold, the value of g drops off more quickly and the deviation from the mean is more significant. However, the Lorentzian approximation is found to provide a close fit to the mean value of g for all curves. In Fig. 3(b), it can be seen that the width of the experimentally determined g response fits the theoretical prediction for low spatial frequencies.

It is important to note that the width of the g response can be tuned by changing the spatial frequency range. In other words, the use of smaller spatial frequencies in the metric permits the measurement of larger aberrations. This property presents the possibility of schemes where aberrations are corrected in a series of optimisations covering first the large magnitude aberrations (using the smallest spatial frequencies), progressing to correction of less significant aberrations using larger frequencies.

Fig. 3. Experimental measurement of the optimisation metric: (a) Variation of g with aberration magnitude a = |a| for the different frequency ranges given by the figures in parentheses (M ₁,M ₂). The solid lines are Lorentzian fits to the mean values; (b) The measured and calculated half width of g(a) for different frequency ranges. The theoretical curve is valid for small frequencies.

Download Full Size | PDF

7. Optimisation scheme

The aberration correction process can be performed as the maximisation of g({a_i}). Using relation 28, we see that this is equivalent to the minimisation of a different metric G({a_i}) defined as

G ({a_{i}}) = g {({a_{i}})}^{- 1} \approx q_{2} + q_{3} \sum_{i = 4}^{N + 3} a_{i}^{2},

where the approximation is valid over all aberration amplitudes. Using the Lukosz coefficients {a_i} as an N-dimensional coordinate basis, it is clear from Eq. 30 that G({a_i}) has a uniform paraboloidal shape in the neighbourhood of its minimum. This representation is particularly advantageous for optimisation, as the minimum of a paraboloidal function is readily found from a small number of metric evaluations. Moreover, the minimisation of G can be decomposed into a sequence of N independent one dimensional parabolic minimisations in each of the coefficients a_i. In order to perform a minimisation with respect to the coefficient a_k of a particular mode L_k, we can express G as

G (a_{k}) \approx {q'}_{2} + q_{3} a_{k}^{2},

where

{q'}_{2} = q_{2} + q_{3} \sum_{i \neq k} a_{i}^{2}

can be considered a constant. As the values of q′₂ and q ₃ are not known, the value of a_k that minimises G(a_k) can be calculated from a minimum of three measurements of G. In practice, we took these three measurements by intentionally introducing different aberrations using the adaptive element. We refer to these aberrations as biases. The biases were chosen to be Φ = -bL_k, Φ = 0 and Φ = +bL_k, where b is a suitable bias amplitude. An image was acquired and its FT and spectral density were calculated. The appropriate range of frequency components was summed, giving the metric measurements g ^-, g ₀ and g ₊ respectively, and the reciprocal of each result was calculated, giving G _-, G ₀ and G ₊. The optimum correction aberration was then estimated by parabolic minimisation as [18]

Fig. 4. Correction of a single Lukosz aberration mode (astigmatism, i = 5) using the scatterer specimen with M ₁ = 0.06 and M ₂ = 0.4. The first row shows the raw images and the second row contains the corresponding spectral densities. The third row illustrates schematically the sampling of the Lorentzian curve used in the optimisation calculation. The diagrams correspond to: (a1-a3) initial aberration of magnitude a ₅ = -4.9; (b1-b3) additional negative bias -b = -11.5 applied; (c1-c3) additional positive bias b = 11.5 applied; (d1-d3) correction applied.

Download Full Size | PDF

a_{corr} = \frac{b (G_{+} - G_{-})}{2 G_{+} - 4 G_{0} + 2 G_{-}},

which is exactly equivalent to the Lorentzian maximisation of the metric g. To correct this single mode, the correction aberration Φ = a _corr L_k would be added to the deformable mirror. For multiple mode correction, each modal coefficient would be measured in this manner before applying the full correction aberration containing all modes.

7.1. Correction of a single mode

The correction process is illustrated in Fig. 4 for the correction of one Lukosz mode using the scatterer specimen. A suitable range of spatial frequencies and the bias amplitude were chosen based upon the curves in Fig. 3. An initial aberration was added using the DM, an image was acquired and the value of g was calculated. Positive and negative bias aberrations were added in turn and the corresponding values of g were calculated. The correction aberration was obtained using Eq. 33 and the correction was applied to the DM. The final rms phase aberration was found to be 0.18, corresponding to a Strehl ratio of 0.97.

Fig. 5. Correction of multiple Lukosz aberration modes showing images (upper row) and spectral densities (lower row). The initial aberration was a random combination of eight modes, i = 4 to 11, with overall amplitude a = 11.5. For the correction procedure M ₁ = 0.06, M ₂ = 0.4, and the bias b = 9.8. The images show: (a) Scatterer, initial aberration; (b) Scatterer, corrected; (c) USAF test chart, initial aberration; (d) USAF test chart, corrected. The values of ϕ _rms show the root mean square phase aberration in radians.

Download Full Size | PDF

7.2. Correction of multiple modes

The correction of multiple aberration modes is illustrated in Fig. 5 for both the scatterer specimen and a US Air Force test chart. In each case, the initial aberration, a _in, introduced by the DM consisted of eight Lukosz modes (i = 4 to 11) and had a total amplitude of |a _in| = 11.5. Each modal coefficient was estimated in turn using a bias amplitude b = 9.8. Once all eight coefficients had been estimated, the full correction aberration, a _corr, was added to the DM. We note that the unbiased measurement was identical for each modal estimation, so was only taken once. The final Strehl ratios were found to be 0.87 for the scatterer and 0.91 for the test chart. A further cycle of correction was also performed (not shown in the figure) using bias b = 4.9, giving final Strehl ratios of 0.99 for the scatterer and 0.98 for the test chart.

7.3. Accuracy of correction

We investigated the correction accuracy for various spatial frequency ranges, bias amplitudes and input aberrations using the scatterer specimen. The results are summarised in Fig. 6 and Table 1. As in the previous section, the initial aberration, a _in, introduced by the DM consisted of a random combination of the eight Lukosz modes i = 4 to 11. The values of optimisation metric were obtained in a similar manner and the correction aberration, a _corr, was determined. We define the aberration error to be a _err = a _in + a _corr and the error magnitude as

ε = ∣ a_{err} ∣ = ∣ a_{in} + a_{corr} ∣ .

The approximate Strehl ratio was calculated using the Gaussian approximation [16] and related to a _err using the expression derived in Appendix B:

S = \exp (- ϕ_{rms}^{2}) = \exp (- {∣ D a_{err} ∣}^{2}) .

Fig. 6. Correction accuracy for the correction of eight Lukosz modes (i = 4 to 11) using different frequency ranges A-F (see Table 1). The upper graphs correspond to bias b = 4.9 whereas in the lower graphs b = 1.6. (a1,a2) show the mean correction error ε; (b1,b2) show the mean Strehl ratio S.

Download Full Size | PDF

Table 1. Spatial frequency ranges for the results of Fig. 6.

View Table | View all tables in this article

A sequence of 50 measurements was taken for each input aberration magnitude |a _in| and the mean values of ε and S were calculated and plotted in Fig. 6. The data show that each of the frequency combinations can provide effective correction over a particular range, but that the range is largest when the lowest spatial frequencies are used. This property is related to the width of the g(a) curves shown in Fig. 3(a). Some of the data points in Fig. 6(a1) and (a2) lie above the ε = |a _in| line, indicating that the corrected aberration is larger than the initial aberration. These points correspond to metrics using higher spatial frequencies, where we do not expect the quadratic approximations used in the derivation of Eq. 27 to hold. We note that for small input aberrations there is an offset error. However, in all cases shown here, this corresponds to a Strehl ratio of greater than 0.8, close to the diffraction limit.

Fig. 7. The effect of additional modes on the correction procedure using M ₁ = 0.06, M ₂ = 0.4, and b = 4.9.

Download Full Size | PDF

These results indicate that, when aberration statistics are unknown, a sensible strategy would involve choosing small spatial frequencies for an initial correction. This would be accompanied by a bias that is no larger than the half width of the response curve, as shown in Fig. 3(b). If further correction is required, this could be performed using a larger range of frequencies and a corresponding smaller bias. If the maximum expected aberration magnitude is known, then the bias could be chosen to be similar to this maximum.

The effect of additional aberration modes on the correction process was investigated by including random combinations of an extra eight modes (i = 12 to 19) in the initial aberration. The original eight modes (i = 4 to 11) were corrected in the same manner as before and ε was calculated taking into account only the modes that were corrected. The results obtained when different amounts of the additional modes were present are shown in Fig. 7. The error ε shows only a small variation as the amplitude of the additional modes is increased. This illustrates that different aberration modes can be corrected independently using this procedure.

8. Discussion and Conclusions

We have introduced a model-based adaptive optics scheme for correcting aberrations in an incoherent imaging system. Using an optimisation metric based upon the low spatial frequency content of the image and an aberration expansion in terms of Lukosz modes, we have been able to separate the effects of the different aberration modes. This allowed the optimisation to be performed as a sequence of independent corrections of each mode. Although only low spatial frequencies are used in the optimisation process, correction of all aberrations (aside from piston, tip and tilt) results. Consequently, imaging quality is improved for all spatial frequencies and not solely the frequencies used in the optimisation metric. The correction scheme is predominantly independent of object structure - the model is valid when the low spatial frequency components are not significantly concentrated in one orientation. This would occur, for example, if the image were dominated by a one dimensional grid-like pattern. Even if the object has this form, we expect the scheme to be robust - this has been indicated by preliminary results. Although the discussion in this paper was framed in the context of an incoherent imaging system, we expect this approach also to be valid for coherent or partially coherent systems.

The optimisation metric used in this paper can be related to image sharpness measures that have been employed in many other image-based adaptive optics systems [5, 6, 7, 8, 9, 10, 11]. A common definition for image sharpness, σ, is obtained by integrating the square of the image intensity, I(x):

σ = \iint I {(x)}^{2} d x d y .

As noted by Hamaker et al. [19], by using Parseval’s theorem, σ can also be calculated in the Fourier domain as

σ = \int_{ξ = 0}^{2 π} \int_{m = 0}^{2} S_{J} (m) m d m d ξ

and is thus equivalent to the metric g if using the spatial frequency range (M ₁,M ₂) = (0,2). The methods described in this paper could therefore be extended for use with image sharpness metrics, obviating the need to calculate the image FT. For example, they would be directly applicable if the object spectrum were dominated by low spatial frequency components.

Appendix A: Evaluation of an integral

In this Appendix we examine the properties of the following integral that appears in the calculation of the OTF (Eq. 10):

I_{2} = \frac{1}{π} \iint_{P} (m ∙ \nabla Φ) d A .

The integration is independent of m, which can be removed from the integrand, so we find that

I_{2} = \frac{1}{π} m ∙ \iint_{P} (\nabla Φ) d A = \frac{1}{π} m ∙ \oint_{C} Φ n d c,

where we have employed Gauss’ theorem to convert the surface integral over the pupil P to a line integral around its circumference C (see Fig. 1). The term dc is an infinitesimal section of C that has the corresponding unit normal vector n. If we define m = (m cos ξ, m sin ξ) and ϕ(θ) = Φ(r) when |r| = 1 then Eq. 39 can be rewritten as

I_{2} = \frac{1}{π} m \int_{θ = 0}^{2 π} ϕ (θ) \cos (ξ - θ) d θ .

We can expand ϕ(θ) as a Fourier series:

ϕ (θ) = \frac{μ_{0}}{2} + \sum_{i = 1}^{\infty} [μ_{i} \cos (iθ) + v_{i} \sin (iθ)],

which, when substituted into Eq. 40, yields

I_{2} = m [μ_{1} \cos (ξ) + v_{1} \sin (ξ)] .

Hence, the integral I ₂ is zero unless μ ₁ or v ₁ are non-zero. If Φ(r) is expressed as a series of Lukosz modes, then the only modes that can contribute to μ ₁ or v ₁ are mode 2 (tip) and mode 3 (tilt). All other Lukosz modes of the first azimuthal order are identically zero at r = 1 so do not contribute to ϕ(θ). The effect of the tip and tilt modes is simply to translate the image laterally. Their influence on the OTF is the introduction of a phase variation and has no effect on the OTF magnitude. The role of I ₂ in Eq. 10 is therefore to compensate for the previous second order term so that the OTF magnitude is not affected by tip and tilt.

Appendix B: Zernike and Lukosz functions

Some low-order Zernike and Lukosz functions are listed in Table 2. The mode indexing schemes, using the single index i or the dual indices (n,m), are explained by Neil et al. [20]. The Zernike functions are normalised such that a coefficient of value 1 corresponds to a wave front variance of 1 rad². The Lukosz functions are normalised such that a coefficient of value 1 corresponds to a focal spot second moment (or equivalently rms spot radius) of λ/(2πNA), where λ is the wavelength and NA is the numerical aperture of the focussing lens.

Conversion between sets of Lukosz modal coefficients {a^m_n} and Zernike coefficients {z^m_n} can be performed using the following relationships:

z_{n}^{m} = {\begin{matrix} \frac{1}{2 \sqrt{n (n + 1)}} a_{n}^{m} - \frac{1}{2 \sqrt{(n + 1) (n + 2)}} a_{n + 2}^{m} m \neq n \\ \frac{1}{\sqrt{2 n (n + 1)}} a_{n}^{n} m = n \neq 0 \\ a_{0}^{0} m = n = 0 \end{matrix}

If the sets of Lukosz and Zernike coefficients are represented by the vectors a and z respectively then we can use the matrix-vector equation z = Da for the conversion, where the elements of the sparse matrix D are calculated from Eq. 43. The rms phase aberration can be easily calculated from the Zernike mode coefficients as |z|. It follows that, in terms of Lukosz coefficients, the rms phase aberration is given by

ϕ_{rms} = ∣ Da ∣ .

Using a geometric optics approximation, the rms focal spot radius ρ _rms is related to the Lukosz coefficients by [14, 15]

ρ_{rms} = \frac{λ}{2 πNA} ∣ a ∣ .

Table 2. Zernike and Lukosz mode definitions

View Table | View all tables in this article

Acknowledgements

D. Débarre is supported by the Délégation Générale pour l’Armement. M. J. Booth is a Royal Academy of Engineering/EPSRC Research Fellow. This work was supported by a grant from the John Fell OUP Research Fund at the University of Oxford. Thanks are due to O. Shatrovoy and T. Bifano of Boston University for providing LabView drivers for the deformable mirror.

References and links

1. R. K. Tyson, Principles of Adaptive Optics, Academic Press, London, 1991.

2. J. W. Hardy, Adaptive Optics for Astronomical Telescopes, Oxford University Press, 1998.

3. R. A. Gonsalves, “Phase retrieval and diversity in adaptive optics,” Opt. Engineering 21, 829–832, 1982.

4. D. R. Luke, J. V. Burke, and R. G. Lyon, “Optical Wavefront Reconstruction: Theory and Numerical Methods,” SIAM Review 44, 169–224, 2002. [CrossRef]

5. R. A. Muller and A. Buffington, “Real-time correction of atmospherically degraded telescope images through image sharpening,” J. Opt. Soc. Am. 64, 1200–1210, 1974. [CrossRef]

6. A. Buffington, F. S. Crawford, R. A. Muller, A. J. Schwemin, and R. G. Smits, “Correction of atmospheric distortion with an image-sharpening telescope,” J. Opt. Soc. Am. 67, 298–303, 1977. [CrossRef]

7. M. A. Vorontsov, G. W. Carhart, D. V. Pruidze, J. C. Ricklin, and D. G. Voelz, “Image quality criteria for an adaptive imaging system based on statistical analysis of the speckle field,” J. Opt. Soc. Am. A 13, 1456–1466, 1996. [CrossRef]

8. M. A. Vorontsov, G. W. Carhart, and J. C. Ricklin, “Adaptive phase-distortion correction based on parallel gradient-descent optimization,” Opt. Lett. 22, 907–909, 1997. [CrossRef] [PubMed]

9. N. Doble, Image Sharpness Metrics and Search Strategies for Indirect Adaptive Optics.PhD thesis, University of Durham, United Kingdom, 2000.

10. J. R. Fienup and J. J. Miller, “Aberration correction by maximizing generalized sharpness metrics,” J. Opt. Soc. Am. A 20, 609–620, 2003. [CrossRef]

11. L. Murray, J. C. Dainty, and E. Daly, “Wavefront correction through image sharpness maximisation,” in Proc. S.P.I.E. , ‘Opto-Ireland 2005: Imaging and Vision’ 5823, 40–47, 2005.

12. M. J. Booth, “Wave front sensor-less adaptive optics: a model-based approach using sphere packings,” Opt. Express 14, 1339–1352, 2006. [CrossRef] [PubMed]

13. M. J. Booth, “Wavefront sensorless adaptive optics for large aberrations,” Opt. Lett. 32, 5–7, 2007. [CrossRef]

14. W. Lukosz, “Der Einfluβ der Aberrationen auf die optische Übertragungsfunktion bei kleinen Orts-Frequenzen,” Optica Acta 10, 1–19, 1963. [CrossRef]

15. J. Braat, “Polynomial expansion of severely aberrated wave fronts,” J. Opt. Soc. Am. A 4, 643–650, 1987. [CrossRef]

16. V. N. Mahajan, Optical Imaging and Aberrations, Part II. Wave Diffraction Optics, SPIE, Bellingham, Wash., 2001.

17. M. J. Booth, T. Wilson, H.-B. Sun, T. Ota, and S. Kawata, “Methods for the characterisation of deformable membrane mirrors,” Appl. Opt. 44, 5131–5139, 2005. [CrossRef] [PubMed]

18. W. Press, S. Teukolsky, W. Vetterling, and B. Flannery, Numerical Recipes in C, Cambridge University Press, 2nd ed., 1992.

19. J. P. Hamaker, J. D. O’Sullivan, and J. E. Noordam, “Image sharpness, Fourier optics, and redundant-spacing interferometry,” J. Opt. Soc. Am. 67, 1122–1123, 1977. [CrossRef]

20. M. A. A. Neil, M. J. Booth, and T. Wilson, “New modal wavefront sensor: a theoretical analysis,” J. Opt. Soc. Am. A 17, 1098–1107, 2000. [CrossRef]

Frequency Range	M ₁	M ₂	Half width g(a)
A	0.06	0.08	17.0
B	0.06	0.20	9.82
C	0.06	0.40	6.05
D	0.06	0.80	4.25
E	0.06	1.20	3.88
F	0.06	2.00	3.81

Index			Zernike mode	Lukosz mode
i	n	m	Z_i (r, θ)	L_i (r, θ)	Name
1	0	0	1	1	Piston
2	1	1	2rcos(θ)	rcos(θ)	Tip
3	1	-1	2rsin(θ)	rsin(θ)	Tilt
4	2	0	√3(2r ² - 1)	$\frac{1}{\sqrt{2}} (r^{2} - 1)$	Defocus
5	2	2	√6r ² cos(2θ)	$\frac{1}{\sqrt{2}} r^{2} \cos (2 θ)$	Astigmatism
6	2	-2	√6r ² sin(2θ)	$\frac{1}{\sqrt{2}} r^{2} \sin (2 θ)$	Astigmatism
7	3	1	2√(3r ³ -2r)cos(θ)	$\frac{1}{\sqrt{6}} (3 r^{3} - 3 r) \cos (θ)$	Coma
8	3	-1	2√(3r ³ -2r)sin(θ)	$\frac{1}{\sqrt{6}} (3 r^{3} - 3 r) \sin (θ)$	Coma
9	3	3	2√2r ³cos(3θ)	$\frac{1}{\sqrt{3}} r^{3} \cos (3 θ)$	Trefoil
10	3	-3	2√2r ³ sin(3θ)	$\frac{1}{\sqrt{3}} r^{3} \sin (3 θ)$	Trefoil
11	4	0	√5(6r ⁴-6r ²+1)	$\frac{1}{2} (3 r^{4} - 4 r^{2} + 1)$	Spherical

Frequency Range	M ₁	M ₂	Half width g(a)
A	0.06	0.08	17.0
B	0.06	0.20	9.82
C	0.06	0.40	6.05
D	0.06	0.80	4.25
E	0.06	1.20	3.88
F	0.06	2.00	3.81

Index			Zernike mode	Lukosz mode
i	n	m	Z_i (r, θ)	L_i (r, θ)	Name
1	0	0	1	1	Piston
2	1	1	2rcos(θ)	rcos(θ)	Tip
3	1	-1	2rsin(θ)	rsin(θ)	Tilt
4	2	0	√3(2r ² - 1)	$\frac{1}{\sqrt{2}} (r^{2} - 1)$	Defocus
5	2	2	√6r ² cos(2θ)	$\frac{1}{\sqrt{2}} r^{2} \cos (2 θ)$	Astigmatism
6	2	-2	√6r ² sin(2θ)	$\frac{1}{\sqrt{2}} r^{2} \sin (2 θ)$	Astigmatism
7	3	1	2√(3r ³ -2r)cos(θ)	$\frac{1}{\sqrt{6}} (3 r^{3} - 3 r) \cos (θ)$	Coma
8	3	-1	2√(3r ³ -2r)sin(θ)	$\frac{1}{\sqrt{6}} (3 r^{3} - 3 r) \sin (θ)$	Coma
9	3	3	2√2r ³cos(3θ)	$\frac{1}{\sqrt{3}} r^{3} \cos (3 θ)$	Trefoil
10	3	-3	2√2r ³ sin(3θ)	$\frac{1}{\sqrt{3}} r^{3} \sin (3 θ)$	Trefoil
11	4	0	√5(6r ⁴-6r ²+1)	$\frac{1}{2} (3 r^{4} - 4 r^{2} + 1)$	Spherical

Image based adaptive optics through optimisation of low spatial frequencies

Abstract

1. Introduction

2. Image formation in an incoherent imaging system

3. Image spectral density for low spatial frequencies

4. Optimisation metric based upon low spatial frequencies

5. Aberration expansion in Lukosz functions

6. Experimental investigation of the optimisation metric

7. Optimisation scheme

7.1. Correction of a single mode

7.2. Correction of multiple modes

7.3. Accuracy of correction

8. Discussion and Conclusions

Appendix A: Evaluation of an integral

Appendix B: Zernike and Lukosz functions

Acknowledgements

References and links

Cited By

Figures (7)

Tables (2)

Equations (46)

Optics Express