Tutorial on Strehl ratio, wavefront power series expansion, Zernike polynomials expansion in small aberrated optical systems
1. Strehl Ratio
The wave aberration function, W(x,y), is defined as the distance, in optical path length (product of the refractive index and path length), from the reference sphere to the wavefront in the exit pupil measured along the ray as a function of the transverse coordinates (x,y) of the ray intersection with a reference sphere centered on the ideal image point. It is not the wavefront itself but it is the departure of the wavefront from the reference spherical wavefront (OPD) as indicated in Figure 1.
Figure 1. Wave Aberration Function for a distant point object
For small aberrations, the Strehl ratio is defined as the ratio of the intensity at the Gaussian image point (the origin of the reference sphere is the point of maximum intensity in the observation plane) in the presence of aberration, divided by the intensity that would be obtained if no aberration were present, , here is phase aberration. Φ Strehl ratio is a very important figure of merit in system with small aberration, i.e., astronomy system where aberration is almost always well corrected, thus a good understand of the relationship between Strehl ratio and aberration variance is absolutely necessary. Now we switch to polar exit pupil coordinates, from the definition of Strehl ratio,
By expanding the complex exponential of equation (1) into a power series and keep the first 2 terms only for small aberration, the author derived the following approximated expression for Strehl ratio (Marechal formula), is the standard deviation of phase aberration Now neglecting the W term in formula (2), the author provided another famous approximated expressions for Strehl ratio, is the variance of phase aberration across the exit pupil
From Equation (1)-(4), we can see for small aberrated system, we can maximize the Strehl ration by minimize the wavefront variance. So we need to figure out a standard way to minimizeσ Φ for any given wave aberration function W(x,y).
2. Power series expansion of W(ρ,θ)
A standard way of describing the wave aberration is to use a Taylor expansion polynomial in field (object height) and pupil coordinates.
are the wave aberration coefficients for the various terms or modes h is the height of the object and r,θ are the polar coordinate sin the pupil plane
From equation (4) above, we are able to see wavefront variance is integration over the pupil function, so we can suppress the image height by absorbing it into the wave aberration coefficients. The wave aberration polynomial is also typically expressed in terms of the normalized pupil radius, , where a is the exit pupil radius. So we can rewrite the power series expansion as:
However, the terms in the Taylor series do not form an orthogonal set of basis functions, thus if we put (5) into (4), we will get many terms and make our calculation complex. Therefore power series expansion is not recommended for data fitting and describing experimental measurements of wavefront aberrations.
3. Why Use Zernike Polynomials?
Optical system aberrations have historically been described, characterized, and catalogued by power series expansions, where the wave aberration is expressed as a weighted sum of power series terms that are functions of the pupil coordinates. Each term is associated with a particular aberration or mode. For example, spherical aberration, coma, astigmatism, field curvature, distortion, and other higher order modes.
Many optical systems have circular pupils. So many analyses and calculations (e.g. diffraction) will involve the integration of the pupil function and wave aberration function over a circular pupil. Experimental measurements will also be performed over a circular pupil and will commonly require some form of data fitting. It is, therefore, convenient to expand the wave aberration in terms of a complete set of basis functions that are orthogonal over the interior of a circle. Experimental data can be fit to a weighted sum of these orthogonal basis functions.
Zernike polynomials form a complete set of functions or modes that are orthogonal over a circle of unit radius and are convenient for serving as a set of basis functions. They are unique in that they are the only polynomials in two variables ρ and θ , which (a) are orthogonal over a unit circle,(b) are invariant in form with respect to rotation of the coordinate axes about the origin, and (c) include a polynomial for each permissible pair of n and m values. This makes them suitable for accurately describing wave aberrations as well as for data fitting. They are usually expressed in polar coordinates, and are readily convertible to Cartesian coordinates. These polynomials are mutually orthogonal, and are therefore mathematically independent, making the variance of the sum of modes equal to the sum of the variances of each individual mode. They can be scaled so that non-zero order modes have zero mean and unit variance. This puts all modes in a common reference frame that enables meaningful relative comparison between them.
Different Zernike polynomial definitions are currently in use. The convention adopted by the OSA has x “horizontal”, y “vertical”, and θ is measured counter-clockwise from xaxis (i.e. right-handed coordinate system). More traditional notation measures θ clockwise from y-axis. There is the Orthogonal type where the polynomials are normalized to have unity magnitude at edge of pupil. There is also the Orthonormal type where the terms are normalized so that the coefficient of a particular term or mode is the RMS contribution of that term. Here I am going to adopt Prof. Mahaja’s definition.
4. Zernike Polynomials expansion of W(ρ,θ)
The orthonormal Zernike polynomials and the names associated with some of them when identified with aberrations are listed in table 1 blow for n ≤ 8. The number of Zernike (or orthogonal) aberration terms in the expansion of an aberration function through a certain order n is given by
Table 1. Orthonormal Zernike circle polynomials
Consider a typical Zernike aberration term:
Unless n = m = 0,its mean value is zero; i.e.
Hence, its variance is given by
Thus, each expansion coefficient, with the exception of, represents the standard deviation of the corresponding aberration term. The variance of the aberration function is accordingly given by:
Thus by using Zernike polynomials expansion, the variance of the aberration function becomes a simple adding of squared expansion coefficients, which greatly reduced our calculation of wavefront variance in the calculation of Strehl ration.
1. V. N. Mahajan, “Strehl ratio for primary aberrations: some analytical results for circular and annular pupils,” J. Opt.Soc. Am. 72, 1258–1266 (1982), Errata, 10, 2092 (1993); “Strehl ratio for primary aberrations in terms of their aberration variance,” J. Opt. Soc. Am. 73, 860–861 (1983).
2. V. N. Mahajan, “Symmetry properties of aberrated point-spread functions,” J. Opt. Soc. Am. A11, 1993–2003 (1994).
3. V. N. Mahajan, “Line of sight of an aberrated optical system,” J. Opt. Soc. Am. 2, 833–846 (1985).
4. V. N. Mahajan, “Zernike annular polynomials for imaging systems with annular pupils,” J. Opt. Soc. Am. 71, 75–85(1981); 71, 1408 (1981); 1, 685 (1984); “Zernike annular polynomials and optical aberrations of systems with annular pupils,” Appl. Opt. 33, 8125–8127 (1994).
5. V. N. Mahajan, “Uniform versus Gaussian beams: a comparison of the effects of diffraction, obscuration, and aberrations,” J. Opt. Soc. Am. A3, 470–485 (1986).
6. V. N. Mahajan, “Zernike circle polynomials and optical aberrations of systems with circular pupils,” Appl. Opt. 33,8121–8124 (1994); and “Zernike polynomials and optical aberrations,” Appl. Opt. 34, 8060–8062 (1995). 6
7. V. N. Mahajan, “Zernike-Gauss polynomials for optical systems with Gaussian pupils,” Appl. Opt. 34, 8057–8059(1995).
8. V. N. Mahajan, Optical Imaging and Aberrations, Part II: Wave Diffraction Optics, SPIE Press, Bellingham,Washington (2001).Proc. of SPIE Vol. 5173 17 9. V. N. Mahajan, “Zernike polynomials and aberration balancing,” Appl. Opt. 34, 8057–8059(1995). Proc. of SPIE Vol. 5173