- Single surface optics and definitions
- Multi-surface systems
- a lens (has two surfaces)
- plane-parallel plate
- Two-mirror telescopes:
- Definitions for multi-surface system: stops and pupils

- Aberrations
- Surface requirements for unaberrated images
- Aberrations: general description and low-order aberrations
- Aberration compensation and different telescope types

- Sources of aberrations
- Ray tracing
- Physical (diffraction) optics
- Adaptive Optics

(Entire section in one PDF file).

Because astronomical sources are faint, we need to collect light. We use
telescopes/cameras to make images of astronomical sources. Example: a
20th magnitude star gives
0.01*photons*/*s*/*cm*^{2} at 5000 A through a
1000 A filter! However, using a 4m telescope gives 1200 photons/s.

Telescopes/optics are the bread and butter tool of the observational astronomer, so it is worthwhile to be familiar with how they work.

We will define an optical system as a system which collects light;
usually, the system will also make images. This requires the bending
of light rays, which is accomplished using lenses (refraction) and/or
mirrors (reflection), using *curved surfaces*.

The operation of refractive optical systems is given by Snell's law of refraction:

An optical element takes a source at s and makes an image at s'. The
source can be *real* or *virtual*. A real image exists at
some point in space; a virtual image is formed where light rays apparently
emanate from or converge to, but at a location where no light actually
appears. For example, in a Cassegrain telescope, the image formed by
the primary is virtual, because the secondary intercepts the light and
redirects it before light gets to the focus of the primary.

Considering an azimuthally symmetric optic, we can define the optical
axis to go through the center of the optic.
The image made by the optic will not necessarily be a perfect image: rays at
different height at the surface, *y*, might not cross at the same point.
This is the subject of aberrations, which we will get into in a while. For
a ``smooth" surface, the amount of aberration will depend on how much
the different rays differ in *y*, which depends on the shape of the surface.
We define *paraxial* and *marginal* rays, as rays near
the center of the aperture and those on the edge of the aperture. We
define the *chief ray* as the ray that passes through the center
of the aperture. To define nominal (unaberrated) quantities,
we consider the *paraxial* regime, i.e. a small region near the
optical axis, surrounding the chief ray. In this regime, all angles are small,
aberrations vanish, and a surface can be wholly specified by its radius
of curvature R.

The *field angle* gives the angle formed between the chief ray
from an object and the z-axis. Note that paraxial does not necessarily
mean a field angle of zero; one can have an object at a field angle and still
consider the paraxial approximation.

Note also that for the time being, we are ignoring
*diffraction*. But we'll get back to that too. We are considering
*geometric* optics, which is what you get from diffraction as
wavelength tends to 0. For nonzero wavelength, geometric optics applies
at scales
*x* > > .

We can derive the basic relation between object and image location as a function of a surface where the index of refraction changes (Schroeder, chapter 2).

- =

The points at *s* and *s'* are called *conjugate*; the behavior
is independent of which direction the light is going .
If either *s* or *s'* is at infinity
(true for astronomical sources for s), the other distance is defined as the
*focal length*, *f*, of the optical element. For *s* = inf, *f* = *s'*.

We can define the quantity on the right side of the equation, which depends
only the the surface parameters (not the image or object locations), as
the *power*, *P*, of the surface:

We can make a similar derivation for the case of reflection:

+ =

This shows that the focal length for a mirror is given by *R*/2.

Note that one can treat reflection by considering refraction with *n'* = - *n*,
and get the same result:

+ =

Given the focal length, we define the *focal ratio*
to be the focal length divided by the
aperture diameter. The focal ratio is also called the *F-number* and
is denoted by the abbreviation *f* /. Note *f* /10 means a focal ratio of
ten; *f* is not a variable in this! The focal ratio gives the
beam ``width''; systems with a small focal ratio have a short focal
length compared with the diameter and hence the incoming beam to the
image is wide. Systems with small focal ratios are called ``fast'' systems;
systems with large focal ratios are called ``slow'' systems.

The *magnification* of a system gives the ratio of the
image height to the object height:

= =

The magnification is negative for this case, because object is flipped.
The magnification also negative for reflection: *n'* = - *n*. Magnification
is an important quantity for multi-element systems.

We define the *scale* as the motion of image for given incident angle
of parallel beam from infinity. From a consideration of the chief rays for
objects on-axis and at field angle , we get:

tan =

or
scale =

In other words, the scale, in units of angular motion per physical
motion in the focal plane, is given by 1/*f*. For a fixed aperture
diameter, systems with a small focal ratio (smaller focal length)
have a larger scale, i.e. more light in a patch of fixed physical
size: hence, these are ``faster'' systems.

Exercise: the APO 3.5m telescope is a f/10 system. The ARCTIC imager has 15 micron pixels. What angle in the sky does one pixel subtend? Once you get this, comment on whether you think this is a good pixel scale and why or why not?

To combine surfaces, one just takes the image from the first surface as the source for the second surface, etc., for each surface. We can generally describe the basic parameters of multi-surface systems by equivalent single-surface parameters, e.g. you can define an effective focal length of a multi-surface system as the focal length of some equivalent single-surface system. The effective focal length is the focal length of the first element multiplied by the magnification of each subsequent element. The two systems (single and multi) are equivalent in the paraxial approximation ONLY.

Consider a lens in air (*n* 1). The first surface give

- = = *P*_{1}

The second surface gives:
- = = *P*_{2}

but we have
After some algebra, we find the effective focal length (from center of lens):

= +

Zero power, but moves image
laterally:
= *d*[1 - (1/*n*)]. Application to filters: variation of focus.

In astronomy, most telescopes are two-mirror telescopes of Newtonian,
Cassegrain, or Gregorian design. All 3 types have a concave primary.
The Newtonian has a flat secondary, the Cassegrain a convex secondary,
and the Gregorian a concave secondary. The Cassegrain is the most
common for research astronomy; it is more compact than a Gregorian
and allows for magnification by the secondary. Basic parameters
are outlined
here.
Each of these telescope types defines a *family* of telescopes
with different first-order performances. From the usage/instrumentation point
of view, important quantities are:

- the diameter of the primary, which defines the light collecting power
- the scale of the telescope, which is
related to the focal length of the primary and the magnification of the
secondary:
*f*_{eff}=*f*_{1}*m* - the back focal distance, which is the distance of the focal plane behind the telescope

From the design point of view, we need to specify:

- the radii of curvature of the mirrors
- the separation between the mirrors

The relation between the usage and design parameters can be derived from simple geometry. First, accept some basic definitions:

- ratio of focal lengths, :
=
*R*_{2}/*R*_{1}=*f*_{2}/*f*_{1} - magnification of the secondary,
*m*(beware that*s*_{2}' is negative for a Cassegrain!):*m*= -*s*_{2}'/*s*_{2} *back focal distance*, the distance from the primary vertex to the focal plane (often expressed in units of the primary focal length, or primary diameter):*f*_{1}=*D*- primary focal ratio,
*F*_{1}:*F*_{1}=*f*_{1}/*D* - ratio of marginal ray heights,
*k*(directly related to separation of mirrors):*k*=*y*_{2}/*y*_{1}

Using some geometry, we can derive some basic relations between these quantities, in particular:

=

and

(1 + ) = *k*(*m* + 1)

Usually, *f*_{1} is limited by technology/cost. Then choose *m*
to match desired scale. *k* is related to separation of mirrors, and
is a compromise between making telescope shorter and blocking out more
light vs. longer and blocking less light; in either case, have to keep
focal plane behind primary!

One final thing to note is how we focus a Cassegrain telescope. Most instruments are placed at a fixed location behind the primary. Ideally, this will be at the back focal distance, and everything should be set as designed. However, sometimes the instrument may not be exactly at the correct back focal distance, or it might move slightly because of thermal expansion/contraction. In this case, focussing is usually then done by moving the secondary mirror.

The amount of image motion for a given secondary motion is given by:

= *k*(*m* + 1) - 1

Working through the relations above, this gives:
= *m*^{2} + 1

so the amount of focal plane motion (
If you move the secondary you change *k*. Since is fixed by the
mirror shapes, it's also clear that you change the magnification as
you move the secondary; this is expected since you are changing the
system focal length, *f* = *mf*_{1}. So it's possible that a given instrument
could have a slightly varying scale if its position is not perfectly
fixed relative to the primary. Alternatively, if you need to independently
focus and set the scale (e.g., SDSS!), then you need to be able to move
two things!

Note that even if the instrument is at exactly the back focal distance, movement of the secondary is required to account for mechanical changing of spacing between the primary and secondary as a result of thermal expansion/contraction.

- aperture stop: determines the amount of light reaching an image
(usually the primary mirror)
- field stop: determines the angular size of the field. This is
usually the detector, but for a large enough detector, it could be
the secondary.
- pupil: location where rays from all field angles fill the same aperture.
- entrance pupil: image of aperture stop as seen from source object
(usually the primary).
- exit pupil: image of aperture stop formed by all subsequent optical elements.

In a two-mirror telescope, the location of the exit pupil is where the
image of the primary is formed by the secondary. This can be calculated
using *s* = *d* as the object distance (where *d* is the separation of the
mirrors), then with the reflection equation, we can solve for *s'* which
gives the location of the exit pupil relative to the secondary mirror.
If one defines the quantity , such that
*f*_{1} is the
distance between the exit pupil and the focal plane, then (algebra not
shown):

= =

This pupil is generally not accessible, so if one needs access to a pupil, additional optics are used.

The exit pupil is an important concept. When we discuss aberrations, it is the total wavefront error at the exit pupil which gives the system aberration. Pupils are important for aberration compensation. They can also be used to put light at a location that is independent of pointing errors.

Next we consider non-paraxial rays. We first consider what surface is required to make an unaberrated image.

We can derive the surface using Fermat's principle.
Fermat's principle states that light travels in the path such that
infinitessimally small variations in the path doesn't change the travel
time to first order: d(time)/d(length) is a minimum.
For a single surface, this reduces to the statement that
light travels the path which takes the least time.
An alternate way of stating Fermat's
principle is that the *optical path length* is unchanged to first
order for a small change in path. The OPL is given by:

Fermat's principle has a physical interpretation when one considers the wave nature of light. It is clear that around a stationary point of the optical path light, the maximum amount of light can be accumulated over different paths with a minimum of destructive interference. By the wave theory, light travels over all possible paths, but the light coming over the ``wrong'' paths destructively interferes, and only the light coming over the ``right'' path constructively interferes.

Fermat's principle can be used to derive the basic laws of reflection and refraction (Snell's law).

Now consider a perfect imaging system that takes all rays from an object and makes them all converge to an object. Since Fermat's principle says the only paths taken will be those for which the OPL is minimally changed for small changes in path, the only way a perfect image will be formed is when all optical path lengths along a surface between an image and object point are the same - otherwise the light doesn't get to this point!

Instead of using Fermat's principle, we could solve for the parameters of a perfect surface using analytic geometry, but this would require an inspired guess for the correct functional form of the surface.

We find that the perfect surface depends on the situation: whether the
light comes from a source at finite or infinite distance, and whether
the mirror is concave or convex. We consider the various cases now,
quoting the results without actually doing the geometry. In all cases,
consider the z-axis to be the optical axis, with the y-axis running
perpendicular. We want to know the shape of the surface, *y*(*z*), that
gives a perfect image.

*Concave mirror with one conjugate at infinity*

Sample application: primary mirror of telescope looking at stars.

Fermat's principle gives:

*Concave mirror with both conjugates at finite distance*

Sample application: Gregorian secondary looking at image formed by primary.

For a concave mirror with both conjugates finite, we get an ellipse. Again, this is perfect only for field angle = 0.

(*z* - *a*)^{2}/*a*^{2} + *y*^{2}/*b*^{2} = 1

*Convex mirror with both conjugates at finite distance*

Sample application: Cassegrain secondary looking at image formed by primary.

For a convex mirror with both conjugates finite, we get a hyperbola:

(*z* - *a*)^{2}/*a*^{2} - *y*^{2}/*b*^{2} = 1

*Convex mirror with one conjugate at infinity*

For a convex mirror with one conjugate at infinity, we get a parabola.

*2D to 3D*

Note that in all cases we've considered a one-dimension surface.
We can generalize to 2D surfaces by rotating around the z-axis; for the
equations, simply replace *y*^{2} with
(*x*^{2} + *y*^{2}).

*Conic sections*

As you may recall from analytic geometry, all of these figures are
*conic sections*, and it is possible to describe all of these
figures with a single equation:

-2*Rz* + (1 + *K*)*z*^{2} = 0

where
= *x*^{2} + *y*^{2}

and
*K* > 0 gives a prolate ellipsoid

*K* = 0 gives a sphere

-1 < *K* < 0 gives a oblate ellipsoid

*K* = - 1 gives a paraboloid

*K* < - 1 gives a hyperboloid

Now consider what happens for surfaces that are not perfect, e.g. for the cases considered above for field angle0 (since only a sphere is symmetric for all field angles), or for field angle 0 for a conic surface which doesn't give a perfect image?

You get *aberrations*; the light from all locations in aperture
does **not** land at **any** common point.

One can consider aberrations in either of two ways:

- aberrations arise from all rays not landing at a common point,
- aberrations arise because wavefront deviates from a spherical wavefront.

In general, the angular and transverse aberrations can be determined from the optical path difference between a given ray and that of a spherical wavefront. The relations are given by:

angular aberration =

transverse aberration = *s'*

If the aberrations are not symmetric in the pupil, then we could define
angular and transverse
*Spherical aberration*

First, consider the axisymmetric case of looking at an object on
axis (field angle equal zero) with an optical element that is a conic
section. We can consider where rays land as *f* (), and derive the
effective focal length, *f*_{e}(), for an arbitrary conic section:

tan = *dz*/*d*

from conic equation:

-2*Rz* + (1 + *K*)*z*^{2} = 0

Note that *f*_{e} is independent of *z* only for *K* = - 1, a parabola. Also
note that *f* is symmetric with respect to .

We define spherical aberration as the aberration resulting from *K* - 1.
Rays from different radial positions in the entrance aperture focus at
different locations. It is an aberration which is present on axis as seen
here.
Spherical aberration is symmetric in the pupil. There is no location
in space where all rays focus at a point. Note that the behavior (image size) as
a function of focal position is not symmetric. One can define several
criteria for where the ``best focus'' might be, leading to the terminology
paraxial focus, marginal focus, diffraction focus, and the circle of
least confusion.

The asymmetric nature of spherical aberration as a function of focal
position distinguishes it from other aberrations and is a useful diagnostic
for whether a system has this aberration. This is shown in
this figure
which shows a sequence of images at different focal positions in the
presence of spherical aberration.
We define *transverse spherical aberration* (TSA) as the image size
at paraxial focus. This is not the location of the minimum image size.

=

The difference in angle between the ``perfect'' ray from the parabola and
the actual ray is called the *angular aberration*, in this case
*angular spherical aberration*, or ASA.

This is simply related to the transverse aberration:

We can also consider aberration as the difference between our wavefront and a spherical wavefront, which in this case is the wavefront given by a parabolic surface.

This result can be generalized to any sort of aberration: the angular and transverse aberrations can be determined from the optical path difference between a given ray and that of a spherical wavefront. The relations are given by:

angular aberration =

transverse aberration = *s'*

If the aberrations are not symmetric in the pupil, then we could define
angular and transverse
*General aberration description*

We can describe deviations from a spherical wavefront generally. Since
all we care about are optical path *differences*, we write an
expression for the optical path difference between an arbitrary ray
and the chief ray, and in doing this, we can also include the
possibility of an off-axis image, and get

Note that rays along the y-axis are called *tangential* rays, while
rays along the x-axis are called *sagittal* rays.

Analytically, people generally restrict themselves to talking about
*third-order* aberrations, which are fourth-order (in powers of
*x*, *y*,, or) in the optical path difference, because of the
derivative we take to get transverse or angular aberrations. In the
third-order limit, one finds that
*A*_{2} = *A*_{2}', and
*A*_{1} = - *A*_{1}'.
Working out the geometry, we find for a mirror that:

From the general expression, we can derive the angular or the transverse aberrations in either the x or y direction. Considering the aberrations in the two separate directions, we find:

The first term is proportional to
*y* and is called *astigmatism*.
The second term is proportional to
(*x*^{2} +3*y*^{2}) and is called
*coma*. The final term, proportional to *y* is *spherical
aberration*, which we've already discussed (note for spherical,
*AA*_{x} = *AA*_{y}
and in fact the AA in any direction is equal, hence the aberration is
circularly symmetric).

*Astigmatism*

For astigmatism, rays from opposite sides of the pupil focus in different
locations relative to the paraxial rays. At the paraxial focus, we end
up with a circular image. As you move away from this image location, you
move towards the tangential focus in one direction and the sagittal focus
in the other direction. At either of these locations, the astigmatic image
looks like a elongated ellipse. Astigmatism goes as , and consequently looks
the same for opposite field angles. Astigmatism is characterized in the
image plane by the *transverse* or *angular* astigmatism
(TAS or AAS), which refer to the height of the marginal rays at the
paraxial focus. Astigmatism is symmetric around zero field angle.

This figure
shows the rays
in the presence of astigmatism.

This figure
shows the behavior of astigmatism as one passes through paraxial focus.

*Coma*

For coma, rays from opposite sides of the pupil focus at the same focal distance.
However, the tangential rays focus at a different location than the
sagittal rays, and neither of these focus at the paraxial focus. The net
effect is to make an image that vaguely looks like a comet, hence the
name coma. Coma goes as , so the direction of the comet
flips sign for opposite field angles. Coma is characterized by either
the *tangential* or *sagittal transverse/angular coma* (TTC, TSC,
ATC, ASC) which
describe the height/angle of either the tangential or sagittal marginal rays
at the paraxial focus:
*TTC* = 3*TSC*.

This figure
shows the rays
in the presence of coma.

This figure
shows the behavior of coma as one passes through paraxial focus.

In fact, there are two more third-order aberrations:
*distortion* and *field curvature*. Neither
affects image quality, only location (unless you are forced to use a
flat image plane!). Field curvature gives a curved focal plane: if imaging
onto a flat detector, this will lead to focus deviations as one goes off-axis.
Distortion affects the location of images in the focal plance, and
goes as . The amount of field curvature and distortion
can be derived from the aberration coefficients and the mirror parameters.

We can also determine the relevant coefficients for a surface with a displaced stop (Schroeder p 77), or for a surface with a decentered pupil (Schroeder p89-90); it's just more geometry and algebra. With all these realtions, we can determine the optical path differences for an entire system: for a multi-surface system, we just add the OPD's as we go from surface to surface. The final aberrations can be determined from the system OPD.

Using the techniques above, we can write expressions for the system aberrations as a function of the surface figures (and field angles). If we give ourselves the freedom to choose surface figures, we can eliminate one (or more) aberrations.

For example, given a conic constant of the primary mirror,
we can use the aberration
relations to determine *K*_{2} such that spherical aberration is zero;
this will give us perfect images on-axis. We find that:

If we allow ourselves the freedom to choose both *K*_{1} and *K*_{2}, we
can eliminate both spherical aberration and coma. Designs of this
sort are called *aplanatic*. The relevant expression, in terms
of the magnification and back focal distance (we could use the relations
discussed earlier to present these in terms of other paraxial parameters), is:

We can only eliminate two aberrations with two mirrors, so even this telescope will be left with astigmatism.

There are two different classes of two-mirror telescopes that allow for
freedom in the shape of both mirrors: Cassegrain
telescopes and Gregorian telescopes (Newtonians have a flat secondary).
For the classical telescope with a
parabolic primary, the Cassegrain secondary is hyperbolic, whereas for
a Gregorian it is ellipsoidal (because of the appropriate conic sections
derived above for convex and concave mirrors with finite conjugates). For
the aplanatic design, the Cassegrain telescope has two hyperbolic mirrors,
while the Gregorian telescope has two ellipsoidal mirrors. An aplanatic
Cassegrain telescope is called a *Ritchey-Chretien* telescope.

The following table gives some characteristics of ``typical'' telescopes. Aberrations are given at a field angle of 18 arc-min in units of arc-seconds. Coma is given in terms of tangential coma.

Characteristics of Two-Mirror Telescopes

Parameter | CC | CG | RC | AG |

m | 4.00 | -4.00 | 4.00 | -4.00 |

k | 0.25 | -0.417 | 0.25 | -0.417 |

1 - k | 0.75 | 1.417 | 0.75 | 1.417 |

mk | 1.000 | 1.667 | 1.000 | 1.667 |

ATC | 2.03 | 2.03 | 0.00 | 0.00 |

AAS | 0.92 | 0.92 | 1.03 | 0.80 |

ADI | 0.079 | 0.061 | 0.075 | 0.056 |

R_{1} |
7.25 | -4.75 | 7.625 | -5.175 |

R_{1} |
4.00 | -8.00 | 4.00 | -8.00 |

The image quality is clearly better for the aplanatic designs than for
the classical designs, as expected because coma dominates off-axis in
the classical design. In the aplanatic design, the Gregorian is slightly
better. However, when considerations other than just optical quality
are considered, the Cassegrain usually is favored: for the same primary
mirror, the Cassegrain is considerably shorter and thus it is less
costly to build an enclosure and telescope structure. To keep the
physical length the same, the Gregorian would have to have a faster
primary mirror, which are more difficult (i.e. costly) to fabricate, and
which will result in a greater sensitivity to alignment errors. Both
types of telescopes have a *curved* focal plane.

So far, we have been discussing aberrations which arise from the optical design of a system when we have a limited number of elements. However, it is important to realize that aberrations can arise from other sources as well. These other sources can give additional third-order aberrations, as well as higher order aberrations. Some possible sources include:

- misfigured or imperfectly figured optics : rarely is an element
made exactly to specification!
- misalignments. If the mirrors in a multiple-element system are not
perfectly aligned, aberrations will result. These can be derived
(third-order) from the aberration expressions for decentered elements. For
two mirror systems, one finds that decentering or tilting the secondary
introduces a
*constant*amount of coma over the field. Coma dominates astigmatism for a misaligned telescope. - mechanical/support problems. When the mirrors are mounted in mirror
cells the weight of the mirror is distributed over some support
structures. Because the mirrors are not infinitely stiff, some distortion
of the mirror shape will occur. Generally, such distortion will probably
change as a function of which way the telescope is pointing. Separate
from this, becuase the telescope structure itself is not perfectly stiff,
one expects some flexure which gives a different secondary (mis)alignment
as a function of where one is pointing. Finally, one might expect the
spacing between the primary and secondary to vary with temperature,
if the telescope structure is made of materials which have non-zero
coefficients of expansion.
- chromatic aberration. Generally, we've only been discussing mirrors
since this is what is used in telescopes. However, astronomers often put
additional optics (e.g., cameras or spectrographs) behind telescopes which
may use refractive elements rather than mirrors. There are aberration
relations for refractive elements just as we've discussed, but these
have additional dependences on the indices of refraction of the optical
elements. For most refractive elements, the index of refraction varies
with wavelength, so one will get wavelength-dependent aberrations,
called chromatic aberrations. These can be minimized by good choices of
materials or by using combinations of different materials for different
elements; however, it is an additional source of aberration.
- seeing. The earth's atmosphere introduces optical path differences
between the rays across the aperture of the telescope. This is generally
the
**dominant**source of image degradation from a ground-based telescope. Consequently, one builds telescopes in good sites, and as far as design and other sources of image degradation are concerned, one is generally only interested in getting these errors small when compared with the smallest expected seeing errors.

For a fully general calculation of image quality, one does not wish
to be limited to third-order aberrations, nor does one often wish
to work out all of the relations for the complex set of aberrations
which result from all of the sources of aberration mentioned above.
Real world situations also have to deal with *vignetting* in
optical systems, in which certain rays may be blocked by something
and never reach the image plane (e.g., in a two-mirror telescope, the
central rays are blocked by the secondary).

Because of these and other considerations, analysis of optical systems
is usually done using *ray tracing*, in which the parameters of an
optical system are entered into a computer, and the computer calculates
the expected images on the basis of geometric optics. Many programs
exist with many features: one can produce *spot diagrams* which
show the location of rays from across the aperture at an image plane
(or any other location), plots of transverse aberrations, plots of
optical path differences, etc., etc.

( Demo ray trace program. Start with on-axis object, single mirror. Where is focus? What will image look like with spherical mirror? What do we need to do to make it perfect? How does it depend on aperture size? Now how do off-axis images look like? spot diagrams, through focus, ray fan, opd plots, etc. Now introduce second mirror. What determines where focus will be? Magnification? What shape to make a perfect on-axis image? What do off-axis images look like? How do we make them better? Now how is performance? Real 3.5m and 1m prescriptions. Issue: guider. )

Up until now, we have avoided considering the wave nature of light which
introduces *diffraction* from interference of light coming from
different parts of the aperture. Because of diffraction, images of
a point source will be slightly blurred. From simple geometric arguments,
we can estimate the size of the blur introduced from diffraction:

To work out in detail the shape of the images formed from diffraction
involves understanding wave propagation. Basically, one integrates
over all of the source points in the aperture (or exit pupil for
an optical system), determining the contribution of each point at
each place in the image plane. The contributions are all summed taking
into account phase differences at each image point, which causes
reinforcment at some points and cancellation at others. The expression
which sums all of the individual source points is called the *
diffraction integral*. When the details are worked out, one finds
that the intensity in the image plane is related to the intensity
and phase at the exit pupil. In fact the wavefront is described at
any plane by the *optical transfer function*, which gives the
intensity and phase of the wave at all locations in that plane. The
OTF at the pupil plane and at the image plane are a Fourier transform
pair. Consequently, we can determine the light distribution in the
image plane by taking the Fourier transform of the pupil plane;
the light distribution, or point spread function, is just the
modulus-squared of the OTF at the image plane. Symbolically, we have

For the simple case of a plane wave with no phase errors, the diffraction integral can be solved analytically. The result for a circular aperture with a central obscuration, when the fractional radius of the obscuration is given by , the expression for the PSF is:

This expression gives the so-called *Airy pattern* which has a
central disk surrounded by concentric dark and bright rings. One finds
that the radius of the first dark ring is at the physical distance
*r* = 1.22*F*, or alternatively, the angular distance
= 1.22/*D*. This gives the size of the *Airy disk*.

For more complex cases, the diffraction integral is solved numerically by doing a Fourier transform. The pupil function is often more complex than a simple circle, because there are often additional items which block light in the pupil, such as the support structures for the secondary mirror.

This figure shows the Airy pattern, both without obscurations, and with a central obscuration and spiders in a setup typical of a telescope.

In addition, there may be phase errors in the exit pupil, because of
the existence of any one of the sources of aberration discussed above.
For general use, is often expressed as an series, where the
expansion is over a set of orthogonal polynomials for the aperture
which is being used. For circular apertures with (or without) a central
obscuration (the case most often found in astronomy), the appropriate
polynomials are called *Zernike* polynomials. The lowest order
terms are just uniform slopes of phase across the pupil, called tilt,
and simply correspond to motion in the image plane. The next terms
correspond to the expressions for the OPD which we found above for
focus, astigmatism, coma, and spherical aberration, generalized to allow
any orientation of the phase errors in the pupil. Higher order terms
correspond to higher order aberrations.

This figure shows the form of some of the low order Zernike terms: the first corresponds to focus aberration, the next two to astigmatism, the next two to coma, the next two to trefoil aberration, and the last to spherical aberration.

A wonderful example of the application of all of this stuff was in the
diagnosis of spherical aberration in the Hubble Space Telescope, which has
been corrected in subsequent instruments in the telescope, which introduce
spherical aberration of the opposite sign. To perform this correction,
however, required and accurate understanding of the amplitude of the
aberration. This was derived from analysis of on-orbit images, as shown in
this
figure.
Note that it is possible in some cases to try to recover the phase errors
from analysis of images. This is called *phase retrieval*. There
are several ways of trying to do this, some of which are complex, so
we won't go into them, but it's good to know that it is possible. But
an accurate amplitude of spherical aberration was derived from these
images. This derived value was later found to correspond almost exactly
to the error expected from an error which was made in the testing facility
for the HST primary mirror, and the agreement of these two values allowed
the construction of new corrective optics to proceed...

Some figures from HST Optical Systems Failure Report.

The goal of *adaptive optics* is to partially or entirely remove
the effects of atmospheric seeing. Note that these day, this is to
be distinguished form *active optics*, which works at lower
frequency, and whose main goal is to remove aberrations coming from
the change in telescope configuration as the telescope moves (e.g.,
small changes in alignment from flexure or sag of the primary mirror
surface as the telescope moves). Active optics generally works as
frequencies less than (usually significantly) 1 Hz, whereas adaptive
optics must work at 10 to 1000 Hz. At low frequencies, the active
optics can be done with actuators on the primary and secondary mirrors
themselves. At the high frequencies reqiured for adaptive optics,
however, these large mirrors cannot respond fast enough, so one is
required to form a pupil on a smaller mirror which can be rapidly
adjusted; hence adaptive optics systems are really separate astronomical
instruments.

Many adaptive optics systems functioning and/or under development: see ESO/VLT adaptive optics, CFHT adaptive optics, Keck adaptive optics, Gemini adaptive optics, http://www.cfht.hawaii.edu/Instruments/Imaging/AOB/other-aosystems.html

The basic idea of an adaptive optics system is to rapidly sense the wavefront errors and then to correct for them on timescales faster than those at which the atmosphere changes. Consequently, there are really three parts to an adaptive optics system:

- a component which senses wavefront errors,
- a control system which figures out how to correct these errors, and
- an optical element which receives the signals from the control system and implements wavefront corrections.

There are several methods used for wavefront sensing. Two ones in fairly common use among today's adaptive optics system are Shack-Hartmann sensors and wavefront curvature sensing devices. In a Shack-Hartmann sensor, an array of lenslets is put in a pupil plane and each lenslet images a small part of the pupil. Measuring image shifts between each of the images gives a measure of the local wavefront tilts. Wavefront curvature devices look at the intensity distribution in out-of focus images. Other wavefront sensing techniques include pyramid wavefront sensors and phase diversity techniques. Usually, a star is used as the source, but this is not required for some wavefront sensors (i.e. extended source can be used).

To correct wavefront errors, some sort of deformable mirror is used. These can be generically split into two categories: segmented and continuous faceplate mirrors, where the latter are more common. A deformable mirror is characterized by the number of adjustable elements: the more elements, the more correction can be done. LCD arrays have also been used for wavefront correction.

In general, it is very difficult to achieve complete correction even
for ideal performance, and one needs to consider the effectiveness of
different adaptive optics systems. This effectiveness depends on the size
of the aperture, the wavelength, the number of resolution elements on the
deformable mirror, and the quality of the site. Clearly, more resolution
elements are needed for larger apertures. Equivalently, the effectiveness
of a system will decrease as the aperture in increased for a fixed number
of resolution elements. One can consider the return as a function of
Zernike order corrected and aperture size. For large telescopes, you'll
only get partial correction unless a very large number of resolution
elements on the deformable mirror are available. The following table
gives the mean square amplitude, , for Kolmogorov turbulence
after removal of the first *j* terms; the rms phase variation is just
/2. For small apertures, you can make significant
gains with removal of just low order terms, but for large apertures you
need very high order terms. Note various criteria for quality of
imaging, e.g.
/4, etc.

Z_{j} |
n | m | Expression | Description | - | |

Z1 | 0 | 0 | 1 | constant | 1.030 S | |

Z2 | 1 | 1 |
2r cos |
tilt | 0.582 S | 0.448 S |

Z3 | 1 | 1 |
2r sin |
tilt | 0.134 S | 0.448 S |

Z4 | 2 | 1 | defocus | 0.111 S | 0.023 S | |

Z5 | 2 | 2 | astigmatism | 0.0880 S | 0.023 S | |

Z6 | 2 | 2 | astigmatism | 0.0648 S | 0.023 S | |

Z7 | 3 | 1 | coma | 0.0587 S | 0.0062 5 | |

Z8 | 3 | 1 | coma | 0.0525 S | 0.0062 S | |

Z9 | 3 | 3 | trifoil | 0.0463 S | 0.0062 S | |

Z10 | 3 | 3 | trifoil | 0.0401 S | 0.0062 S | |

Z11 | 4 | 0 | spherical | 0.0377S | 0.0024 S |

Another important limitation is that one needs an object on which you can derive the wavefront. Measurements of wavefront are subject to noise just like any other photon detection so bright sources may be required. This is even more evident when one considers that you need a source which is within the same isoplanatic patch as your desired object, and when you recall that the wavefront changes on time scales of milliseconds. These requirements place limitations on the amount of sky over which it is possible to get good correction. It also places limitations on the sorts of detectors which are needed in the wavefront sensors (fast readout and low or zero readout noise!).

band | r_{0} |
V_{lim} |
Coverage (%) | ||||

U | 0.365 | 9.0 | .009 | .0027 | 7.4 | 1.2 | 1.8 E-5 |

B | 0.44 | 11.4 | .011 | .0034 | 8.2 | 1.5 | 6.1 E-5 |

V | 0.55 | 14.9 | .015 | .0045 | 9.0 | 1.9 | 2.6 E-4 |

R | 0.70 | 20.0 | .020 | .0060 | 10.0 | 2.6 | 0.0013 |

I | 0.90 | 27.0 | .027 | .0081 | 11.0 | 3.5 | 0.006 |

J | 1.25 | 40 | .040 | .0120 | 12.2 | 5.1 | 0.046 |

H | 1.62 | 55 | .055 | .0164 | 13.3 | 7.0 | 0.22 |

K | 2.2 | 79 | .079 | .024 | 14.4 | 10.1 | 1.32 |

L | 3.4 | 133 | .133 | .040 | 16.2 | 17.0 | 14.5 |

M | 5.0 | 210 | .21 | .063 | 17.7 | 27.0 | 71 |

N | 10 | 500 | .50 | .150 | 20.4 | 64 | 100 |

The isoplanatic patch limitation is severe. In many cases, we might expect non-opticmal performance if the reference object is not as close as it should be ideally.

In most cases, both because of lack of higher order correction and because of reference star vs. target wavefront differences, adaptive optics works in the partially correcting regime. This typically gives PSFs with a sharp core, but still with extended wings.

The problem of sky coverage can be avoided if one uses so-called laser guide stars. The idea is to create a star by shining a laser up into the atmosphere. To date, two generic classes of lasers have been used, Rayleigh and sodium beacons. The Rayleigh beacons work by scattering off a layer roughly 30 km above the Earth's surface; the sodium beacons work by scattering off a layer roughly 90 km above the Earth's surface. Laser guide stars still have some limitations. For one, the path through the atmosphere which the laser traverses does not exactly correspond to the path that light from a star traverses, because the latter comes from an essentially infinite distance; this leads to the effect called focal anoisoplanatism. In addition, laser guide stars cannot generally be used to track image motion since the laser passes up and down through the same atmosphere and image motion is cancelled out. To correct for image motion, separate tip-tilt tracking is required.

Note that even with perfect correction, one is still limited by the isoplanatic patch size. As one moves further and further away from the reference object, the correction will gradually degrade, because a different path through the atmosphere is being probed.

To get around this, one can consider the use of multiple laser guides
stars (laser guide star constellation) to characterize the atmosphere over
a broader column. However, if this is done, one cannot correct all field
angles simultaneously at the telescope pupil, because the aberrations are
different for different field angles. Instead, one could choose to correct
them in a plane conjugate to the location of the dominant source of
atmospheric aberration. This is the basis of a *ground layer adaptive
optics (GLAO)* system, where a correction is made for aberration in the
lower atmosphere.

In principle, even better correction over a wider field of view is possible with
*multiple* deformable mirrors,
giving rise to the concept of *multi-conjugate* adaptive optics
(MCAO) systems. In such systems, each adaptive optic would correct at a
different location in the atmosphere.

Systems with single laser guide stars have certainly been tested and appear to work; but remember, only over an isoplanatic patch, and often with partially corrected images. Several implementations of system with multiple guide stars actually exist (at VLT and Keck?) to allow sampling of a larger cylinder/cone through the atmosphere; some of these are designed to correct at particular layers to maximize FOV, e.g. ground layer adaptive optics (GLAO). The bulk of adaptive optics work has been done in the near-IR.

Extreme (high-contrast) AO.

A variant on adaptive optics: lucky imaging.

Science with adaptive optics. Typical AO PSFs. Morphology vs. photometry.

http://www.alpao.com/Applications/Adaptive_optics_for_Astronomy.htm

Galactic center AO (see bottom of page, note scale)