next_inactive up previous
Up: AY535 class notes Previous: Detectors


Data Reduction: details and subtleties

(Entire section in one PDF file).

Data reduction steps

Generally, remove instrument signatures in reverse order from which they were introduced. First consider basic processing steps applicable to all array detectors, i.e. regardless of whether instrument is imaging or spectroscopy. Possible correction steps include:

  1. A/D correction: rarely done.

  2. Bias level subtraction: always done for CCDs, usually not for IR detectors (because of nature of IR readout, i.e. doubly-correlated sampling)

  3. Bias structure subtraction (superbias): often done, but not always warranted.

  4. Linearity correction: rarely done with CCDs, typically done with IR arrays.

  5. Dark current subtraction (superdark): occasionally done with CCDs, depending on detector, exposure times; usually done for IR arrays.

  6. Preflash subtraction, or deferred charge correction: rarely done with modern CCDs.

  7. Shutter shading correction: required if effect is significant.

  8. Flat field correction: always done.

  9. additonal complicated things: scattered light (has different flat field characteristics). Amplifier hysteresis correction (probably only good qualitatively!). Similarly, CTE correction. Hot pixel fixup. Residual image subtraction. Pixel area correction.

  10. Fringing subtraction/sky subtraction: occasionally done where warranted.

For each of these steps, it is important to understand whether the correction is additive or multiplicative.

Bias level subtraction

Use overscan/underscan if available. If not, you must use bias frames or dark frames of the same exposure time. Note IR detectors should be already bias-subtracted if they are using doubly-correlated reads, but often there is residual bias which is often exposure-time dependent, and if so, can be subtracted out with darks of the same exposure time (at the same time doing dark subtraction!). Warnings about overscan regions: amplifier hysteresis. Vertical and horizontal overscan. (vertical often not taken, rarely applied). Note possible drift of bias level with time (apparent in vertical overscan), so possibly fit. If fitting for varying bias, beware of bias jumps.

Bias structure subtraction

Readout electronics sometimes introduce fixed pattern noise, and in some cases, it is possible to subtract this from your data. However, you can only do this if the pattern is repeatable. You must check; beware of a ``fixed'' pattern which changes with, e.g., telescope pointing. Often, this noise is variable in time or location, and consequently you only make things worse by attempting to correct for it.

Check bias frame histograms. Should be zero mean and Gaussian with width given by readout noise. This gives you first observed estimate of readout noise. If it's not zero-mean or has significantly larger width, look for fixed pattern. Construct superbias frame from lots of bias frames. Repeat several times and see if histogram is repeatable.

Required number of bias frames depends to some level on how bad the bias structure is. If it's really bad, i.e., much worse than the readout noise, you'll improve things with only a few bias frames. However, usually, the bias structure is subtle and less than the readout noise. In this case, you can often add noise by doing a superbias subtraction. In the absence of any bias structure, the superbias will have noise of rn/$ \sqrt{{N}}$, and consequently, your superbias subtracted frame will have effective readout noise of $ \sqrt{{rn^2 (1 + 1/N)}}$. So probably want at least 10 frames, more if you really want to go faint and will be readout noise limited.

IR data should not required bias structure subtraction with doubly-correlated reads, but note that some instruments do!

Note that it is possible to do bias subtraction simultaneous with dark subtraction if you take calibration of the same length. This is typcially what is done with IR detectors. In fact, in the IR, as we'll discuss shortly, sky frames are often subtracted from object frames, and if this is done, the bias and dark contributions are automatically included in the sky subtraction.

Linearity correction

For some detectors, especially in the infrared, you must make a correction for the nonlinearity of the devices. To do this, you must measure the output intensity at a range of known input intensities. To do this, you must be able to accurately control the input intensities. This is probably best done by taking data at different exposure times; since shutter timing is usually quite precise, the ratio of the exposure times give the ratio of the input intensities. However, for this to be true, you must use a light source which is stable, i.e., does not vary with time (at the one percent level); this is often hard to come by.

To make a linearity curve, you take a series of frames of increasing exposure level, usually by increasing the exposure time. To do linearity, you must know the relative levels of your different frames, which is the ratio of exposure times if the light source is stable. However, most light sources are not stable. To do a linearity correction, you need at least a light source which does not vary on short time scales. Then you can intersperse lamp monitoring exposures to try to calibrate out the light variations. Generally you can do this by interspersing some fiducial exposure between each of your successive light levels. For example, a linearity sequence might consist of an exposure sequence like: 1s, 2s, 1s, 4s, 1s, 8s, 1s, 16s, etc. The longer exposures are used to make the linearity curve, and the 1s exposures are there to test for source variability; if the latter occurs, the short exposures can be used to correct the other exposures for changing source intensity (as long as the changes are of sufficiently long timescale that they are well sampled by the short exposures). When you are done, make a plot of light level in each of the fiducial exposures. Look for any drifts in the lamp brightness, and if they exist, you will need to calibrate them out of your linearity exposures. Plot the linearity exposures as DN/s vs mean count level. You need to do this in regions which has a roughly constant sensitivity; don't average over the whole array; it is even possible to do it on a per pixel basis. Fit some sort of curve (low order polynomial) through the data. Then for every observation, use this curve to correct the observed count rate to the count rate you would have received at some fiducial level.

At many observatories, the linearity correction may have been previously measured and will be available for you to use without needing to measure it yourself.

Note that the correction will be larger the further your program observation is from the fiducial light level. To minimize errors from errors in your linearity correction, take your program exposures as close as possible to a single light level; when exposure levels differ a lot, you will need to be trusting your linearity correction more.

Dark current subtraction

Detectors generate dark current arising from thermal excitation. Generally, the dark current in a device is different for each different pixel. Most pixels have very low dark current (because an operating temperature is chosen to bring dark current to an acceptably low level.) However, some pixels will have larger dark current (warm pixels), and some pixels may have very high dark current (hot pixels). The amount of dark current on a frame will obviously be larger for longer exposure times.

If the dark rate in a given pixel is constant in time, one can subtract the dark current by using dark frame, i.e. frames with the shutter closed but of finite exposure time (not bias frames), and making ``superdark'' frames by combining multiple dark frames (to beat down photon and readout noise). Sometimes the dark rate in a pixel does not appear to scale linearly with time, so sometimes you have to beware of using dark frames at some exposure time to correct object frames of a different exposure time; in the ideal world, you would take dark frames at each different exposure time you will be using for your object frames. However, dark frames are time-consuming to take. Also, you need to be careful about taking dark frames during the daytime, because sometimes instruments have small light leaks which make daytime darks useless. Cloudy nights are an ideal time to take dark frames!

For pixels with very low dark rates, which is often the majority of the pixels, you need to take very large numbers of dark frames to avoid being dominated by readout noise. Often, however, you only have time to take a relatively small number of darks, since they must be of long exposure time. Conseqeuntly, you might wish to consider a strategy where you make a superdark frame, but set all pixels for which dark rate is too small to be accurately measured to be identically zero. This way you avoid just adding readout noise to these pixels, but you still use the measured dark rate to correct for warmer pixels.

Preflash subtraction, or deferred charge correction

For CCDs with charge transfer problems at low light levels (mostly older devices), nonlinearities can be avoided by preflashing each exposure with a small pedestal of light, although this is not ideal as it adds background noise to your object frames. If you add a preflash, it must be subtracted during data reduction; to do so, one would take a series of preflash-only frames, and combine them to make a preflash subtraction frame.

If your object frames have sufficient background by themselves so that all pixels are above the nonlinear low levels, it is possible to correct for deferred charge effects without the use of a preflash frame. Details of how to obtain the required calibration data and construct the relevant calibration frames are given in Gilliland in ``Astronomical CCD Observing and Reduction Techniques'' (PASP Conf. Series 23).

Shutter shading correction

Since shutters take a finite amount of time to open, the true exposure time may be a function of location on the frame; in particular, this will be true for iris shutters (note, however, the existence of rotating shutters or dual blade linear shutters, e.g. Bonn shutters). Generally, shutters move quite fast, so this is only a measureable effect for short exposure times, and even then, only for detectors which are physically large (i.e. have large shutters). However, this is becoming more common with very large format CCDs. If you care about photometry at the one percent level on short exposures with such a device (and you often do, because standard star exposures are often relatively short), you may need to consider a shutter shading correction. This can be derived by comparing N short exposures with exposure time t with a single long exposures of exposure time Nt; for example 20 1s exposures with 1 20s exposure.

For any frame, the observed number of counts, O, will be given by:

O = S(t + $\displaystyle \Delta$t(x, y))

where S is the true count rate, t is the commanded exposure time, and $ \Delta$t gives the difference in exposure time at different positions on the detector. From this you would get:

NOt = NS(t + $\displaystyle \Delta$t)

ONt = S(Nt + $\displaystyle \Delta$t)

If a sufficiently long exposure is taken in the latter case, then, to first order,

ONt $\displaystyle \approx$ SNt


$\displaystyle {N O_t \over O_{Nt}}$ $\displaystyle \approx$ $\displaystyle {(t + \Delta t)\over t}$

so from a ratio of the two frames, one could compute $ \Delta$t, and use the first equation above to correct every observed frame to the true count rate for the commanded exposure time:

S = $\displaystyle {O t\over t+\Delta t}$

Flat fielding

Flat fielding is the calibration step which corrects for varying throughput as a function of position in the field. This arises from variations in pixel-to-pixel sensitivity, illumination patterns, and filter throughput variations, i.e. it is not necessarily just a detector effect. All of these source cause multiplicative field-dependent errors.

In principle, these are easy to correct for; all one needs to do is to observe a source which is spatially flat, and any deviations from a flat response give the field-dependent response variations. One then corrects all program observations by dividing by this flat-field frame. However, for accurate correction, one needs to observe a perfectly flat source, which can be difficult. Often, a white spot on the inside of the dome is used which is way out of focus, but, at some level this provides a different illumination than sky because it is a near-field target. Also,one needs to consider the color dependence of the flat field response; you need separate flats for every filter. For broad filters, you also need to consider the color of the light source and how it compares with the color of your objects.

Individual pixel response variations require high S/N to measure; generally, you should aim for accuracy of a small fraction of a percent. However, be aware that computed S/N doesn't give true accuracy for large spatial scales, because the intrinsic limitation is the uniformity and color of light source. The main problem in flat fielding is systematic errors, not random errors.

Generally, flats are constructed using one of three techniques, or some combination of these three: dome flats, twilight flats, and dark sky flats.

Dome flats are good for measuring the pixel-to-pixel variation if color match is good or if your array doesn't have strong color dependence; often, however, color match is poor. Because of illumination problems, however, dome flats are usually limited to a few percent accuracy for large scales.

Twilight flats can be good for pixel-to-pixel if you don't have too many filters and if they're not too narrow and if it doesn't take too long to read array. They are generally much better than dome flats for large scale structure. They can have some color mismatch problems. The biggest problem with twilight flats is that twilight is short, so it can be difficult to get data in all filters; certainly, one usually needs to use both evening and morning twilight. A lesser problem is that one will occasionally see stars in the twilight flats; this requires that you take several frames in each filter, moving the telescope between frames, and then reject stars in the process of combining the multiple frames. Additional problem is if you miss twilight, due to weather or scheduling.

Alternatively you can use your dark sky exposures to make the flats. However, generally, the level level is low (in the optical) so you can't build up high S/N. Consquently, when using these ``superflats'', one determines the pixel-to-pixel response with dome flats, and then you get the large scale variations with "superflats". Combine many frames from the night rejecting stars and fit surface to get large scale variation. A great advantage of this technique is that you match the color of the night sky perfectly.

What are the effects of systematic flat fielding errors? For no backgrounds, the fractional error in your measurement is the fractional error in your flat. However, in the real world, we need to subtract off background, and the presence of flat-fielding errors make you estimate the sky wrong, if you measure the sky at a different location from your object. You observe S + B. If you subtract xB, this gives S + (1-x)B, which differs from S by:

error = $\displaystyle {S + (1-x)B\over S}$ = 1 + (1 - x)*$\displaystyle {B\over S}$

so, for example, if S = B and x = 0.99 (one percent error in sky), then you make a one percent systematic error. However, if B = 100S, then you make a factor of two error!

Thus flat-fielding becomes critically important if you are measuring sources which are fainter than the sky level. This occurs in the optical as you go very faint, and almost always occurs in the IR, because the background is so great.

Note that the same problem occurs if the background is intrinsically spatially non-uniform. This can be the case in the IR because of potential spatial variations in sky emission as well as thermal contributions from the telescope/dome itself. This is especially troublesome because the background non-uniformity may very well change with time and telescope pointing.

What can you do about this? If all your frames are subject to the same flat-fielding or additive background errors, then you can beat the problem by observing the sky on the same pixels as your object. If you then subtract the sky frame from the object frame, you end up with a difference frame which has a much lower sky level as compared with your object (but still possibly non-zero if the background level varies in time). Consequently, the errors introduced by the flat-fielding errors go down. In fact, if the sky subtraction is perfect, you can reduce your object errors down to the percentage error in the flat field. However, remember that in doing this you introduce noise from photon statistics, since the subtracted frame has $ \sqrt{{2}}$ the error of the original frame (if the sky dominates). So you don't necessarily win by doing this every time: only if the sky is brighter than the object. Depending on your requirements, you might be able to get away with observing the sky for somewhat less time than the object, if you want to keep the big systematic errors small and don't care about losing S/N at the faintest levels; but beware, you may pay something for your time saved. You can win this back if you average several sky frames for each object frame, but you have to beware for the sky frames changing with time.

Note if your object is small, you can do this without losing any observing time by dithering the object around the field. If your object is large, however, you increase your overhead substantially. But you don't have any choice.

Flats in the IR

Flats are additionally complicated in the infrared, because every frame you take, including flat field frames, have an additive extra component arising from the radiation coming from the telescope and dome structures. This is more of a problem at longer IR wavelengths, because at shorter ones, the sky dominates. Given this, how do you make a flat at all? One possibility is dome flats with and without lights: subtract the two for your flat. Of course, this has the potential problems associated with dome flats: incorrect illumination and color mismatches. But what can you do? Take both this kind and sky flats. If the variable spatial additive component from your telescope is small in the sky flats, you'll probably do better with these.

Since we are talking systematic errors in sky determination causing the biggest problems in your photometric accuracy, it is very difficult to estimate the errors analytically. Remeber, systematic errors are the most troublesome! Pretty much the only way you can check the accuracy of your flats is to take images of the same object at several different field locations and empirically determine how well the measurements repeat. This is highly advised in the IR, at least once with every instrument.

Scattered light and flat fielding

Another problem sometimes encountered with flat fields occurs if there is light that makes it to the focal plane without having come through the normal optical path; this is generically called scattered light.

If there is scattered light in object frames, it may be possible to remove it as an extra background contribution (that is not uniform across the frame).

But if there is scattered light in your flat fields, it can cause significant problems for photometry in all of your frames since you'll divide your frames by an incorrect field-dependent response function. Scattered light in flats can be espeically problematic since flats are often taking when there is significant light in the dome.

You may be able to check for the presence of scattered light by looking at flats taken with the telescope in different positions: in particular, if you can take flats at several different position angles, comparing these often can indicate whether there is a problem.

Flat fielding and distortion: pixel area effects

In many modern instruments, the field of view is sufficiently large that optical distortion can be significant, leading to a different pixel scale at different locations in the focal plane. This can have a significant effect on flat fields and requires you to carefully consider your flats.

If different pixels subtend different areas on the sky, the pixel area changes will be reflected in your flat field, so if you divide by a raw flat field, you will also be normalizing out different pixel area effects. But if you are observing unresolved sources, the light will just be spread out over a different number of pixels (i.e. a field dependent point-spread function in pixel coordinates), so if you divide by a flat field with the pixel area effects in it, you'll erroneously change the relative brightnesses of stars across the field-of-view!

If this is a significant effect, you will want to either remove the pixel area contribution from the flat-field (which you can do if a distortion solution has been obtained for the instrument), or you can resample the data onto a grid ofequal pixel areas (but this has the disadvantage of introducing correlated noise).

Drift-scanning and flat fielding

One technique for reducing errors from flat fielding: drift scanning (also called TDI, time-delay integration mod). Mechanics of drift scanning. Implications for improving flat fielding. SDSS as an example. Time variation along the strip: issues/opportunities for photometry and moving objects.

Spectroscopic data reduction

  1. normal CCD processing: overscan, bias, dark

  2. flat fielding. Note problem that dome flats have spectral energy distribution of light source. ``Flatten'' the flats in the wavelength direction to preserve error analysis, i.e. remove the large scale wavelength dependence, but preserve the pixel-to-pixel response variations. In the spatial direction, flat fielding is like imaging, but often the requirements on accuracy are less stringent.

  3. wavelength calibration. Use arc lamps with known lines. Identify lines, determine line centers (centroid or fitting), and fit function to centers vs. wavelength.

    Issues: slit dependent wavelength calibration.

    Resampling of spectra onto ``homogeneous'' wavelength scale, e.g., linear or log-linear. Sampling issues!

  4. flux calibration: correction for throughput as a function of wavelength. Not always required, e.g. if measuring strengths relative to nearby continuum. Spectrophotometric standards, e.g. Massey et al. ApJ 328, 315 (1988).

    If fluxing is performed, usually also want to correct for atmospheric extinction as a function of wavelength and airmass: use of mean extinction coefficients.

    Issues: differential refraction, accuracy of spectrophotometric standards.

  5. Object reduction: extracting object spectrum (``tracing'' the object) and sky spectrum. Aperture extraction vs. optimal extraction.

    Issues: spectral curvature. Two-dimensional extraction

  6. Sky subtraction: standard and IR; nod and shuffle

    Issues: spatial and temporal variation

  7. Telluric feature correction (esp in IR).

    Issues: spatial and temporal variation

Modern data reduction pipelines

Pipelines, data models (e.g., SDSS-III datamodel source and APOGEE data model, and version control.

next_inactive up previous
Up: AY535 class notes Previous: Detectors
Jon Holtzman 2017-11-17