Image file formats

Next: Combining images Up: Basic astronomical image processing Previous: Image display

Image file formats

There are a number of ways one might consider storing a 2-D image in a disk file. Generally, since these images may be large, we are interested in formats which minimize the amount of disk space used and the amount of I/O time needed to read/write the images into memory. For this reason, images are rarely saved as formatted numbers; since the dynamic range of most detectors is at least 16 bits (-32768 to 32767), this would require at least 5 bytes per pixel, for data which is intrisically only 2 bytes of information. Consequently, images are generally stored as unformatted numbers. The binary representation of numbers can differ between machine types. Generally, however, most machines use 2's complement for integer and an IEEE definitiion for floating point numbers; machines differ, however, on the byte-order of these representations, with some architechtures putting the least significant byte first, while others put the most significant byte first.

Once a representation for the numbers is chosen, the values of individual pixels are simply dumped sequentially to a file. However, some additional information is needed to reconstruct these numbers into a 2d image. In particular, one needs to know the number of rows and columns in an image in order to properly wrap a sequence of numbers back into a 2D format. In addition, there may be additional information one wants to convey with an image, e.g., values identifying the scales and values along each axis, auxiliary information about how the image was obtained (exposure time, telescope used), etc. etc.

The standard file format for astronomical images is the FITS format. This format is described in a series of papers (refs: A&AS 44, 363; A&AS 44,371; A&AS 73, 359). Basically, a FITS file in its simplest form consists of two sections: a header section and a data section. The header section gives the required information to interpret and unwrap the data and also includes any optional information about the frame. This information is formatted in an arbitrary number of 80-byte ASCII records, each of which associates a value (integer, floating point, or character) with a predefined keyword; keywords can be up to 8 characters in length. Required keywords for FITS format are 1) SIMPLE = T or F depending on whether special extensions to the FITS format are used, 2) BITPIX, which gives the number of bits/pixel, telling the computer what format the data is stored in (16-bit integer, 32-bit integer, or 32-bit floating point). Note that FITS data by definition uses 2's complement or IEEE floating point with msb first. 3) NAXIS, which gives the total dimensionality of the image (since FITS can be used for 1-D images (spectra), 2-D images (single images), 3-D images (image stacks), etc. For image reconstruction from the file, the order of FITS pixels is that columns vary first, then rows. 4) a series of NAXIS[1-N] cards given the dimension of each axis., 5) an end card, which signifies the end of the header section.

Optional header cards include: CRVAL[1-n], CRPIX[1-N], CDELT[1-N], CTYPE[1-N] OBJECT, DATE-OBS (dd/mm/yy), INSTRUME, TELESCOP, OBSERVER, RA, DEC, EPOCH, COMMENT, HISTORY, etc. Note in particular, BZERO and BSCALE (true = disk* bscale + bzero.

Both the FITS header and the data section are padded with blanks (header) or zeros (data) to fill a multiple of 2880 bytes (which was done insure compatability with machine types of essentially all possible word lengths). Consequently, a FITS file contains a set of filled 80-card header records, a set of blank records after the END card to pad to the next larger multiple of 2880 bytes, then the raw data in a byte stream, padded with zero's to provide an even multiple of 2880 total bytes.

A variety of software packages read/write FITS images, and several subroutine libraries have been made publically available to make it easy for you to do I/O with FITS images in your own programs (FITSLIB, others?). IDL also contains FITS I/O routines.

Note that IRAF is almost unique in that it uses its own internally defined file format. It provides the capability for converting between FITS and IRAF format, but all IRAF calculations (which, remember, are disk based), are in the internal format. IRAF developers do not tell users what exactly the format of the IRAF files is so that they may reserve the right to change it in the future (with back-compatability). In order for users to be able to access IRAF images from their own software, they do provide access to subroutines which can be called from FORTRAN programs through the IMFORT library.

In recent versions of IRAF, it has become possible to use the FITS data format for all calculations instead of the IRAF internal format; I highly recommend using this (it is not the default unless you add something in your login.cl file).

Next: Combining images Up: Basic astronomical image processing Previous: Image display

Rene Walterbos 2003-04-14