next up previous
Next: Noise equation: how do Up: AY535 class notes Previous: Light, magnitudes, and the


Uncertainties and error analysis

(Entire section in one PDF file).

For a given rate of emitted photons, there's a probability function which gives the number of photons we detect, even assuming 100% detection efficiency, because of statistical uncertainties. In addition, there may also be instrumental uncertainties. Consequently, we now turn to the concepts of probability distributions, with particular interest in the distribution which applies to the detection of photons.

Distributions and characteristics thereof

Some definitions relating to values which characterize a distribution:

mean≡μ = $\displaystyle \int$xp(x)dx

variance≡σ2 = $\displaystyle \int$(x - μ)2p(x)dx

standarddeviation≡σ = $\displaystyle \sqrt{{variance}}$

median : mid-point value.

$\displaystyle {\int_{-\infty}^{x_{median}} p(x) dx \over \int_{-\infty}^{\infty} p(x) dx}$ = $\displaystyle {1\over 2}$

mode : most probable value

Note that the geometric interpretation of above quantities depends on the nature of the distribution; although we all carry around the picture of the mean and the variance for a Gaussian distribution, these pictures are not applicable to other distributions, but the quantities are still well-defined.

Also, note that there is a difference between the sample mean, variance, etc. and the population quantities. The latter apply to the true distribution, while the former are estimates of the latter from some finite sample (N measurements) of the population. The sample quantities are derived from:

sample mean : $\displaystyle \bar{x}$$\displaystyle {\sum x_i \over N}$

sample variance≡$\displaystyle {\sum(x_i - \bar x)^2 \over N-1}$ = $\displaystyle {\sum x_i^2 - (\sum x_i)^2/N \over N-1}$

The sample mean and variance approach the true mean and variance as N approaches infinity. But note, especially for small samples, your estimate of the mean and variance may differ from their true (population) values (more below)!

\textit{Understand the concept of probability distribution functi...
...he difference between population
quantities and sample quantities.}

The binomial distribution

Now we consider what distribution is appropriate for the detection of photons. The photon distribution can be derived from the binomial distribution, which gives the probability of observing the number, x, of some possible event, given a total number of events n, and the probability of observing the particular event (among all other possibilities) in any single event, p, under the assumption that all events are independent of each other:

P(x, n, p) = $\displaystyle {n! p^x (1-p)^{n-x} \over x! (n-x)!}$

For the binomial distribution, one can derive:

mean≡$\displaystyle \int$xp(x)dx = np

variance≡σ2$\displaystyle \int$(x - μ)2p(x)dx = np(1 - p)

The Poisson distribution

In the case of detecting photons, n is the total number of photons emitted, and p is the probability of detecting a photon during our observation out of the total emitted. We don't know either of these numbers! However, we do know that p < < 1 and we know, or at least we can estimate, the mean number detected:

μ = np


In this limit, the binomial distribution asymtotically approaches the Poisson distribution:

P(x, μ) = $\displaystyle {\mu^x \exp^{-\mu} \over x!}$

From the expressions for the binomial distribution in this limit, the mean of the distribution is μ, and the variance is

variance = $\displaystyle \sum_{x}^{}$[(x - μ)2p(x, μ)]

variance = np = μ

σ = $\displaystyle \sqrt{{\mu}}$

This is an important result.

Note that the Poisson distribution is generally the appropriate distribution not only for counting photons, but for any sort of counting experiment where a series of events occurs with a known average rate, and are independent of time since the last event.

What does the Poisson distribution look like? Plots for μ = 2, μ = 10, μ = 10000.

Understand the Poisson distribution and when it applies....
...e Poisson distribution
is related to the mean of the distribution.}

The normal, or Gaussian, distribution

Note, for large μ, the Poisson distribution is well-approximated around the peak by a Gaussian, or normal distribution:

P(x, μ, σ) = $\displaystyle {1 \over \sqrt{2\pi} \sigma}$e$\scriptstyle {-{(x-\mu)^2\over 2\sigma^2}}$

This is important because it allows us to use ``simple'' least squares techniques to fit observational data, because these generally assume normally distributed data. However, beware that in the tails of the distribution, and at low mean rates, the Poisson distribution can differ significantly from a Gaussian distribution. In these cases, least-squares may not be appropriate to model observational data; instead, one might need to consider maximum likelihood techniques instead.

The normal distribution is also important because many physical variables seem to be distributed accordingly. This may not be an accident because of the central limit theorem: if a quantity depends on a number of independent random variables with ANY distribution, the quantity itself will be distributed normally (see statistics texts). In observational techniques, we encounter the normal distribution because one important source of instrumental noise, readout noise, is distributed normally.

Know what a Gaussian (normal) distribution is, including...
...nces the
Poisson distribution is similar to a normal distribution.}

Importance of error distribution analysis

You need to understand the expected uncertainties in your observations in order to:

Confidence levels

For example, say we want to know whether some single point is consistent with expectations, e.g., we see a bright point in multiple measurements of a star, and want to know whether the star flared. Say we have a time sequence with known mean and variance, and we obtain a new point, and want to know whether it is consistent with known distribution?

If the form of the probability distribution is known, then you can calculate the probability of getting a measurement more than some observed distance from the mean. In the case where the observed distribution is Gaussian (or approximately so), this is done using the error function (sometimes called erf(x)), which is the integral of a gaussian from some starting value.

Some simple guidelines to keep in mind follow (the actual situation often requires more sophisticated statistical tests). First, for Gaussian distributions, you can calculate that 68% of the points should fall within plus or minus one sigma from the mean, and 95.3% between plus or minus two sigma from the mean. Thus, if you have a time line of photon fluxes for a star, with N observed points, and a photon noise σ on each measurement, you can test whether the number of points deviating more than 2σ from the mean is much larger than expected. To decide whether any single point is really significantly different, you might want to use more stringent criterion, e.g., 5σ rather than a 2σ criterion; a 5σ has much higher level of significance. On the other hand, if you have far more points in the range 2 - -4σ brighter or fainter than you would expect, you may also have a significant detection of intensity variations (provided you really understand your uncertainties on the measurements!).

Also, note that your observed distribution should be consistent with your uncertainty estimates given the above guidelines. If you have a whole set of points, that all fall within 1σ of each other, something is wrong with your uncertainty estimates (or perhaps your measurements are correlated with each other)!

For a series of measurements, one can calculate the χ2 statistic, and determine how probable this value is, given the number of points.

χ2 = $\displaystyle \sum$$\displaystyle {(observed(i) - model(i) )^2 \over \sigma_i^2}$

For a given value of χ2 and number of measurements, one can calculate the probability that the measurements are consistent with the model (and the uncertainties are correctly predicted). A quick estimate of the consistency of the model with the observed data points can be made using reduced χ2, defined as χ2 divided by the degrees of freedom (number of data points minus number of parameters), which should be near unity if the measurements are consistent wth the model.

next up previous
Next: Noise equation: how do Up: AY535 class notes Previous: Light, magnitudes, and the