How do you characterize a rhythm? To study biological rhythms, and enable comparisons between various rhythms, we must be able to describe and analyze the rhythms. Here are a few characteristics of rhythms, which we can investigate using CATkit: period, phase, amplitude, and MESOR. CATkit techniques are described below. Each technique provides a slightly different perspective on the data: actogram; periodogram; smoothing; auto-correlation; cross-correlation; cosinor; least square spectrum; multiple-component cosinor; gliding spectrum.
First a few considerations that are relevant to time-series analysis.
Initial considerations
The type and frequency of data sampling impacts the methods that can be used.
Continuous vs discrete data: Body temperature, or the flourescence of a tag are continuous. They change gradually over time, although they are observed at discrete time points. Counting the times an animal crosses a laser, or the number of people taken by ambulance for a heart attach, in a unit of time is discrete.
Sampling frequency for continuous data: Sampling must be done rapidly enough to avoid ‘‘aliasing’’ in the periodicity region of interest. Aliasing occurs when the sampling interval is longer than the period being considered. (Hamming, 1983). Sampling frequency must be more than twice the frequency of the sampled process. This is the Nyquist or fold-over frequency (Chatfield, 1989). A bit faster is better to be sure detail is not lost, but this is the theoretical tipping point. (Dowse Ch6)
Sampling frequency for non-continuous data: The Nyquist interval must still be factored in. Data such as the number of times an animal crosses a laser beam must be counted and broken into discrete "bins". It has been shown that bin size affects the output of time series analysis and that this effect can be profound when bin size is too small (review: Dowse and Ringo, 1994). Over an arbitrarily short interval of the day, say a half an hour, the series of occurrences of events, such as a fly breaking a light beam in a chamber, is described by a Poisson process. There is no time structure or pattern and events occur stochastically. Dowse says, "Based on empirical and practical considerations, bin size much smaller than 10 minutes may cause artifact, in that perfectly good periodicities may be obscured in the presence of a lot of noise. Half-hour bins are generally small enough for good results in our experience.". (Dowse Ch6)
The effects of sampling frequency and/or binning may vary depending on the data being used. These effects should be considered when planning data collection.
Interval and Completeness of data: The analyses below all require there to be a regular interval between each sample, or bin, and no missing data -- except for cosinor, which handles missing or irregular data. CATkit interpolates to fill in missing data for those functions that require it to be equidistant.
First a few considerations that are relevant to time-series analysis.
Initial considerations
The type and frequency of data sampling impacts the methods that can be used.
Continuous vs discrete data: Body temperature, or the flourescence of a tag are continuous. They change gradually over time, although they are observed at discrete time points. Counting the times an animal crosses a laser, or the number of people taken by ambulance for a heart attach, in a unit of time is discrete.
Sampling frequency for continuous data: Sampling must be done rapidly enough to avoid ‘‘aliasing’’ in the periodicity region of interest. Aliasing occurs when the sampling interval is longer than the period being considered. (Hamming, 1983). Sampling frequency must be more than twice the frequency of the sampled process. This is the Nyquist or fold-over frequency (Chatfield, 1989). A bit faster is better to be sure detail is not lost, but this is the theoretical tipping point. (Dowse Ch6)
Sampling frequency for non-continuous data: The Nyquist interval must still be factored in. Data such as the number of times an animal crosses a laser beam must be counted and broken into discrete "bins". It has been shown that bin size affects the output of time series analysis and that this effect can be profound when bin size is too small (review: Dowse and Ringo, 1994). Over an arbitrarily short interval of the day, say a half an hour, the series of occurrences of events, such as a fly breaking a light beam in a chamber, is described by a Poisson process. There is no time structure or pattern and events occur stochastically. Dowse says, "Based on empirical and practical considerations, bin size much smaller than 10 minutes may cause artifact, in that perfectly good periodicities may be obscured in the presence of a lot of noise. Half-hour bins are generally small enough for good results in our experience.". (Dowse Ch6)
The effects of sampling frequency and/or binning may vary depending on the data being used. These effects should be considered when planning data collection.
Interval and Completeness of data: The analyses below all require there to be a regular interval between each sample, or bin, and no missing data -- except for cosinor, which handles missing or irregular data. CATkit interpolates to fill in missing data for those functions that require it to be equidistant.
Actogram
Easy visualization of the period of the data. Common sense check of shape of the data. The period can be adjusted to allow fitting the actogram to the data, to identify if the data follows a repeating pattern. If diagonals are observed, the period should be adjusted until vertical lines can be seen along the leading or trailing edge of the peak activity areas. p149 (Palmer 1994)
Smoothing
This technique also allows visualization of the data, but with more detail than the actogram. Data is binned and averaged, then plotted, showing a line that is smoother than the original data.
Auto Correlation
Identifies the presence of rhythmicity, period and phase. Should always be used to determine the significance of any period (Levin 2002). Compares a dataset to itself, repeatedly, lagging one dataset by one point with respect to the other each time. The difference between the two datasets is calculated at each point, giving an autocorrelation value. The correlogram, a graph of the autocorrelation results, is a time-domain analysis that allows assessment for the presence or absence of any periodicities in the data as well as their regularity. The autocorrelation values, r, are without units. Horizontal lines above and below the abscissa are used to represent the 95% confidence interval, calculated as 2/Sqrt(N). The height of the third peak, counting the peak at lag 0 as #1, constitutes the Rhythmicity Index. The decay envelope of the function indicates the stability of the rhythm (Dowse Ch6).
Cross Correlation
Compare 2 different data sets to to determine if period and phase are aligned. Whereas autocorrelation evaluates the relationship between a data set with itself over time, crosscorrelation evaluates the relationship between two different data sets. A plot will show the correlation coefficients with respect to the lag (in units of time) between the two time courses. This comparison can be used to determine whether there is a difference in phase between the two data sets. The lag is read as the phase-offset from 0 on the abscissa. (Dowse Ch6) p149 (Levine 2002)
Cosinor
Up to this point, all of the analysis methods have required the data to be equally spaced, with no missing data. And unlike the earlier non-parametric methods, this is a parametric technique, building a model based on the expected period. The Cosinor method will identify periodicity in sparse data, or non-equidistant data.
Cosinor fits the data to (one or more) cosine curve(s). For any given period, regression analysis is used to solve a system of equations of a line from which the amplitude and Mesor (a circadian rhythm-adjusted mean) of the model curve can be calculated.
The peak of the cosine wave provides a suitable measure of phase, the acrophase.
Cosinor fits the data to (one or more) cosine curve(s). For any given period, regression analysis is used to solve a system of equations of a line from which the amplitude and Mesor (a circadian rhythm-adjusted mean) of the model curve can be calculated.
The peak of the cosine wave provides a suitable measure of phase, the acrophase.
Population-Mean Cosinor (PMC)
When data are collected as a function of time on 3 or more individuals, the population-mean cosinor procedure renders it possible to make inferences concerning a population rhythm, provided the individuals considered represent a random sample from that population. Each individual series is analyzed by the single- or multiple-component single cosinor. The PMC uses takes these results as input. Assuming that the within-individual variances are the same, the PMC estimates the population rhythm parameters by calculating the arithmetic mean of individual MESORs and the vectorial average of individual amplitude-acrophase pairs.
Population-Mean Cosinor Parameter Tests (PMCtests)
Statistical tests of the equality of PMC rhythm parameters (MESOR, Amplitude, Acrophase) from two or more populations, considered singly or (Amplitude, Acrophase) jointly. CATparam uses the parameters output from a single- or multiple-component single cosinor as input.