Instructions for Calling the Chronomics Analysis Toolbox (CAT)
Examples of calling CAT can be found on the Vignettes page, and associated instructions.
After installing R and RStudio:
1. Download the Installing CATkit pdf
2. Open "Installing CAT from tar" with a text editor (or R)
3. Following the directions, open Vignette.r and modify it to run a vignette
4. To run the vignette, double click on Vignette.r;
or, if you have Vignette.r open in R or RStudio, you can run it by selecting shift-cmd-S; or from the Menu, select Code-->Source
5. After the program completes running, it will give you the name of the output file (in the same folder as the input was found).
6. View Tips on interpreting CATCall
Parameters have default values that will be used if not explicitly set. The call to CAT, with defaults is:
CatCall(TimeCol=1, timeFormat="%Y%m%d%H%M",lum=NA, valCols=c(), sumCols=c(), Avg=FALSE,
NEq=list(bin=FALSE,Interval=NA,Midpoint=NA, start=NA, end=NA),
Plex=list(plex=FALSE, nClasses=0, foldLen=0, RefTime, yLab, ySet),
export=FALSE, sizePts=2, binPts=5, Interval = 0, Increment=0, k=6, yLab="Activity Level (au)", modulo=1440,
funcName="", Rverbose=0, RmaxGap=400, Skip=0,header=FALSE, Smoothing=FALSE,
Actogram=FALSE,AutoCorr=FALSE,CrossCorr=FALSE,Console=FALSE,Graphics="pdf", Darkness=1,LagPcnt=.33,tz="GMT",title="",
fileName, file2=list(Name=NULL,TimeCol=1, timeFormat="%Y%m%d%H%M", lum=4, valCols=c(3,4), sumCols=c(5,6),sizePts=2, binPts=5,Darkness=0))
After installing R and RStudio:
1. Download the Installing CATkit pdf
2. Open "Installing CAT from tar" with a text editor (or R)
3. Following the directions, open Vignette.r and modify it to run a vignette
4. To run the vignette, double click on Vignette.r;
or, if you have Vignette.r open in R or RStudio, you can run it by selecting shift-cmd-S; or from the Menu, select Code-->Source
5. After the program completes running, it will give you the name of the output file (in the same folder as the input was found).
6. View Tips on interpreting CATCall
Parameters have default values that will be used if not explicitly set. The call to CAT, with defaults is:
CatCall(TimeCol=1, timeFormat="%Y%m%d%H%M",lum=NA, valCols=c(), sumCols=c(), Avg=FALSE,
NEq=list(bin=FALSE,Interval=NA,Midpoint=NA, start=NA, end=NA),
Plex=list(plex=FALSE, nClasses=0, foldLen=0, RefTime, yLab, ySet),
export=FALSE, sizePts=2, binPts=5, Interval = 0, Increment=0, k=6, yLab="Activity Level (au)", modulo=1440,
funcName="", Rverbose=0, RmaxGap=400, Skip=0,header=FALSE, Smoothing=FALSE,
Actogram=FALSE,AutoCorr=FALSE,CrossCorr=FALSE,Console=FALSE,Graphics="pdf", Darkness=1,LagPcnt=.33,tz="GMT",title="",
fileName, file2=list(Name=NULL,TimeCol=1, timeFormat="%Y%m%d%H%M", lum=4, valCols=c(3,4), sumCols=c(5,6),sizePts=2, binPts=5,Darkness=0))
CAT Parameterization
CAT consists of a set of functions listed in Table 2, as well as the Cosinor function, detailed in the Cosinor section below. The functions in Table 2 can be called by executing an R script (like a .bat file) from a shortcut, or from within R. The R script contains the parameters for running CAT (see parameters below), and is edited with any text editor. CAT can be modified to solicit the data input file(s) at execution, or the data file(s) can be specified in the R script (see web site for specifics). The selected functions execute in the sequence shown in Table 2, and output is in the form of a series of graphs in a PDF file in the same folder where the input file was found. (See below for more information on Input and Output files.)
Parameters: Default values for each parameter are shown in the sample call above.
TimeCol: Lists columns containing the date and time. Specify one column (a scalar) if date time is all in one column. Specify 2 time columns as a vector, c(1,2), if date is in one column and time is in another. The format for time will be expected in timeFormat parameter.
timeFormat: Using the R time-formatting codes, specify how your dates are formatted. Default for a 1 column time is "%Y%m%d%H%M". A two column time will be concatenated without spaces, and your specified format applied: DateTime. See strptime function for R time-formatting codes.
lum: The column number containing luminance values, or NA. Luminance values are used to determine where the light level drops sharply, and this point is used as the starting point for analysis. Data points prior to dark onset are not used. (This can be reversed to use only data after light onset by setting Darkness=0.)
valCols: Specify which columns contain non-count-type data, such as temp or blood pressure, that should be averaged when binned. This is handled differently than count-type data. Specify valCols=c() if none.
sumCols: Specify which columns contain count-type data -- activity counts, for example, that should be summed when binned. Specify sumCols=c() if none.
Avg: A Boolean to indicate if you would like to see the output of an average of all data columns. If you tell CAT to analyze columns 4:8, and specify average, in addition to analyzing each column from 4 to 8, the columns will also be averaged, and that average will be analyzed.
NEq: Optional. Only needed when a certain kind of binnin is required, placing the average of a range of data points at the Midpoint of that Interval. When using NEq, specify the data column in valCols. Only one column at a time.
bin: Should binning be done (only applies to the Plex).
Interval: The range of times over which to averaged data, in hours.
Midpoint: The midpoint time of the interval, where the mean is placed. This is specified in hours from the beginning of the interval.
start: The time at which to start analysis. Specified in timeFormat.
end: The time at which to end analysis. Specified in timeFormat.
Plex: Optional parameters for Plexogram. Other functions will not run at the same time. NEq can be specified with Plex if binning of data is needed.
plex: Should a Plexogram be run. <*>nClasses: The number of classes in each fold. <*>foldLen: The length of the fold, in hours.
RefTime: The time at which to start analysis. Specified in timeFormat.
yLab: The label on the y-axis of the plot.
ySet: The range on the y-axis of the plot.
export: Boolean. Default is False. If True, a data file is saved after interpolation and binning (per parameters). When True, each function (except Actogram) exports results to (separate) comma-delimited text files.
sizePts, binPts: sizePts is the number of minutes between samples. binPts is the number of samples to aggregate into one bin. Binning is very flexible since it can be so important. sizePts * binPts = number of minutes in each bin. Only full bins are used for analysis, so there could be a few data points at the end of the data (after binEnd) that are not used.
Interval, Increment: These two parameters are used together to specify a progressive analysis. The interval is the length of subsections of data to analyze, and the increment is how far to move ahead in the data to begin the next span. (Interval will need to be large enough not to trigger the error messages in the functions, which require more than 3 days for each analysis.) The entire data set will be analyzed (from LumStart to binEnd). A progressive analysis can be performed by the Auto-Correlation and Cross-Correlation. The Actogram and Smoothing functions are performed on the full dataset length, for each column, as normal. There is no benefit to viewing these graphs in subsections.
k: Only the Smoothing function uses this parameter. It is a count of the number of data points on each side of a point to include in the moving average. The moving average is calculated using 2k+1 data points.
yLab: Label for Y axis on Smoothing and Actogram functions. Default is "Activity Level (au)".
modulo: Only the Actogram function uses this parameter. It specifies the width in minutes to be used for displaying the Actogram. Default is 1440 min, or 1 day.
funcName: A short name for this run, used to differentiate multiple runs that are otherwise quite similar. Appears in the filenames, and in the .rtf file.
Rverbose, RmaxGap: Rverbose is for debug and can take on values of -1, 0, 1 or 2. 0 turns off debug information. 1 or 2 add increasing amounts of debug information. -1 displays minimal information on graphs. RmaxGap specifies the maximum allowable number of missing data points in any one block. An error will be returned if gaps larger than this are found in either data file.
Skip, header: These are kind of housekeeping parameters.
Skip is a parameter to the R read.table function. Default is 0. It is needed if you have multiple lines of header – the first row will be read as the header (unless you set header=FALSE) and Skip indicates how many rows to skip before reading data. header indicates if the file columns have a header row. Default=FALSE. Headers are used to name variables.
(Function): Default is FALSE. For any function in Table 2, specifying (Function)=TRUE will cause it to run, thus you select only the functions you need for any purpose. For example: Actogram=TRUE, Smoothing=FALSE, AutoCorr=FALSE, CrossCorr=TRUE, ...
Console: Default is FALSE. When Console=TRUE output will be redirected to the RStudio Console, instead of an output file.
Graphics: Results of CatCall are sent to a files when Console=FALSE. Default file output type is "pdf". Possible values: "jpg, pdf, tif, png. See Output Data below for more information on output.
Darkness: This refers to the illumination column in the first file. CAT analysis and graphing begins at darkness onset, as indicated by the luminance column. Normally, darkness is indicated by a very small number (<10) and light is a large number (>=10). If this is needed to be reversed, changing the Darkness defaults will correct the interpretation of the lumninance column for the respective files. Darkness=0 means that darkness is a small number (<10). Darkness=1 indicates light is a very small number, and darkness is a large number (>=10).
LagPcnt: Specifies maximal lag used to calculate the Autocorrelation and Crosscorrela- tion functions, expressed as a percentage of the number of data.
tz: R timezone code. Default GMT should be used in most cases.
title: A title for the run. Appears on the PDF output, and plots.
fileName: You can specify one or two files. Only one file is needed for most functions, but the Crosscorrelation requires two files. Crosscorrelation will perform a column-by-column crosscorrelation, so both files must have the same number of columns.
file2: Optional list of parameter for the second optional data file. Required only for the Crosscorrelation function. A list of parameters for the second file, mirroring those for the first file.
=list(Name=NULL,TimeCol=1, timeFormat="%Y%m%d%H%M", lum=4, valCols=c(3,4), sumCols=c(5,6),sizePts=2, binPts=5,Darkness=0)
Input Data:
Input data is assumed to be equidistant. All columns are expected to be numeric.
Data File format: Tab- or comma-delimited (.txt) file with the following columns: time, luminance, data col1 [, data col2] . . .[, data coln]
Cross-Correlation requires 2 data files, where other functions require only one data file. A single data file with many data columns can be specified for analysis, in which case the Cross-Correlation function is skipped; or 2 data sets can be analyzed. In all cases, interpolation is done to fill in missing data points; and then analysis is done on each specified data column in a file, as well as on the average of all columns individually analyzed (if selected).
Output Data:
Sample graphics output file: The output file is a graphics file. See Output section on the web site, or Vignette folders, for a sample of a full output file. All output filenames contain the input data filename to clearly identify the data file under analysis, and a timestamp to show the time of analysis. Each graphic file also lists the column name being analyzed (or averaged), and the starting and ending times of analysis, as they vary slightly from the full data set (Lum to binEnd).
Possible output file types: jpg, pdf, tif, png
Binned data: Two different binning methods are possible. Using parameters sizePts=2 and binPts=5, the input data is interpolated and (optionally) binned. This transformed data can be exported using the export parameter. If export=True then each function (except Actogram) exports a file with the results of the function.
A second type of binning is done when the parameters in NEq() are used. This type always sums the data in each bin and finds an average. No interpolation is done. This binning works with the Plexogram function, if binning is needed.
For help Debugging Errors, see Error page.
Information on CAT Cosinor is also available.
Parameters: Default values for each parameter are shown in the sample call above.
TimeCol: Lists columns containing the date and time. Specify one column (a scalar) if date time is all in one column. Specify 2 time columns as a vector, c(1,2), if date is in one column and time is in another. The format for time will be expected in timeFormat parameter.
timeFormat: Using the R time-formatting codes, specify how your dates are formatted. Default for a 1 column time is "%Y%m%d%H%M". A two column time will be concatenated without spaces, and your specified format applied: DateTime. See strptime function for R time-formatting codes.
lum: The column number containing luminance values, or NA. Luminance values are used to determine where the light level drops sharply, and this point is used as the starting point for analysis. Data points prior to dark onset are not used. (This can be reversed to use only data after light onset by setting Darkness=0.)
valCols: Specify which columns contain non-count-type data, such as temp or blood pressure, that should be averaged when binned. This is handled differently than count-type data. Specify valCols=c() if none.
sumCols: Specify which columns contain count-type data -- activity counts, for example, that should be summed when binned. Specify sumCols=c() if none.
Avg: A Boolean to indicate if you would like to see the output of an average of all data columns. If you tell CAT to analyze columns 4:8, and specify average, in addition to analyzing each column from 4 to 8, the columns will also be averaged, and that average will be analyzed.
NEq: Optional. Only needed when a certain kind of binnin is required, placing the average of a range of data points at the Midpoint of that Interval. When using NEq, specify the data column in valCols. Only one column at a time.
bin: Should binning be done (only applies to the Plex).
Interval: The range of times over which to averaged data, in hours.
Midpoint: The midpoint time of the interval, where the mean is placed. This is specified in hours from the beginning of the interval.
start: The time at which to start analysis. Specified in timeFormat.
end: The time at which to end analysis. Specified in timeFormat.
Plex: Optional parameters for Plexogram. Other functions will not run at the same time. NEq can be specified with Plex if binning of data is needed.
plex: Should a Plexogram be run. <*>nClasses: The number of classes in each fold. <*>foldLen: The length of the fold, in hours.
RefTime: The time at which to start analysis. Specified in timeFormat.
yLab: The label on the y-axis of the plot.
ySet: The range on the y-axis of the plot.
export: Boolean. Default is False. If True, a data file is saved after interpolation and binning (per parameters). When True, each function (except Actogram) exports results to (separate) comma-delimited text files.
sizePts, binPts: sizePts is the number of minutes between samples. binPts is the number of samples to aggregate into one bin. Binning is very flexible since it can be so important. sizePts * binPts = number of minutes in each bin. Only full bins are used for analysis, so there could be a few data points at the end of the data (after binEnd) that are not used.
Interval, Increment: These two parameters are used together to specify a progressive analysis. The interval is the length of subsections of data to analyze, and the increment is how far to move ahead in the data to begin the next span. (Interval will need to be large enough not to trigger the error messages in the functions, which require more than 3 days for each analysis.) The entire data set will be analyzed (from LumStart to binEnd). A progressive analysis can be performed by the Auto-Correlation and Cross-Correlation. The Actogram and Smoothing functions are performed on the full dataset length, for each column, as normal. There is no benefit to viewing these graphs in subsections.
k: Only the Smoothing function uses this parameter. It is a count of the number of data points on each side of a point to include in the moving average. The moving average is calculated using 2k+1 data points.
yLab: Label for Y axis on Smoothing and Actogram functions. Default is "Activity Level (au)".
modulo: Only the Actogram function uses this parameter. It specifies the width in minutes to be used for displaying the Actogram. Default is 1440 min, or 1 day.
funcName: A short name for this run, used to differentiate multiple runs that are otherwise quite similar. Appears in the filenames, and in the .rtf file.
Rverbose, RmaxGap: Rverbose is for debug and can take on values of -1, 0, 1 or 2. 0 turns off debug information. 1 or 2 add increasing amounts of debug information. -1 displays minimal information on graphs. RmaxGap specifies the maximum allowable number of missing data points in any one block. An error will be returned if gaps larger than this are found in either data file.
Skip, header: These are kind of housekeeping parameters.
Skip is a parameter to the R read.table function. Default is 0. It is needed if you have multiple lines of header – the first row will be read as the header (unless you set header=FALSE) and Skip indicates how many rows to skip before reading data. header indicates if the file columns have a header row. Default=FALSE. Headers are used to name variables.
(Function): Default is FALSE. For any function in Table 2, specifying (Function)=TRUE will cause it to run, thus you select only the functions you need for any purpose. For example: Actogram=TRUE, Smoothing=FALSE, AutoCorr=FALSE, CrossCorr=TRUE, ...
Console: Default is FALSE. When Console=TRUE output will be redirected to the RStudio Console, instead of an output file.
Graphics: Results of CatCall are sent to a files when Console=FALSE. Default file output type is "pdf". Possible values: "jpg, pdf, tif, png. See Output Data below for more information on output.
Darkness: This refers to the illumination column in the first file. CAT analysis and graphing begins at darkness onset, as indicated by the luminance column. Normally, darkness is indicated by a very small number (<10) and light is a large number (>=10). If this is needed to be reversed, changing the Darkness defaults will correct the interpretation of the lumninance column for the respective files. Darkness=0 means that darkness is a small number (<10). Darkness=1 indicates light is a very small number, and darkness is a large number (>=10).
LagPcnt: Specifies maximal lag used to calculate the Autocorrelation and Crosscorrela- tion functions, expressed as a percentage of the number of data.
tz: R timezone code. Default GMT should be used in most cases.
title: A title for the run. Appears on the PDF output, and plots.
fileName: You can specify one or two files. Only one file is needed for most functions, but the Crosscorrelation requires two files. Crosscorrelation will perform a column-by-column crosscorrelation, so both files must have the same number of columns.
file2: Optional list of parameter for the second optional data file. Required only for the Crosscorrelation function. A list of parameters for the second file, mirroring those for the first file.
=list(Name=NULL,TimeCol=1, timeFormat="%Y%m%d%H%M", lum=4, valCols=c(3,4), sumCols=c(5,6),sizePts=2, binPts=5,Darkness=0)
Input Data:
Input data is assumed to be equidistant. All columns are expected to be numeric.
Data File format: Tab- or comma-delimited (.txt) file with the following columns: time, luminance, data col1 [, data col2] . . .[, data coln]
Cross-Correlation requires 2 data files, where other functions require only one data file. A single data file with many data columns can be specified for analysis, in which case the Cross-Correlation function is skipped; or 2 data sets can be analyzed. In all cases, interpolation is done to fill in missing data points; and then analysis is done on each specified data column in a file, as well as on the average of all columns individually analyzed (if selected).
Output Data:
Sample graphics output file: The output file is a graphics file. See Output section on the web site, or Vignette folders, for a sample of a full output file. All output filenames contain the input data filename to clearly identify the data file under analysis, and a timestamp to show the time of analysis. Each graphic file also lists the column name being analyzed (or averaged), and the starting and ending times of analysis, as they vary slightly from the full data set (Lum to binEnd).
Possible output file types: jpg, pdf, tif, png
Binned data: Two different binning methods are possible. Using parameters sizePts=2 and binPts=5, the input data is interpolated and (optionally) binned. This transformed data can be exported using the export parameter. If export=True then each function (except Actogram) exports a file with the results of the function.
A second type of binning is done when the parameters in NEq() are used. This type always sums the data in each bin and finds an average. No interpolation is done. This binning works with the Plexogram function, if binning is needed.
For help Debugging Errors, see Error page.
Information on CAT Cosinor is also available.