Struct cmb_dataset
Defined in File cmb_dataset.h
Struct Documentation
-
struct cmb_dataset
An automatically resizing array of (possibly unordered) sample values, each sample a double.
Public Functions
-
struct cmb_dataset *cmb_dataset_create(void)
Allocate memory for a dataset.
Remember to call a matching
cmb_dataset_destroywhen done to avoid memory leakage.- Returns:
A freshly allocated dataset object.
-
void cmb_dataset_initialize(struct cmb_dataset *dsp)
Initialize the dataset, clearing any data values.
- Parameters:
dsp – Pointer to an already allocated dataset object.
-
void cmb_dataset_reset(struct cmb_dataset *dsp)
Re-initialize it, returning it to a newly initialized state.
- Parameters:
dsp – Pointer to an already allocated dataset object.
-
void cmb_dataset_terminate(struct cmb_dataset *dsp)
Un-initialize it, returning it to a newly created state.
- Parameters:
dsp – Pointer to an already allocated dataset object.
-
uint64_t cmb_dataset_copy(struct cmb_dataset *tgt, const struct cmb_dataset *src)
Copy
tgtintosrc, overwriting whatever was intgt.- Parameters:
tgt – Pointer to the target dataset object.
src – Pointer to the source dataset object.
- Returns:
Number of data points copied.
-
uint64_t cmb_dataset_merge(struct cmb_dataset *tgt, const struct cmb_dataset *s1, const struct cmb_dataset *s2)
Merge datasets
s1ands2into datasettgt. The target may or may not be one of the two sources, but notNULL.- Parameters:
tgt – Pointer to the target dataset object.
s1 – Pointer to the first source dataset object.
s2 – Pointer to the second source dataset object.
- Returns:
Number of data points in the merged data set.
-
void cmb_dataset_destroy(struct cmb_dataset *dsp)
Free memory allocated by
cmb_dataset_createfor the dataset and its arrays.Do not call unless the dataset was created on the heap by
cmb_dataset_create. Otherwise, only usecmb_dataset_terminateto free the internal data array.- Parameters:
dsp – Pointer to a previously allocated dataset object.
-
void cmb_dataset_sort(const struct cmb_dataset *dsp)
Sort the data array in ascending order.
- Parameters:
dsp – Pointer to a dataset object.
-
uint64_t cmb_dataset_add(struct cmb_dataset *dsp, double x)
Add a single value to a dataset, resizing the array as needed.
- Parameters:
dsp – Pointer to a dataset object.
x – The new sample value to add.
- Returns:
The new number of data values in the array.
-
uint64_t cmb_dataset_summarize(const struct cmb_dataset *dsp, struct cmb_datasummary *dsump)
Calculate summary statistics of the data series.
- Parameters:
dsp – Pointer to a dataset object.
dsump – Pointer to a data summary object to store the results.
- Returns:
The number of data values included in the summary.
-
double cmb_dataset_median(const struct cmb_dataset *dsp)
Calculate and return the median of the dataset.
May be somewhat time-consuming, since it first needs to sort the data array. Calling it on an empty dataset will generate a warning and return zero.
- Parameters:
dsp – Pointer to a dataset object.
- Returns:
The maximum data value in the data set, zero if no data yet.
-
void cmb_dataset_fivenum_print(const struct cmb_dataset *dsp, FILE *fp, bool lead_ins)
Calculate and print the “five-number” summary of dataset quantiles, i.e., minimum, first quartile, median, third quartile, and maximum.
- Parameters:
dsp – Pointer to a dataset object.
fp – A valid file pointer, possibly
stdoutlead_ins – Flag for whether to add lead-in texts or just print the numeric values.
-
void cmb_dataset_histogram_print(const struct cmb_dataset *dsp, FILE *fp, unsigned num_bins, double low_lim, double high_lim)
Print a simple character-based histogram. Will autoscale to the dataset range if
LowerLimit == UpperLimit.Will print the symbol ‘#’ for a full bar “pixel”, ‘=’ for one that is more than half full, and ‘-’ for one that is less than half full.
Adds overflow bins to the ends of the range to catch anything outside.
- Parameters:
dsp – Pointer to a dataset object.
fp – A valid file pointer, possibly
stdoutnum_bins – The number of bins, not including the two overflow bins
low_lim – The lower limit for the bin range.
high_lim – The upper limit for the bin range.
-
void cmb_dataset_print(const struct cmb_dataset *dsp, FILE *fp)
Print the raw data values in a single column.
- Parameters:
dsp – Pointer to a dataset object.
fp – A valid file pointer, possibly
stdout
-
void cmb_dataset_ACF(const struct cmb_dataset *dsp, unsigned n, double *acf)
Calculate autocorrelation coefficients.
- Parameters:
dsp – Pointer to a dataset object.
n – The highest lag value to calculate
acf – The array where the acf’s will be stored, size
n + 1
-
void cmb_dataset_PACF(const struct cmb_dataset *dsp, unsigned n, double *pacf, double *acf)
Calculate partial autocorrelation coefficients.
The first and most time-consuming step in the algorithm is to calculate the ACFs. If these already have been calculated, they can be given as the last argument
acf[]. If this argument isNULL, they will be calculated directly from the dataset during the call.- Parameters:
dsp – Pointer to a dataset object.
n – The highest lag value to calculate.
pacf – The array where the pacf’s will be stored, size
n + 1acf – Array of ACF’s if already calculated, size
n + 1, otherwiseNULL
-
void cmb_dataset_correlogram_print(const struct cmb_dataset *dsp, FILE *fp, unsigned n, double *acf)
Print a simple correlogram of the autocorrelation coefficients previously calculated, either ACFs or PACFs.
If the data vector
acf[]isNULL, ACFs will be calculated directly from the dataset by callingcmb_dataset_ACF.To print PACFs, give a vector of PACFs as the
acfargument.- Parameters:
dsp – Pointer to a dataset object.
fp – A valid file pointer, possibly
stdoutn – The highest lag value to calculate.
acf – The array where the acf’s will be stored size
n + 1
Public Members
-
uint64_t cookie
A “magic cookie” to catch uninitialized objects
-
uint64_t cursize
The currently allocated space as a number of samples
-
uint64_t count
The current number of samples in the array
-
double min
Smallest sample, initially
DBL_MAX
-
double max
Largest sample, initially
-DBL_MAX
-
double *xa
Pointer to the actual data array, initially
NULL
Public Static Functions
-
static inline uint64_t cmb_dataset_count(const struct cmb_dataset *dsp)
Count the number of data values.
- Parameters:
dsp – Pointer to a dataset object.
- Returns:
The number of data values in the data set.
-
static inline double cmb_dataset_min(const struct cmb_dataset *dsp)
The minimum sample value in the dataset.
- Parameters:
dsp – Pointer to a dataset object.
- Returns:
The minimum data value in the data set,
DBL_MAXif no data yet.
-
static inline double cmb_dataset_max(const struct cmb_dataset *dsp)
The maximum sample value in the dataset.
- Parameters:
dsp – Pointer to a dataset object.
- Returns:
The maximum data value in the data set,
-DBL_MAXif no data yet.
-
struct cmb_dataset *cmb_dataset_create(void)