autocorrelation

Autocorrelation header file

Functions

void write_auto_corr_file_from_data_file(std::string autocorr_filename, std::string datafile, int length, int dimension, int num_segments, double target_corr, int num_threads, bool cumulative)
Parameters:
  • length – length of input data

  • dimension – dimension of data

  • num_segments – number of segements to compute the auto-corr length

  • target_corr – Autocorrelation for which the autocorrelation length is defined (lag of autocorrelation for which it equals the target_corr)

  • num_threads – Total number of threads to use

  • cumulative – Boolean to calculate the autocorrelation cumulatively

void write_auto_corr_file_from_data(std::string autocorr_filename, double **data, int length, int dimension, int num_segments, double target_corr, int num_threads, bool cumulative)

Writes the autocorrelation file from a data array.

First row is the starting index of that segment

Second row is the length of that segment

If cumulative, the ac is calculated in the following format:

|—-—|

|———–—|

|———————-—|

Else, the ac is calculated as :

|—-—|

    |------|

            |--------|

Parameters:
  • autocorr_filename – Name of the file to write the autocorrelation to

  • data – Input chains

  • length – length of input data

  • dimension – dimension of data

  • num_segments – number of segements to compute the auto-corr length

  • target_corr – Autocorrelation for which the autocorrelation length is defined (lag of autocorrelation for which it equals the target_corr)

  • num_threads – Total number of threads to use

  • cumulative – Boolean to calculate the autocorrelation cumulatively

void auto_corr_from_data_batch(double ***data, int length, int dimension, int chain_N, int ***output, int num_segments, double target_corr, int num_threads, bool cumulative)

Calculates the autocorrelation length for a set of data for a number of segments for each dimension — completely host code, utilitizes FFTW3 for longer chuncks of the chains — Batch version for multiple chains at a time.

Takes in the data from a sampler, shape data[chain_N][N_steps][dimension]

Outputs lags that correspond to the target_corr — shape output[chain_N][dimension][num_segments]

If cumulative, the ac is calculated in the following format:

|—-—|

|———–—|

|———————-—|

Else, the ac is calculated as :

|—-—|

    |------|

            |--------|

Parameters:
  • data – Input data

  • length – length of input data

  • dimension – dimension of data

  • output – [out] array that stores the auto-corr lengths — array[num_segments]

  • num_segments – number of segements to compute the auto-corr length

  • target_corr – Autocorrelation for which the autocorrelation length is defined (lag of autocorrelation for which it equals the target_corr)

  • num_threads – Total number of threads to use

  • cumulative – Boolean to calculate the autocorrelation cumulatively

void auto_corr_from_data(double **data, int length, int dimension, int **output, int num_segments, double target_corr, int num_threads, bool cumulative)

Calculates the autocorrelation length for a set of data for a number of segments for each dimension — completely host code, utilitizes FFTW3 for longer chuncks of the chains.

Takes in the data from a sampler, shape data[N_steps][dimension]

Outputs lags that correspond to the target_corr — shape output[dimension][num_segments]

If cumulative, the ac is calculated in the following format:

|—-—|

|———–—|

|———————-—|

Else, the ac is calculated as :

|—-—|

    |------|

            |--------|

Parameters:
  • data – Input data

  • length – length of input data

  • dimension – dimension of data

  • output – [out] array that stores the auto-corr lengths — array[num_segments]

  • num_segments – number of segements to compute the auto-corr length

  • target_corr – Autocorrelation for which the autocorrelation length is defined (lag of autocorrelation for which it equals the target_corr)

  • num_threads – Total number of threads to use

  • cumulative – Boolean to calculate the autocorrelation cumulatively

void threaded_ac_spectral(int thread, threaded_ac_jobs_fft job)

Internal routine to calculate an spectral autocorrelation job.

Allows for a more efficient use of the threadPool class

void threaded_ac_serial(int thread, threaded_ac_jobs_serial job)

Internal routine to calculate a serial autocorrelation job.

Allows for a more efficient use of the threadPool class

double auto_correlation_serial(double *arr, int length, int start, double target)

Calculates the autocorrelation of a chain with the brute force method.

Parameters:
  • arr – input array

  • length – Length of input array

  • start – starting index (probably 0)

  • target – Target autocorrelation for which `length’ is defined

void auto_correlation_spectral_windowed(double *chain, int length, int start, double *autocorr, fftw_outline *plan_forw, fftw_outline *plan_rev)

Autocorrelation calculation following EMCEE’s implementation.

Incorporates a windowing function to reduce noise in the estimate of Tau

int autocorrelation_window(double *tau, int c, int length)
void auto_correlation_spectral(double *chain, int length, double *autocorr, fftw_outline *plan_forw, fftw_outline *plan_rev)

Wrapper function for convience — assumes the data array starts at 0.

void auto_correlation_spectral(double *chain, int length, int start, double *autocorr, fftw_outline *plan_forw, fftw_outline *plan_rev)

Faster approximation of the autocorrelation of a chain. Implements FFT/IFFT — accepts FFTW plan as argument for plan reuse and multi-threaded applications.

Based on the Wiener-Khinchin Theorem.

Algorithm used from https://lingpipe-blog.com/2012/06/08/autocorrelation-fft-kiss-eigen/

NOTE the length used in initializing the fftw plans should be L = pow(2, std::ceil( std::log2(length) ) ) — the plans are padded so the total length is a power of two

Option to provide starting index for multi-dimension arrays in collapsed to one dimension

length is the length of the segment to be analyzed, not necessarily the dimension of the chain

void auto_correlation_spectral(double *chain, int length, double *autocorr)

Faster approximation of the autocorrelation of a chain. Implements FFT/IFFT.

Based on the Wiener-Khinchin Theorem.

Algorithm used from https://lingpipe-blog.com/2012/06/08/autocorrelation-fft-kiss-eigen/

double auto_correlation(double *arr, int length, double tolerance)

OUTDATED — numerically finds autocorrelation length — not reliable.

double auto_correlation_serial_old(double *arr, int length)

OUTDATED Calculates the autocorrelation — less general version.

double auto_correlation_grid_search(double *arr, int length, int box_num = 10, int final_length = 50, double target_length = .01)

OUTDATED — Grid search method of computing the autocorrelation — unreliable.

Hopefully more reliable than the box-search method, which can sometimes get caught in a recursive loop when the stepsize isn’t tuned, but also faster than the basic linear, serial search

Parameters:
  • arr – Input array to use for autocorrelation

  • length – Length of input array

  • box_num – number of boxes to use for each iteration, default is 10

  • final_length – number of elements per box at which the grid search ends and the serial calculation begins

  • target_length – target correlation that corresponds to the returned lag

double auto_correlation_internal(double *arr, int length, int lag, double ave)

Internal function to compute the auto correlation for a given lag.

void auto_corr_intervals_outdated(double *data, int length, double *output, int num_segments, double accuracy)

Function that computes the autocorrelation length on an array of data at set intervals to help determine convergence.

outdated version — new version uses FFTs

Parameters:
  • data – Input data

  • length – length of input data

  • output – [out] array that stores the auto-corr lengths — array[num_segments]

  • num_segments – number of segements to compute the auto-corr length

  • accuracy – longer chains are computed numerically, this specifies the tolerance

void write_auto_corr_file_from_data(std::string autocorr_filename, double **output, int intervals, int dimension, int N_steps)

OUTDATED — writes autocorrelation lengths for a data array, but only with the serial method and only for a target correlation of .01.

void write_auto_corr_file_from_data_file(std::string autocorr_filename, std::string output_file, int intervals, int dimension, int N_steps)

OUTDATED — writes autocorrelation lengths for a data file, but only with the serial method and only for a target correlation of .01.

class threaded_ac_jobs_fft
#include <autocorrelation.h>

Class to contain spectral method jobs.

Public Members

double **data
int *length

Read only &#8212; Data to use &#8212; full chain

int *start

Read only &#8212; length of total data

int *end

Read only &#8212; start index

int dimension

Read only &#8212; end index

fftw_outline *planforward

Read only &#8212; dimension being analyzed

fftw_outline *planreverse

fftw plan to use for spectral method

int *lag

fftw plan to use for spectral method

double *target

READ AND WRITE &#8212; final lag

class threaded_ac_jobs_serial
#include <autocorrelation.h>

Class to contain serial method jobs.

Public Members

double **data
int *length

Read only &#8212; Data to use &#8212; full chain

int *start

Read only &#8212; length of total data

int *end

Read only &#8212; start index

int dimension

Read only &#8212; end index

int *lag

Read only &#8212; dimension being analyzed

double *target

READ AND WRITE &#8212; final lag

class comparator_ac_fft
#include <autocorrelation.h>

comparator to sort ac-jobs

Starts with the longest jobs, then works down the list

Public Functions

inline bool operator()(threaded_ac_jobs_fft t, threaded_ac_jobs_fft k)
class comparator_ac_serial
#include <autocorrelation.h>

comparator to sort ac-jobs

Starts with the longest jobs, then works down the list

Public Functions

inline bool operator()(threaded_ac_jobs_serial t, threaded_ac_jobs_serial k)