mcmc_sampler_internals

Internal functions of the generic MCMC sampler (nothing specific to GW)

Functions

int thermodynamic_integration(double *integrated_likelihoods, double *temps, int temps_N, double *evidence, double *error)
double calculate_evidence(sampler *samplerptr)
void combine_chain_evidence(sampler *samplerptr, double *evidences, int *total_terms, int ensemble_size)
void integrate_likelihood(sampler *samplerptr)
void iterate_fisher(sampler *samplerptr, int chain_id)
int mcmc_step(sampler *sampler, double *current_param, double *next_param, int *current_status, int *next_status, int *current_model_status, int *next_model_status, int chain_number)

interface function between the sampler and the internal step functions

void gaussian_step(sampler *sampler, double *current_param, double *proposed_param, int *current_status, int *proposed_status, int *current_model_status, int *proposed_model_status, int chain_id, int *selected_dimension)

Straight gaussian step.

Change this to pick one direction

Parameters:
  • sampler – Sampler struct

  • current_param – current position in parameter space

  • proposed_param – [out] Proposed position in parameter space

void fisher_step(sampler *sampler, double *current_param, double *proposed_param, int *current_status, int *proposed_status, int *current_model_status, int *proposed_model_status, int chain_index)

Fisher informed gaussian step.

Parameters:
  • sampler – Sampler struct

  • current_param – current position in parameter space

  • proposed_param – [out] Proposed position in parameter space

void update_fisher(sampler *sampler, double *current_param, int *param_status, int *model_status, int chain_index)
void mmala_step(sampler *sampler, double *current_param, double *proposed_param, int *current_status, int *proposed_status, int *current_model_status, int *proposed_model_status, int chain_index)

MMALA informed step — Currently not supported.

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!DON’T TRUST!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

NOTE: This assumes the Fisher doesn’t change between steps. The proposal ratio only accounts for the gradient of the log likelihood

Parameters:
  • sampler – Sampler struct

  • current_param – current position in parameter space

  • proposed_param – [out] Proposed position in parameter space

void diff_ev_step(sampler *sampler, double *current_param, double *proposed_param, int *current_status, int *proposed_status, int *current_model_status, int *proposed_model_status, int chain_id)

differential evolution informed step

Differential evolution uses the past history of the chain to inform the proposed step:

Take the difference of two random, accepted previous steps and step along that with some step size, determined by a gaussian

Parameters:
  • sampler – Sampler struct

  • current_param – current position in parameter space

  • proposed_param – [out] Proposed position in parameter space

void RJ_smooth_history(sampler *sampler, double *current_param, int *current_param_status, int base_history_id, double *eff_history_coord, int *eff_history_status, int chain_id)

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

Parameters:
  • sampler – Current sampler

  • current_param – Current parameters to match

  • current_param_status – Current parameters to match

  • base_history_id – Original history element

  • eff_history_coord – [out] Modified history coord

  • eff_history_status – [out] Modified History status

  • chain_id – Chain ID of the current chain

void RJ_step(sampler *sampler, double *current_param, double *proposed_param, int *current_status, int *proposed_status, int *current_model_status, int *proposed_model_status, double *MH_corrections, int chain_number)

RJ-proposal step for trans-dimensional MCMCs.

This extra step may seem unnecessary, I’m just adding it in in case the extra flexibility is useful in the future for preprocessing of the chain before sending it to the user’s RJ_proposal

Parameters:
  • sampler – sampler

  • current_param – current coordinates in parameter space

  • proposed_param – [out] Proposed coordinates in parameter space

  • current_status – Current status of parameters

  • proposed_status – [out] Proposed status of parameters

  • current_model_status – Current status of parameters

  • proposed_model_status – [out] Proposed status of parameters

  • chain_number – chain mumber

void chain_swap(sampler *sampler, double ***output, int ***param_status, int **model_status, int step_num, int *swp_accepted, int *swp_rejected)

subroutine to perform chain comparison for parallel tempering

The total output file is passed, and the chains are swapped sequentially

This is the routine for `Deterministic’ sampling (parallel or sequential, but not pooled)

Parameters:
  • sampler – sampler struct

  • output – output vector containing chains

  • param_status – Parameter status

  • model_status – Parameter status

  • step_num – current step number

int single_chain_swap(sampler *sampler, double *chain1, double *chain2, int *chain1_status, int *chain2_status, int *chain1_model_status, int *chain2_model_status, int T1_index, int T2_index)

subroutine to actually swap two chains

This is the more general subroutine, which just swaps the two chains passed to the function

Parameters:
  • sampler – sampler structure

  • chain1 – parameter position of chain that could be changed

  • chain2 – chain that is not swapped, but provides parameters to be swapped by the other chain

  • chain1_status – Parameter status array for chain1

  • chain2_status – Parameter status array for chain2

  • chain1_model_status – Parameter status array for chain1

  • chain2_model_status – Parameter status array for chain2

  • T1_index – number of chain swappe in chain_temps

  • T2_index – number of chain swapper in chain_temps

void assign_probabilities(sampler *sampler, int chain_index)

update and initiate probabilities for each variety of step

Type 0: Gaussian step

Type 1: Differential Evolution step

Type 2: MMALA step (currently not supported)

Type 3: Fisher step

void transfer_chain(sampler *samplerptr_dest, sampler *samplerptr_source, int id_destination, int id_source, bool transfer_output)

Copies contents of one chain to another.

Transfers id_source in samplerptr_source to id_dest samplerptr_dest

NOTE: This copies the VALUE, not the reference. This could be expensive, so use with caution

id_dest is ERASED

samplerptr_dest and samplerptr_source MUST have the same dimension, the same sampling details (like having or not having a fisher) etc

samplerptr_dest must be previously allocated properly

As output is the largest transfer by far, the transfer_output flag can be used to allow the user to handle that manually.

bool check_sampler_status(sampler *samplerptr)

Checks the status of a sampler for the stochastic sampling.

Just loops through the ref_chain_status variables

void update_step_widths(sampler *samplerptr, int chain_id)

Updates the step widths, shooting for 20% acceptance ratios for each type of step.

void allocate_sampler_mem(sampler *sampler)
void deallocate_sampler_mem(sampler *sampler)
void update_history(sampler *sampler, double *new_params, int *new_param_status, int chain_index)
void write_stat_file(sampler *sampler, std::string filename)
void write_checkpoint_file(sampler *sampler, std::string filename)

Routine that writes metadata and final positions of a sampler to a checkpoint file.

int chain_number_from_checkpoint_file(std::string check_file, int *chain_N)

Load chain number from checkpoint file.

int dimension_from_checkpoint_file(std::string check_file, int *min_dimension, int *max_dimension)

Load dimension from checkpoint file.

If RJ — returns min_dimension and max_dimension

If not RJ — returns dimension in min_dimension and max_dimension

void load_checkpoint_file(std::string check_file, sampler *sampler)

load checkpoint file into sampler struct

NOTE — allocate_sampler called in function — MUST deallocate manually

NOTE — sampler->chain_temps allocated internally — MUST free manually

void load_temps_checkpoint_file(std::string check_file, double *temps, int chain_N)

load temperatures from checkpoint file

Assumed the temps array is already allocated in memory for the correct number of chains

Just a utility routine to read temperatures from checkpoint file

It would be easy to read in the chain number and allocate memory in the function, but I prefer to leave allocation/deallocation up to the client

void assign_ct_p(sampler *sampler, int step, int chain_index, int gauss_dim)
void assign_ct_m(sampler *sampler, int step, int chain_index, int gauss_dim)
void assign_initial_pos(sampler *samplerptr, double *initial_pos, int *initial_status, int initial_model_status, double **ensemble_initial_pos, int **ensemble_initial_status, int *ensemble_initial_model_status, double *seeding_var)
double PT_dynamical_timescale(int t0, int nu, int t)

Timescale of the PT dynamics.

kappa in the the language of arXiv:1501.05823v3

Parameters:
  • t0 – Timescale of the dyanmics

  • nu – Initial amplitude (number of steps to base dynamics on)

  • t – current time

void update_temp_neighborhoods(sampler *samplerptr)
void update_temperatures_full_ensemble(sampler *samplerptr, int t0, int nu, int t)
void update_temperatures(sampler *samplerptr, int t0, int nu, int t)

updates the temperatures for a sampler such that all acceptance rates are equal

Follows the algorithm outlined in arXiv:1501.05823v3

Fixed temperatures for the first and last chain

used in MCMC_MH_dynamic_PT_alloc_internal

For defined results, this should be used while the sampler is using non-pooling methods

void initiate_full_sampler(sampler *sampler_new, sampler *sampler_old, int chain_N_thermo_ensemble, int chain_N, std::string chain_allocation_scheme, std::string checkpoint_file_start)

For the dynamic PT sampler, this is the function that starts the full sampler with the max number of chains.

The output file will be reused, but the positions are set back to zero (copying the current position to position zero)

Assumes the output, chain_temps have been allocated in memory for the final number of chains chain_N and steps N_steps

Allocates memory for the new sampler sampler_new -> it’s the user’s responsibility to deallocate with deallocate_sampler_mem

checkpoint_file_start is the file used for continue_PTMCMC like functions, and can be used if the user wants to maintain the approximate positions and histories of the ensemble of chains after the dynamic temperature allocation. If the rest of the chains should just be copies of the base ensemble, pass an empty string (“”). Since the chain distribution can change the number and distribution of chain temperatures (that’s kinda the point..), the transfer is not perfect. Cold chains are prioritized, then the rest of the chains are filled in as close as possible.

Parameters:
  • sampler_old – Dynamic sampler

  • chain_N_thermo_ensemble – Number of chains used in the thermodynamic ensemble

  • chain_N – Number of chains to use in the static sampler

  • chain_allocation_scheme – Scheme to use to allocate any remaining chains

  • checkpoint_file_start – Base checkpoint file

void copy_base_checkpoint_properties(std::string check_file, sampler *samplerptr)

Copies positions and histories into chain ensemble, skipping the first set of temperatures.

NOTE — allocate_sampler called in function — MUST deallocate manually

NOTE — sampler->chain_temps allocated internally — MUST free manually

!ASSUMPTIONS! —

The checkpoint file and new sampler must have the same total number of chains, dimension, and the histories are the same length

The temperatures are in ascending order, as output in write_checkpoint

void write_output_file(std::string file, int step_num, int max_dimension, double ***output, int ***status, int **model_status, int chain_N, int nested_model_number, double *temps, bool RJ)

Utility to write out the parameters and status of a sampler to a file.

void reduce_output(int step_num, int max_dimension, double ***output_old, int ***status_old, int **model_status_old, double **output_new, int **status_new, int **model_status_new, int chain_N, double *temps, bool RJ)
int count_cold_chains(double *temps, int chain_N)
void assign_ensemble_temps(double *chain_temps, int chain_N, int max_chain_N_thermo_ensemble, double TMAX)
void update_kde_cov(sampler *sampler, int chain_id)
void kde_proposal(sampler *sampler, double *current_param, double *proposed_param, int *current_status, int *proposed_status, int *current_model_status, int *proposed_model_status, int chain_id)

KDE step from history file.

TODO: Right now, this assumes a diagonal covariance matrix. Maybe fix this in the future

Parameters:
  • sampler – Sampler struct

  • current_param – current position in parameter space

  • proposed_param – [out] Proposed position in parameter space

Variables

const double limit_inf = -std::numeric_limits<double>::infinity()
class mcmc_data_interface
#include <mcmc_sampler_internals.h>

Public Functions

inline ~mcmc_data_interface()

Public Members

int min_dim
int chain_id
int max_dim
int nested_model_number
int chain_number
double RJ_step_width = 1
bool burn_phase = false
class sampler
#include <mcmc_sampler_internals.h>

Class storing everything that defines an instance of the sampler

Public Members

mcmc_data_interface **interfaces
bool block_sample = false
double block_sample_prob = .5
int block_num = 2
int block_boundary_ids[2] = {7, 11}
bool kde_step = false
bool tune = true
int types_of_steps = 6
double **step_prob
double **prob_boundaries
double *chain_temps
int **swap_partners
int **swap_accepts
bool random_swaps = true
double **chain_neighborhoods
int *chain_neighbors
int **chain_neighborhoods_ids
int *chain_neighbors_ids
int chain_radius = 2
bool restrict_swapping = true
bool isolate_ensembles = false
bool isolate_ensembles_cold = true
double swap_rate = 1. / 2.
bool burn_phase = false
bool ensemble_proposal = true
double ensemble_proposal_rate = 0.25
double ensemble_prop_a = 1.
bool *waiting
bool *restarted_chain
bool *waiting_SWP
std::mutex *queue_mutexes
int *chain_pos
double swp_freq
int chain_N
int numThreads
int N_steps
int dimension
int min_dim
int max_dim
bool fisher_exist
bool proper_fisher = false
bool *de_primed
int *priority
bool *ref_chain_status
int *ref_chain_ids
int ref_chain_num
bool prioritize_cold_chains = false
double ***output
bool pool
int progress = 0
bool show_progress
int num_threads
int history_length = 1000
int history_update = 10
int *current_hist_pos
double ***history
int ***history_status
double *current_likelihoods
int *check_stepsize_freq
double *max_target_accept_ratio
double *min_target_accept_ratio
int *gauss_last_accept_ct
int *gauss_last_reject_ct
int **gauss_last_accept_ct_per_dim
int **gauss_last_reject_ct_per_dim
int *de_last_accept_ct
int *de_last_reject_ct
int *fish_last_accept_ct
int *fish_last_reject_ct
int *kde_last_accept_ct
int *kde_last_reject_ct
int *RJstep_last_accept_ct
int *RJstep_last_reject_ct
int **randgauss_width_number
double ***randgauss_width
double ***fisher_vecs
double **fisher_vals
double ***fisher_vecs_prop
double **fisher_vals_prop
double ***fisher_matrix
double ***fisher_matrix_prop
int *fisher_update_ct
double *prop_MH_factor
int fisher_update_number = 200
double ***kde_cov = NULL
double ***kde_fisher = NULL
double *kde_cov_lndet
int kde_cov_update_number = 200
int *kde_cov_update_ct = NULL
std::function<double(double*, int*, int, mcmc_data_interface*, void*)> lp
std::function<double(double*, int*, int, mcmc_data_interface*, void*)> ll
std::function<void(double*, int*, int, double**, mcmc_data_interface*, void*)> fish
void **user_parameters = NULL
bool local_param_allocation = false
gsl_rng **rvec
int *nan_counter
int *num_gauss
int *num_fish
int *num_de
int *num_mmala
int *num_RJstep
int *num_kde
double time_elapsed_cpu
double time_elapsed_wall
double time_elapsed_cpu_ac
double time_elapsed_wall_ac
int *kde_accept_ct
int *kde_reject_ct
int *fish_accept_ct
int *fish_reject_ct
int *de_accept_ct
int *de_reject_ct
int *gauss_accept_ct
int *gauss_reject_ct
int **gauss_accept_ct_per_dim
int **gauss_reject_ct_per_dim
int *mmala_accept_ct
int *mmala_reject_ct
int *RJstep_accept_ct
int *RJstep_reject_ct
int *swap_accept_ct
int *swap_reject_ct
int *step_accept_ct
int *step_reject_ct
double *thermodynamic_integrated_likelihood
int *thermodynamic_integrated_likelihood_terms
double ***ll_lp_output
bool log_ll = false
bool log_lp = false
int *A
bool PT_alloc = false
bool linear_swapping = true
int ***param_status
bool RJMCMC = false
std::function<void(double*, double*, int*, int*, int*, int*, double*, mcmc_data_interface*, void*)> rj
bool update_RJ_width = true
int **model_status = NULL
int nested_model_number = 0