bayesvalidrox.bayes_inference.mcmc.MCMC¶
- class bayesvalidrox.bayes_inference.mcmc.MCMC(engine, mcmc_params, discrepancy, observation=None, out_names=None, selected_indices=None, use_emulator=False, out_dir='')¶
Bases:
PostSampler
A class for bayesian inference via a Markov-Chain Monte-Carlo (MCMC) Sampler to approximate the posterior distribution of the Bayes theorem: $$p(theta|mathcal{y}) = frac{p(mathcal{y}|theta) p(theta)}
{p(mathcal{y})}.$$
This class make inference with emcee package [1] using an Affine Invariant Ensemble sampler (AIES) [2].
- [1] Foreman-Mackey, D., Hogg, D.W., Lang, D. and Goodman, J., 2013.emcee:
the MCMC hammer. Publications of the Astronomical Society of the Pacific, 125(925), p.306. https://emcee.readthedocs.io/en/stable/
- [2] Goodman, J. and Weare, J., 2010. Ensemble samplers with affine
invariance. Communications in applied mathematics and computational science, 5(1), pp.65-80.
Attributes¶
- enginebvr.Engine
Engine object that contains the surrogate, model and exp_design.
- mcmc_paramsdict
Dictionary of parameters for the mcmc. Required are - prior_samples: np.array of size [Nsamples, ndim]
With samples from the parameters to infer. If given, the walkers will be initialized with values sampled equally spaced based on the boundaries of the samples given here. No burnin will be done. Default is None - in which case the walkers are initialized randomly, with a burn in period.
- n_steps: int
Number of steps/samples to generate for each walker
- n_walkers: int
Number of walkers/independent chains in the ensemble
- n_burn: int
Number of samples to consider in the burnin period
- moves: Obj
sampling strategy which determines how new points are proposed. Must be a valid emcee move object. The following options are available (see the EMCEE website for more details https://emcee.readthedocs.io/en/stable/user/moves/):
emcee.moves.KDEMove()
emcee.moves.DEMove()
emcee.moves.StretchMove()
emcee.moves.DESnookerMove()
emcee.moves.WalkMove()
None - default value. If None is given, then EMCEE uses the StretchMove() by default
- multiplrocessing: bool
True to parallelize the different walkers. Default is False
verbose: bool
- discrepancyobject, optional
Object of class bvr.Discrepancy. The default is None.
- observationdict, optional
Measurement/observation to use as reference. The default is None.
- out_nameslist, optional
The list of requested output keys to be used for the analysis. The default is None. If None, all the defined outputs from the engine are used.
- selected_indicesdict, optional
A dictionary with the selected indices of each model output. The default is None. If None, all measurement points are used in the analysis.
- use_emulatorbool
Set to True if the emulator/metamodel should be used in the analysis. If False, the model is run.
- out_dirstring
Directory to write the outputs to.
- __init__(engine, mcmc_params, discrepancy, observation=None, out_names=None, selected_indices=None, use_emulator=False, out_dir='')¶
Methods
__init__
(engine, mcmc_params, discrepancy[, ...])calculate_loglik_logbme
(model_evals[, ...])Calculate log-likelihoods and logbme on the perturbed data.
eval_model
(theta)Evaluates the (meta-) model at the given theta.
Checks to see if user-provided Move is a valid EMCEE move.
log_likelihood
(theta)Computes likelihood ( p(mathcal{Y}|theta)) of the performance of the (meta-)model in reproducing the observation data.
log_posterior
(theta)Computes the posterior likelihood (p(theta| mathcal{Y})) for the given parameterset.
log_prior
(theta)Calculates the log prior likelihood ( p(theta)) for the given parameter set(s) ( theta ).
normpdf
(outputs[, std_outputs, rmse])Calculates the likelihood of simulation outputs compared with observation data.
Run the MCMC sampler for the given observations and stdevs.
- calculate_loglik_logbme(model_evals, surr_error=None, std_outputs=None) tuple[ndarray, ndarray] ¶
Calculate log-likelihoods and logbme on the perturbed data. This function assumes everything as Gaussian.
Parameters¶
- model_evalsdict
Model or metamodel outputs as a dictionary.
- surr_errordict, optional
A dictionary containing the root mean squared error as array of shape (n_samples, n_measurement) for each model output. The default is None.
- std_outputsdict of 2d np arrays, optional
Standard deviation (uncertainty) associated to the output. The default is None.
Returns¶
- log_likelihoodnp.ndarray
The calculated loglikelihoods. Size: (n_samples, n_bootstrap_itr).
- log_bmenp.ndarray
The log bme. This also accounts for metamodel error, if self.use_emulator is True. Size: (1,n_bootstrap_itr).
- eval_model(theta) tuple[dict, dict] ¶
Evaluates the (meta-) model at the given theta.
Parameters¶
- thetaarray of shape (n_samples, n_params)
Parameter set, i.e. proposals of the MCMC chains.
Returns¶
- mean_preddict
Mean model prediction.
- std_preddict
Std of model prediction.
- is_valid_move() bool ¶
Checks to see if user-provided Move is a valid EMCEE move.
Raises¶
ValueError
Returns¶
- bool
True if valid move, raises an error if not a valid move.
- log_likelihood(theta) ndarray ¶
Computes likelihood ( p(mathcal{Y}|theta)) of the performance of the (meta-)model in reproducing the observation data.
Parameters¶
- thetaarray of shape (n_samples, n_params)
Parameter set, i.e. proposals of the MCMC chains.
Returns¶
- log_likenp.ndarray
Log likelihood. Shape: (n_samples)
- log_posterior(theta) ndarray ¶
Computes the posterior likelihood (p(theta| mathcal{Y})) for the given parameterset.
Parameters¶
- thetaarray of shape (n_samples, n_params)
Parameter set, i.e. proposals of the MCMC chains.
Returns¶
- log_likenp.ndarray
Log posterior likelihood. Shape: (n_samples)
- log_prior(theta) ndarray ¶
Calculates the log prior likelihood ( p(theta)) for the given parameter set(s) ( theta ).
Parameters¶
- thetaarray of shape (n_samples, n_params)
Parameter sets, i.e. proposals of MCMC chains.
Returns¶
- logprior: np.ndarray
Log prior likelihood. If theta has only one row, a single value is returned otherwise an array of shape (n_samples) is returned.
- normpdf(outputs, std_outputs=None, rmse=None) ndarray ¶
Calculates the likelihood of simulation outputs compared with observation data.
Parameters¶
- outputsdict
The metamodel outputs as an array of shape (n_samples, n_measurement) for each model output.
- std_outputsdict of 2d np arrays, optional
Standard deviation (uncertainty) associated to the output. The default is None.
- rmsedict, optional
A dictionary containing the root mean squared error as array of shape (n_samples, n_measurement) for each model output. The default is None.
Returns¶
- logLiknp.ndarray
Log-likelihoods. Shape: (n_samples)