Bayesian inference ****************** .. container:: twocol .. container:: leftside With Bayesian inference we ask the question 'how does our understanding of the inputs change given some observation of the outputs of the model?', i.e. we perform an updating step of the prior distributions to posterior, based on some observations. Bayesvalidrox provides a dedicated class to perform this task, :any:`bayesvalidrox.bayes_inference.bayes_inference.BayesInference`, which infers the posterior via ``rejection-sampling`` or ``MCMC``. The likelihood in rejection sampling is estimated with the help of ``bootstrapping``. MCMC-specific parameters are to be given as a dictionary called ``mcmc_params`` and can include * ``init_samples``: initial samples * ``n_steps``: number of steps * ``n_walkers``: number of walkers * ``n_burn``: length of the burn-in * ``moves``: function to use for the moves, e.g. taken from ``emcee`` * ``multiprocessing``: setting for multiprocessing * ``verbose``: verbosity .. container:: rightside .. image:: ./diagrams/bayesian_validation.png :width: 300 :alt: UML diagram for classes related to Bayesian inference. The observation should be set as ``Model.observations`` in the ``Engine``, and an estimation of its uncertainty can be provided as a :any:`bayesvalidrox.bayes_inference.discrepancy.Discrepancy` object. Example ======= For this example we need to add the following imports. >>> from bayesvalidrox import Discrepancy, BayesInference In order to run Bayesian inference we first need to provide an observation. For this example we take an evaluation of the model on some chosen sample and add the resulting values as ``Model.observations``. As this expects a 1D-array for each output key, we need to change the format slightly. >>> true_sample = [[2]] >>> observation = Model.run_model_parallel(true_sample) >>> Model.observations = {} >>> for key in observation: >>> if key == 'x_values': >>> Model.observations[key]=observation[key] >>> else: >>> Model.observations[key]=observation[key][0] Next we define the uncertainty on the observation with the class :any:`bayesvalidrox.bayes_inference.discrepancy.Discrepancy`. For this example we set the uncertainty to be zero-mean gaussian and dependent on the values in the observation, i.e. larger values have a larger uncertainty associated with them. The ``parameters`` contain the variance for each point in the observation. .. warning:: For models with only a single uncertain input parameter, numerical issues can appear when the discrepancy is set only depending on the observed data. To resolve this, a small value can be added to the variance of the discrepancy. >>> obsData = pd.DataFrame(Model.observations, columns=Model.Output.names) >>> DiscrepancyOpts = Discrepancy('') >>> DiscrepancyOpts.type = 'Gaussian' >>> DiscrepancyOpts.parameters = obsData**2+0.01 Now we can initialize an object of class :any:`bayesvalidrox.bayes_inference.bayes_inference.BayesInference` with all the wanted properties. This object has to be given our ``Engine``. If it should use the surrogate during inference, set ``emulator`` to ``True``, otherwise the model will be evaluated directly. We also set the defined ``Discrepancy``. and set ``post_plot_pred`` if posterior predictions should be visualized. >>> BayesObj = BayesInference(Engine_) >>> BayesObj.emulator = True >>> BayesObj.Discrepancy = DiscrepancyOpts >>> BayesObj.plot_post_pred = True In order to run with rejection sampling, we set the ``inference_method`` accordingly and add properties for ``bootstrap``. >>> BayesObj.inference_method = 'rejection' >>> BayesObj.bootstrap = True >>> BayesObj.n_bootstrap_itrs = 500 >>> BayesObj.bootstrap_noise = 2 If the sampling should be done with MCMC, then this is set as the ``inference_method`` and additional properties are given in ``mcmc_params``. For this example we use the python package ``emcee`` to define the MCMC moves. >>> BayesObj.inference_method = 'MCMC' >>> import emcee >>> BayesObj.mcmc_params = { >>> 'n_steps': 1e4, >>> 'n_walkers': 30, >>> 'moves': emcee.moves.KDEMove(), >>> 'multiprocessing': False, >>> 'verbose': False >>> } Then we run the inference. >>> BayesObj.create_inference() If the output directory ``BayesObj.out_dir`` is not set otherwise, the outputs are written into the folder ``Outputs_Bayes_model_Calib``. This folder includes the posterior distribution of the input parameters, as well as the predictions resulting from the mean of the posterior. For inference with MCMC, chain diagnostics are also written out in the console. .. container:: twocol .. container:: leftside .. code-block:: py ---------------Posterior diagnostics--------------- Mean auto-correlation time: 2.057 Thin: 1 Burn-in: 4 Flat chain shape: (13380, 1) Mean acceptance fraction*: 0.752 Gelman-Rubin Test**: [1.001] * This value must lay between 0.234 and 0.5. ** These values must be smaller than 1.1. -------------------------------------------------- .. container:: rightside .. image:: ../examples/user_guide/Outputs_Bayes_model_Calib/Posterior_Dist_model_emulator.pdf :width: 400 :alt: Posterior distribution of the input parameter .. image:: ../examples/user_guide/Outputs_Bayes_model_Calib/Post_Prior_Perd_model_emulator_A.pdf :width: 400 :alt: Comparison of posterior prediction to the observation