Bayesian inference
******************
.. container:: twocol

   .. container:: leftside
   
      With Bayesian inference we ask the question 'how does our understanding of the inputs change given some observation of the outputs of the model?', i.e. we perform an updating step of the prior distributions to posterior, based on some observations.
      Bayesvalidrox provides a dedicated class to perform this task, :any:`bayesvalidrox.bayes_inference.bayes_inference.BayesInference`, which uses :any:`bayesvalidrox.bayes_inference.post_sampler.PostSampler` objects to generate posterior samples.
      
   .. container:: rightside
   
      .. image:: ./diagrams/bayesian_validation.png
         :width: 300
         :alt: UML diagram for classes related to Bayesian inference.

The observation should be set as ``Model.observations`` in the ``Engine``, and an estimation of its uncertainty can be provided as a :any:`bayesvalidrox.bayes_inference.discrepancy.Discrepancy` object.

In addition to inference, the class :any:`bayesvalidrox.bayes_inference.bayes_inference.BayesInference` can also be used to validate a model or surrogate model on given observations by estimating the BME.
The observations for validation should be given as ``Model.observations_valid`` and can be perturbed using added Gaussian noise, or bootstrapped with leave-one-out cross-validation.
      
Sampler classes
===============
.. container:: twocol

   .. container:: leftside
   
      Bayesvalidrox supports two options for generating posterior samples, ``rejection-sampling`` and ``MCMC``.
      Both of them are given in separate child classes of the :any:`bayesvalidrox.bayes_inference.post_sampler.PostSampler` class.
      
   .. container:: rightside
   
      .. image:: ./diagrams/bayesian_validation.png
         :width: 300
         :alt: UML diagram for classes related to Bayesian inference.

Additional parameters for MCMC can be given to :any:`bayesvalidrox.bayes_inference.bayes_inference.BayesInference` as a dictionary called ``mcmc_params`` and can include 

* ``prior_samples``: initial samples 
* ``nsteps``: number of steps 
* ``nwalkers``: number of walkers
* ``nburn``: length of the burn-in 
* ``moves``: function to use for the moves, e.g. taken from ``emcee``
* ``mp``: setting for multiprocessing
* ``verbose``: verbosity 


Example
=======
For this example we need to add the following imports.

>>> from bayesvalidrox import Discrepancy, BayesInference

In order to run Bayesian inference we first need to provide an observation.
For this example we take an evaluation of the model on some chosen sample and add the resulting values as ``Model.observations``.
As this expects a 1D-array for each output key, we need to change the format slightly.

>>> true_sample = [[2]]
>>> observation = model.run_model_parallel(true_sample)
>>> model.observations = {}
>>> for key in observation:
>>>     if key == 'x_values':
>>>         model.observations[key]=observation[key]
>>>     else:
>>>         model.observations[key]=observation[key][0]

Next we define the uncertainty on the observation with the class :any:`bayesvalidrox.bayes_inference.discrepancy.Discrepancy`.
For this example we set the uncertainty to be zero-mean gaussian and dependent on the values in the observation, i.e. larger values have a larger uncertainty associated with them.
The ``parameters`` contain the variance for each point in the observation.

.. warning::
   For models with only a single uncertain input parameter, numerical issues can appear when the discrepancy is set only depending on the observed data.
   To resolve this, a small value can be added to the variance of the discrepancy.

>>> obsData = pd.DataFrame(model.observations, columns=model.output.names)
>>> discrepancy = Discrepancy('')
>>> discrepancy.type = 'Gaussian'
>>> discrepancy.parameters = obsData**2+0.01

Now we can initialize an object of class :any:`bayesvalidrox.bayes_inference.bayes_inference.BayesInference` with all the wanted properties.
This object has to be given our ``Engine``.
If it should use the surrogate during inference, set ``use_emulator`` to ``True``, otherwise the model will be evaluated directly.
We also set the defined ``discrepancy``. and set ``post_plot_pred`` if posterior predictions should be visualized.

>>> bayes = BayesInference(Engine_)
>>> bayes.use_emulator = True
>>> bayes.discrepancy = discrepancy
>>> bayes.plot = True

In order to run with rejection sampling, we set the ``inference_method``.

>>> bayes.inference_method = 'rejection'

If the sampling should be done with MCMC, then the ``inference_method`` is set to ``'MCMC'`` and additional properties are given in ``mcmc_params``.
For this example we use the python package ``emcee`` to define the MCMC moves.

>>> bayes.inference_method = 'MCMC'
>>> import emcee
>>> bayes.mcmc_params = {
>>>     'nsteps': 1e4,
>>>     'nwalkers': 30,
>>>     'moves': emcee.moves.KDEMove(),
>>>     'mp': False,
>>>     'verbose': False
>>>     }

Then we run the inference, here also saving the results.

>>> bayes.run_inference(save=True)

If the output directory ``bayes.out_dir`` is not set otherwise, the outputs are written into the folder ``Outputs_Bayes_model_Calib``.
This folder includes the posterior distribution of the input parameters, as well as the predictions resulting from the mean of the posterior.
For inference with MCMC, chain diagnostics are also written out in the console.

.. container:: twocol

   .. container:: leftside
   
      .. code-block:: py

         ---------------Posterior diagnostics---------------
         Mean auto-correlation time: 2.057
         Thin: 1
         Burn-in: 4
         Flat chain shape: (13380, 1)
         Mean acceptance fraction*: 0.752
         Gelman-Rubin Test**:  [1.001]

         * This value must lay between 0.234 and 0.5.
         ** These values must be smaller than 1.1.
         --------------------------------------------------
		 
   .. container:: rightside

      .. image:: ./examples/user_guide/Outputs_Bayes_model_Calib/Posterior_Dist_model_emulator.pdf
         :width: 400
         :alt: Posterior distribution of the input parameter
		 
      .. image:: ./examples/user_guide/Outputs_Bayes_model_Calib/Post_Prior_Perd_model_emulator_A.pdf
         :width: 400
         :alt: Comparison of posterior prediction to the observation
		 
		 
For validation, we can add properties for ``bootstrap`` of the observations.
In this example we perturb the observations with Gaussian noise.

>>> bayes.bootstrap_method = 'normal'
>>> bayes.n_bootstrap_itrs = 500
>>> bayes.bootstrap_noise = 0.2

Since this is now a validation setting, the :any:`bayesvalidrox.bayes_inference.bayes_inference.BayesInference` object will look for validation observations.
We just copy the observations used for inference.

>>> model.observations_valid = model.observations

We can calculate additional metrics for validation, by adding them to ``valid_metrics``.
Options include the Kullback-Leibler Divergence (``'KLD'``) and Information entropy (``'inf_entropy'``).

>>> bayes.valid_metrics = ['kld', 'inf_entropy']

Now we can run the validation, which will return the log-BME of the model or surrogate model on the observations.

>>> log_bme = bayes.run_validation()

If the output directory ``bayes.out_dir`` is not set otherwise, the outputs are written into the folder ``Outputs_Bayes_model_Valid``, and the log-BME is visualized.
The additional metrics are stored in ``bayes.kld`` and ``bayes.inf_entropy``.