Training surrogate models

Surrogate models, also called metamodels, are models that are built on evaluations of full models with the goal to capture the full behaviour, but reduce the cost of evaluations.

The surrogate models are trained on datasets \(\mathcal{D}=(x_i, y_i)_{i=1,\dots,M)\) that consist of \(M\) samples of the uncertain parameters and the corresponding model outputs. We call this dataset the training data, with training samples \((x_i)_{i=1,\dots,M)\).

BayesValidRox creates surrogate models as objects of the class bayesvalidrox.surrogate_models.surrogate_models.MetaModel. Training is performed by the class bayesvalidrox.surrogate_models.engine.Engine.

UML diagram for metamodel-related classes in bayesvalidrox

MetaModel options

In BayesValidRox two types of surrogate model are available, Polynomial Chaos Expansion (PCE) and Gaussian Processes (GP). The Polynomial Chaos Expansion (PCE) and its variant the arbitrary Polynomial Chaos Expansion (aPC) build polynomials from the given distributions of uncertain inputs. Gaussian processes (GP) give kernel-based representations of the model results.

We provide a broad range of regression methods for useage with PCE-surrogates that can be set by the parameter MetaModel.pce_reg_method. These include Ordinary Least Squares (ols), Bayesian Ridge Regression (brr), Least angle regression (lars), Bayesian ARD Regression (ard), Fast Bayesian ARD Regression (fastard), Variational Bayesian Learning (vbl) and Emperical Bayesian Learning (ebl). Depending on the chosen regression method, the surrogate outputs a mean approximation and an associated standard deviation.

Dimensionality reduction can be performed on outputs with Principal Component Analysis (PCA). PCA is applied on the set of surrogates built for the x_values defined in the model.

If bootstrapping is used, multiple surrogates will be created based on bootstrapped training data, and jointly evaluated. The final outputs will then be the mean and standard deviation of their approximations.

Training with the engine

For training a surrogate model we use an object of class bayesvalidrox.surrogate_models.engine.Engine. This needs to be given three things: the metamodel itself, the model that the metamodel should replace and the experimental design that matches the uncertain inputs for the model and metamodel.

The standard method of training the surrogate is performed by the function train_normal(). Other available training methods in BayesValidRox are presented in Active learning: iteratively expanding the training set.

For training the engine performs three main steps.

  1. Generating training samples from the experimental design.

  2. Evaluating the model on the training samples.

  3. Fitting the surrogate to the training dataset.

Example

We now build a surrogate model for the simple model from Models using the experimental design from Priors, input space and experimental design. For this we need the classes bayesvalidrox.surrogate_models.surrogate_models.MetaModel and bayesvalidrox.surrogate_models.engine.Engine.

>> from bayesvalidrox import MetaModel, Engine

First we set up the surrogate model and tell it to consider the uncertain parameters defined in Inputs as its input parameters.

>>> MetaMod = MetaModel(Inputs)

Then we specify what type of surrogate we want and its properties. Here we use an aPCE with maximal polynomial degree 3 and want to use FastARD as the regression method. We set the value of the q-norm truncation scheme to 0.9. This combination will give us a sparse aPCE.

>>> MetaMod.meta_model_type = 'aPCE'
>>> MetaMod.pce_reg_method = 'FastARD'
>>> MetaMod.pce_deg = 3
>>> MetaMod.pce_q_norm = 0.85

Before we start the actual training we set n_init_samples to our wanted number of training samples.

>>> ExpDesign.n_init_samples = 10

Like this the experimental design will generate 10 samples according to our previously set sampling method. Alternatively we can set the samples that we generated in Priors, input space and experimental design as the training samples. For this the sampling method should be set to ‘user’ and our samples given as X.

>>> ExpDesign.sampling_method = 'user'
>>> ExpDesign.root_samples = samples

Now we create an engine object with the model, experimental design and surrogate model and run the training.

>>> Engine_ = Engine(MetaMod, Model, ExpDesign)
>>> Engine.train_normal()

We can evaluate the trained surrogate model in two ways, via the engine, or directly. The evaluations return the mean approximation of the surrogate and its associated standard deviation. Evaluation via the surrogate model can make use of the sampling in the experimental design,

>>> mean, stdev = Engine_.eval_metamodel(nsamples = 10)

while for direct evaluation the exact set of samples has to be given.

>>> mean, stdev = Engine_.MetaModel.eval_metamodel(samples)