Training surrogate models ************************* Surrogate models, also called metamodels, are models that are built on evaluations of full models with the goal to capture the full behaviour, but reduce the cost of evaluations. The surrogate models are trained on datasets :math:`\mathcal{D}=(x_i, y_i)_{i=1,\dots,M)` that consist of :math:`M` samples of the uncertain parameters and the corresponding model outputs. We call this dataset the training data, with training samples :math:`(x_i)_{i=1,\dots,M)`. BayesValidRox creates surrogate models as objects of the class :any:`bayesvalidrox.surrogate_models.surrogate_models.MetaModel`. Training is performed by the class :any:`bayesvalidrox.surrogate_models.engine.Engine`. .. image:: ./diagrams/metamod_training_reduced.png :width: 800 :alt: UML diagram for metamodel-related classes in bayesvalidrox MetaModel options ================= In BayesValidRox two types of surrogate model are available, Polynomial Chaos Expansion (PCE) and Gaussian Processes (GP). The Polynomial Chaos Expansion (PCE) and its variant the arbitrary Polynomial Chaos Expansion (aPC) build polynomials from the given distributions of uncertain inputs. Gaussian processes (GP) give kernel-based representations of the model results. We provide a broad range of regression methods for useage with PCE-surrogates that can be set by the parameter ``MetaModel.pce_reg_method``. These include Ordinary Least Squares (``ols``), Bayesian Ridge Regression (``brr``), Least angle regression (``lars``), Bayesian ARD Regression (``ard``), Fast Bayesian ARD Regression (``fastard``), Variational Bayesian Learning (``vbl``) and Emperical Bayesian Learning (``ebl``). Depending on the chosen regression method, the surrogate outputs a mean approximation and an associated standard deviation. Dimensionality reduction can be performed on outputs with Principal Component Analysis (PCA). PCA is applied on the set of surrogates built for the ``x_values`` defined in the model. If bootstrapping is used, multiple surrogates will be created based on bootstrapped training data, and jointly evaluated. The final outputs will then be the mean and standard deviation of their approximations. Training with the engine ======================== For training a surrogate model we use an object of class :any:`bayesvalidrox.surrogate_models.engine.Engine`. This needs to be given three things: the metamodel itself, the model that the metamodel should replace and the experimental design that matches the uncertain inputs for the model and metamodel. The standard method of training the surrogate is performed by the function ``train_normal()``. Other available training methods in BayesValidRox are presented in :any:`al_description`. .. container:: twocol .. container:: leftside For training the engine performs three main steps. 1) Generating training samples from the experimental design. 2) Evaluating the model on the training samples. 3) Fitting the surrogate to the training dataset. .. container:: rightside .. image:: ./diagrams/engine_train_normal.png :width: 800 :alt: Diagram of main steps in ``Engine.train_normal()`` Example ======= We now build a surrogate model for the simple model from :any:`model_description` using the experimental design from :any:`input_description`. For this we need the classes :any:`bayesvalidrox.surrogate_models.surrogate_models.MetaModel` and :any:`bayesvalidrox.surrogate_models.engine.Engine`. >> from bayesvalidrox import MetaModel, Engine First we set up the surrogate model and tell it to consider the uncertain parameters defined in ``Inputs`` as its input parameters. >>> MetaMod = MetaModel(Inputs) Then we specify what type of surrogate we want and its properties. Here we use an aPCE with maximal polynomial degree 3 and want to use FastARD as the regression method. We set the value of the q-norm truncation scheme to 0.9. This combination will give us a sparse aPCE. >>> MetaMod.meta_model_type = 'aPCE' >>> MetaMod.pce_reg_method = 'FastARD' >>> MetaMod.pce_deg = 3 >>> MetaMod.pce_q_norm = 0.85 Before we start the actual training we set ``n_init_samples`` to our wanted number of training samples. >>> ExpDesign.n_init_samples = 10 Like this the experimental design will generate 10 samples according to our previously set sampling method. Alternatively we can set the samples that we generated in :any:`input_description` as the training samples. For this the sampling method should be set to 'user' and our samples given as ``X``. >>> ExpDesign.sampling_method = 'user' >>> ExpDesign.root_samples = samples Now we create an engine object with the model, experimental design and surrogate model and run the training. >>> Engine_ = Engine(MetaMod, Model, ExpDesign) >>> Engine.train_normal() We can evaluate the trained surrogate model in two ways, via the engine, or directly. The evaluations return the mean approximation of the surrogate and its associated standard deviation. Evaluation via the surrogate model can make use of the sampling in the experimental design, >>> mean, stdev = Engine_.eval_metamodel(nsamples = 10) while for direct evaluation the exact set of samples has to be given. >>> mean, stdev = Engine_.MetaModel.eval_metamodel(samples)