Active learning: iteratively expanding the training set
*******************************************************
Active learning (AL), also called sequential training, is the iterative choice of additional training samples after the initial training of a surrogate model.
The new samples can be chosen in an explorative manner or by exploiting available data and properties of the surrogate.
The relevant functions are contained in the class :any:`bayesvalidrox.surrogate_models.sequential_design.SequentialDesign` and :any:`bayesvalidrox.surrogate_models.exploration.Exploration`.

.. warning::
   Exploration with 'voronoi' is disabled for release v1.1.0!

.. image:: ./diagrams/active_learning_reduced.png
   :width: 550
   :alt: UML diagram for the classes and functions used in active learning in BayesValidRox.
   
In BayesValidRox AL is realized by additional properties of the :any:`bayesvalidrox.surrogate_models.exp_designs.ExpDesigns` and :any:`bayesvalidrox.surrogate_models.engine.Engine` classes without any changes to the surrogate model.

Exploration, exploitation and tradeoff
======================================
**Exploration** methods choose the new samples in a space-filling manner, while **exploitation methods** make use of available data or properties of the surrogate models, such as the estimated surrogate standard deviation.
Exploration methods in BayesValidRox include random or latin-hypercube sampling, voronoi sampling, choice based on leave-one-out cross validation or dual-annealing.
Exploitation can be set to Bayesian designs, such as Bayesian3 Active Learning, or variance-based designs.

The tradeoff between exploration and exploitation is defined by **tradeoff-schemes**, such as an equal split, epsilon-decreaseing or adaptive schemes.


Example
=======
We take the engine from :any:`surrogate_description` and change the settings to perform sequential training.

This mainly changes the experimental design.
For this example we start with the 10 initial samples from :any:`surrogate_description` and increase them iteratively to the number of samples given in ``n_max_samples``.
The parameter ``n_new_samples`` sets the number of new samples that are chosen in each iteration, while ``mod_LOO_threshold`` sets an additional stopping condition.

>>> exp_design.n_max_samples = 14
>>> exp_design.n_new_samples = 1
>>> exp_design.mod_loo_threshold = 1e-16
    
Here we do not set a ``tradeoff_scheme``. 
This will result in all samples being chosen based on the exploration weights.

>>> exp_design.tradeoff_scheme = None
    
As the proposed samples come from the exploration method, we still need to define this.
	
>>> exp_design.explore_method = 'random'
>>> exp_design.n_canddidate = 1000
>>> exp_design.n_cand_groups = 4
    
For the exploitation method we use a variance-based method, as no data is given.
	
>>> exp_design.exploit_method = 'VarOptDesign'
>>> exp_design.util_func = 'EIGF'
    
Once all properties are set, we can assemble the engine and start it.
This time we use ``train_sequential``.
	
>>> engine = Engine(meta_model, model, exp_design)
>>> engine.train_sequential()