Base Controller Class¶
A controller class is designed to emulate the boiler plate code which often needs to be written in order to use gpytorch.
GP Controller¶
All controllers should inherit from the following class:
- class vanguard.base.gpcontroller.GPController(train_x, train_y, kernel_class, mean_class, y_std, likelihood_class, marginal_log_likelihood_class, optimiser_class, smart_optimiser_class, rng=None, **kwargs)[source]¶
Bases:
BaseGPControllerThe base class for GP controllers.
The following class variables will persist unless changed (manually or by decorators):
device: The device of the tensors to be used. By default, this will be set totorch.device('cuda:0')iftorch.cuda.is_available()returnsTrue, but otherwise it defaults totorch.device('cpu').dtype: The dtype for tensors, which defaults totorch.float32. Setting this higher will improve accuracy at the cost of memory.gp_model_class: An uninstantiated subclass ofExactGPorApproximateGPto be used in inference.posterior_class: An uninstantiated subclass ofPosteriorto be used for all posteriors returned during prediction.posterior_collection_class: An uninstantiated subclass ofMonteCarloPosteriorCollectionto be used for all posteriors returned during fuzzy prediction.
Note
The loss after each iteration of hyperparameter tuning is saved in the controller’s metrics tracker (accessed using the
metrics_tracker()property), and can be printed during fitting by using theprint_metrics()method. Consider this example (GaussianGPControlleris used to simplify the example; metric printing is available for all controller classes):- Example:
>>> from vanguard.datasets.synthetic import SyntheticDataset >>> from vanguard.kernels import ScaledRBFKernel >>> from vanguard.vanilla import GaussianGPController >>> >>> dataset = SyntheticDataset() >>> >>> controller = GaussianGPController(dataset.train_x, dataset.train_y, ... ScaledRBFKernel, dataset.train_y_std) >>> initial_loss = controller.fit(10) >>> with controller.metrics_tracker.print_metrics(): ... loss = controller.fit(5) iteration: 11, loss: ... iteration: 12, loss: ... iteration: 13, loss: ... iteration: 14, loss: ... iteration: 15, loss: ...
For more options see the
MetricsTrackerclass.- Parameters:
train_x (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],float])train_y (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],ndarray[tuple[Any,...],dtype[integer]],float])y_std (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],float])likelihood_class (
type[Likelihood])marginal_log_likelihood_class (
type[MarginalLogLikelihood])smart_optimiser_class (
type[SmartOptimiser])
- property metrics_tracker: MetricsTracker¶
Return the
MetricsTrackerassociated with the controller.
- fit(n_sgd_iters=10, gradient_every=None)[source]¶
Run rounds of hyperparameter tuning.
Note
By default
fit(n_sgd_iters=n, gradient_every=m)is equivalent tofit(n_sgd_iters=n). However, any changes to_sgd_round()could break this equivalence.Warning
Do not overload this method in order to alter SGD behaviour. Instead, overload
_sgd_round()to ensure that all added functionality propagates correctly.- Parameters:
- Return type:
- Returns:
The loss.
- posterior_over_fuzzy_point(x, x_std)[source]¶
Return predictive posterior of the y-value over a fuzzy point.
- Parameters:
x (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],float]) – (n_predictions, n_features) The predictive inputs.x_std (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],float]) –The input noise standard deviations:
array_like[np.floating]: (n_features,) The standard deviation per input dimension for the predictions,
np.floating: Assume homoskedastic noise.
- Return type:
- Returns:
The posterior.
- fuzzy_predictive_likelihood(x, x_std)[source]¶
Calculate the predictive likelihood at an x-value, given variance.
- Parameters:
- Return type:
- Returns:
The marginal distribution.
- classmethod new(instance, **kwargs)[source]¶
Create an instance of the class with the same initialisation parameters as an existing instance.
Any keyword arguments passed in this method will overwrite the values used for the initialisation of the new instance. Calling
type(instance).new(instance)is essentially equivalent to creating a copy of instance, albeit with some parameters potentially remaining connected.Warning
This method is not guaranteed to return a deep copy of an instance if the classes match. Attributes such as the training data, and kernel are likely to be shared across instances. To mitigate this, explicitly pass copies of these as keyword parameters.
Posteriors¶
Posterior classes.
Vanguard contains classes to represent posterior distributions, which are used to encapsulate the predictive posterior of a model at some input points.
- class vanguard.base.posteriors.Posterior(distribution)[source]¶
Represents a posterior predictive distribution over a collection of points.
Note
Various Vanguard decorators are expected to overwrite the
prediction()andconfidence_interval()methods of this class. However, the_tensor_prediction()and_tensor_confidence_interval()methods should remain untouched, in order to avoid accidental double transformations.- Parameters:
distribution (
Distribution) – The distribution.
- __init__(distribution)[source]¶
Initialise self.
- Parameters:
distribution (
Distribution)
- property condensed_distribution: Distribution¶
Return the condensed distribution.
Return a representative distribution of the posterior, with 1-dimensional mean and 2-dimensional covariance. In standard cases, this will just return the distribution.
- confidence_interval(alpha=0.05)[source]¶
Construct confidence intervals around mean of predictive posterior.
- nll(y, noise_variance=0, alpha=np.float64(0.31731050786291415))[source]¶
Compute the negative log-likelihood of some values under the posterior.
- Parameters:
y (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],float]) – (n, d) or (d,) where d is the dimension of the space on which the posterior is defined. Sum over first dimension if two dimensional.noise_variance (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],float]) – Additional variance to be included in the calculation.alpha (
float) – The significance of the confidence interval used to calculate the standard deviation.
- Return type:
- Returns:
The negative log-likelihood of the given y values.
- log_probability(y)[source]¶
Compute the log-likelihood of some values under the posterior.
- Parameters:
y (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],float]) – (n, d) or (d,) where d is the dimension of the space on which the posterior is defined. Sum over first dimension if two dimensional.- Return type:
- Returns:
The log-likelihood of the given y values, i.e. \(\sum_{i} \log P(y_i)\) where \(P\) is the posterior density.
- class vanguard.base.posteriors.MonteCarloPosteriorCollection(posterior_generator)[source]¶
A collection of posteriors over a set of points.
Enables fuzzy predictions and confidence intervals for models without any specific method to handle input uncertainty. Samples are lazily loaded if more are needed for a better prediction.
- Parameters:
posterior_generator (
Generator[Posterior,None,None]) – APosteriorobject defining an infinite generator of posteriors.
Warning
In order to ensure reproducible output for predictions and confidence intervals, a cached sample is used.
-
MAX_POSTERIOR_ERRORS_BEFORE_RAISE:
int= 100¶ The maximum number of RuntimeErrors that _yield_posteriors will suppress before raising.
- property condensed_distribution: Distribution¶
Return the condensed distribution.
Return a representative distribution of the posterior, with 1-dimensional mean and 2-dimensional covariance. In this case, return a distribution based on the mean and covariance returned by
_tensor_prediction().
- classmethod from_mean_and_covariance(mean, covariance)[source]¶
Construct from the mean and covariance of a Gaussian.
- static _decide_mc_num_samples(alpha)[source]¶
Determine an appropriately large number of Monte Carlo samples.
Determine an appropriately large number of Monte Carlo samples for a desired confidence level when computing confidence intervals with Monte Carlo integration. This method is motivated by a simple remark in [Owen13]. The factor is arbitrary, we just want the number of samples to be a lot larger than \(\frac{1}{\min(alpha, 1-alpha)}\).
Warning
The current method should give reasonable default behaviour, but it doesn’t come with any guarantees. Moreover, we may be demanding too many samples, which is inefficient.
Base Controller¶
The (non-user-facing) base class of Vanguard controllers.
The BaseGPController class contains the
machinery of the GPController.
- class vanguard.base.basecontroller.BaseGPController(train_x, train_y, kernel_class, mean_class, y_std, likelihood_class, marginal_log_likelihood_class, optimiser_class, smart_optimiser_class, rng=None, **kwargs)[source]¶
Contains the base machinery for the
GPControllerclass.- Parameters:
train_x (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],float]) – (n_samples, n_features) The mean of the inputs (or the observed values).train_y (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],ndarray[tuple[Any,...],dtype[integer]],float]) – (n_samples,) or (n_samples, 1) The responsive values.kernel_class (
type[Kernel]) – An uninstantiated subclass ofgpytorch.kernels.Kernel.mean_class (
type[Mean]) – An uninstantiated subclass ofgpytorch.means.Meanto use in the prior GP.y_std (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],float]) –The observation noise standard deviation:
ArrayLike[float] (n_samples,): known heteroskedastic noise,
float: known homoskedastic noise assumed.
likelihood_class (
type[Likelihood]) – An uninstantiated subclass ofgpytorch.likelihoods.Likelihood. The default isgpytorch.likelihoods.FixedNoiseGaussianLikelihood.marginal_log_likelihood_class (
type[MarginalLogLikelihood]) – An uninstantiated subclass of an MLL fromgpytorch.mlls. The default isgpytorch.mlls.ExactMarginalLogLikelihood.optimiser_class (
type[Optimizer]) – An uninstantiatedtorch.optim.Optimizerclass used for gradient-based learning of hyperparameters. The default istorch.optim.Adam.smart_optimiser_class (
type[SmartOptimiser]) – An uninstantiatedSmartOptimiserclass used to wrap theoptimiser_classand enable early stopping.rng (
Optional[Generator]) – Generator instance used to generate random numbers.
- Keyword Arguments:
kernel_kwargs (dict): Keyword arguments to be passed to the kernel_class constructor.
mean_kwargs (dict): Keyword arguments to be passed to the mean_class constructor.
likelihood_kwargs (dict): Keyword arguments to be passed to the likelihood_class constructor.
gp_kwargs (dict): Keyword arguments to be passed to the gp_model_class constructor.
mll_kwargs (dict): Keyword arguments to be passed to the marginal_log_likelihood_class constructor.
optim_kwargs (dict): Keyword arguments to be passed to the optimiser_class constructor.
batch_size (int,None): The batch size to use in SGD. If
None, the whole dataset is used at each iteration.additional_metrics (List[function]): A list of additional metrics to track.
- gp_model_class¶
alias of
ExactGPModel
- __init__(train_x, train_y, kernel_class, mean_class, y_std, likelihood_class, marginal_log_likelihood_class, optimiser_class, smart_optimiser_class, rng=None, **kwargs)[source]¶
Initialise self.
- Parameters:
train_x (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],float])train_y (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],ndarray[tuple[Any,...],dtype[integer]],float])y_std (
Union[Tensor,ndarray[tuple[Any,...],dtype[floating]],float])likelihood_class (
type[Likelihood])marginal_log_likelihood_class (
type[MarginalLogLikelihood])smart_optimiser_class (
type[SmartOptimiser])
- property _likelihood: Likelihood¶
Return the likelihood of the model.
- classmethod get_default_tensor_dtype()[source]¶
Get the default tensor type for this controller class.
- Return type:
- classmethod get_default_tensor_device()[source]¶
Get the default tensor device for this controller class.
- Return type:
- _fuzzy_predictive_likelihood(x, x_std)[source]¶
Calculate the predictive likelihood at an x-value, given variance.
- Parameters:
- Return type:
- Returns:
The marginal distribution.
- _get_posterior_over_fuzzy_point_in_eval_mode(x, x_std)[source]¶
Obtain Monte Carlo integration samples from the predictive posterior with Gaussian input noise.
- Parameters:
x (
Union[Tensor,ndarray[tuple[Any,...],dtype[float]],float]) – (n_predictions, n_features) The predictive inputs.x_std (
Union[Tensor,ndarray[tuple[Any,...],dtype[float]],float]) –The input noise standard deviations:
array_like[float]: (n_features,) The standard deviation per input dimension for the predictions,
float: Assume homoskedastic noise.
- Return type:
- Returns:
The prior distribution.
- _sgd_round(n_iters=10, gradient_every=10)[source]¶
Use gradient based optimiser to tune the hyperparameters.
- _single_optimisation_step(x, y, retain_graph=False)[source]¶
Take do a single forward pass and optimisation backward pass.
- _get_posterior_over_point_in_eval_mode(x)[source]¶
Predict the y-value of a single point in evaluation mode.
- _get_posterior_over_point(x)[source]¶
Predict the y-value of a single point. The mode (eval vs train) of the model is not changed.
- _process_x_std(std)[source]¶
Parse supplied std dev for input noise for different cases.
- Parameters:
std (
Union[Tensor,ndarray[tuple[Any,...],dtype[float]],float]) –The standard deviation. This can be:
array_like[float]: (n_point, self.dim) heteroskedastic input noise across feature dimensions, or
float: homoskedastic input noise across feature dimensions.
- Return type:
- Returns:
The parsed standard deviation of shape (self.dim,) or (std.shape[0], self.dim) depending on the shape of
std. IfstdisNonethen trainable values are returned.
- _input_standardise_modules(*modules)[source]¶
Apply standard input scaling (mean zero, variance 1) to the supplied PyTorch nn.Modules.
The mean and variance are computed from the training inputs of self.
- classmethod set_default_tensor_dtype(dtype)[source]¶
Set the default tensor dtype for the class, subsequent subclasses, and external tensors.
- classmethod set_default_tensor_device(device)[source]¶
Set the default tensor device for the class, subsequent subclasses, and external tensors.
- static _decide_noise_shape(posterior, x)[source]¶
Determine the correct shape of the likelihood noise.
Given a posterior distribution and an array of predictive inputs, determine the correct size of a noise term to supply to the likelihood to match the model and the number of predictive points.
- Parameters:
- Return type:
- Returns:
The correct shape for the likelihood noise in this case.
- vanguard.base.basecontroller._catch_and_check_module_errors(controller)[source]¶
Handle some hard to detect errors that may occur within GP model classes.
- Parameters:
controller (
BaseGPController) – The controller that owns the module class.- Return type:
Metrics¶
Keep track of loss and other metrics when training.
Vanguard supports a number of metrics pre-attached and tracked to all
controller classes. These are calculated per iteration by the
MetricsTracker class.
- vanguard.base.metrics.loss(loss_value, controller)[source]¶
Return the loss value.
- Parameters:
loss_value (
float)controller (
Optional[BaseGPController])
- Return type:
- class vanguard.base.metrics.MetricsTracker(*metrics)[source]¶
Tracks metrics for a controller.
Warning
Passing
lambdafunctions is discouraged, as eachlambdafunction will overwrite the previous. Instead, create distinct functions for your metric.- Example:
>>> from vanguard.base.metrics import loss >>> >>> tracker = MetricsTracker(loss) >>> for loss_value in range(5): ... tracker.run_metrics(float(loss_value), controller=None) >>> with tracker.print_metrics(): ... for loss_value in range(5): ... tracker.run_metrics(float(loss_value), controller=None) iteration: 6, loss: 0.0 iteration: 7, loss: 1.0 iteration: 8, loss: 2.0 iteration: 9, loss: 3.0 iteration: 10, loss: 4.0 >>> with tracker.print_metrics(every=2): ... for loss_value in range(5): ... tracker.run_metrics(float(loss_value), controller=None) iteration: 12, loss: 1.0 iteration: 14, loss: 3.0 >>> with tracker.print_metrics(every=2, format_string="loss: {loss}"): ... for loss_value in range(5): ... tracker.run_metrics(float(loss_value), controller=None) loss: 1.0 loss: 3.0
- Parameters:
metrics (
Callable)
- __init__(*metrics)[source]¶
Initialise self.
A metric takes the form of a function of (loss, controller) -> real number. The simplest and most obvious metric simply returns the loss value, e.g. see the function vanguard.base.metrics.loss. Other common examples might extract parameters from the controller, e.g. a kernel’s lengthscale, and return that.
- Parameters:
metrics (
Callable)
- add_metrics(*metrics)[source]¶
Add metrics to the tracker.
See __init__ docstring for definition of metrics.
- print_metrics(every=1, format_string=None)[source]¶
Temporarily enabling printing the metrics within a context manager.
- Parameters:
every (
int) – How often to print the output. Does not start on the first iteration. Defaults to 1 (print always).format_string (
Optional[str]) – Used to format the output. Keys passed here must match with information passed to therun_metrics()method. If None, all metrics will be printed.
- Return type:
- run_metrics(loss_value, controller, **additional_info)[source]¶
Register the components of an iteration.
Each metric in the tracker will be run on the arguments of this method, and then stored for future reference. Iterations do not need to be passed. Additional information passed as keyword arguments can be displayed to the user when combined with
print_metrics()and a customised format string.- Parameters:
loss_value (
float) – The loss.controller (
Optional[BaseGPController]) – The controller instance.
- Return type: