Base Controller Class¶

A controller class is designed to emulate the boiler plate code which often needs to be written in order to use gpytorch.

GP Controller¶

All controllers should inherit from the following class:

class vanguard.base.gpcontroller.GPController(train_x, train_y, kernel_class, mean_class, y_std, likelihood_class, marginal_log_likelihood_class, optimiser_class, smart_optimiser_class, rng=None, **kwargs)[source]¶

Bases: BaseGPController

The base class for GP controllers.

The following class variables will persist unless changed (manually or by decorators):

device: The device of the tensors to be used. By default, this will be set to torch.device('cuda:0') if torch.cuda.is_available() returns True, but otherwise it defaults to torch.device('cpu').
dtype: The dtype for tensors, which defaults to torch.float32. Setting this higher will improve accuracy at the cost of memory.
gp_model_class: An uninstantiated subclass of ExactGP or ApproximateGP to be used in inference.
posterior_class: An uninstantiated subclass of Posterior to be used for all posteriors returned during prediction.
posterior_collection_class: An uninstantiated subclass of MonteCarloPosteriorCollection to be used for all posteriors returned during fuzzy prediction.

Note

The loss after each iteration of hyperparameter tuning is saved in the controller’s metrics tracker (accessed using the metrics_tracker() property), and can be printed during fitting by using the print_metrics() method. Consider this example (GaussianGPController is used to simplify the example; metric printing is available for all controller classes):

Example:

>>> from vanguard.datasets.synthetic import SyntheticDataset
>>> from vanguard.kernels import ScaledRBFKernel
>>> from vanguard.vanilla import GaussianGPController
>>>
>>> dataset = SyntheticDataset()
>>>
>>> controller = GaussianGPController(dataset.train_x, dataset.train_y,
...                                   ScaledRBFKernel, dataset.train_y_std)
>>> initial_loss = controller.fit(10)
>>> with controller.metrics_tracker.print_metrics():
...     loss = controller.fit(5)  
iteration: 11, loss: ...
iteration: 12, loss: ...
iteration: 13, loss: ...
iteration: 14, loss: ...
iteration: 15, loss: ...

For more options see the MetricsTracker class.

Parameters:

train_x (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float])
train_y (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], ndarray[tuple[Any, ...], dtype[integer]], float])
kernel_class (type[Kernel])
mean_class (type[Mean])
y_std (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float])
likelihood_class (type[Likelihood])
marginal_log_likelihood_class (type[MarginalLogLikelihood])
optimiser_class (type[Optimizer])
smart_optimiser_class (type[SmartOptimiser])
rng (Optional[Generator])

property likelihood_noise: Tensor¶: Return the noise of the likelihood.

property learning_rate: float¶: Return the learning rate of the parameter optimiser.

property metrics_tracker: MetricsTracker¶: Return the MetricsTracker associated with the controller.

fit(n_sgd_iters=10, gradient_every=None)[source]¶

Run rounds of hyperparameter tuning.

Note

By default fit(n_sgd_iters=n, gradient_every=m) is equivalent to fit(n_sgd_iters=n). However, any changes to _sgd_round() could break this equivalence.

Warning

Do not overload this method in order to alter SGD behaviour. Instead, overload _sgd_round() to ensure that all added functionality propagates correctly.

Parameters:

n_sgd_iters (int) – The number of gradient updates to perform in each round of hyperparameter tuning.
gradient_every (Optional[int]) – How often (in iterations) to do special HNIGP input gradient steps. Defaults to same as n_sgd_iters normally, overridden to 1 in batch-mode.

Return type:

Union[Tensor, float]

Returns:

The loss.

posterior_over_point(x)[source]¶

Return predictive posterior of the y-value over a point.

Parameters:: x (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float]) – (n_predictions, n_features) The predictive inputs.
Return type:: Posterior
Returns:: The posterior.

posterior_over_fuzzy_point(x, x_std)[source]¶

Return predictive posterior of the y-value over a fuzzy point.

Parameters:

x (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float]) – (n_predictions, n_features) The predictive inputs.
x_std (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float]) –
The input noise standard deviations:
- array_like[np.floating]: (n_features,) The standard deviation per input dimension for the predictions,
- np.floating: Assume homoskedastic noise.

Return type:

Posterior

Returns:

The posterior.

predictive_likelihood(x)[source]¶

Calculate the predictive likelihood at an x-value.

Parameters:: x (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float]) – (n_predictions, n_features) The points at which to obtain the likelihood.
Return type:: Posterior
Returns:: The marginal distribution.

fuzzy_predictive_likelihood(x, x_std)[source]¶

Calculate the predictive likelihood at an x-value, given variance.

Parameters:

x (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float]) – (n_predictions, n_features) The points at which to obtain the likelihood.
x_std (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float]) – (n_predictions, n_features) The std-dev of input points.

Return type:

Posterior

Returns:

The marginal distribution.

classmethod new(instance, **kwargs)[source]¶

Create an instance of the class with the same initialisation parameters as an existing instance.

Any keyword arguments passed in this method will overwrite the values used for the initialisation of the new instance. Calling type(instance).new(instance) is essentially equivalent to creating a copy of instance, albeit with some parameters potentially remaining connected.

Warning

This method is not guaranteed to return a deep copy of an instance if the classes match. Attributes such as the training data, and kernel are likely to be shared across instances. To mitigate this, explicitly pass copies of these as keyword parameters.

Parameters:: instance (Self)
Return type:: Self

Posteriors¶

Posterior classes.

Vanguard contains classes to represent posterior distributions, which are used to encapsulate the predictive posterior of a model at some input points.

class vanguard.base.posteriors.Posterior(distribution)[source]¶

Represents a posterior predictive distribution over a collection of points.

Note

Various Vanguard decorators are expected to overwrite the prediction() and confidence_interval() methods of this class. However, the _tensor_prediction() and _tensor_confidence_interval() methods should remain untouched, in order to avoid accidental double transformations.

Parameters:: distribution (Distribution) – The distribution.

__init__(distribution)[source]¶

Initialise self.

Parameters:: distribution (Distribution)

property condensed_distribution: Distribution¶

Return the condensed distribution.

Return a representative distribution of the posterior, with 1-dimensional mean and 2-dimensional covariance. In standard cases, this will just return the distribution.

prediction()[source]¶

Return the prediction as a numpy array.

Return type:

tuple[Tensor, Tensor]

Returns:

(means, covar) where:

means: (n_predictions,) The posterior predictive mean,
covar: (n_predictions, n_predictions) The posterior predictive covariance matrix.

confidence_interval(alpha=0.05)[source]¶

Construct confidence intervals around mean of predictive posterior.

Parameters:: alpha (float) – The significance level of the CIs.
Return type:: tuple[Tensor, Tensor, Tensor]
Returns:: The (median, lower, upper) bounds of the confidence interval for the predictive posterior, each of shape (n_predictions,).

mse(y)[source]¶

Compute the mean-squared of some values under the posterior.

Parameters:: y (Union[Tensor, float]) – (n, d) or (d,) where d is the dimension of the space on which the posterior is defined. Sum over first dimension if two dimensional.
Return type:: float
Returns:: The MSE of the given y values, i.e. \(\frac{1}{n}\sum_{i} (y_i - \hat{y}_i)\).

nll(y, noise_variance=0, alpha=np.float64(0.31731050786291415))[source]¶

Compute the negative log-likelihood of some values under the posterior.

Parameters:

y (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float]) – (n, d) or (d,) where d is the dimension of the space on which the posterior is defined. Sum over first dimension if two dimensional.
noise_variance (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float]) – Additional variance to be included in the calculation.
alpha (float) – The significance of the confidence interval used to calculate the standard deviation.

Return type:

float

Returns:

The negative log-likelihood of the given y values.

log_probability(y)[source]¶

Compute the log-likelihood of some values under the posterior.

Parameters:: y (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float]) – (n, d) or (d,) where d is the dimension of the space on which the posterior is defined. Sum over first dimension if two dimensional.
Return type:: float
Returns:: The log-likelihood of the given y values, i.e. \(\sum_{i} \log P(y_i)\) where \(P\) is the posterior density.

sample(n_samples=1)[source]¶

Draw independent samples from the posterior.

Parameters:: n_samples (int) – The number of samples to draw.
Return type:: Tensor

classmethod from_mean_and_covariance(mean, covariance)[source]¶

Construct from the mean and covariance of a Gaussian.

Parameters:

mean (Tensor) – (d,) or (d, t) The mean of the Gaussian.
covariance (Tensor) – (d, d) or (dt, dt) The covariance matrix of the Gaussian.

Return type:

Self

Returns:

The multivariate Gaussian distribution for either a single task or multiple tasks, depending on the shape of the args.

class vanguard.base.posteriors.MonteCarloPosteriorCollection(posterior_generator)[source]¶

A collection of posteriors over a set of points.

Enables fuzzy predictions and confidence intervals for models without any specific method to handle input uncertainty. Samples are lazily loaded if more are needed for a better prediction.

Parameters:: posterior_generator (Generator[Posterior, None, None]) – A Posterior object defining an infinite generator of posteriors.

Warning

In order to ensure reproducible output for predictions and confidence intervals, a cached sample is used.

MAX_POSTERIOR_ERRORS_BEFORE_RAISE: int = 100¶: The maximum number of RuntimeErrors that _yield_posteriors will suppress before raising.

__init__(posterior_generator)[source]¶

Initialise self.

Parameters:: posterior_generator (Generator[Posterior, None, None])

property condensed_distribution: Distribution¶

Return the condensed distribution.

Return a representative distribution of the posterior, with 1-dimensional mean and 2-dimensional covariance. In this case, return a distribution based on the mean and covariance returned by _tensor_prediction().

sample(n_samples=1)[source]¶

Draw independent samples from the posterior.

Parameters:: n_samples (int) – An integer specifying the number of samples to draw.
Return type:: Tensor

classmethod from_mean_and_covariance(mean, covariance)[source]¶

Construct from the mean and covariance of a Gaussian.

Parameters:

mean (Tensor) – (d,) or (d, t) The mean of the Gaussian.
covariance (Tensor) – (d, d) or (dt, dt) The covariance matrix of the Gaussian.

Return type:

NoReturn

Returns:

The multivariate Gaussian distribution for either a single task or multiple tasks, depending on the shape of the args.

static _decide_mc_num_samples(alpha)[source]¶

Determine an appropriately large number of Monte Carlo samples.

Determine an appropriately large number of Monte Carlo samples for a desired confidence level when computing confidence intervals with Monte Carlo integration. This method is motivated by a simple remark in [Owen13]. The factor is arbitrary, we just want the number of samples to be a lot larger than \(\frac{1}{\min(alpha, 1-alpha)}\).

Warning

The current method should give reasonable default behaviour, but it doesn’t come with any guarantees. Moreover, we may be demanding too many samples, which is inefficient.

Parameters:: alpha (float) – The significance level.
Return type:: int
Returns:: The number of samples.

Base Controller¶

The (non-user-facing) base class of Vanguard controllers.

The BaseGPController class contains the machinery of the GPController.

class vanguard.base.basecontroller.BaseGPController(train_x, train_y, kernel_class, mean_class, y_std, likelihood_class, marginal_log_likelihood_class, optimiser_class, smart_optimiser_class, rng=None, **kwargs)[source]¶

Contains the base machinery for the GPController class.

Parameters:

train_x (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float]) – (n_samples, n_features) The mean of the inputs (or the observed values).
train_y (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], ndarray[tuple[Any, ...], dtype[integer]], float]) – (n_samples,) or (n_samples, 1) The responsive values.
kernel_class (type[Kernel]) – An uninstantiated subclass of gpytorch.kernels.Kernel.
mean_class (type[Mean]) – An uninstantiated subclass of gpytorch.means.Mean to use in the prior GP.
y_std (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float]) –
The observation noise standard deviation:
- ArrayLike[float] (n_samples,): known heteroskedastic noise,
- float: known homoskedastic noise assumed.
likelihood_class (type[Likelihood]) – An uninstantiated subclass of gpytorch.likelihoods.Likelihood. The default is gpytorch.likelihoods.FixedNoiseGaussianLikelihood.
marginal_log_likelihood_class (type[MarginalLogLikelihood]) – An uninstantiated subclass of an MLL from gpytorch.mlls. The default is gpytorch.mlls.ExactMarginalLogLikelihood.
optimiser_class (type[Optimizer]) – An uninstantiated torch.optim.Optimizer class used for gradient-based learning of hyperparameters. The default is torch.optim.Adam.
smart_optimiser_class (type[SmartOptimiser]) – An uninstantiated SmartOptimiser class used to wrap the optimiser_class and enable early stopping.
rng (Optional[Generator]) – Generator instance used to generate random numbers.

Keyword Arguments:

kernel_kwargs (dict): Keyword arguments to be passed to the kernel_class constructor.
mean_kwargs (dict): Keyword arguments to be passed to the mean_class constructor.
likelihood_kwargs (dict): Keyword arguments to be passed to the likelihood_class constructor.
gp_kwargs (dict): Keyword arguments to be passed to the gp_model_class constructor.
mll_kwargs (dict): Keyword arguments to be passed to the marginal_log_likelihood_class constructor.
optim_kwargs (dict): Keyword arguments to be passed to the optimiser_class constructor.
batch_size (int,None): The batch size to use in SGD. If None, the whole dataset is used at each iteration.
additional_metrics (List[function]): A list of additional metrics to track.

gp_model_class¶: alias of ExactGPModel

__init__(train_x, train_y, kernel_class, mean_class, y_std, likelihood_class, marginal_log_likelihood_class, optimiser_class, smart_optimiser_class, rng=None, **kwargs)[source]¶

Initialise self.

Parameters:

train_x (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float])
train_y (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], ndarray[tuple[Any, ...], dtype[integer]], float])
kernel_class (type[Kernel])
mean_class (type[Mean])
y_std (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]], float])
likelihood_class (type[Likelihood])
marginal_log_likelihood_class (type[MarginalLogLikelihood])
optimiser_class (type[Optimizer])
smart_optimiser_class (type[SmartOptimiser])
rng (Optional[Generator])

property dtype: dtype | None¶: Return the default dtype of the controller.

property device: device¶: Return the default device of the controller.

property _likelihood: Likelihood¶: Return the likelihood of the model.

set_to_training_mode()[source]¶

Set trainable parameters to training mode.

Return type:: None

set_to_evaluation_mode()[source]¶

Set trainable parameters to evaluation mode.

Return type:: None

classmethod get_default_tensor_dtype()[source]¶

Get the default tensor type for this controller class.

Return type:: dtype

classmethod get_default_tensor_device()[source]¶

Get the default tensor device for this controller class.

Return type:: device

_predictive_likelihood(x)[source]¶

Calculate the predictive likelihood at an x-value.

Parameters:: x (Union[Tensor, ndarray[tuple[Any, ...], dtype[float]], float]) – (n_predictions, n_features) The points at which to obtain the likelihood.
Return type:: Posterior
Returns:: The marginal distribution.

_fuzzy_predictive_likelihood(x, x_std)[source]¶

Calculate the predictive likelihood at an x-value, given variance.

Parameters:

x (Union[Tensor, ndarray[tuple[Any, ...], dtype[float]], float]) – (n_predictions, n_features) The points at which to obtain the likelihood.
x_std (Union[Tensor, ndarray[tuple[Any, ...], dtype[float]], float]) – (n_predictions, n_features) The std-dev of input points.

Return type:

Posterior

Returns:

The marginal distribution.

_get_posterior_over_fuzzy_point_in_eval_mode(x, x_std)[source]¶

Obtain Monte Carlo integration samples from the predictive posterior with Gaussian input noise.

Parameters:

x (Union[Tensor, ndarray[tuple[Any, ...], dtype[float]], float]) – (n_predictions, n_features) The predictive inputs.
x_std (Union[Tensor, ndarray[tuple[Any, ...], dtype[float]], float]) –
The input noise standard deviations:
- array_like[float]: (n_features,) The standard deviation per input dimension for the predictions,
- float: Assume homoskedastic noise.

Return type:

Posterior

Returns:

The prior distribution.

_set_requires_grad(value)[source]¶

Set the required grad flag of all trainable params.

Parameters:: value (bool) – The value to set for the requires_grad attribute.
Return type:: None

_sgd_round(n_iters=10, gradient_every=10)[source]¶

Use gradient based optimiser to tune the hyperparameters.

Parameters:

n_iters (int) – The number of gradient updates.
gradient_every (int) – How often (in iterations) to do HNIGP (heteroskedastic noisy input GP) gradient steps.

Return type:

Tensor

Returns:

The training loss at the last iteration.

_single_optimisation_step(x, y, retain_graph=False)[source]¶

Take do a single forward pass and optimisation backward pass.

Parameters:

x (Tensor) – (n_samples, n_features) The inputs.
y (Tensor) – (n_samples, ?) The response values.
retain_graph (bool)

Return type:

Tensor

Returns:

The loss.

_loss(train_x, train_y)[source]¶

Compute the training loss (negative marginal log likelihood).

Parameters:

train_x (Tensor) – The observed values.
train_y (Tensor) – The responsive values.

Return type:

Tensor

Returns:

The loss.

_get_posterior_over_point_in_eval_mode(x)[source]¶

Predict the y-value of a single point in evaluation mode.

Parameters:: x (Union[Tensor, ndarray[tuple[Any, ...], dtype[float]], float]) – (n_predictions, n_features) The predictive inputs.
Return type:: Posterior
Returns:: The prior distribution.

_gp_forward(x)[source]¶

Pass inputs through the base GPyTorch GP model.

Parameters:: x (Union[Tensor, ndarray[tuple[Any, ...], dtype[float]], float])
Return type:: Distribution

_get_posterior_over_point(x)[source]¶

Predict the y-value of a single point. The mode (eval vs train) of the model is not changed.

Parameters:: x (Union[Tensor, ndarray[tuple[Any, ...], dtype[float]], float]) – (n_predictions, n_features) The predictive inputs.
Return type:: Posterior
Returns:: The prior distribution.

_process_x_std(std)[source]¶

Parse supplied std dev for input noise for different cases.

Parameters:

std (Union[Tensor, ndarray[tuple[Any, ...], dtype[float]], float]) –

The standard deviation. This can be:

array_like[float]: (n_point, self.dim) heteroskedastic input noise across feature dimensions, or
float: homoskedastic input noise across feature dimensions.

Return type:

Tensor

Returns:

The parsed standard deviation of shape (self.dim,) or (std.shape[0], self.dim) depending on the shape of std. If std is None then trainable values are returned.

_input_standardise_modules(*modules)[source]¶

Apply standard input scaling (mean zero, variance 1) to the supplied PyTorch nn.Modules.

The mean and variance are computed from the training inputs of self.

Parameters:: modules (type[Module]) – Modules to apply mean and variance to.
Return type:: list[type[Module]]

classmethod set_default_tensor_dtype(dtype)[source]¶

Set the default tensor dtype for the class, subsequent subclasses, and external tensors.

Parameters:: dtype (dtype) – The tensor dtype to use as the default.
Return type:: None

classmethod set_default_tensor_device(device)[source]¶

Set the default tensor device for the class, subsequent subclasses, and external tensors.

Parameters:: device (device) – The device to use as the default.
Return type:: None

static _decide_noise_shape(posterior, x)[source]¶

Determine the correct shape of the likelihood noise.

Given a posterior distribution and an array of predictive inputs, determine the correct size of a noise term to supply to the likelihood to match the model and the number of predictive points.

Parameters:

posterior (Posterior) – The posterior distribution that will combined with the noise in a likelihood.
x (Union[Tensor, ndarray[tuple[Any, ...], dtype[floating]]]) – The predictive input points.

Return type:

tuple[int, ...]

Returns:

The correct shape for the likelihood noise in this case.

static warn_normalise_y()[source]¶

Give a warning to indicate that y values have not been standard scaled.

Can be overridden and disabled by subclasses when not relevant.

Return type:: None

vanguard.base.basecontroller._catch_and_check_module_errors(controller)[source]¶

Handle some hard to detect errors that may occur within GP model classes.

Parameters:: controller (BaseGPController) – The controller that owns the module class.
Return type:: Callable

Metrics¶

Keep track of loss and other metrics when training.

Vanguard supports a number of metrics pre-attached and tracked to all controller classes. These are calculated per iteration by the MetricsTracker class.

vanguard.base.metrics.loss(loss_value, controller)[source]¶

Return the loss value.

Parameters:

loss_value (float)
controller (Optional[BaseGPController])

Return type:

float

class vanguard.base.metrics.MetricsTracker(*metrics)[source]¶

Tracks metrics for a controller.

Warning

Passing lambda functions is discouraged, as each lambda function will overwrite the previous. Instead, create distinct functions for your metric.

Example:

>>> from vanguard.base.metrics import loss
>>>
>>> tracker = MetricsTracker(loss)
>>> for loss_value in range(5):
...     tracker.run_metrics(float(loss_value), controller=None)
>>> with tracker.print_metrics():
...     for loss_value in range(5):
...         tracker.run_metrics(float(loss_value), controller=None)
iteration: 6, loss: 0.0
iteration: 7, loss: 1.0
iteration: 8, loss: 2.0
iteration: 9, loss: 3.0
iteration: 10, loss: 4.0
>>> with tracker.print_metrics(every=2):
...     for loss_value in range(5):
...         tracker.run_metrics(float(loss_value), controller=None)
iteration: 12, loss: 1.0
iteration: 14, loss: 3.0
>>> with tracker.print_metrics(every=2, format_string="loss: {loss}"):
...     for loss_value in range(5):
...         tracker.run_metrics(float(loss_value), controller=None)
loss: 1.0
loss: 3.0

Parameters:

metrics (Callable)

__init__(*metrics)[source]¶

Initialise self.

A metric takes the form of a function of (loss, controller) -> real number. The simplest and most obvious metric simply returns the loss value, e.g. see the function vanguard.base.metrics.loss. Other common examples might extract parameters from the controller, e.g. a kernel’s lengthscale, and return that.

Parameters:: metrics (Callable)

add_metrics(*metrics)[source]¶

Add metrics to the tracker.

See __init__ docstring for definition of metrics.

Parameters:: metrics (Callable)
Return type:: None

print_metrics(every=1, format_string=None)[source]¶

Temporarily enabling printing the metrics within a context manager.

Parameters:

every (int) – How often to print the output. Does not start on the first iteration. Defaults to 1 (print always).
format_string (Optional[str]) – Used to format the output. Keys passed here must match with information passed to the run_metrics() method. If None, all metrics will be printed.

Return type:

Iterator[None]

reset()[source]¶

Remove the stored metrics outputs and reset the iteration count.

Return type:: None

run_metrics(loss_value, controller, **additional_info)[source]¶

Each metric in the tracker will be run on the arguments of this method, and then stored for future reference. Iterations do not need to be passed. Additional information passed as keyword arguments can be displayed to the user when combined with print_metrics() and a customised format string.

Parameters:

loss_value (float) – The loss.
controller (Optional[BaseGPController]) – The controller instance.

Return type:

None