Creating a Decorator¶
[1]:
# © Crown Copyright GCHQ
#
# Licensed under the GNU General Public License, version 3 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.gnu.org/licenses/gpl-3.0.en.html
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
Although Vanguard has a number of out-of-the-box decorators to allow for advanced Gaussian processes techniques, one might need something more specialist. Luckily, decorators in Vanguard are designed to be as extensible as possible. This walkthrough will explain how to create a new decorator to shuffle the input data passed to a controller.
[2]:
from collections.abc import Iterable
from typing import Any, Callable, TypeVar, Union
import numpy as np
import torch
from gpytorch.kernels import RBFKernel
from gpytorch.likelihoods import FixedNoiseGaussianLikelihood
from gpytorch.means import ConstantMean
from gpytorch.mlls import ExactMarginalLogLikelihood
from numpy.typing import ArrayLike, NDArray
from vanguard.base import GPController
from vanguard.decoratorutils import Decorator, process_args, wraps_class
from vanguard.optimise import SmartOptimiser
from vanguard.uncertainty import GaussianUncertaintyGPController
[3]:
T = TypeVar("T")
SeedT = Union[ArrayLike, np.random.BitGenerator, None]
Recapping Python Decorators¶
In Python, a decorator is a function which returns another function. Consider the following function:
[4]:
def is_py_file(file_path: str) -> bool:
"""
Determine if a path points to a Python file.
:param file_path: Path to query
:return: :data:`True` if ``file_path`` has a Python extension
"""
return str(file_path).endswith(".py")
is_py_file("foo.py"), is_py_file("bar.js")
[4]:
(True, False)
This function will fail if it passed anything other than a string, but it will raise an AttributeError:
It would be preferable for the function to raise a TypeError, which could be achieved with a simple
try/except block, but it’s possible that is_py_file cannot be edited, or that this change needs to be made
to multiple functions which could each take a different number of inputs. Instead, a decorator can be used to check that
file_path is a string, without mutating is_py_file:
[5]:
CallableStringT = TypeVar("CallableStringT", bound=Callable[[str, ...], Any])
def check_string(func: CallableStringT) -> CallableStringT:
"""Check that the input is a string."""
def inner_function(*args: str) -> Any:
for arg in args:
if not isinstance(arg, str):
raise TypeError("All inputs must be strings.")
return func(*args)
return inner_function
The decorator can then be applied in the following fashion:
[6]:
@check_string # equivalent to: is_py_file = check_string(is_py_file)
def is_py_file(file_path: str) -> bool:
"""
Determine if a path points to a Python file.
:param file_path: Path to query
:return: :data:`True` if ``file_path`` has a Python extension
"""
return str(file_path).endswith(".py")
Sometimes it is helpful for a decorator to accept some arguments to adjust its behaviour. In this case, the function in question just needs to return a decorator:
[7]:
CallableTT = TypeVar("CallableTT", bound=Callable[[T, ...], Any])
def check_type(t: type[T]) -> Callable[[CallableTT], CallableTT]:
"""Check that the input is of a certain type."""
def decorator(func: CallableTT) -> CallableTT:
def inner_function(*args: T) -> Any:
for arg in args:
if not isinstance(arg, t):
raise TypeError(f"All inputs must be of type {t}.")
return func(*args)
return inner_function
return decorator
[8]:
@check_type(str) # equivalent to: is_py_file = check_type(str)(is_py_file)
def is_py_file(file_path: str) -> bool:
"""
Determine if a path points to a Python file.
:param file_path: Path to query
:return: :data:`True` if ``file_path`` has a Python extension
"""
return str(file_path).endswith(".py")
Decorators in Vanguard¶
All decorators should inherit from Decorator in order to ensure
consistency, and to make use of the in-built features. The Decorator
requires a framework_class argument, which should be an (uninstantiated) subclass of
GPController. Any new features added by a decorator should be relative to its
framework class. If the decorator is applied to a different GPController subclass,
then checks will be run to ensure that this class does not define any new methods, nor overwrite any existing ones. The
reason for this is to avoid any potential issues with any extended features, forcing the user to explicitly ignore such
problems if they are certain it will not affect the validity of the decorator.
Vanguard decorators also take a required_decorators parameter (usually a set but can be any iterable), which
references a number of uninstantiated decorator classes which must be applied before a particular decorator can be
applied. This allows for maximum separation between functionality, and the majority of decorators do not have any
requirements.
Creating a Decorator: Shuffling Inputs¶
Consider the following function:
[9]:
def consistent_shuffle(*arrays: NDArray[float], seed: SeedT = None) -> list[NDArray[float]]:
"""Shuffle all arrays into the same order, to maintain consistency."""
rng = np.random.RandomState(seed=seed)
indices = np.arange(len(arrays[0]))
rng.shuffle(indices)
shuffled_arrays = [array[indices] for array in arrays]
return shuffled_arrays
[10]:
x = np.array([1, 2, 3, 4, 5])
y = np.array([1, 4, 9, 16, 25])
[11]:
consistent_shuffle(x, y, seed=1)
[11]:
[array([3, 2, 5, 1, 4]), array([ 9, 4, 25, 1, 16])]
This function will be applied to the train_x, train_y and y_std inputs to the newly decorated class. In
order to work with these parameters, the process_args() function comes in handy:
[12]:
process_args(
GPController.__init__,
None,
x,
y,
RBFKernel,
mean_class=ConstantMean,
y_std=0.1,
likelihood_class=FixedNoiseGaussianLikelihood,
marginal_log_likelihood_class=ExactMarginalLogLikelihood,
optimiser_class=torch.optim.Adam,
smart_optimiser_class=SmartOptimiser,
)
[12]:
{'self': None,
'train_x': array([1, 2, 3, 4, 5]),
'train_y': array([ 1, 4, 9, 16, 25]),
'kernel_class': gpytorch.kernels.rbf_kernel.RBFKernel,
'mean_class': gpytorch.means.constant_mean.ConstantMean,
'y_std': 0.1,
'likelihood_class': gpytorch.likelihoods.gaussian_likelihood.FixedNoiseGaussianLikelihood,
'marginal_log_likelihood_class': gpytorch.mlls.exact_marginal_log_likelihood.ExactMarginalLogLikelihood,
'optimiser_class': torch.optim.adam.Adam,
'smart_optimiser_class': vanguard.optimise.optimiser.SmartOptimiser,
'rng': None}
This returns a parameter mapping, essentially ensuring that all parameters are treated as keyword arguments, even the
ones which were passed as positional arguments. This function can be used in the
_decorate_class() method of the decorator to intercept arguments
passed to the decorated :class:
[13]:
class ShuffleDecorator(Decorator):
"""Shuffles input data."""
def __init__(self, **kwargs: Any) -> None:
super().__init__(framework_class=GPController, required_decorators={}, **kwargs)
def _decorate_class(self, cls: type[T]) -> type[T]:
class InnerClass(cls):
"""An inner class."""
def __init__(self, *args: Any, **kwargs: Any) -> None:
all_parameters_as_kwargs = process_args(super().__init__, *args, **kwargs)
old_train_x = all_parameters_as_kwargs.pop("train_x")
old_train_y = all_parameters_as_kwargs.pop("train_y")
old_y_std = all_parameters_as_kwargs.pop("y_std") # pop to avoid duplication
if isinstance(old_y_std, (float, int)):
old_y_std = np.ones_like(old_train_x) * old_y_std
new_train_x, new_train_y, new_y_std = consistent_shuffle(old_train_x, old_train_y, old_y_std)
super().__init__(train_x=new_train_x, train_y=new_train_y, y_std=new_y_std, **all_parameters_as_kwargs)
return InnerClass
There are a few things to note here:
A call to
verify_decorated_class()will be made in the__call__method to run checks for any new or overwritten methods in the decorated class. In special circumstances this can be ignored, although it is not recommended.Since this code is using
super(), a value forselfdoesn’t need to be passed toprocess_args().Parameters are “popped” rather than simply referenced in order to avoid forgetting to set them before passing them forward, and to avoid any duplication.
The decorator can now be applied to a controller class in one of two ways. The latter is recommended for readability and extension.
[14]:
ShuffledGPController = ShuffleDecorator()(GPController)
@ShuffleDecorator()
class ShuffledGPController(GPController): # noqa: F811
"""Shuffles inputs to the controller."""
pass
Class Wrapping¶
Although the new ShuffledGPController will now work as expected, there are some inconsistencies in the docstrings
and the names. This is best observed using help(), but can be seen by inspecting the __name__ and __doc__
attributes:
[15]:
print(ShuffledGPController.__name__)
print(ShuffledGPController.__doc__)
InnerClass
An inner class.
This can fixed by using the wraps_class() decorator, which behaves a lot like
the functools.wraps(), only for classes:
[16]:
class ShuffleDecorator(Decorator):
"""Shuffles input data."""
def __init__(self, **kwargs: Any) -> None:
super().__init__(framework_class=GPController, required_decorators={}, **kwargs)
def _decorate_class(self, cls: type[T]) -> type[T]:
@wraps_class(cls)
class InnerClass(cls):
"""An inner class."""
def __init__(self, *args: Any, **kwargs: Any) -> None:
all_parameters_as_kwargs = process_args(super().__init__, *args, **kwargs)
old_train_x = all_parameters_as_kwargs.pop("train_x")
old_train_y = all_parameters_as_kwargs.pop("train_y")
old_y_std = all_parameters_as_kwargs.pop("y_std") # pop to avoid duplication
if isinstance(old_y_std, (float, int)):
old_y_std = np.ones_like(old_train_x) * old_y_std
new_train_x, new_train_y, new_y_std = consistent_shuffle(old_train_x, old_train_y, old_y_std)
super().__init__(train_x=new_train_x, train_y=new_train_y, y_std=new_y_std, **all_parameters_as_kwargs)
return InnerClass
[17]:
@ShuffleDecorator()
class ShuffledGPController(GPController):
"""Shuffles inputs to the controller."""
pass
print(ShuffledGPController.__name__)
print(ShuffledGPController.__doc__)
ShuffledGPController
Shuffles inputs to the controller.
Warning
When a decorator is applied in-line with NewClass = Decorator()(OldClass), then the values NewClass.__name__
and NewClass.__doc__ will correspond to OldClass.__name__ and OldClass.__doc__ respectively. This is
often not expected behaviour, so should be done with care.
Note
wraps_class() will also take care of the names and docstrings of methods
within the wrapped class.
Decorator Parameters¶
Sometimes it is necessary to implement additional arguments to allow a user to adjust the behaviour of the decorator.
Since consistent_shuffle takes a seed parameter, it would be good to allow the decorator to make use of it:
[18]:
class ShuffleDecorator(Decorator):
"""Shuffles input data."""
def __init__(self, seed: SeedT = None, **kwargs: Any) -> None:
super().__init__(framework_class=GPController, required_decorators={}, **kwargs)
self.seed = seed
def _decorate_class(self, cls: type[T]) -> type[T]:
seed = self.seed
@wraps_class(cls)
class InnerClass(cls):
"""An inner class."""
def __init__(self, *args: Any, **kwargs: Any) -> None:
all_parameters_as_kwargs = process_args(super().__init__, *args, **kwargs)
old_train_x = all_parameters_as_kwargs.pop("train_x")
old_train_y = all_parameters_as_kwargs.pop("train_y")
old_y_std = all_parameters_as_kwargs.pop("y_std") # pop to avoid duplication
if isinstance(old_y_std, (float, int)):
old_y_std = np.ones_like(old_train_x) * old_y_std
new_train_x, new_train_y, new_y_std = consistent_shuffle(old_train_x, old_train_y, old_y_std, seed=seed)
super().__init__(train_x=new_train_x, train_y=new_train_y, y_std=new_y_std, **all_parameters_as_kwargs)
return InnerClass
Note the defining of the intermediate value seed, before entering InnerClass. This is necessary because within
the scope of InnerClass, self no longer refers to the decorator instance.
Handling Different Controllers¶
A good decorator would ideally be re-usable for many different components. However, note what happens when
ShuffleDecorator is applied to the GaussianUncertaintyGPController class.
[19]:
@ShuffleDecorator()
class ShuffledGaussianUncertaintyGPController(GaussianUncertaintyGPController):
"""Shuffles inputs to the controller."""
pass
/tmp/ipykernel_644/355334906.py:1: UnexpectedMethodWarning: 'ShuffleDecorator': The class 'GaussianUncertaintyGPController' has added the following unexpected methods:
* vanguard.uncertainty.GaussianUncertaintyGPController._get_additive_grad_noise
* vanguard.uncertainty.GaussianUncertaintyGPController._noise_transform
* vanguard.uncertainty.GaussianUncertaintyGPController.predict_at_point
@ShuffleDecorator()
/tmp/ipykernel_644/355334906.py:1: OverwrittenMethodWarning: 'ShuffleDecorator': The class 'GaussianUncertaintyGPController' has overwritten the following methods:
* vanguard.uncertainty.GaussianUncertaintyGPController._set_requires_grad
* vanguard.uncertainty.GaussianUncertaintyGPController._get_posterior_over_fuzzy_point_in_eval_mode
* vanguard.uncertainty.GaussianUncertaintyGPController._process_x_std
* vanguard.uncertainty.GaussianUncertaintyGPController._sgd_round
* vanguard.uncertainty.GaussianUncertaintyGPController.__init__
@ShuffleDecorator()
To acknowledge that these methods are not expected to affect the behaviour of the decorator, they must be explicitly ignored:
[20]:
@ShuffleDecorator(
ignore_methods={
"predict_at_point",
"_get_additive_grad_noise",
"_noise_transform",
"_append_constant_to_infinite_generator",
}
)
class ShuffledGaussianUncertaintyGPController(GaussianUncertaintyGPController): # noqa: F811
"""Shuffles inputs to the controller."""
pass
/tmp/ipykernel_644/2722914397.py:1: OverwrittenMethodWarning: 'ShuffleDecorator': The class 'GaussianUncertaintyGPController' has overwritten the following methods:
* vanguard.uncertainty.GaussianUncertaintyGPController.__init__
* vanguard.uncertainty.GaussianUncertaintyGPController._process_x_std
* vanguard.uncertainty.GaussianUncertaintyGPController._sgd_round
* vanguard.uncertainty.GaussianUncertaintyGPController._get_posterior_over_fuzzy_point_in_eval_mode
* vanguard.uncertainty.GaussianUncertaintyGPController._set_requires_grad
@ShuffleDecorator(
Note
It is possible to ignore all of these warnings by passing ignore_all=True to the decorator, although this is
only recommended if one is certain that changing the decorated controller will not cause any new errors. Also, passing
raise_instead=True will raise an error instead of emitting a warning, which will cause the program to stop
completely.
These methods are expected, but have been overwritten. Most of these methods are not expected to affect the decorator
either, with the exception of __init__. Although __init__ could be ignored and the code would run,
GaussianUncertaintyGPController takes a train_x_std parameter which would need to be
shuffled also. This would be a problem for a user of the decorator, and can be avoided by adding the ability to pass
additional parameters to be shuffled:
[21]:
class ShuffleDecorator(Decorator):
"""Shuffles input data."""
def __init__(self, seed: SeedT = None, additional_params_to_shuffle: Iterable[str] = (), **kwargs: Any) -> None:
if additional_params_to_shuffle:
kwargs["ignore_methods"] = set(kwargs["ignore_methods"]) | {"__init__"}
super().__init__(framework_class=GPController, required_decorators={}, **kwargs)
self.seed = seed
self.params_to_shuffle = set.union({"train_x", "train_y", "y_std"}, set(additional_params_to_shuffle))
def _decorate_class(self, cls: type[T]) -> type[T]:
seed = self.seed
params_to_shuffle = self.params_to_shuffle
@wraps_class(cls)
class InnerClass(cls):
"""An inner class."""
def __init__(self, *args: Any, **kwargs: Any) -> None:
all_parameters_as_kwargs = process_args(super().__init__, *args, **kwargs)
array_for_reference = all_parameters_as_kwargs["train_x"]
pre_shuffled_args = [all_parameters_as_kwargs.pop(param) for param in params_to_shuffle]
pre_shuffled_args_as_arrays = [
np.ones_like(array_for_reference) * arg if isinstance(arg, (float, int)) else arg
for arg in pre_shuffled_args
]
shuffled_args = consistent_shuffle(*pre_shuffled_args_as_arrays, seed=seed)
shuffled_params_as_kwargs = dict(zip(params_to_shuffle, shuffled_args))
super().__init__(**shuffled_params_as_kwargs, **all_parameters_as_kwargs)
return InnerClass
There are a few changes to unpack here; take note of the following:
If a user passes
additional_params_to_shuffle, then it can be assumed that they have properly checked__init__, and it can be automatically ignored by the decorator.The popping and array-converting of parameters now needs to be less constrained, and done more programmatically.
[22]:
ignore_methods = {
"_get_posterior_over_fuzzy_point_in_eval_mode",
"__init__",
"_sgd_round",
"_process_x_std",
"_set_requires_grad",
"predict_at_point",
"_get_additive_grad_noise",
"_noise_transform",
"_append_constant_to_infinite_generator",
}
@ShuffleDecorator(seed=1, additional_params_to_shuffle={"train_x_std"}, ignore_methods=ignore_methods)
class ShuffledGaussianUncertaintyGPController(GaussianUncertaintyGPController): # noqa: F811
"""Shuffles inputs to the controller."""
pass
There are plenty of other ways in which ShuffleDecorator can be improved or made more extendable, but the concepts
are more or less the same.
[23]:
train_x = np.array([1, 2, 3, 4, 5])
train_x_std = np.array([0.01, 0.02, 0.03, 0.04, 0.05])
train_y = np.array([1, 4, 9, 16, 25])
y_std = np.array([0.02, 0.04, 0.06, 0.08, 0.1])
[24]:
controller = ShuffledGaussianUncertaintyGPController(
train_x,
train_x_std,
train_y,
y_std,
kernel_class=RBFKernel,
mean_class=ConstantMean,
likelihood_class=FixedNoiseGaussianLikelihood,
marginal_log_likelihood_class=ExactMarginalLogLikelihood,
optimiser_class=torch.optim.Adam,
)
/home/docs/checkouts/readthedocs.org/user_builds/vanguard/envs/latest/lib/python3.13/site-packages/vanguard/base/basecontroller.py:573: UserWarning: A regression problem with no warping may suffer from numerical instability in optimisation if the y values are not standard scaled. Using the NormaliseY decorator will likely help.
warnings.warn(
[25]:
print(controller.train_x.T)
print(controller.train_x_std.T)
print(controller.train_y.T)
print(controller._y_variance.T)
tensor([[3., 2., 5., 1., 4.]])
tensor([0.0300, 0.0200, 0.0500, 0.0100, 0.0400])
tensor([[ 9., 4., 25., 1., 16.]])
tensor([0.0036, 0.0016, 0.0100, 0.0004, 0.0064])
/home/docs/checkouts/readthedocs.org/user_builds/vanguard/envs/latest/lib/python3.13/site-packages/torch/utils/_device.py:104: UserWarning: The use of `x.T` on tensors of dimension other than 2 to reverse their shape is deprecated and it will throw an error in a future release. Consider `x.mT` to transpose batches of matrices or `x.permute(*torch.arange(x.ndim - 1, -1, -1))` to reverse the dimensions of a tensor. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4413.)
return func(*args, **kwargs)