Creating a Decorator

[1]:
# © Crown Copyright GCHQ
#
# Licensed under the GNU General Public License, version 3 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.gnu.org/licenses/gpl-3.0.en.html
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Although Vanguard has a number of out-of-the-box decorators to allow for advanced Gaussian processes techniques, one might need something more specialist. Luckily, decorators in Vanguard are designed to be as extensible as possible. This walkthrough will explain how to create a new decorator to shuffle the input data passed to a controller.

[2]:
from collections.abc import Iterable
from typing import Any, Callable, TypeVar, Union

import numpy as np
import torch
from gpytorch.kernels import RBFKernel
from gpytorch.likelihoods import FixedNoiseGaussianLikelihood
from gpytorch.means import ConstantMean
from gpytorch.mlls import ExactMarginalLogLikelihood
from numpy.typing import ArrayLike, NDArray

from vanguard.base import GPController
from vanguard.decoratorutils import Decorator, process_args, wraps_class
from vanguard.optimise import SmartOptimiser
from vanguard.uncertainty import GaussianUncertaintyGPController
[3]:
T = TypeVar("T")
SeedT = Union[ArrayLike, np.random.BitGenerator, None]

Recapping Python Decorators

In Python, a decorator is a function which returns another function. Consider the following function:

[4]:
def is_py_file(file_path: str) -> bool:
    """
    Determine if a path points to a Python file.

    :param file_path: Path to query
    :return: :data:`True` if ``file_path`` has a Python extension
    """
    return str(file_path).endswith(".py")


is_py_file("foo.py"), is_py_file("bar.js")
[4]:
(True, False)

This function will fail if it passed anything other than a string, but it will raise an AttributeError:

It would be preferable for the function to raise a TypeError, which could be achieved with a simple try/except block, but it’s possible that is_py_file cannot be edited, or that this change needs to be made to multiple functions which could each take a different number of inputs. Instead, a decorator can be used to check that file_path is a string, without mutating is_py_file:

[5]:
CallableStringT = TypeVar("CallableStringT", bound=Callable[[str, ...], Any])


def check_string(func: CallableStringT) -> CallableStringT:
    """Check that the input is a string."""

    def inner_function(*args: str) -> Any:
        for arg in args:
            if not isinstance(arg, str):
                raise TypeError("All inputs must be strings.")
        return func(*args)

    return inner_function

The decorator can then be applied in the following fashion:

[6]:
@check_string  # equivalent to: is_py_file = check_string(is_py_file)
def is_py_file(file_path: str) -> bool:
    """
    Determine if a path points to a Python file.

    :param file_path: Path to query
    :return: :data:`True` if ``file_path`` has a Python extension
    """
    return str(file_path).endswith(".py")

Sometimes it is helpful for a decorator to accept some arguments to adjust its behaviour. In this case, the function in question just needs to return a decorator:

[7]:
CallableTT = TypeVar("CallableTT", bound=Callable[[T, ...], Any])


def check_type(t: type[T]) -> Callable[[CallableTT], CallableTT]:
    """Check that the input is of a certain type."""

    def decorator(func: CallableTT) -> CallableTT:
        def inner_function(*args: T) -> Any:
            for arg in args:
                if not isinstance(arg, t):
                    raise TypeError(f"All inputs must be of type {t}.")
            return func(*args)

        return inner_function

    return decorator
[8]:
@check_type(str)  # equivalent to: is_py_file = check_type(str)(is_py_file)
def is_py_file(file_path: str) -> bool:
    """
    Determine if a path points to a Python file.

    :param file_path: Path to query
    :return: :data:`True` if ``file_path`` has a Python extension
    """
    return str(file_path).endswith(".py")

Decorators in Vanguard

All decorators should inherit from Decorator in order to ensure consistency, and to make use of the in-built features. The Decorator requires a framework_class argument, which should be an (uninstantiated) subclass of GPController. Any new features added by a decorator should be relative to its framework class. If the decorator is applied to a different GPController subclass, then checks will be run to ensure that this class does not define any new methods, nor overwrite any existing ones. The reason for this is to avoid any potential issues with any extended features, forcing the user to explicitly ignore such problems if they are certain it will not affect the validity of the decorator.

Vanguard decorators also take a required_decorators parameter (usually a set but can be any iterable), which references a number of uninstantiated decorator classes which must be applied before a particular decorator can be applied. This allows for maximum separation between functionality, and the majority of decorators do not have any requirements.

Creating a Decorator: Shuffling Inputs

Consider the following function:

[9]:
def consistent_shuffle(*arrays: NDArray[float], seed: SeedT = None) -> list[NDArray[float]]:
    """Shuffle all arrays into the same order, to maintain consistency."""
    rng = np.random.RandomState(seed=seed)
    indices = np.arange(len(arrays[0]))
    rng.shuffle(indices)

    shuffled_arrays = [array[indices] for array in arrays]
    return shuffled_arrays
[10]:
x = np.array([1, 2, 3, 4, 5])
y = np.array([1, 4, 9, 16, 25])
[11]:
consistent_shuffle(x, y, seed=1)
[11]:
[array([3, 2, 5, 1, 4]), array([ 9,  4, 25,  1, 16])]

This function will be applied to the train_x, train_y and y_std inputs to the newly decorated class. In order to work with these parameters, the process_args() function comes in handy:

[12]:
process_args(
    GPController.__init__,
    None,
    x,
    y,
    RBFKernel,
    mean_class=ConstantMean,
    y_std=0.1,
    likelihood_class=FixedNoiseGaussianLikelihood,
    marginal_log_likelihood_class=ExactMarginalLogLikelihood,
    optimiser_class=torch.optim.Adam,
    smart_optimiser_class=SmartOptimiser,
)
[12]:
{'self': None,
 'train_x': array([1, 2, 3, 4, 5]),
 'train_y': array([ 1,  4,  9, 16, 25]),
 'kernel_class': gpytorch.kernels.rbf_kernel.RBFKernel,
 'mean_class': gpytorch.means.constant_mean.ConstantMean,
 'y_std': 0.1,
 'likelihood_class': gpytorch.likelihoods.gaussian_likelihood.FixedNoiseGaussianLikelihood,
 'marginal_log_likelihood_class': gpytorch.mlls.exact_marginal_log_likelihood.ExactMarginalLogLikelihood,
 'optimiser_class': torch.optim.adam.Adam,
 'smart_optimiser_class': vanguard.optimise.optimiser.SmartOptimiser,
 'rng': None}

This returns a parameter mapping, essentially ensuring that all parameters are treated as keyword arguments, even the ones which were passed as positional arguments. This function can be used in the _decorate_class() method of the decorator to intercept arguments passed to the decorated :class:

[13]:
class ShuffleDecorator(Decorator):
    """Shuffles input data."""

    def __init__(self, **kwargs: Any) -> None:
        super().__init__(framework_class=GPController, required_decorators={}, **kwargs)

    def _decorate_class(self, cls: type[T]) -> type[T]:
        class InnerClass(cls):
            """An inner class."""

            def __init__(self, *args: Any, **kwargs: Any) -> None:
                all_parameters_as_kwargs = process_args(super().__init__, *args, **kwargs)

                old_train_x = all_parameters_as_kwargs.pop("train_x")
                old_train_y = all_parameters_as_kwargs.pop("train_y")
                old_y_std = all_parameters_as_kwargs.pop("y_std")  # pop to avoid duplication

                if isinstance(old_y_std, (float, int)):
                    old_y_std = np.ones_like(old_train_x) * old_y_std

                new_train_x, new_train_y, new_y_std = consistent_shuffle(old_train_x, old_train_y, old_y_std)

                super().__init__(train_x=new_train_x, train_y=new_train_y, y_std=new_y_std, **all_parameters_as_kwargs)

        return InnerClass

There are a few things to note here:

  • A call to verify_decorated_class() will be made in the __call__ method to run checks for any new or overwritten methods in the decorated class. In special circumstances this can be ignored, although it is not recommended.

  • Since this code is using super(), a value for self doesn’t need to be passed to process_args().

  • Parameters are “popped” rather than simply referenced in order to avoid forgetting to set them before passing them forward, and to avoid any duplication.

The decorator can now be applied to a controller class in one of two ways. The latter is recommended for readability and extension.

[14]:
ShuffledGPController = ShuffleDecorator()(GPController)


@ShuffleDecorator()
class ShuffledGPController(GPController):  # noqa: F811
    """Shuffles inputs to the controller."""

    pass

Class Wrapping

Although the new ShuffledGPController will now work as expected, there are some inconsistencies in the docstrings and the names. This is best observed using help(), but can be seen by inspecting the __name__ and __doc__ attributes:

[15]:
print(ShuffledGPController.__name__)
print(ShuffledGPController.__doc__)
InnerClass
An inner class.

This can fixed by using the wraps_class() decorator, which behaves a lot like the functools.wraps(), only for classes:

[16]:
class ShuffleDecorator(Decorator):
    """Shuffles input data."""

    def __init__(self, **kwargs: Any) -> None:
        super().__init__(framework_class=GPController, required_decorators={}, **kwargs)

    def _decorate_class(self, cls: type[T]) -> type[T]:
        @wraps_class(cls)
        class InnerClass(cls):
            """An inner class."""

            def __init__(self, *args: Any, **kwargs: Any) -> None:
                all_parameters_as_kwargs = process_args(super().__init__, *args, **kwargs)

                old_train_x = all_parameters_as_kwargs.pop("train_x")
                old_train_y = all_parameters_as_kwargs.pop("train_y")
                old_y_std = all_parameters_as_kwargs.pop("y_std")  # pop to avoid duplication

                if isinstance(old_y_std, (float, int)):
                    old_y_std = np.ones_like(old_train_x) * old_y_std

                new_train_x, new_train_y, new_y_std = consistent_shuffle(old_train_x, old_train_y, old_y_std)

                super().__init__(train_x=new_train_x, train_y=new_train_y, y_std=new_y_std, **all_parameters_as_kwargs)

        return InnerClass
[17]:
@ShuffleDecorator()
class ShuffledGPController(GPController):
    """Shuffles inputs to the controller."""

    pass


print(ShuffledGPController.__name__)
print(ShuffledGPController.__doc__)
ShuffledGPController
Shuffles inputs to the controller.

Warning

When a decorator is applied in-line with NewClass = Decorator()(OldClass), then the values NewClass.__name__ and NewClass.__doc__ will correspond to OldClass.__name__ and OldClass.__doc__ respectively. This is often not expected behaviour, so should be done with care.

Note

wraps_class() will also take care of the names and docstrings of methods within the wrapped class.

Decorator Parameters

Sometimes it is necessary to implement additional arguments to allow a user to adjust the behaviour of the decorator. Since consistent_shuffle takes a seed parameter, it would be good to allow the decorator to make use of it:

[18]:
class ShuffleDecorator(Decorator):
    """Shuffles input data."""

    def __init__(self, seed: SeedT = None, **kwargs: Any) -> None:
        super().__init__(framework_class=GPController, required_decorators={}, **kwargs)
        self.seed = seed

    def _decorate_class(self, cls: type[T]) -> type[T]:
        seed = self.seed

        @wraps_class(cls)
        class InnerClass(cls):
            """An inner class."""

            def __init__(self, *args: Any, **kwargs: Any) -> None:
                all_parameters_as_kwargs = process_args(super().__init__, *args, **kwargs)

                old_train_x = all_parameters_as_kwargs.pop("train_x")
                old_train_y = all_parameters_as_kwargs.pop("train_y")
                old_y_std = all_parameters_as_kwargs.pop("y_std")  # pop to avoid duplication

                if isinstance(old_y_std, (float, int)):
                    old_y_std = np.ones_like(old_train_x) * old_y_std

                new_train_x, new_train_y, new_y_std = consistent_shuffle(old_train_x, old_train_y, old_y_std, seed=seed)

                super().__init__(train_x=new_train_x, train_y=new_train_y, y_std=new_y_std, **all_parameters_as_kwargs)

        return InnerClass

Note the defining of the intermediate value seed, before entering InnerClass. This is necessary because within the scope of InnerClass, self no longer refers to the decorator instance.

Handling Different Controllers

A good decorator would ideally be re-usable for many different components. However, note what happens when ShuffleDecorator is applied to the GaussianUncertaintyGPController class.

[19]:
@ShuffleDecorator()
class ShuffledGaussianUncertaintyGPController(GaussianUncertaintyGPController):
    """Shuffles inputs to the controller."""

    pass
/tmp/ipykernel_644/355334906.py:1: UnexpectedMethodWarning: 'ShuffleDecorator': The class 'GaussianUncertaintyGPController' has added the following unexpected methods:
* vanguard.uncertainty.GaussianUncertaintyGPController._get_additive_grad_noise
* vanguard.uncertainty.GaussianUncertaintyGPController._noise_transform
* vanguard.uncertainty.GaussianUncertaintyGPController.predict_at_point
  @ShuffleDecorator()
/tmp/ipykernel_644/355334906.py:1: OverwrittenMethodWarning: 'ShuffleDecorator': The class 'GaussianUncertaintyGPController' has overwritten the following methods:
* vanguard.uncertainty.GaussianUncertaintyGPController._set_requires_grad
* vanguard.uncertainty.GaussianUncertaintyGPController._get_posterior_over_fuzzy_point_in_eval_mode
* vanguard.uncertainty.GaussianUncertaintyGPController._process_x_std
* vanguard.uncertainty.GaussianUncertaintyGPController._sgd_round
* vanguard.uncertainty.GaussianUncertaintyGPController.__init__
  @ShuffleDecorator()

To acknowledge that these methods are not expected to affect the behaviour of the decorator, they must be explicitly ignored:

[20]:
@ShuffleDecorator(
    ignore_methods={
        "predict_at_point",
        "_get_additive_grad_noise",
        "_noise_transform",
        "_append_constant_to_infinite_generator",
    }
)
class ShuffledGaussianUncertaintyGPController(GaussianUncertaintyGPController):  # noqa: F811
    """Shuffles inputs to the controller."""

    pass
/tmp/ipykernel_644/2722914397.py:1: OverwrittenMethodWarning: 'ShuffleDecorator': The class 'GaussianUncertaintyGPController' has overwritten the following methods:
* vanguard.uncertainty.GaussianUncertaintyGPController.__init__
* vanguard.uncertainty.GaussianUncertaintyGPController._process_x_std
* vanguard.uncertainty.GaussianUncertaintyGPController._sgd_round
* vanguard.uncertainty.GaussianUncertaintyGPController._get_posterior_over_fuzzy_point_in_eval_mode
* vanguard.uncertainty.GaussianUncertaintyGPController._set_requires_grad
  @ShuffleDecorator(

Note

It is possible to ignore all of these warnings by passing ignore_all=True to the decorator, although this is only recommended if one is certain that changing the decorated controller will not cause any new errors. Also, passing raise_instead=True will raise an error instead of emitting a warning, which will cause the program to stop completely.

These methods are expected, but have been overwritten. Most of these methods are not expected to affect the decorator either, with the exception of __init__. Although __init__ could be ignored and the code would run, GaussianUncertaintyGPController takes a train_x_std parameter which would need to be shuffled also. This would be a problem for a user of the decorator, and can be avoided by adding the ability to pass additional parameters to be shuffled:

[21]:
class ShuffleDecorator(Decorator):
    """Shuffles input data."""

    def __init__(self, seed: SeedT = None, additional_params_to_shuffle: Iterable[str] = (), **kwargs: Any) -> None:
        if additional_params_to_shuffle:
            kwargs["ignore_methods"] = set(kwargs["ignore_methods"]) | {"__init__"}

        super().__init__(framework_class=GPController, required_decorators={}, **kwargs)

        self.seed = seed
        self.params_to_shuffle = set.union({"train_x", "train_y", "y_std"}, set(additional_params_to_shuffle))

    def _decorate_class(self, cls: type[T]) -> type[T]:
        seed = self.seed
        params_to_shuffle = self.params_to_shuffle

        @wraps_class(cls)
        class InnerClass(cls):
            """An inner class."""

            def __init__(self, *args: Any, **kwargs: Any) -> None:
                all_parameters_as_kwargs = process_args(super().__init__, *args, **kwargs)

                array_for_reference = all_parameters_as_kwargs["train_x"]

                pre_shuffled_args = [all_parameters_as_kwargs.pop(param) for param in params_to_shuffle]
                pre_shuffled_args_as_arrays = [
                    np.ones_like(array_for_reference) * arg if isinstance(arg, (float, int)) else arg
                    for arg in pre_shuffled_args
                ]
                shuffled_args = consistent_shuffle(*pre_shuffled_args_as_arrays, seed=seed)

                shuffled_params_as_kwargs = dict(zip(params_to_shuffle, shuffled_args))

                super().__init__(**shuffled_params_as_kwargs, **all_parameters_as_kwargs)

        return InnerClass

There are a few changes to unpack here; take note of the following:

  • If a user passes additional_params_to_shuffle, then it can be assumed that they have properly checked __init__, and it can be automatically ignored by the decorator.

  • The popping and array-converting of parameters now needs to be less constrained, and done more programmatically.

[22]:
ignore_methods = {
    "_get_posterior_over_fuzzy_point_in_eval_mode",
    "__init__",
    "_sgd_round",
    "_process_x_std",
    "_set_requires_grad",
    "predict_at_point",
    "_get_additive_grad_noise",
    "_noise_transform",
    "_append_constant_to_infinite_generator",
}


@ShuffleDecorator(seed=1, additional_params_to_shuffle={"train_x_std"}, ignore_methods=ignore_methods)
class ShuffledGaussianUncertaintyGPController(GaussianUncertaintyGPController):  # noqa: F811
    """Shuffles inputs to the controller."""

    pass

There are plenty of other ways in which ShuffleDecorator can be improved or made more extendable, but the concepts are more or less the same.

[23]:
train_x = np.array([1, 2, 3, 4, 5])
train_x_std = np.array([0.01, 0.02, 0.03, 0.04, 0.05])
train_y = np.array([1, 4, 9, 16, 25])
y_std = np.array([0.02, 0.04, 0.06, 0.08, 0.1])
[24]:
controller = ShuffledGaussianUncertaintyGPController(
    train_x,
    train_x_std,
    train_y,
    y_std,
    kernel_class=RBFKernel,
    mean_class=ConstantMean,
    likelihood_class=FixedNoiseGaussianLikelihood,
    marginal_log_likelihood_class=ExactMarginalLogLikelihood,
    optimiser_class=torch.optim.Adam,
)
/home/docs/checkouts/readthedocs.org/user_builds/vanguard/envs/latest/lib/python3.13/site-packages/vanguard/base/basecontroller.py:573: UserWarning: A regression problem with no warping may suffer from numerical instability in optimisation if the y values are not standard scaled. Using the NormaliseY decorator will likely help.
  warnings.warn(
[25]:
print(controller.train_x.T)
print(controller.train_x_std.T)
print(controller.train_y.T)
print(controller._y_variance.T)
tensor([[3., 2., 5., 1., 4.]])
tensor([0.0300, 0.0200, 0.0500, 0.0100, 0.0400])
tensor([[ 9.,  4., 25.,  1., 16.]])
tensor([0.0036, 0.0016, 0.0100, 0.0004, 0.0064])
/home/docs/checkouts/readthedocs.org/user_builds/vanguard/envs/latest/lib/python3.13/site-packages/torch/utils/_device.py:104: UserWarning: The use of `x.T` on tensors of dimension other than 2 to reverse their shape is deprecated and it will throw an error in a future release. Consider `x.mT` to transpose batches of matrices or `x.permute(*torch.arange(x.ndim - 1, -1, -1))` to reverse the dimensions of a tensor. (Triggered internally at /pytorch/aten/src/ATen/native/TensorShape.cpp:4413.)
  return func(*args, **kwargs)