Bike Dataset¶
The bike dataset contains messy information about bike rentals, and is a good dataset for testing performance.
Supplied by the UC Irvine Machine Learning Repository [FanaeeT2013].
- class vanguard.datasets.bike.BikeDataset(num_samples=None, training_proportion=0.9, significance=0.025, noise_scale=0.001, rng=None)[source]¶
Bases:
DatasetComparison of bike rentals to weather information.
Contains the hourly count of rental bikes between years 2011 and 2012 in Capital bike sharing system with the corresponding weather and seasonal information. Supplied by the UC Irvine Machine Learning Repository [FanaeeT2013].
- Parameters:
num_samples (
Optional[int]) – The number of samples to use. IfNone, all samples will be used.training_proportion (
float) – The proportion of data used for training, defaults to 0.9.significance (
float) – The significance used, defaults to 0.025.noise_scale (
float) – The standard deviation of a given vectorvis taken to be noise_scale * np.abs(v).mean(). Defaults to 0.001.rng (
Optional[Generator]) – Generator instance used to generate random numbers.
- __init__(num_samples=None, training_proportion=0.9, significance=0.025, noise_scale=0.001, rng=None)[source]¶
Initialise self.
- plot_prediction(pred_y_mean, pred_y_lower, pred_y_upper, y_upper_bound=None, error_width=0.3)[source]¶
Plot a prediction using its confidence interval.
- Parameters:
pred_y_mean (
Tensor) – Array of prediction means.pred_y_lower (
Tensor) – Lower bound of predictions, e.g. from a prediction interval.pred_y_upper (
Tensor) – Upper bound of predictions, e.g. from a prediction interval.y_upper_bound (
Optional[Tensor]) – If provided, any points in the test set above this value will be discarded from plotting.error_width (
float) – Error bar line width.
- Return type: