Package 'scoringutils'

Title: Utilities for Scoring and Assessing Predictions
Description: scoringutils facilitates the evaluation of forecasts in a convenient framework based on data.table. It allows user to to check their forecasts and diagnose issues, to visualise forecasts and missing data, to transform data before scoring, to handle missing forecasts, to aggregate scores, and to visualise the results of the evaluation. The package mostly focuses on the evaluation of probabilistic forecasts and allows evaluating several different forecast types and input formats. Find more information about the package in the Vignettes as well as in the accompanying paper, <doi:10.48550/arXiv.2205.07090>.
Authors: Nikos Bosse [aut, cre] , Sam Abbott [aut] , Hugo Gruson [aut] , Johannes Bracher [ctb] , Toshiaki Asakura [ctb] , James Mba Azam [ctb] , Sebastian Funk [aut], Michael Chirico [ctb]
Maintainer: Nikos Bosse <[email protected]>
License: MIT + file LICENSE
Version: 1.2.2.9000
Built: 2024-09-30 23:16:21 UTC
Source: https://github.com/epiforecasts/scoringutils

Help Index


Add relative skill scores based on pairwise comparisons

Description

Adds a columns with relative skills computed by running pairwise comparisons on the scores. For more information on the computation of relative skill, see get_pairwise_comparisons(). Relative skill will be calculated for the aggregation level specified in by.

Usage

add_relative_skill(
  scores,
  compare = "model",
  by = NULL,
  metric = intersect(c("wis", "crps", "brier_score"), names(scores)),
  baseline = NULL
)

Arguments

scores

An object of class scores (a data.table with scores and an additional attribute metrics as produced by score()).

compare

Character vector with a single colum name that defines the elements for the pairwise comparison. For example, if this is set to "model" (the default), then elements of the "model" column will be compared.

by

Character vector with column names that define further grouping levels for the pairwise comparisons. By default this is NULL and there will be one relative skill score per distinct entry of the column selected in compare. If further columns are given here, for example, by = "location" with compare = "model", then one separate relative skill score is calculated for every model in every location.

metric

A string with the name of the metric for which a relative skill shall be computed. By default this is either "crps", "wis" or "brier_score" if any of these are available.

baseline

A string with the name of a model. If a baseline is given, then a scaled relative skill with respect to the baseline will be returned. By default (NULL), relative skill will not be scaled with respect to a baseline model.


Absolute error of the median (quantile-based version)

Description

Compute the absolute error of the median calculated as

abs(observedmedian prediction)\textrm{abs}(\textrm{observed} - \textrm{median prediction})

The median prediction is the predicted value for which quantile_level == 0.5, the function therefore requires 0.5 to be among the quantile levels in quantile_level.

Usage

ae_median_quantile(observed, predicted, quantile_level)

Arguments

observed

Numeric vector of size n with the observed values.

predicted

Numeric nxN matrix of predictive quantiles, n (number of rows) being the number of forecasts (corresponding to the number of observed values) and N (number of columns) the number of quantiles per forecast. If observed is just a single number, then predicted can just be a vector of size N.

quantile_level

Vector of of size N with the quantile levels for which predictions were made.

Value

Numeric vector of length N with the absolute error of the median.

Input format

metrics-quantile.png

See Also

ae_median_sample()

Examples

observed <- rnorm(30, mean = 1:30)
predicted_values <- replicate(3, rnorm(30, mean = 1:30))
ae_median_quantile(
  observed, predicted_values, quantile_level = c(0.2, 0.5, 0.8)
)

Absolute error of the median (sample-based version)

Description

Absolute error of the median calculated as

abs(observevdmedian_prediction)% \textrm{abs}(\textrm{observevd} - \textrm{median\_prediction})

Usage

ae_median_sample(observed, predicted)

Arguments

observed

A vector with observed values of size n

predicted

nxN matrix of predictive samples, n (number of rows) being the number of data points and N (number of columns) the number of Monte Carlo samples. Alternatively, predicted can just be a vector of size n.

Value

vector with the scoring values

Input format

metrics-sample.png

See Also

ae_median_quantile()

Examples

observed <- rnorm(30, mean = 1:30)
predicted_values <- matrix(rnorm(30, mean = 1:30))
ae_median_sample(observed, predicted_values)

General information on creating a forecast object

Description

There are several “as_forecast_()⁠functions to process and validate a data.frame (or similar) or similar with forecasts and observations. If the input passes all input checks, those functions will be converted to a⁠forecast' object. A forecast object is a 'data.table' with a class 'forecast' and an additional class that depends on the forecast type. Every forecast type has its own 'as_forecast_()' function. See the details section below for more information on the expected input formats.

The ⁠as_forecast_<type>()⁠ functions give users some control over how their data is parsed. Using the arguments observed, predicted, etc. users can rename existing columns of their input data to match the required columns for a forecast object. Using the argument forecast_unit, users can specify the the columns that uniquely identify a single forecast (and remove the others, see docs for the internal set_forecast_unit() for details).

The following functions are available:

Arguments

data

A data.frame (or similar) with predicted and observed values. See the details section of as_forecast() for additional information on required input formats.

forecast_unit

(optional) Name of the columns in data (after any renaming of columns) that denote the unit of a single forecast. See get_forecast_unit() for details. If NULL (the default), all columns that are not required columns are assumed to form the unit of a single forecast. If specified, all columns that are not part of the forecast unit (or required columns) will be removed.

observed

(optional) Name of the column in data that contains the observed values. This column will be renamed to "observed".

predicted

(optional) Name of the column in data that contains the predicted values. This column will be renamed to "predicted".

Value

Depending on the forecast type, an object of the following class will be returned:

  • forecast_binary for binary forecasts

  • forecast_point for point forecasts

  • forecast_sample for sample-based forecasts

  • forecast_quantile for quantile-based forecasts

Forecast types and input formats

Various different forecast types / forecast formats are supported. At the moment, those are:

  • point forecasts

  • binary forecasts ("soft binary classification")

  • nominal forecasts ("soft classification with multiple unordered classes")

  • Probabilistic forecasts in a quantile-based format (a forecast is represented as a set of predictive quantiles)

  • Probabilistic forecasts in a sample-based format (a forecast is represented as a set of predictive samples)

Forecast types are determined based on the columns present in the input data. Here is an overview of the required format for each forecast type:

required-inputs.png

All forecast types require a data.frame or similar with columns observed predicted, and model.

Point forecasts require a column observed of type numeric and a column predicted of type numeric.

Binary forecasts require a column observed of type factor with exactly two levels and a column predicted of type numeric with probabilities, corresponding to the probability that observed is equal to the second factor level. See details here for more information.

Nominal forecasts require a column observed of type factor with N levels, (where N is the number of possible outcomes), a column predicted of type numeric with probabilities (which sum to one across all possible outcomes), and a column predicted_label of type factor with N levels, denoting the outcome for which a probability is given. Forecasts must be complete, i.e. there must be a probability assigned to every possible outcome.

Quantile-based forecasts require a column observed of type numeric, a column predicted of type numeric, and a column quantile_level of type numeric with quantile-levels (between 0 and 1).

Sample-based forecasts require a column observed of type numeric, a column predicted of type numeric, and a column sample_id of type numeric with sample indices.

For more information see the vignettes and the example data (example_quantile, example_sample_continuous, example_sample_discrete, example_point(), example_binary, and example_nominal).

Forecast unit

In order to score forecasts, scoringutils needs to know which of the rows of the data belong together and jointly form a single forecasts. This is easy e.g. for point forecast, where there is one row per forecast. For quantile or sample-based forecasts, however, there are multiple rows that belong to a single forecast.

The forecast unit or unit of a single forecast is then described by the combination of columns that uniquely identify a single forecast. For example, we could have forecasts made by different models in various locations at different time points, each for several weeks into the future. The forecast unit could then be described as forecast_unit = c("model", "location", "forecast_date", "forecast_horizon"). scoringutils automatically tries to determine the unit of a single forecast. It uses all existing columns for this, which means that no columns must be present that are unrelated to the forecast unit. As a very simplistic example, if you had an additional row, "even", that is one if the row number is even and zero otherwise, then this would mess up scoring as scoringutils then thinks that this column was relevant in defining the forecast unit.

In order to avoid issues, we recommend setting the forecast unit explicitly, usually through the forecast_unit argument in the as_forecast() functions. This will drop unneeded columns, while making sure that all necessary, 'protected columns' like "predicted" or "observed" are retained.

See Also

Other functions to create forecast objects: as_forecast_binary(), as_forecast_nominal(), as_forecast_point(), as_forecast_quantile(), as_forecast_sample()

Examples

as_forecast_binary(example_binary)
as_forecast_quantile(
  example_quantile,
  forecast_unit = c("model", "target_type", "target_end_date",
                    "horizon", "location")
)

Create a forecast object for binary forecasts

Description

Create a forecast object for binary forecasts. See more information on forecast types and expected input formats by calling ⁠?⁠as_forecast().

Usage

as_forecast_binary(
  data,
  forecast_unit = NULL,
  observed = NULL,
  predicted = NULL
)

Arguments

data

A data.frame (or similar) with predicted and observed values. See the details section of as_forecast() for additional information on required input formats.

forecast_unit

(optional) Name of the columns in data (after any renaming of columns) that denote the unit of a single forecast. See get_forecast_unit() for details. If NULL (the default), all columns that are not required columns are assumed to form the unit of a single forecast. If specified, all columns that are not part of the forecast unit (or required columns) will be removed.

observed

(optional) Name of the column in data that contains the observed values. This column will be renamed to "observed".

predicted

(optional) Name of the column in data that contains the predicted values. This column will be renamed to "predicted".

See Also

Other functions to create forecast objects: as_forecast, as_forecast_nominal(), as_forecast_point(), as_forecast_quantile(), as_forecast_sample()


Common functionality for ⁠as_forecast_<type>⁠ functions

Description

Common functionality for ⁠as_forecast_<type>⁠ functions

Usage

as_forecast_generic(
  data,
  forecast_unit = NULL,
  observed = NULL,
  predicted = NULL
)

Arguments

data

A data.frame (or similar) with predicted and observed values. See the details section of as_forecast() for additional information on required input formats.

forecast_unit

(optional) Name of the columns in data (after any renaming of columns) that denote the unit of a single forecast. See get_forecast_unit() for details. If NULL (the default), all columns that are not required columns are assumed to form the unit of a single forecast. If specified, all columns that are not part of the forecast unit (or required columns) will be removed.

observed

(optional) Name of the column in data that contains the observed values. This column will be renamed to "observed".

predicted

(optional) Name of the column in data that contains the predicted values. This column will be renamed to "predicted".

Details

This function splits out part of the functionality of ⁠as_forecast_<type>⁠ that is the same for all ⁠as_forecast_<type>⁠ functions. It renames the required columns, where appropriate, and sets the forecast unit.


Create a forecast object for nominal forecasts

Description

Nominal forecasts are a form of categorical forecasts where the possible outcomes that the observed values can assume are not ordered. In that sense, Nominal forecasts represent a generalisation of binary forecasts.

Usage

as_forecast_nominal(
  data,
  forecast_unit = NULL,
  observed = NULL,
  predicted = NULL,
  predicted_label = NULL
)

Arguments

data

A data.frame (or similar) with predicted and observed values. See the details section of as_forecast() for additional information on required input formats.

forecast_unit

(optional) Name of the columns in data (after any renaming of columns) that denote the unit of a single forecast. See get_forecast_unit() for details. If NULL (the default), all columns that are not required columns are assumed to form the unit of a single forecast. If specified, all columns that are not part of the forecast unit (or required columns) will be removed.

observed

(optional) Name of the column in data that contains the observed values. This column will be renamed to "observed".

predicted

(optional) Name of the column in data that contains the predicted values. This column will be renamed to "predicted".

predicted_label

(optional) Name of the column in data that denotes the outcome to which a predicted probability corresponds to. This column will be renamed to "predicted_label". Only applicable to nominal forecasts.

See Also

Other functions to create forecast objects: as_forecast, as_forecast_binary(), as_forecast_point(), as_forecast_quantile(), as_forecast_sample()


Create a forecast object for point forecasts

Description

Create a forecast object for point forecasts. See more information on forecast types and expected input formats by calling ⁠?⁠as_forecast().

When converting a forecast_quantile object into a forecast_point object, the 0.5 quantile is extracted and returned as the point forecast.

Usage

as_forecast_point(data, ...)

## Default S3 method:
as_forecast_point(
  data,
  forecast_unit = NULL,
  observed = NULL,
  predicted = NULL,
  ...
)

## S3 method for class 'forecast_quantile'
as_forecast_point(data, ...)

Arguments

data

A data.frame (or similar) with predicted and observed values. See the details section of as_forecast() for additional information on required input formats.

...

Unused

forecast_unit

(optional) Name of the columns in data (after any renaming of columns) that denote the unit of a single forecast. See get_forecast_unit() for details. If NULL (the default), all columns that are not required columns are assumed to form the unit of a single forecast. If specified, all columns that are not part of the forecast unit (or required columns) will be removed.

observed

(optional) Name of the column in data that contains the observed values. This column will be renamed to "observed".

predicted

(optional) Name of the column in data that contains the predicted values. This column will be renamed to "predicted".

See Also

Other functions to create forecast objects: as_forecast, as_forecast_binary(), as_forecast_nominal(), as_forecast_quantile(), as_forecast_sample()


Create a forecast object for quantile-based forecasts

Description

Create a forecast object for quantile-based forecasts. See more information on forecast types and expected input formats by calling ⁠?⁠as_forecast().

When creating a forecast_quantile object from a forecast_sample object, the quantiles are estimated by computing empircal quantiles from the samples via quantile(). Note that empirical quantiles are a biased estimator for the true quantiles in particular in the tails of the distribution and when the number of available samples is low.

Usage

as_forecast_quantile(data, ...)

## Default S3 method:
as_forecast_quantile(
  data,
  forecast_unit = NULL,
  observed = NULL,
  predicted = NULL,
  quantile_level = NULL,
  ...
)

## S3 method for class 'forecast_sample'
as_forecast_quantile(
  data,
  probs = c(0.05, 0.25, 0.5, 0.75, 0.95),
  type = 7,
  ...
)

Arguments

data

A data.frame (or similar) with predicted and observed values. See the details section of as_forecast() for additional information on required input formats.

...

Unused

forecast_unit

(optional) Name of the columns in data (after any renaming of columns) that denote the unit of a single forecast. See get_forecast_unit() for details. If NULL (the default), all columns that are not required columns are assumed to form the unit of a single forecast. If specified, all columns that are not part of the forecast unit (or required columns) will be removed.

observed

(optional) Name of the column in data that contains the observed values. This column will be renamed to "observed".

predicted

(optional) Name of the column in data that contains the predicted values. This column will be renamed to "predicted".

quantile_level

(optional) Name of the column in data that contains the quantile level of the predicted values. This column will be renamed to "quantile_level". Only applicable to quantile-based forecasts.

probs

A numeric vector of quantile levels for which quantiles will be computed. Corresponds to the probs argument in quantile().

type

Type argument passed down to the quantile function. For more information, see quantile().

See Also

Other functions to create forecast objects: as_forecast, as_forecast_binary(), as_forecast_nominal(), as_forecast_point(), as_forecast_sample()


Create a forecast object for sample-based forecasts

Description

Create a forecast object for sample-based forecasts

Usage

as_forecast_sample(
  data,
  forecast_unit = NULL,
  observed = NULL,
  predicted = NULL,
  sample_id = NULL
)

Arguments

data

A data.frame (or similar) with predicted and observed values. See the details section of as_forecast() for additional information on required input formats.

forecast_unit

(optional) Name of the columns in data (after any renaming of columns) that denote the unit of a single forecast. See get_forecast_unit() for details. If NULL (the default), all columns that are not required columns are assumed to form the unit of a single forecast. If specified, all columns that are not part of the forecast unit (or required columns) will be removed.

observed

(optional) Name of the column in data that contains the observed values. This column will be renamed to "observed".

predicted

(optional) Name of the column in data that contains the predicted values. This column will be renamed to "predicted".

sample_id

(optional) Name of the column in data that contains the sample id. This column will be renamed to "sample_id". Only applicable to sample-based forecasts.

See Also

Other functions to create forecast objects: as_forecast, as_forecast_binary(), as_forecast_nominal(), as_forecast_point(), as_forecast_quantile()


Assert Inputs Have Matching Dimensions

Description

Function assesses whether input dimensions match. In the following, n is the number of observations / forecasts. Scalar values may be repeated to match the length of the other input. Allowed options are therefore:

  • observed is vector of length 1 or length n

  • predicted is:

    • a vector of of length 1 or length n

    • a matrix with n rows and 1 column

Usage

assert_dims_ok_point(observed, predicted)

Arguments

observed

Input to be checked. Should be a factor of length n with exactly two levels, holding the observed values. The highest factor level is assumed to be the reference level. This means that predicted represents the probability that the observed value is equal to the highest factor level.

predicted

Input to be checked. predicted should be a vector of length n, holding probabilities. Alternatively, predicted can be a matrix of size n x 1. Values represent the probability that the corresponding value in observed will be equal to the highest available factor level.

Value

Returns NULL invisibly if the assertion was successful and throws an error otherwise.


Assert that input is a forecast object and passes validations

Description

Assert that an object is a forecast object (i.e. a data.table with a class forecast and an additional class ⁠forecast_*⁠ corresponding to the forecast type).

Usage

assert_forecast(forecast, forecast_type = NULL, verbose = TRUE, ...)

## Default S3 method:
assert_forecast(forecast, forecast_type = NULL, verbose = TRUE, ...)

## S3 method for class 'forecast_binary'
assert_forecast(forecast, forecast_type = NULL, verbose = TRUE, ...)

## S3 method for class 'forecast_point'
assert_forecast(forecast, forecast_type = NULL, verbose = TRUE, ...)

## S3 method for class 'forecast_quantile'
assert_forecast(forecast, forecast_type = NULL, verbose = TRUE, ...)

## S3 method for class 'forecast_sample'
assert_forecast(forecast, forecast_type = NULL, verbose = TRUE, ...)

Arguments

forecast

A forecast object (a validated data.table with predicted and observed values, see as_forecast()).

forecast_type

(optional) The forecast type you expect the forecasts to have. If the forecast type as determined by scoringutils based on the input does not match this, an error will be thrown. If NULL (the default), the forecast type will be inferred from the data.

verbose

Logical. If FALSE (default is TRUE), no messages and warnings will be created.

...

Currently unused. You cannot pass additional arguments to scoring functions via .... See the Customising metrics section below for details on how to use purrr::partial() to pass arguments to individual metrics.

Value

Returns NULL invisibly.

Forecast types and input formats

Various different forecast types / forecast formats are supported. At the moment, those are:

  • point forecasts

  • binary forecasts ("soft binary classification")

  • nominal forecasts ("soft classification with multiple unordered classes")

  • Probabilistic forecasts in a quantile-based format (a forecast is represented as a set of predictive quantiles)

  • Probabilistic forecasts in a sample-based format (a forecast is represented as a set of predictive samples)

Forecast types are determined based on the columns present in the input data. Here is an overview of the required format for each forecast type:

required-inputs.png

All forecast types require a data.frame or similar with columns observed predicted, and model.

Point forecasts require a column observed of type numeric and a column predicted of type numeric.

Binary forecasts require a column observed of type factor with exactly two levels and a column predicted of type numeric with probabilities, corresponding to the probability that observed is equal to the second factor level. See details here for more information.

Nominal forecasts require a column observed of type factor with N levels, (where N is the number of possible outcomes), a column predicted of type numeric with probabilities (which sum to one across all possible outcomes), and a column predicted_label of type factor with N levels, denoting the outcome for which a probability is given. Forecasts must be complete, i.e. there must be a probability assigned to every possible outcome.

Quantile-based forecasts require a column observed of type numeric, a column predicted of type numeric, and a column quantile_level of type numeric with quantile-levels (between 0 and 1).

Sample-based forecasts require a column observed of type numeric, a column predicted of type numeric, and a column sample_id of type numeric with sample indices.

For more information see the vignettes and the example data (example_quantile, example_sample_continuous, example_sample_discrete, example_point(), example_binary, and example_nominal).

Examples

forecast <- as_forecast_binary(example_binary)
assert_forecast(forecast)

Validation common to all forecast types

Description

The function runs input checks that apply to all input data, regardless of forecast type. The function

  • asserts that the forecast is a data.table which has columns observed and predicted

  • checks the forecast type and forecast unit

  • checks there are no duplicate forecasts

  • if appropriate, checks the number of samples / quantiles is the same for all forecasts.

Usage

assert_forecast_generic(data, verbose = TRUE)

Arguments

data

A data.table with forecasts and observed values that should be validated.

verbose

Logical. If FALSE (default is TRUE), no messages and warnings will be created.

Value

returns the input


Assert that forecast type is as expected

Description

Assert that forecast type is as expected

Usage

assert_forecast_type(data, actual = get_forecast_type(data), desired = NULL)

Arguments

data

A forecast object (see as_forecast()).

actual

The actual forecast type of the data

desired

The desired forecast type of the data

Value

Returns NULL invisibly if the assertion was successful and throws an error otherwise.


Assert that inputs are correct for binary forecast

Description

Function assesses whether the inputs correspond to the requirements for scoring binary forecasts.

Usage

assert_input_binary(observed, predicted)

Arguments

observed

Input to be checked. Should be a factor of length n with exactly two levels, holding the observed values. The highest factor level is assumed to be the reference level. This means that predicted represents the probability that the observed value is equal to the highest factor level.

predicted

Input to be checked. predicted should be a vector of length n, holding probabilities. Alternatively, predicted can be a matrix of size n x 1. Values represent the probability that the corresponding value in observed will be equal to the highest available factor level.

Value

Returns NULL invisibly if the assertion was successful and throws an error otherwise.


Assert that inputs are correct for interval-based forecast

Description

Function assesses whether the inputs correspond to the requirements for scoring interval-based forecasts.

Usage

assert_input_interval(observed, lower, upper, interval_range)

Arguments

observed

Input to be checked. Should be a numeric vector with the observed values of size n.

lower

Input to be checked. Should be a numeric vector of size n that holds the predicted value for the lower bounds of the prediction intervals.

upper

Input to be checked. Should be a numeric vector of size n that holds the predicted value for the upper bounds of the prediction intervals.

interval_range

Input to be checked. Should be a vector of size n that denotes the interval range in percent. E.g. a value of 50 denotes a (25%, 75%) prediction interval.

Value

Returns NULL invisibly if the assertion was successful and throws an error otherwise.


Assert that inputs are correct for nominal forecasts

Description

Function assesses whether the inputs correspond to the requirements for scoring nominal forecasts.

Usage

assert_input_nominal(observed, predicted, predicted_label)

Arguments

observed

Input to be checked. Should be a factor of length n with N levels holding the observed values. n is the number of observations and N is the number of possible outcomes the observed values can assume. output)

predicted

Input to be checked. predicted should be a vector of length n, holding probabilities. Alternatively, predicted can be a matrix of size n x 1. Values represent the probability that the corresponding value in observed will be equal to the highest available factor level.

predicted_label

Factor of length N with N levels, where N is the number of possible outcomes the observed values can assume.

Value

Returns NULL invisibly if the assertion was successful and throws an error otherwise.


Assert that inputs are correct for point forecast

Description

Function assesses whether the inputs correspond to the requirements for scoring point forecasts.

Usage

assert_input_point(observed, predicted)

Arguments

observed

Input to be checked. Should be a numeric vector with the observed values of size n.

predicted

Input to be checked. Should be a numeric vector with the predicted values of size n.

Value

Returns NULL invisibly if the assertion was successful and throws an error otherwise.


Assert that inputs are correct for quantile-based forecast

Description

Function assesses whether the inputs correspond to the requirements for scoring quantile-based forecasts.

Usage

assert_input_quantile(
  observed,
  predicted,
  quantile_level,
  unique_quantile_levels = TRUE
)

Arguments

observed

Input to be checked. Should be a numeric vector with the observed values of size n.

predicted

Input to be checked. Should be nxN matrix of predictive quantiles, n (number of rows) being the number of data points and N (number of columns) the number of quantiles per forecast. If observed is just a single number, then predicted can just be a vector of size N.

quantile_level

Input to be checked. Should be a vector of size N that denotes the quantile levels corresponding to the columns of the prediction matrix.

unique_quantile_levels

Whether the quantile levels are required to be unique (TRUE, the default) or not (FALSE).

Value

Returns NULL invisibly if the assertion was successful and throws an error otherwise.