Kiril Zvezdarov’s Blog - Modeling Formula 1 Fantasy GP points with PyMC

Code

%matplotlib inline

import json
import os
import warnings
from enum import Enum
from pathlib import Path

warnings.filterwarnings("ignore", category=UserWarning)
warnings.filterwarnings("ignore", category=FutureWarning)

import arviz as az
import numpy as np
import pandas as pd
import preliz as pz
import pymc as pm
import requests
import seaborn as sns
import tqdm
from matplotlib import pyplot as plt
from scipy import stats

sns.set_theme()
sns.set_style(style="darkgrid", rc={"axes.facecolor": ".9", "grid.color": ".8"})
sns.set_palette(palette="deep")
sns_c = sns.color_palette(palette="deep")

plt.rcParams["figure.figsize"] = [10, 7]

az.style.use("arviz-docgrid")

WARNING (pytensor.tensor.blas): Using NumPy C-API based implementation for BLAS functions.

Like many others I got into Formula 1 after binging through Drive to Survive. After enduring a nail biting (and ultimately bitterly disappointing) 2021 season, and attending the 2022 Canadian Grand Prix, I organized a small fantasy league for 2023, hosted on Fantasy GP.

Fantasy GP’s rules give each player a set budget to build a team consisting of 3 drivers and 3 constructor. During the season, drivers and constructors are awarded points for position finished during at the race (matching the actual world championship points awarded), as well as for bonuses such as beating their teammate and gaining positions during the race. This effectively brings lower midfied and backmarker drivers into the mix, since they can generate points from performance relative to their rivals.

This notebook will attempt to model individual driver and constructor performance in terms of Fantasy GP points earned, in order to provide a slightly more quantifiable basis for team construction. We focus on the most fundamental question - how many points per race is a given driver/constructor likely to earn. To answer this, we will create and evaluate a simple Bayesian linear model using the probabilistic programming framework PyMC.

Disclaimer

I have a fairly limited knowledge of Bayesian statistics and modeling; this notebook shouldn’t be considered authoritative of prescriptive in any way. It is mostly a collection of notes, reflecting ongoing learning.

1 Data acquisition

We start by acquiring data for the 2023 season. The Ergast API hosts statistics for Formula 1 races from the 1950s to the present day and exposes convenient endpoints for querying both qualifying and grand prix results.

First, we define some convenience functions to query the raw json data:

API = "http://ergast.com/api/f1"

Code

DATASET_PATH = {"gp": "results", "quali": "qualifying"}


def get_dataset(dataset: str, year: int, limit: int = 10000) -> dict:
    path = DATASET_PATH[dataset]
    url = f"{API}/{year}/{path}.json?limit={limit}"

    response = requests.get(url)
    response.raise_for_status()

    return response.json()


def load_dataset(dataset: str, year: int) -> dict:
    results_path = Path(f".data/{year}_{dataset}_results.json")

    if results_path.exists():
        with results_path.open() as f:
            results = json.loads(f.read())
    else:
        results = get_dataset(dataset, year)

        if not os.path.isdir(".data"):
            os.makedirs(".data")

        with results_path.open("w") as f:
            f.write(json.dumps(results))

    return results

gp_data = load_dataset("gp", 2023)
quali_data = load_dataset("quali", 2023)

Next, we unnest it into a row-wise format and load it as a Pandas dataframe for ease of use:

Code

def gp_results_to_dataframe(json_data):
    data = {}
    for gp in json_data["MRData"]["RaceTable"]["Races"]:
        for result in gp["Results"]:
            data.setdefault("circuit", []).append(gp["Circuit"]["circuitId"])
            data.setdefault("driver", []).append(result["Driver"]["code"])
            data.setdefault("round", []).append(gp["round"])
            data.setdefault("constructor", []).append(
                result["Constructor"]["constructorId"]
            )

            data.setdefault("grid", []).append(result["grid"])
            data.setdefault("position", []).append(result["position"])
            data.setdefault("points", []).append(result["points"])
            data.setdefault("fastest_lap", []).append(
                result.get("FastestLap", {}).get("rank")
            )
            data.setdefault("status", []).append(result["status"])

    return (
        pd.DataFrame(data)
        .astype(
            {
                "circuit": "category",
                "driver": "category",
                "round": "int",
                "constructor": "category",
                "status": "category",
                "grid": "Int64",
                "position": "Int64",
                "points": "float32",
                "fastest_lap": "Int64",
            }
        )
        .set_index(["circuit", "driver"])
    )


def quali_results_to_dataframe(json_data):
    data = {}
    for quali in json_data["MRData"]["RaceTable"]["Races"]:
        for result in quali["QualifyingResults"]:
            data.setdefault("circuit", []).append(quali["Circuit"]["circuitId"])
            data.setdefault("driver", []).append(result["Driver"]["code"])
            data.setdefault("round", []).append(quali["round"])
            data.setdefault("constructor", []).append(
                result["Constructor"]["constructorId"]
            )

            data.setdefault("qualifying_position", []).append(result["position"])

    return (
        pd.DataFrame(data)
        .astype(
            {
                "circuit": "category",
                "driver": "category",
                "round": "int",
                "constructor": "category",
                "qualifying_position": "Int64",
            }
        )
        .set_index(["circuit", "driver"])
    )

gp_df = gp_results_to_dataframe(gp_data)
gp_df

		round	constructor	grid	position	points	fastest_lap	status
circuit	driver
bahrain	VER	1	red_bull	1	1	25.0	6	Finished
	PER	1	red_bull	2	2	18.0	7	Finished
	ALO	1	aston_martin	5	3	15.0	5	Finished
	SAI	1	ferrari	4	4	12.0	14	Finished
	HAM	1	mercedes	7	5	10.0	10	Finished
...	...	...	...	...	...	...	...	...
monza	STR	14	aston_martin	20	16	0.0	17	Finished
	HUL	14	haas	13	17	0.0	10	+1 Lap
	MAG	14	haas	19	18	0.0	15	+1 Lap
	OCO	14	alpine	18	19	0.0	19	Steering
	TSU	14	alphatauri	11	20	0.0	<NA>	Engine

280 rows × 7 columns

quali_df = quali_results_to_dataframe(quali_data)
quali_df

		round	constructor	qualifying_position
circuit	driver
bahrain	VER	1	red_bull	1
	PER	1	red_bull	2
	LEC	1	ferrari	3
	SAI	1	ferrari	4
	ALO	1	aston_martin	5
...	...	...	...	...
monza	ZHO	14	alfa	16
	GAS	14	alpine	17
	OCO	14	alpine	18
	MAG	14	haas	19
	STR	14	aston_martin	20

280 rows × 3 columns

Finally, we join the grand prix and qualifying dataframes together, forming the base dataset for our analysis; we also rename all entries credited to Liam Lawson (LAW) to Daniel Ricciardio (RIC), as Fantasy GP credits points from the temporary reserve driver to the driver they replace:

data_df = gp_df.join(quali_df[["qualifying_position"]]).rename(
    level="driver", index={"LAW": "RIC"}
)
data_df

		round	constructor	grid	position	points	fastest_lap	status	qualifying_position
circuit	driver
bahrain	VER	1	red_bull	1	1	25.0	6	Finished	1
	PER	1	red_bull	2	2	18.0	7	Finished	2
	ALO	1	aston_martin	5	3	15.0	5	Finished	5
	SAI	1	ferrari	4	4	12.0	14	Finished	4
	HAM	1	mercedes	7	5	10.0	10	Finished	7
...	...	...	...	...	...	...	...	...	...
monza	STR	14	aston_martin	20	16	0.0	17	Finished	20
	HUL	14	haas	13	17	0.0	10	+1 Lap	13
	MAG	14	haas	19	18	0.0	15	+1 Lap	19
	OCO	14	alpine	18	19	0.0	19	Steering	18
	TSU	14	alphatauri	11	20	0.0	<NA>	Engine	11

280 rows × 8 columns

2 Fantasy points

Fantasy GP points per team/driver are calculated in the following way:

drivers earn the points scored during the grand prix, 10pts for taking pole position, 2pts per position gained on Sunday, and 5 points each for beating their teammate during qualifying and the race itself.
constructors earn the points scored by each of their drivers during the grand prix, 1pt per position gained per driver, 5 points if both cars finish, 2 if only one car finishes.

To simplify the problem, we completely ignore sprint races (as should everyone).

The race finish status classification has several detailed categories:

data_df["status"].dtype

CategoricalDtype(categories=['+1 Lap', '+2 Laps', 'Accident', 'Brakes', 'Collision',
                  'Collision damage', 'Electrical', 'Engine', 'Finished',
                  'Mechanical', 'Oil leak', 'Overheating', 'Power loss',
                  'Retired', 'Steering', 'Undertray'],
, ordered=False, categories_dtype=object)

Fantasy GP scoring only cares whether a car finished or not, so we define the following set of categories to indiciate a finish:

STATUS_FINISHED = ["Finished", "+1 Lap", "+2 Laps"]

While our dataset has the WDC points scored per driver/constructor already, positions gained and performance vs. teammate need to be added in:

def add_scoring_cols(df: pd.DataFrame) -> pd.DataFrame:
    return df.assign(
        # Grid value of 0 indicates pit lane start; here we set that to 99
        # to simplify the check for who won out in qualifying.
        grid=lambda x: x["grid"].where(x["grid"] != 0, 20)
    ).assign(
        # Positions gained compared to the starting grid position; scoring doesn't
        # care about positions lost, so we set anything below 0 to 0.
        positions_gained=lambda x: np.maximum(x["grid"] - x["position"], 0),
        # Whether the driver won pole position
        has_pole=lambda x: x["qualifying_position"] == 1,
        # Whether the driver beat their teammate in qualifying
        beat_teammate_quali=lambda x: x.groupby(
            # Each group is per race, per constructor, so only 2 rows - one for each driver.
            ["circuit", "constructor"],
            group_keys=False,
        ).apply(
            # Smaller grid pos. = better; the grid position is compared
            # to the reversed grid array in the group (essentially
            # we create a cartesian product of the grid pos.)
            lambda g: g["grid"]
            < g["grid"].iloc[::-1].values
        ),
        # Same as the previous column, but for finishing position in the race.
        beat_teammate_race=lambda x: x.groupby(
            ["circuit", "constructor"], group_keys=False
        ).apply(
            lambda g: (g["position"] < g["position"].iloc[::-1].values)
            & g["status"].isin(STATUS_FINISHED)
        ),
        has_fastest_lap=lambda x: x["fastest_lap"] == 1,
    )


data_df = add_scoring_cols(data_df)

This leaves us with the following dataframe:

Code

data_df[
    [
        "grid",
        "positions_gained",
        "beat_teammate_quali",
        "beat_teammate_race",
        "has_pole",
        "has_fastest_lap",
    ]
]

		grid	positions_gained	beat_teammate_quali	beat_teammate_race	has_pole	has_fastest_lap
circuit	driver
bahrain	VER	1	0	True	True	True	False
	PER	2	0	False	False	False	False
	ALO	5	2	True	True	False	False
	SAI	4	0	False	True	False	False
	HAM	7	2	False	True	False	False
...	...	...	...	...	...	...	...
monza	STR	20	4	False	False	False	False
	HUL	13	0	True	True	False	False
	MAG	19	1	False	False	False	False
	OCO	18	0	False	False	False	False
	TSU	11	0	True	False	False	<NA>

280 rows × 6 columns

Finally, we can score the Fantasy GP points; we define the scoring function for the driver to be the sum of:

WDC points
10 for gaining pole position
5 for outqualifying their teammate
5 for outracing their teammate
2 per position gained at the end of the race

Note that a driver who did not finish the race on Sunday can still earn points from the qualifying session.

def score_driver(x):
    return pd.Series(
        x["has_pole"] * 10 + x["beat_teammate_quali"] * 5, dtype=float
    ) + pd.Series(
        x["points"] + x["positions_gained"] * 2 + x["beat_teammate_race"] * 5,
        dtype=float,
    ).where(
        x["status"].isin(STATUS_FINISHED), 0
    )

Scoring for the constructors is defined as:

WDC points
5 points if both cars finished, 2 if only one did
1 point per position gained at the end of the race per car

def score_constructor(x):
    finished = x["status"].isin(STATUS_FINISHED)
    match finished.sum():
        case 2:
            finish_bonus = 5
        case 1:
            finish_bonus = 2
        case _:
            finish_bonus = 0

    return x["points"].sum() + x[finished]["positions_gained"].sum() + finish_bonus

Just from the rules definition it is clear that drivers have a higher points earnign ceiling but much higher variance, whereas constructors are more consistent but earn fewer points.

Code

def score_fantasy_points(df: pd.DataFrame) -> pd.DataFrame:
    return df.assign(
        driver_points=score_driver,
        # Constructor points need to be joined back in on the grouping columns, in order to
        # fill in the missing spots with duplicate values - since we have 20 drivers, but 10 constructors,
        # the group has fewer rows and needs to be broadcast per group on the index.
        constructor_points=lambda x: x.join(
            x.groupby(["circuit", "constructor"])
            .apply(score_constructor)
            .rename("constructor_points"),
            on=["circuit", "constructor"],
        )["constructor_points"],
    )

The scored dataframe looks like this:

Code

data_df = score_fantasy_points(data_df)
data_df[["driver_points", "constructor_points"]]

		driver_points	constructor_points
circuit	driver
bahrain	VER	45.0	48.0
	PER	18.0	48.0
	ALO	29.0	32.0
	SAI	17.0	14.0
	HAM	19.0	23.0
...	...	...	...
monza	STR	8.0	12.0
	HUL	10.0	6.0
	MAG	2.0	6.0
	OCO	0.0	4.0
	TSU	5.0	3.0

280 rows × 2 columns

Note

The constructor points are duplicated per race, since we broadcast those across both drivers.

3 Data exploration

Now we visualize the data to get a better idea of its properties. First, some basic box plots for each driver/constructor:

Code

sns.set_theme(style="ticks")
f, ax = plt.subplots(figsize=(12, 10))
sns.boxplot(
    data=data_df.reset_index(),
    x="driver_points",
    y="driver",
    palette="pastel",
    whis=[0, 100],
)

sns.stripplot(
    x="driver_points",
    y="driver",
    data=data_df.reset_index(),
    hue="circuit",
    linewidth=0,
    size=5,
    dodge=True,
    palette="muted",
)

ax.xaxis.grid(True)
ax.set(xlabel="points")
ax.set_title("Driver Fantasy GP points 2023");

Figure 1: Fantasy points earned by each driver in 2023

For convenience, we create a constructor-specific dataframe by taking the first row for each constructor for each race. This gets rid of the duplicate data (as constructor points were duplicated across both drivers):

const_data_df = data_df.groupby(["circuit", "constructor"]).first().reset_index()

Code

sns.set_theme(style="ticks")
f, ax = plt.subplots(figsize=(12, 6))
sns.boxplot(
    data=const_data_df.reset_index(),
    x="constructor_points",
    y="constructor",
    palette="pastel",
    whis=[0, 100],
)

sns.stripplot(
    x="constructor_points",
    y="constructor",
    data=const_data_df.reset_index(),
    hue="circuit",
    palette="muted",
    linewidth=0,
    size=4,
    dodge=True,
)

ax.xaxis.grid(True)
ax.set(xlabel="points")
ax.set_title("Constructor Fantasy GP points 2023");

Figure 2: Fantasy points earned by each constructor in 2023

These look quite similar and expose the brutal reality of Red Bull’s total dominance over the field. Matching our intuition from the rules definition, drivers appear to earn more and have wider variance in points scored, comapred to teams, with the exception of Max Verstappen and his absurd consistency.

Next, we plot the kernel density estimate of the driver points to gain a better idea of the overall distribution:

Code

g = sns.displot(
    data_df.reset_index(),
    x="driver_points",
    col="driver",
    kind="kde",
    rug=True,
    col_wrap=3,
    fill=True,
)
g.set_axis_labels("points", "density");

Figure 3: Kernel density estimates for fantasy points earned by each driver in 2023

Code

g = sns.displot(
    const_data_df.reset_index(),
    x="constructor_points",
    col="constructor",
    col_wrap=5,
    kind="kde",
    height=4,
    rug=True,
    fill=True,
)


g.set_axis_labels("points", "density");

Figure 4: Kernel density estimates for fantasy points earned by each constructor in 2023

The distribution of points is spread and somewhat multi-peaked in both cases; again, constructors appear score in a much more consistent range compared to drivers. The pooled observed points distributions are:

Code

f, axs = plt.subplots(1, 2, figsize=(12, 5), sharey=True)

sns.kdeplot(data_df.reset_index(), x="driver_points", fill=True, ax=axs[0])
sns.kdeplot(const_data_df.reset_index(), x="constructor_points", fill=True, ax=axs[1])
axs[0].set_title("Overall driver points density 2023")
axs[0].set(xlabel="driver points")

axs[1].set_title("Overall constructor points density 2023")
axs[1].set(xlabel="constructor points")

f.tight_layout();

Figure 5: Grouped kernel density estimates for fantasy points by drivers/constructors in 2023

Given that the point distributions between drivers and constructors appear very similar, we’ll focus on constructing models of the driver points, which we can later fit to the constructor data with minor adjustments.

4 Unpooled model

Our first model will be the simplest possible - we just directly estimate the mean points for each driver individually (an intercept-only, unpooled linear model).

4.1 Picking the likelihood

The points data has several properties that stem from the Fantasy GP ruleset:

points are discrete
points are strictly bound between 0 and 84 (25 for p1 + 1 for fastest lap + 10 for pole position + 10 for beating teammate + 19 * 2 for maximum possible positions gained)
most drivers have a few 0 point finishes, but overall it’s very unlikely for a driver to score 0 points consistently.

Because we are modeling discrete, strictly positive points, one of the count distributions is likely an appropriate choice for the likelihood. The Poisson distribution could be a good choice, but the observations appear overdispersed, as measured by the coefficient of dispersion:

Code

data_df = data_df.reset_index().astype(
    {
        "circuit": "category",
        "driver": "category",
    }
)

dispersion = (
    data_df.groupby("driver")["driver_points"].var()
    / data_df.groupby("driver")["driver_points"].mean()
)
dispersion

driver
ALB     2.016575
ALO     2.458689
BOT     4.663685
DEV     2.808279
GAS    12.389453
HAM     2.238189
HUL     3.298566
LEC     8.306281
MAG     4.909699
NOR     5.876923
OCO     7.848030
PER     3.748745
PIA     6.062937
RIC     2.466667
RUS     5.968760
SAI     6.063976
SAR     6.434705
STR     5.173077
TSU     2.345168
VER     1.030361
ZHO     7.205955
Name: driver_points, dtype: float64

Given that, we will pick the Negative Binomial as our likelihood, since it handles overdispersion better¹.

4.2 Choosing the priors

A Negative Binomial likelihood is defined by two parameters - its mean μ, and a shape parameter α. We’ll need to set priors on both.

The mean is restricted to be positive only. The standard way of defining a negative binomial linear model is to set a Normal prior on the mean and use the exponentiation as the inverse link function to make it positive and usable for parameterizing the likelihood. For the actual prior parameters it would be reasonable to go with a fairly uninformative prior, such as Normal(mu=0, sigma=1); in this case, prior knowledge and the ruleset gives us a hunch that we can reasonably expect most drivers’ mean points per race to be somewhere between 5 (backmarker, occasionally beats their teammate, maybe gains a place) and 30 (top driver & team, consistently getting and winning from pole position).

To come up with parameter values for this, we use the PreliZ library to find a maximum entropy Normal distribution with 90% of its mass between 5 and 30 (we take the natural log of those, given the exponential inverse link):

mu_dist = pz.Normal()
pz.maxent(mu_dist, np.log(5), np.log(30), 0.90);

The shape parameter α is also strictly positive. Its effect on the likelihood can be best understood by plotting a Negative Binomial for a few different values, while keeping the mean fixed:

Code

f, axs = plt.subplots(2, 2, figsize=(12, 6), sharey=True)

pz.NegativeBinomial(alpha=0.5, mu=15).plot_pdf(ax=axs[0, 0], legend=False)
axs[0, 0].set_title(f"Negative Binomial with μ: 15, α: 0.5")

pz.NegativeBinomial(alpha=5, mu=15).plot_pdf(ax=axs[0, 1], legend=False)
axs[0, 1].set_title(f"Negative Binomial with μ: 15, α: 5")

pz.NegativeBinomial(alpha=50, mu=15).plot_pdf(ax=axs[1, 0], legend=False)
axs[1, 0].set_title(f"Negative Binomial with μ: 15, α: 50")

pz.NegativeBinomial(alpha=500, mu=15).plot_pdf(ax=axs[1, 1], legend=False)
axs[1, 1].set_title(f"Negative Binomial with μ: 15, α: 500");

Figure 7: Negative binomial distributions with a fixed mean at different shape parameters

As α increases, the tails shrink. We don’t want very high values, as that will restrict the tails too much, but very small values result in absurdly long tails. To handle that, we use a Gamma distribution, with mass strongly concentrated between 1 and 20:

alpha_dist = pz.Gamma()
pz.maxent(alpha_dist, 1, 20, .9);

Finally, we prepare the data for use within the model. We factorize the driver column into labels and an index:

idx, labels = pd.factorize(data_df["driver"], sort=True)

labels is a categorical variable over the driver names:

CategoricalIndex(['ALB', 'ALO', 'BOT', 'DEV', 'GAS', 'HAM', 'HUL', 'LEC',
                  'MAG', 'NOR', 'OCO', 'PER', 'PIA', 'RIC', 'RUS', 'SAI',
                  'SAR', 'STR', 'TSU', 'VER', 'ZHO'],
                 categories=['ALB', 'ALO', 'BOT', 'DEV', ..., 'STR', 'TSU', 'VER', 'ZHO'], ordered=False, dtype='category')

idx is the flattened (race number, driver) index into the observed race result rows:

array([19, 11,  1, 15,  5, 17, 14,  2,  4,  0, 18, 16,  8,  3,  6, 20,  9,
       10,  7, 12, 11, 19,  1, 14,  5, 15,  7, 10,  4,  8, 18,  6, 20,  3,
       12, 16,  9,  2,  0, 17, 19,  5,  1, 17, 11,  9,  6, 12, 20, 18,  2,
       15,  4, 10,  3, 16,  8, 14,  0,  7, 11, 19,  7,  1, 15,  5, 17, 14,
        9, 18, 12,  0,  8,  4, 10, 16,  6,  2, 20,  3, 19, 11,  1, 14, 15,
        5,  7,  4, 10,  8, 18, 17,  2,  0,  6, 20,  9,  3, 12, 16, 19,  1,
       10,  5, 14,  7,  4, 15,  9, 12,  2,  3, 20,  0, 18, 11,  6, 16,  8,
       17, 19,  5, 14, 11, 15, 17,  1, 10, 20,  4,  7, 18, 12,  3,  6,  0,
        9,  8,  2, 16, 19,  1,  5,  7, 15, 11,  0, 10, 17,  2, 12,  4,  9,
       18,  6, 20,  8,  3, 14, 16, 19,  7, 11,  9,  1, 15, 14,  5, 17,  4,
        0, 20, 16, 10,  2, 12,  3,  8, 18,  6, 19,  9,  5, 12, 14, 11,  1,
        0,  7, 15, 16,  2,  6, 17, 20, 18,  3,  4,  8, 10, 19,  9, 11,  5,
       12, 14,  7, 15,  1, 17,  0,  2, 13,  6, 18, 20,  8, 16, 10,  4, 19,
       11,  7,  5,  1, 14,  9, 10, 17, 18,  4,  2, 20,  0,  8, 13, 16,  6,
       15, 12, 19,  1,  4, 11, 15,  5,  9,  0, 12, 10, 17,  6, 13,  2, 18,
        8, 14, 20,  7, 16, 19, 11, 15,  7, 14,  5,  0,  9,  1,  2, 13, 12,
       16, 20,  4, 17,  6,  8, 10, 18])

We also extract the observed points:

driver_points = data_df["driver_points"]
driver_points

0      45.0
1      18.0
2      29.0
3      17.0
4      19.0
       ... 
275     8.0
276    10.0
277     2.0
278     0.0
279     5.0
Name: driver_points, Length: 280, dtype: float64

4.3 Model definition

Next, we define the model with PyMC. As discussed previously, we think that the process that generates points can be seen as a Negative Binomial distribution, so that is set as the likelihood. The mean (μ) аnd shape (α) are the parameters that we want to infer from the data. Finally, the prior guesses for these parameters are set from the mu_dist and alpha_dist objects we computed previously.

This translates to:

with pm.Model(coords={"driver": labels, "idx": idx}) as unpooled_negb:
    μ = pm.Normal("μ", mu_dist.mu, mu_dist.sigma, dims="driver")
    μ_ = pm.Deterministic("μ_", pm.math.exp(μ), dims="driver")

    α = pm.Gamma("α", alpha_dist.alpha, alpha_dist.beta, dims="driver")

    y = pm.NegativeBinomial(
        "y", alpha=α[idx], mu=μ_[idx], observed=driver_points, dims="idx"
    )

4.4 Prior predictive checks

To see if the choice for priors is acceptable, we sample from the prior predictive and plot the results:

with unpooled_negb:
    unpooled_prior_samples = pm.sample_prior_predictive(samples=1000)

Sampling: [y, α, μ]

az.plot_ppc(unpooled_prior_samples, group="prior", figsize=(12, 6));

Figure 9: Prior predictive distribution of the unpooled model

The prior predictive is quite wide, with a very long and somewhat fat tail, but overall the data could be plausibly seen as coming from it.

4.5 Inference and diagnostics

Next, we perform the actual inference step - we sample from the posterior and the posterior predictive distribution:

with unpooled_negb:
    unpooled_idata = pm.sample(target_accept=0.9, idata_kwargs={"log_likelihood": True})
    pm.sample_posterior_predictive(unpooled_idata, extend_inferencedata=True)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [μ, α]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 5 seconds.
Sampling: [y]

100.00% [8000/8000 00:04<00:00 Sampling 4 chains, 0 divergences]

100.00% [4000/4000 00:00<00:00]

There are no divergences detected, which is a good first sign that the model has converged.

To check the sampling process we plot the traces for the μ parameter. What we are looking for is similar densities on the left (indicating that the chains have converged to the same posterior distribution), and noisy/patternless trace plots on the right (which would indicate that the sampler was able to efficiently run through the sample space)²:

az.plot_trace(unpooled_idata, var_names=["μ"], compact=False);

Figure 10: Traceplots of the inferred parameter μ for the unpooled model

There are a few kinks here and there (see RIC), but overall these look good.

Next we plot the energy plot and BFMI. Ideally the marginal energy and energy transition distributions should overlap as much as possible, and BFMI should be to be above .3:

Code

ax = az.plot_energy(unpooled_idata)
ax.set_title("Energy Plot");

Figure 11: Energy plot for the unpooled model

Finally, we check the summary statistics for the μ parameter:

bulk and tail effective sample size (ess_bulk and ess_tail respectively) should be > 400, otherwise the estimation of the rest of the summary statistics is considered unreliable ³
r_hat should ideally equal 1, with values over 1.01 convergence issues ⁴
mcse_{mean, sd} is the Monte Carlo Standard Error, or the error introduced by approximating the posterior with a finite number of samples; lower = better ⁵

az.summary(unpooled_idata, var_names=["μ"])

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
μ[ALB]	2.559	0.113	2.356	2.786	0.002	0.001	5027.0	2717.0	1.0
μ[ALO]	3.124	0.101	2.925	3.308	0.002	0.001	4219.0	2969.0	1.0
μ[BOT]	2.523	0.190	2.158	2.878	0.003	0.002	4131.0	2642.0	1.0
μ[DEV]	1.766	0.220	1.350	2.179	0.004	0.003	3568.0	2298.0	1.0
μ[GAS]	2.459	0.221	2.039	2.874	0.003	0.002	4298.0	2631.0	1.0
μ[HAM]	3.085	0.099	2.907	3.279	0.001	0.001	4774.0	2917.0	1.0
μ[HUL]	2.159	0.151	1.888	2.452	0.002	0.002	4008.0	2570.0	1.0
μ[LEC]	2.723	0.209	2.311	3.114	0.003	0.002	4991.0	2884.0	1.0
μ[MAG]	2.030	0.258	1.552	2.523	0.004	0.003	4224.0	2823.0	1.0
μ[NOR]	2.695	0.146	2.416	2.960	0.002	0.002	4055.0	2487.0	1.0
μ[OCO]	2.275	0.280	1.764	2.818	0.004	0.003	4511.0	2902.0	1.0
μ[PER]	3.283	0.101	3.086	3.472	0.001	0.001	4856.0	3012.0	1.0
μ[PIA]	1.901	0.276	1.373	2.416	0.005	0.003	3901.0	2514.0	1.0
μ[RIC]	2.338	0.226	1.948	2.791	0.003	0.002	4746.0	2388.0	1.0
μ[RUS]	2.776	0.165	2.480	3.097	0.002	0.002	4741.0	2662.0	1.0
μ[SAI]	2.661	0.176	2.338	3.005	0.003	0.002	4280.0	2802.0	1.0
μ[SAR]	1.835	0.451	1.023	2.665	0.007	0.005	4104.0	2613.0	1.0
μ[STR]	2.182	0.260	1.709	2.681	0.004	0.003	3834.0	2291.0	1.0
μ[TSU]	2.420	0.122	2.196	2.660	0.002	0.001	4081.0	2713.0	1.0
μ[VER]	3.742	0.074	3.607	3.886	0.001	0.001	5036.0	2571.0	1.0
μ[ZHO]	2.255	0.237	1.798	2.699	0.003	0.003	4903.0	2624.0	1.0

All of the sampling diagnistics look good, which means the model has converged and we can move on to posterior predictive checks and model evaluation.

4.6 Posterior predictive checks

Next we examine the posterior predictive distribution, which will help validate that the predictions the model is making make sense given the data. Below we plot several posterior graphs (from top to bottom, left to right):

the (grouped) posterior predictive distribution should closely replicate the pattern seen in the observed data ⁶
the Bayesian p-value is the probability of the predicted data being equal to or less than the observed data. The ideal/expected outcome is represented by the dashed line (0.5), with the solid line being the KDE of the predicted data; the u-value represents the same idea as the p-value, but per observation - it should be roughly uniform ⁷
LOO-PIT is the probability integral transform checed with LOO-CV; this should be close to uniform; difference between LOO-PIT empirical cumulative distribution function and the uniform cdf ⁸

Code

def plot_posterior_predictive_checks(idata):
    f = plt.figure(figsize=(12, 12))

    ax_ppc = f.add_subplot(3, 1, 1)
    az.plot_ppc(idata, ax=ax_ppc)
    ax_ppc.set_title("Posterior predictive checks")

    ax_bpv = f.add_subplot(3, 2, 3)
    az.plot_bpv(idata, kind="p_value", ax=ax_bpv)
    ax_bpv.set_title("Bayesian p_value")

    ax_uv = f.add_subplot(3, 2, 4)
    az.plot_bpv(idata, kind="u_value", ax=ax_uv)
    ax_uv.set_title("Marginal p_value")

    ax_loo_pit_ecdf = f.add_subplot(3, 2, 5)
    az.plot_loo_pit(idata, y="y", ecdf=True, ax=ax_loo_pit_ecdf)
    ax_loo_pit_ecdf.set_title("LOO-PIT (ecdf=True)")

    ax_loo_pit = f.add_subplot(3, 2, 6)
    az.plot_loo_pit(idata, y="y", ecdf=False, ax=ax_loo_pit)
    ax_loo_pit.set_title("LOO-PIT (ecdf=False)")

    f.tight_layout()

    return (ax_ppc, ax_bpv, ax_uv, ax_loo_pit_ecdf, ax_loo_pit)

plot_posterior_predictive_checks(unpooled_idata);

Figure 12: Posterior predictive checks, u/p-value, and LOO for the unpooled model

From the posterior predictive plots it seems that:

the posterior predictive samples roughly look like tha observed data, but the fit around the right tail isn’t the best.
the p-value plot shows that the model’s predictions are somewhat shifted to the right and the u-value plot indicates that the model is missing observations in the left tail ⁹
LOO-PIT shows the model is likely biased, which agrees with the p/u-value plots.

Lastly, let’s plot the posterior predictive distributions for a few individual drivers:

Code

fig, axs = plt.subplots(2, 2, figsize=(12, 6))

DriverIdx = Enum("Driver", zip(labels.categories, labels.codes))

az.plot_ppc(unpooled_idata, coords={"idx": [DriverIdx.VER.value]}, ax=axs[0, 0])
axs[0, 0].set_title("VER")

az.plot_ppc(unpooled_idata, coords={"idx": [DriverIdx.SAR.value]}, ax=axs[0, 1])
axs[0, 1].set_title("SAR")

az.plot_ppc(unpooled_idata, coords={"idx": [DriverIdx.LEC.value]}, ax=axs[1, 0])
axs[1, 0].set_title("LEC")

az.plot_ppc(unpooled_idata, coords={"idx": [DriverIdx.HAM.value]}, ax=axs[1, 1])
axs[1, 1].set_title("HAM");

Figure 13: Posterior predictive plots of individual drivers for the unpooled model

4.7 Posterior evaluation

We are reasonably confident that the model has converged and that there were no issues during sampling. Predictions seem to be somewhat biased, but still reasonable for our purposes. Now it’s time to check the posterior distributions of the estimated parameter μ:

az.plot_posterior(unpooled_idata, var_names=["μ"]);

Figure 14: Posterior distributions of the μ parameter for the unpooled model

The estimates look well formed; to make comparing the parameter value between drivers easier, we plot a forest of the 94% HDI ¹⁰:

az.plot_forest(unpooled_idata, var_names=["μ"], combined=True, figsize=(12, 6));

Figure 15: Forest of the posterior distributions of μ for the unpooled model (log scale)

The probability distributions for the means broadly agree with our informal observations throughout the season - VER is scoring high above the rest, in an extremely consistent range, with HAM, ALO, and PER close to eachother for second place, a mixed, variable set of results for the midfield, and a few clear backmarker drivers.

To translate the estimated parameter values to points, we can plot the transformed μ_ value (remember that this is just the exponentiated μ):

az.plot_forest(unpooled_idata, var_names=["μ_"], combined=True, figsize=(12, 6));

Figure 16: Forest of the posterior distributions of μ for the unpooled model

4.8 Model evaluation

Finally, we inspect PSIS-LOO-CV to get an idea of the performance of the model:

az.loo(unpooled_idata)

Computed from 4000 posterior samples and 280 observations log-likelihood matrix.

         Estimate       SE
elpd_loo  -954.78    14.99
p_loo       44.52        -

There has been a warning during the calculation. Please check the results.
------

Pareto k diagnostic values:
                         Count   Pct.
(-Inf, 0.5]   (good)      274   97.9%
 (0.5, 0.7]   (ok)          5    1.8%
   (0.7, 1]   (bad)         1    0.4%
   (1, Inf)   (very bad)    0    0.0%

elpd_loo ¹¹ will be more useful when comparing this baseline model to other, more complicated ones, but the fact that Pareto k values are all significantly below 0.7 is a good sign that the model can generalize decently.

5 Zero-inflated unpooled model

One of the common issues with both Poisson and Negative Binomial models is that they do not generate zeroes often enough. This can be alleviated by using a mixture model that inflates the number of zeroes, which PyMC already has an implementation of - the Zero Inflated Negative Binomial likelihood.

To test that extra zeroes might result in a better fit, we define a new model with the same priors as before, but using the mixture likelihood. One additional parameter, ψ, is needed, so we have to set a prior over that. ψ is the probability of a non-zero value, between 0 and 1. We will use a Beta as the prior distribution, and given the ruleset and reliability of modern F1 cars, our prior belief is that 0 point finishes aren’t all that likely. To express that, we take the maximum entropy Beta whose mass is concentrated between .8 and .99:

5.1 Choosing priors

psi_dist = pz.Beta()
pz.maxent(psi_dist, 0.7, 0.99, 0.75);

Figure 17: Prior distribution for the ψ parameter (probability of a non-zero value)

5.2 Model definition

The new model is trivial to define:

with pm.Model(coords={"driver": labels, "idx": idx}) as zi_unpooled_negb:
    μ = pm.Normal("μ", mu_dist.mu, mu_dist.sigma, dims="driver")
    μ_ = pm.Deterministic("μ_", pm.math.exp(μ), dims="driver")

    α = pm.Gamma("α", alpha_dist.alpha, alpha_dist.beta, dims="driver")

    ψ = pm.Beta("ψ", psi_dist.alpha, psi_dist.beta, dims="driver")

    y = pm.ZeroInflatedNegativeBinomial(
        "y",
        psi=ψ[idx],
        alpha=α[idx],
        mu=μ_[idx],
        observed=driver_points,
        dims="idx",
    )

5.3 Prior evaluation

We perform the same prior predictive check:

with zi_unpooled_negb:
    zi_unpooled_prior_samples = pm.sample_prior_predictive(samples=1000)

Sampling: [y, α, μ, ψ]

az.plot_ppc(zi_unpooled_prior_samples, group="prior");

Figure 18: Prior predictive distribution for the zero-inflated model

5.4 Inference and diagnostics

Sample:

with zi_unpooled_negb:
    zi_unpooled_idata = pm.sample(
        target_accept=0.9, idata_kwargs={"log_likelihood": True}
    )
    pm.sample_posterior_predictive(zi_unpooled_idata, extend_inferencedata=True)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [μ, α, ψ]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 10 seconds.
Sampling: [y]

100.00% [8000/8000 00:09<00:00 Sampling 4 chains, 0 divergences]

100.00% [4000/4000 00:05<00:00]

The traceplots look about as good as before:

az.plot_trace(zi_unpooled_idata, var_names=["μ"], compact=False);

Figure 19: Traceplots of the μ parameter for the zero-inflated model

BFMI and energy are reasonable:

Code

ax = az.plot_energy(zi_unpooled_idata)
ax.set_title("Energy Plot");

Figure 20: Energy plot for the zero-inflated model

r_hat and ESS look good as well:

az.summary(zi_unpooled_idata, var_names=["μ"])

	mean	sd	hdi_3%	hdi_97%	mcse_mean	mcse_sd	ess_bulk	ess_tail	r_hat
μ[ALB]	2.558	0.111	2.343	2.762	0.001	0.001	6450.0	3026.0	1.0
μ[ALO]	3.123	0.100	2.942	3.322	0.001	0.001	6609.0	2903.0	1.0
μ[BOT]	2.655	0.128	2.417	2.900	0.002	0.001	6154.0	2859.0	1.0
μ[DEV]	1.907	0.183	1.581	2.273	0.002	0.001	7928.0	2908.0	1.0
μ[GAS]	2.499	0.208	2.115	2.907	0.003	0.002	6758.0	2526.0	1.0
μ[HAM]	3.082	0.097	2.898	3.264	0.001	0.001	7940.0	2852.0	1.0
μ[HUL]	2.222	0.139	1.965	2.484	0.002	0.001	6692.0	2804.0	1.0
μ[LEC]	2.876	0.143	2.590	3.128	0.002	0.001	6002.0	2722.0	1.0
μ[MAG]	2.237	0.149	1.947	2.507	0.002	0.001	6121.0	2629.0	1.0
μ[NOR]	2.700	0.144	2.436	2.974	0.002	0.001	7691.0	3154.0	1.0
μ[OCO]	2.511	0.166	2.197	2.832	0.002	0.002	6137.0	2785.0	1.0
μ[PER]	3.282	0.105	3.080	3.478	0.001	0.001	6747.0	2638.0	1.0
μ[PIA]	2.078	0.183	1.698	2.395	0.002	0.002	5952.0	2692.0	1.0
μ[RIC]	2.335	0.216	1.906	2.717	0.003	0.002	7119.0	2722.0	1.0
μ[RUS]	2.850	0.135	2.612	3.122	0.002	0.001	5916.0	2873.0	1.0
μ[SAI]	2.728	0.154	2.448	3.019	0.002	0.001	5981.0	2686.0	1.0
μ[SAR]	2.036	0.213	1.672	2.461	0.003	0.002	4655.0	2695.0	1.0
μ[STR]	2.420	0.135	2.167	2.681	0.002	0.001	6232.0	2915.0	1.0
μ[TSU]	2.417	0.123	2.177	2.641	0.001	0.001	7217.0	2763.0	1.0
μ[VER]	3.742	0.074	3.609	3.890	0.001	0.001	6489.0	2464.0	1.0
μ[ZHO]	2.424	0.161	2.105	2.713	0.002	0.001	6419.0	2791.0	1.0

5.5 Posterior predictive checks

Posterior predictive checks look somewhat better:

the posterior predictive fits about as well, with a slightly thinner tail
p-value is still shifted
u-value is much closer to uniform
LOO-PIT still suggests bias, but it looks slightly better overall; the right tail of the posterior predictive still looks too fat and goes well above the maximum points possible in the ruleset, likely causing the bias and shifted predictions.

plot_posterior_predictive_checks(zi_unpooled_idata);

Figure 21: Posterior predictive checks, p/u-value, and LOO for the zero-inflated model

5.6 Posterior evaluation

We plot just the forest of transformed means:

az.plot_forest(zi_unpooled_idata, var_names=["μ_"], combined=True, figsize=(12, 6));

Figure 22: Forest plot of the posterior distribution of the exponentiated μ parameter for the zero-inflated model

As expected, compared to the basic model, the means have been pulled towards zero slightly and the HDIs have shrunk.

5.7 Model comparison

Now that we have two models it’s worth running a comparison to see which performs better ¹²:

Code

comparison = az.compare(
    {"unpooled": unpooled_idata, "zero inflated unpooled": zi_unpooled_idata}
)
comparison

	rank	elpd_loo	p_loo	elpd_diff	weight	se	dse	warning	scale
zero inflated unpooled	0	-922.394375	40.951957	0.000000	0.787666	17.265214	0.000000	False	log
unpooled	1	-954.778337	44.522890	32.383962	0.212334	14.990548	11.918236	True	log

az.plot_compare(comparison);

Figure 23: Comparison of the performance of the unpooled and zero-inflated models

The zero-inflated model clearly seems to perform better ¹³.

6 Driver estimates

We have a reasonable looking, if a bit biased model for driver points, which we’ll now use to inform fantasy team construction.

When picking drivers, we consider the following:

how many points per race is the driver expected to bring and how reliably
how likely is it that a given driver will outperform another driver, and by what magnitude
how underpriced/overpriced is a a driver given their performance

While price analysis is out of the scope of this notebook, the points estimation model can help us with the first two questions.

Below, we compare each driver to their rivals, using the difference of predicted points per race. To do so, we take the predictions generated from the posterior predictive distribution for each driver, subtract the two vectors, and take the mean. This reduces the posterior distributions to a point estimate, but still incorporates the uncertainty of the model, as we work directly with values generated from the posterior. The value can be seen both as confidence driver x is better than driver y - higher is better, and as the points we expect driver x to earn over driver y on average.

Caution

The plot below isn’t symmetric over the right diagonal - we are strictly comparing the driver on the X axis to the driver on the Y axis.

def make_comparison(idata, idx, groups, size=1000):
    columns = {}
    for idx_a, name_a in enumerate(groups):
        for idx_b, name_b in enumerate(groups):
            pp_a = idata.posterior_predictive["y"][:, :, idx == idx_a].values.flatten()
            pp_b = idata.posterior_predictive["y"][:, :, idx == idx_b].values.flatten()

            samples_a = np.random.choice(pp_a, size=size)
            samples_b = np.random.choice(pp_b, size=size)

            if idx_a != idx_b:
                points_over = (samples_a - samples_b).mean()
            else:
                points_over = 0

            columns.setdefault(name_a, []).append(points_over)

    comparison_df = pd.DataFrame.from_dict(columns).set_index(groups)
    mask = np.triu(np.ones_like(comparison_df, dtype=bool))

    return comparison_df, mask

comparison_df, mask = make_comparison(zi_unpooled_idata, idx, labels)

Code

sns.set_theme(style="white")

cmap = sns.diverging_palette(230, 20, as_cmap=True)

f, ax = plt.subplots(figsize=(11, 9))

ax = sns.heatmap(
    comparison_df,
    cmap=cmap,
    mask=mask,
    annot=True,
    center=0,
    square=True,
    linewidths=0.5,
    cbar_kws={"shrink": 0.5},
    annot_kws={"fontsize": 8},
).set(title="Driver points expected over rival");

Figure 24: Expected points earned by driver x over driver y using the zero-inflated model

7 Constructor estimates

We’ll run the same loop for the constructor comparison - first we define a slightly modified zero inflated model, then we sample, validate convergence, and finally use the posterior and posterior predictive to evaluate constructor peformance.

Data preparation is identical - we factorize the constructors and take the points

const_idx, const_labels = pd.factorize(const_data_df["constructor"], sort=True)
const_points = const_data_df["constructor_points"]

We adjust the priors slightly, to account for differences in the constructors scoring system:

mu_const_dist = pz.Normal()
pz.maxent(mu_const_dist, np.log(5), np.log(30), 0.9);

Figure 25: Prior for the μ parameter for fitting the zero inflated model to the constructor data

alpha_const_dist = pz.Gamma()
pz.maxent(alpha_const_dist, 1, 10, .9);

Figure 26: Prior for the α parameter for fitting the zero inflated model to the constructor data

psi_const_dist = pz.Beta()
pz.maxent(psi_const_dist, .9, .99, .9);

Figure 27: Prior for the ψ parameter for fitting the zero inflated model to the constructor data

with pm.Model(
    coords={"constructor": const_labels, "idx": const_idx}
) as const_zi_unpooled_model:
    μ = pm.Normal("μ", mu_const_dist.mu, mu_const_dist.sigma, dims="constructor")
    μ_ = pm.Deterministic("μ_", pm.math.exp(μ), dims="constructor")

    α = pm.Gamma("α", alpha_const_dist.alpha, alpha_const_dist.beta, dims="constructor")

    ψ = pm.Beta("ψ", psi_const_dist.alpha, psi_const_dist.beta, dims="constructor")

    y = pm.ZeroInflatedNegativeBinomial(
        "y",
        psi=ψ[const_idx],
        alpha=α[const_idx],
        mu=μ_[const_idx],
        observed=const_points,
        dims="idx",
    )

with const_zi_unpooled_model:
    const_prior_pred = pm.sample_prior_predictive(samples=1000)

Sampling: [y, α, μ, ψ]

az.plot_ppc(const_prior_pred, group="prior", figsize=(12, 6));

Figure 28: Prior predictive distribution for the zero-inflated model when fitted to the constructors data

with const_zi_unpooled_model:
    const_idata = pm.sample(target_accept=0.9, idata_kwargs={"log_likelihood": True})
    pm.sample_posterior_predictive(const_idata, extend_inferencedata=True)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Multiprocess sampling (4 chains in 4 jobs)
NUTS: [μ, α, ψ]
Sampling 4 chains for 1_000 tune and 1_000 draw iterations (4_000 + 4_000 draws total) took 6 seconds.
Sampling: [y]

100.00% [8000/8000 00:05<00:00 Sampling 4 chains, 0 divergences]

100.00% [4000/4000 00:03<00:00]

az.plot_trace(const_idata, var_names=["μ"], compact=False);

Figure 29: Traceplots of the μ parameter for the zero-inflated model when fitted to the constructors data

plot_posterior_predictive_checks(const_idata);

Figure 30: Posterior predictive checks, p/u-value, and LOO for the zero-inflated model when fitted to the constructors data

az.loo(const_idata)

Computed from 4000 posterior samples and 140 observations log-likelihood matrix.

         Estimate       SE
elpd_loo  -472.93     9.48
p_loo       12.07        -
------

Pareto k diagnostic values:
                         Count   Pct.
(-Inf, 0.5]   (good)      139   99.3%
 (0.5, 0.7]   (ok)          1    0.7%
   (0.7, 1]   (bad)         0    0.0%
   (1, Inf)   (very bad)    0    0.0%

az.plot_forest(const_idata, var_names=["μ_"], combined=True, figsize=(12, 6));

Figure 31: Forest of the posterior distribution of the μ parameter for the zero-inflated model when fitted to the constructors data

Code

sns.set_theme(style="white")

cmap = sns.diverging_palette(230, 20, as_cmap=True)

f, ax = plt.subplots(figsize=(11, 9))

comparison_df, mask = make_comparison(const_idata, const_idx, const_labels)

ax = sns.heatmap(
    comparison_df,
    cmap=cmap,
    mask=mask,
    annot=True,
    center=0,
    square=True,
    linewidths=0.5,
    cbar_kws={"shrink": 0.5},
    annot_kws={"fontsize": 8},
).set(title="Constructor points expected over rival");

Figure 32: Expected points earned by constructor x over constructor y using the zero-inflated model

8 Conclusion

None of the results here are that surprising, but it’s nonetheless useful to confirm our intuition for who’s a good pick and who isn’t. That said, the estimates should be taken with more than a grain of salt - the model we used is quite rudimentary (for one, it doesn’t account for performance trends as the season progresses), and modeling the summed fantasy points directly leads to a lot of detail getting obscured and left out.

%load_ext watermark
%watermark -n -u -v -iv

The watermark extension is already loaded. To reload it, use:
  %reload_ext watermark
Last updated: Tue Sep 05 2023

Python implementation: CPython
Python version       : 3.11.4
IPython version      : 8.14.0

numpy     : 1.25.2
tqdm      : 4.66.1
json      : 2.0.9
preliz    : 0.3.2
scipy     : 1.10.1
arviz     : 0.16.1
matplotlib: 3.7.2
pymc      : 5.7.2
pandas    : 2.1.0
sys       : 3.11.4 (main, Jun 28 2023, 19:51:46) [GCC]
requests  : 2.31.0
seaborn   : 0.12.2

Footnotes

see the intro to this for more details: https://www.pymc.io/projects/examples/en/latest/generalized_linear_models/GLM-negative-binomial-regression.html↩︎
see this for a quick summary of model checking techniques: https://www.pymc.io/projects/docs/en/stable/learn/core_notebooks/pymc_overview.html#model-checking↩︎
see: https://bayesiancomputationbook.com/markdown/chp_02.html#effective-sample-size↩︎
see: https://bayesiancomputationbook.com/markdown/chp_02.html#potential-scale-reduction-factor-hat-r↩︎
see: https://bayesiancomputationbook.com/markdown/chp_02.html#monte-carlo-standard-error↩︎
see: https://bayesiancomputationbook.com/markdown/chp_02.html#understanding-your-predictions↩︎
see: https://bayesiancomputationbook.com/markdown/chp_02.html#fig-posterior-predictive-check-pu-values↩︎
see: https://oriolabrilpla.cat/en/blog/posts/2019/loo-pit-tutorial.html↩︎
for a detailed guide on interpreting the bpv plots see: https://bayesiancomputationbook.com/markdown/chp_02.html#fig-posterior-predictive-many-examples↩︎
see: https://sjster.github.io/introduction_to_computational_statistics/docs/Production/PyMC3.html#hdi↩︎
see: https://bayesiancomputationbook.com/markdown/chp_02.html#cross-validation-and-loo↩︎
see: https://bayesiancomputationbook.com/markdown/chp_02.html#model-comparison↩︎
for much more information on cross validation see: https://avehtari.github.io/modelselection/CV-FAQ.html#12_What_is_the_interpretation_of_ELPD__elpd_loo__elpd_diff↩︎