Understanding Boxplots through the Lens of Formula 1: 2023 Italy Grand Prix Analysis
FastF1 Tutorials Series

Boxplots, often referred to as whisker plots, are a robust tool in statistical analysis that offer a five-number summary of a dataset: the minimum, first quartile, median, third quartile, and maximum. They provide a quick glance at the distribution, variability, and central tendency of data. Today, we will delve into the fascinating realm of Formula 1, using the 2023 Italy Grand Prix’s team pace data as our guide.

Introduction to the World of Boxplots
In the realm of statistical visualization, the boxplot, also known as a whisker plot, stands out as an invaluable tool for researchers, analysts, and enthusiasts alike. At its core, a boxplot is a standardized way of displaying the distribution of data based on a five-number summary: the minimum, first quartile (Q1), median, third quartile (Q3), and the maximum. This graphical representation paints a clear picture of how values in a dataset are spread out, where the center of the data lies, and if there are any potential outliers.
The simple design of a boxplot is deceptive, for it encapsulates a wealth of information. The central box represents the interquartile range (IQR) — the middle 50% of the data. This range, stretching between Q1 and Q3, offers insights into data variability and the potential skewness of the distribution. The line inside the box marks the median, providing a quick reference to the dataset’s central tendency. The whiskers extending from the box delineate the spread of the data, while individual points outside these whiskers often indicate outliers — values that deviate significantly from the rest.
But why is the boxplot so essential? In an age where data drives decisions, the boxplot’s ability to summarize complex datasets into an easily digestible format is invaluable. It offers a quick visual comparison between groups or categories, aiding in hypothesis testing, data exploration, and outlier detection. Moreover, its compact design makes it ideal for presentations, reports, and articles, providing readers with an immediate understanding of the underlying data distribution.
Creating Boxplots using python and FastF1 library
Google Colab Notebook
Python Code
!pip install fastf1
from google.colab import drive
drive.mount('/content/drive/')
%cd '/content/drive/MyDrive/Colab Notebooks/'
import fastf1 as ff1
from fastf1 import plotting
from fastf1 import utils
import fastf1.legacy
import fastf1 as ff1
import numpy as np
from matplotlib import pyplot as plt
from matplotlib.collections import LineCollection
from matplotlib import cm
import matplotlib.font_manager as fm
import numpy as np
import pandas as pd
import seaborn as sns
ff1.Cache.enable_cache('Cache')
plotting.setup_mpl()
year= 2023
gp = 'Italy'
event = 'R'
session_race = ff1.get_session(year, gp, event)
session_race.load()
quick_laps = session_race.laps.pick_quicklaps()
quick_laps
transformed_laps = quick_laps.copy()
transformed_laps.loc[:, "LapTime (s)"] = quick_laps["LapTime"].dt.total_seconds()
team_order = (
transformed_laps[["Team", "LapTime (s)"]]
.groupby("Team")
.median()["LapTime (s)"]
.sort_values()
.index
)
print(team_order)
team_palette = {team: fastf1.plotting.team_color(team) for team in team_order}
fig, ax = plt.subplots(figsize=(20, 8))
title = " Team Pace Comparison "+str(year)+" "+str(gp)+" Grand Prix"
sns.boxplot(
data=transformed_laps,
x="Team",
y="LapTime (s)",
order=team_order,
palette=team_palette,
whiskerprops=dict(color="white"),
boxprops=dict(edgecolor="white"),
medianprops=dict(color="grey"),
capprops=dict(color="white")
)
plt.title(title)
plt.grid(visible=False)
ax.set(xlabel=None)
plt.tight_layout()
plt.show()
This Python code is structured to run in a Google Colab environment and utilizes the fastf1
library to analyze and visualize lap time data from the 2023 Italy Grand Prix. Here's a breakdown of the code:

- Library Installation and Importation:
!pip install fastf1
installs thefastf1
library.- Various other libraries such as
numpy
,matplotlib
,pandas
, andseaborn
are imported for data manipulation and visualization.
2. Google Drive Mounting:
drive.mount('/content/drive/')
mounts the user's Google Drive to the Colab environment for file access.%cd '/content/drive/MyDrive/Colab Notebooks/'
changes the working directory to a specific folder within the Google Drive.
3. Configuration and Data Retrieval:
ff1.Cache.enable_cache('Cache')
enables caching to speed up data loading in future runs.session_race = ff1.get_session(year, gp, event)
andsession_race.load()
retrieve and load the race session data.
4. Data Filtering and Processing:
quick_laps = session_race.laps.pick_quicklaps()
filters the data for quick laps.transformed_laps
is a copy ofquick_laps
, where theLapTime
column is converted to seconds usingquick_laps["LapTime"].dt.total_seconds()
.team_order
computes the median lap time for each team, sorts them in ascending order, and stores the team order.
5. Palette Creation:
team_palette
creates a dictionary mapping each team to a color usingfastf1.plotting.team_color(team)
.
6. Visualization:
- A new figure and axes are created with
fig, ax = plt.subplots(figsize=(20, 8))
. sns.boxplot(...)
creates a box plot showing the distribution of lap times for each team, ordered by their median lap times:
x="Team"
and y="LapTime (s)"
set the x and y-axes to display the team names and lap times in seconds, respectively.
order=team_order
specifies the order of the teams on the x-axis.
palette=team_palette
sets the color palette using the team colors.
Various properties like whiskerprops
, boxprops
, medianprops
, and capprops
are set for aesthetic adjustments.
plt.title(title)
adds a title to the plot.plt.grid(visible=False)
hides the grid.ax.set(xlabel=None)
removes the x-axis label.plt.tight_layout()
adjusts the layout to ensure everything fits nicely.plt.show()
displays the plot.
This code essentially creates a box plot to visually compare the pace of different Formula 1 teams during the 2023 Italy Grand Prix, based on the distribution of their lap times.

Let’s analyze the boxplots of the teams’ performances at the 2023 Italy Grand Prix:
1. Red Bull Racing: Beginning with the pacesetters, Red Bull’s box showcases a median lap time in the lower 86-second region. The box’s size indicates the interquartile range (IQR), suggesting variability in their lap times. A broader spread implies inconsistent lap performances.
2. Ferrari: The home favorites, Ferrari, display a slightly higher median than Red Bull, positioned just above the 86-second mark. Their compact IQR, compared to Red Bull, showcases a consistent pace among their drivers.
3. Mercedes: The Silver Arrows present a median near the upper 86-second domain. The whiskers, representing the range of the data, indicate an occasional slower lap, possibly due to strategic pit stops or on-track events.
4. McLaren: Their median hovers around 87 seconds, placing them competitively. The spread of their box reveals some variability in pace, indicating differing strategies or car performances across laps.
5. AlphaTauri: Positioned around the middle, their boxplot displays a median performance near the 87-second bracket. The extended IQR suggests the team had varied lap times, potentially due to different tire strategies or in-race incidents.
6. Aston Martin: With a snug IQR, Aston Martin’s performance was consistent around the 88-second median. Their boxplot reveals a balanced pace, crucial for consistent race results.
7. Alfa Romeo: Their median touches the higher 88 seconds, but with a more substantial IQR. This data hints at a broader range of lap times, suggesting a mixed performance from their drivers.
8. Williams: Their median lap time brushes the 89-second barrier. The box’s spread indicates variability, possibly hinting at different strategies employed during the race or differing car setups.
9. Haas F1 Team: A median lap time close to Williams, but with a more compressed IQR, indicating a steadier pace amongst their drivers throughout the Grand Prix.
10. Alpine: Their median surpasses the 89-second range, but the box’s compact nature reveals consistent lap times. This consistency is crucial for strategizing pit stops and race position battles.

Conclusion:
Boxplots are an invaluable tool for researchers, analysts, and enthusiasts alike. They compress vast amounts of data into a clear, understandable format. Through this 2023 Italy Grand Prix analysis, we have seen how teams’ performance can be quickly gauged, helping strategists, pundits, and fans derive insights into the race dynamics.

Thanks for getting to this point, if you have an specific doubt or if you want to perform a specific analysis please free to contact me.
For more information, please visit RacingDataLab Website:
Disclaimer:
- This article is unofficial and is not associated in any way with the Formula 1 companies. F1, FORMULA ONE, FORMULA 1, FIA FORMULA ONE WORLD CHAMPIONSHIP, GRAND PRIX and related marks are trade marks of Formula One Licensing B.V. - The comments expressed on this article and in the analyzes are personal and do not represent the position of any company. - This article is for a Fan use, dedicated to the FIA FORMULA ONE WORLD CHAMPIONSHIP, to report on and provide information about the FORMULA 1 events.
Credits:
FIA.com | the official website of the Federation Internationale de l’Automobile. | https://www.fia.com/
Formula1.com | the official website of the F1. | https://www.formula1.com/
RedBullContentPull.com | Editorial Use / Getty Images / Red Bull Content Pool | https://www.redbullcontentpool.com/
PlainEnglish.io 🚀
Thank you for being a part of the In Plain English community! Before you go:
- Be sure to clap and follow the writer️
- Learn how you can also write for In Plain English️
- Follow us: X | LinkedIn | YouTube | Discord | Newsletter
- Visit our other platforms: Stackademic | CoFeed | Venture