avatarRaúl García

Summary

The article provides a statistical analysis of the 2023 Italy Grand Prix using boxplots to compare Formula 1 team performances based on lap times, utilizing the fastf1 Python library within a Google Colab notebook.

Abstract

The article titled "Understanding Boxplots through the Lens of Formula 1: 2023 Italy Grand Prix Analysis" is part of the FastF1 Tutorials Series and demonstrates the application of boxplots in the context of Formula 1 racing. It explains the significance of boxplots in offering a five-number summary of a dataset, illustrating the distribution, variability, and central tendency of data. The article guides readers through the process of creating boxplots using Python and the FastF1 library in a Google Colab environment to analyze the team pace data from the 2023 Italy Grand Prix. It details the steps for data retrieval, filtering, and visualization, culminating in a graphical comparison of lap times across different teams. The analysis reveals insights into the performance consistency and strategies of each team, providing a comprehensive understanding of race dynamics from a data-driven perspective.

Opinions

  • The author emphasizes the value of boxplots as a robust tool for statistical analysis, particularly for their ability to summarize complex datasets into an easily digestible format.
  • The article suggests that the FastF1 library and Python are powerful tools for analyzing Formula 1 data, making it accessible for researchers, analysts, and enthusiasts.
  • The author implies that data variability and consistency, as shown by the boxplots, are critical factors in understanding team performance and race strategy.
  • The use of team colors in the boxplot visualization is appreciated for making the data more intuitive and engaging for readers and viewers.
  • The author's analysis indicates that while some teams like Ferrari showed consistent lap times, others like Red Bull Racing had more variability, which could be indicative of different racing strategies or on-track incidents.
  • The article concludes with a nod to the broader application of data science in Formula 1, suggesting that such analyses can provide strategic insights for teams, pundits, and fans.

Understanding Boxplots through the Lens of Formula 1: 2023 Italy Grand Prix Analysis

FastF1 Tutorials Series

https://www.redbullcontentpool.com/

Boxplots, often referred to as whisker plots, are a robust tool in statistical analysis that offer a five-number summary of a dataset: the minimum, first quartile, median, third quartile, and maximum. They provide a quick glance at the distribution, variability, and central tendency of data. Today, we will delve into the fascinating realm of Formula 1, using the 2023 Italy Grand Prix’s team pace data as our guide.

Team Pace Comparison 2023 Italy GP

Introduction to the World of Boxplots

In the realm of statistical visualization, the boxplot, also known as a whisker plot, stands out as an invaluable tool for researchers, analysts, and enthusiasts alike. At its core, a boxplot is a standardized way of displaying the distribution of data based on a five-number summary: the minimum, first quartile (Q1), median, third quartile (Q3), and the maximum. This graphical representation paints a clear picture of how values in a dataset are spread out, where the center of the data lies, and if there are any potential outliers.

The simple design of a boxplot is deceptive, for it encapsulates a wealth of information. The central box represents the interquartile range (IQR) — the middle 50% of the data. This range, stretching between Q1 and Q3, offers insights into data variability and the potential skewness of the distribution. The line inside the box marks the median, providing a quick reference to the dataset’s central tendency. The whiskers extending from the box delineate the spread of the data, while individual points outside these whiskers often indicate outliers — values that deviate significantly from the rest.

But why is the boxplot so essential? In an age where data drives decisions, the boxplot’s ability to summarize complex datasets into an easily digestible format is invaluable. It offers a quick visual comparison between groups or categories, aiding in hypothesis testing, data exploration, and outlier detection. Moreover, its compact design makes it ideal for presentations, reports, and articles, providing readers with an immediate understanding of the underlying data distribution.

Creating Boxplots using python and FastF1 library

Google Colab Notebook

Python Code

!pip install fastf1

from google.colab import drive
drive.mount('/content/drive/')

%cd '/content/drive/MyDrive/Colab Notebooks/'

import fastf1 as ff1
from fastf1 import plotting
from fastf1 import utils
import fastf1.legacy
import fastf1 as ff1
import numpy as np
from matplotlib import pyplot as plt
from matplotlib.collections import LineCollection
from matplotlib import cm
import matplotlib.font_manager as fm
import numpy as np
import pandas as pd
import seaborn as sns

ff1.Cache.enable_cache('Cache')
plotting.setup_mpl()

year= 2023
gp = 'Italy'
event = 'R'


session_race = ff1.get_session(year, gp, event)
session_race.load()

quick_laps = session_race.laps.pick_quicklaps()
quick_laps

transformed_laps = quick_laps.copy()
transformed_laps.loc[:, "LapTime (s)"] = quick_laps["LapTime"].dt.total_seconds()
team_order = (
    transformed_laps[["Team", "LapTime (s)"]]
    .groupby("Team")
    .median()["LapTime (s)"]
    .sort_values()
    .index
)
print(team_order)

team_palette = {team: fastf1.plotting.team_color(team) for team in team_order}


fig, ax = plt.subplots(figsize=(20, 8))
title = " Team Pace Comparison "+str(year)+" "+str(gp)+" Grand Prix"
sns.boxplot(
    data=transformed_laps,
    x="Team",
    y="LapTime (s)",
    order=team_order,
    palette=team_palette,
    whiskerprops=dict(color="white"),
    boxprops=dict(edgecolor="white"),
    medianprops=dict(color="grey"),
    capprops=dict(color="white")
    
)

plt.title(title)
plt.grid(visible=False)

ax.set(xlabel=None)
plt.tight_layout()
plt.show()

This Python code is structured to run in a Google Colab environment and utilizes the fastf1 library to analyze and visualize lap time data from the 2023 Italy Grand Prix. Here's a breakdown of the code:

https://racingdatalab.com/r15.php
  1. Library Installation and Importation:
  • !pip install fastf1 installs the fastf1 library.
  • Various other libraries such as numpy, matplotlib, pandas, and seaborn are imported for data manipulation and visualization.

2. Google Drive Mounting:

  • drive.mount('/content/drive/') mounts the user's Google Drive to the Colab environment for file access.
  • %cd '/content/drive/MyDrive/Colab Notebooks/' changes the working directory to a specific folder within the Google Drive.

3. Configuration and Data Retrieval:

  • ff1.Cache.enable_cache('Cache') enables caching to speed up data loading in future runs.
  • session_race = ff1.get_session(year, gp, event) and session_race.load() retrieve and load the race session data.

4. Data Filtering and Processing:

  • quick_laps = session_race.laps.pick_quicklaps() filters the data for quick laps.
  • transformed_laps is a copy of quick_laps, where the LapTime column is converted to seconds using quick_laps["LapTime"].dt.total_seconds().
  • team_order computes the median lap time for each team, sorts them in ascending order, and stores the team order.

5. Palette Creation:

  • team_palette creates a dictionary mapping each team to a color using fastf1.plotting.team_color(team).

6. Visualization:

  • A new figure and axes are created with fig, ax = plt.subplots(figsize=(20, 8)).
  • sns.boxplot(...) creates a box plot showing the distribution of lap times for each team, ordered by their median lap times:

x="Team" and y="LapTime (s)" set the x and y-axes to display the team names and lap times in seconds, respectively.

order=team_order specifies the order of the teams on the x-axis.

palette=team_palette sets the color palette using the team colors.

Various properties like whiskerprops, boxprops, medianprops, and capprops are set for aesthetic adjustments.

  • plt.title(title) adds a title to the plot.
  • plt.grid(visible=False) hides the grid.
  • ax.set(xlabel=None) removes the x-axis label.
  • plt.tight_layout() adjusts the layout to ensure everything fits nicely.
  • plt.show() displays the plot.

This code essentially creates a box plot to visually compare the pace of different Formula 1 teams during the 2023 Italy Grand Prix, based on the distribution of their lap times.

Let’s analyze the boxplots of the teams’ performances at the 2023 Italy Grand Prix:

1. Red Bull Racing: Beginning with the pacesetters, Red Bull’s box showcases a median lap time in the lower 86-second region. The box’s size indicates the interquartile range (IQR), suggesting variability in their lap times. A broader spread implies inconsistent lap performances.

2. Ferrari: The home favorites, Ferrari, display a slightly higher median than Red Bull, positioned just above the 86-second mark. Their compact IQR, compared to Red Bull, showcases a consistent pace among their drivers.

3. Mercedes: The Silver Arrows present a median near the upper 86-second domain. The whiskers, representing the range of the data, indicate an occasional slower lap, possibly due to strategic pit stops or on-track events.

4. McLaren: Their median hovers around 87 seconds, placing them competitively. The spread of their box reveals some variability in pace, indicating differing strategies or car performances across laps.

5. AlphaTauri: Positioned around the middle, their boxplot displays a median performance near the 87-second bracket. The extended IQR suggests the team had varied lap times, potentially due to different tire strategies or in-race incidents.

6. Aston Martin: With a snug IQR, Aston Martin’s performance was consistent around the 88-second median. Their boxplot reveals a balanced pace, crucial for consistent race results.

7. Alfa Romeo: Their median touches the higher 88 seconds, but with a more substantial IQR. This data hints at a broader range of lap times, suggesting a mixed performance from their drivers.

8. Williams: Their median lap time brushes the 89-second barrier. The box’s spread indicates variability, possibly hinting at different strategies employed during the race or differing car setups.

9. Haas F1 Team: A median lap time close to Williams, but with a more compressed IQR, indicating a steadier pace amongst their drivers throughout the Grand Prix.

10. Alpine: Their median surpasses the 89-second range, but the box’s compact nature reveals consistent lap times. This consistency is crucial for strategizing pit stops and race position battles.

Conclusion:

Boxplots are an invaluable tool for researchers, analysts, and enthusiasts alike. They compress vast amounts of data into a clear, understandable format. Through this 2023 Italy Grand Prix analysis, we have seen how teams’ performance can be quickly gauged, helping strategists, pundits, and fans derive insights into the race dynamics.

Thanks for getting to this point, if you have an specific doubt or if you want to perform a specific analysis please free to contact me.

For more information, please visit RacingDataLab Website:

Disclaimer:

- This article is unofficial and is not associated in any way with the Formula 1 companies. F1, FORMULA ONE, FORMULA 1, FIA FORMULA ONE WORLD CHAMPIONSHIP, GRAND PRIX and related marks are trade marks of Formula One Licensing B.V. - The comments expressed on this article and in the analyzes are personal and do not represent the position of any company. - This article is for a Fan use, dedicated to the FIA FORMULA ONE WORLD CHAMPIONSHIP, to report on and provide information about the FORMULA 1 events.

Credits:

FIA.com | the official website of the Federation Internationale de l’Automobile. | https://www.fia.com/

Formula1.com | the official website of the F1. | https://www.formula1.com/

RedBullContentPull.com | Editorial Use / Getty Images / Red Bull Content Pool | https://www.redbullcontentpool.com/

PlainEnglish.io 🚀

Thank you for being a part of the In Plain English community! Before you go:

Formula 1
F1
Python
Python Libraries
Box Plot
Recommended from ReadMedium