The article provides a detailed guide on how to analyze Formula 1 data from the 2021 Italian Grand Prix using Python, focusing on the battle between Ricciardo and Verstappen.
Abstract
This tutorial, authored by a data-enthusiast and Formula 1 fan, demonstrates the use of Python and the fastf1 library to analyze telemetry data from the 2021 Italian Grand Prix. It addresses the question of whether Verstappen could have overtaken Ricciardo on merit by examining their lap times and the distance between them during the race. The article guides readers through setting up the analysis environment, collecting and processing the data, and plotting it to draw insights. It concludes that Ricciardo's performance was strong, making it difficult for Verstappen to overtake, and suggests that McLaren had a good chance of winning regardless of the collision incident between Hamilton and Verstappen.
Opinions
The author believes that Ricciardo's performance was meritorious and that the McLaren team was competitive enough to potentially win the Italian Grand Prix without the intervention of the collision between Hamilton and Verstappen.
The author expresses that Verstappen was pushing hard to close the gap to Ricciardo, particularly evident in laps 4, 5, and 6, but faced challenges due to dirty air and tire overheating.
The author suggests that the straight-line speed advantage of the McLaren's Mercedes engine was a significant factor in the race dynamics, contributing to the difficulty Verstappen faced in overtaking.
The author encourages readers to engage with the data themselves to gain a deeper understanding of the race and the intricacies of Formula 1 telemetry analysis.
The author highlights the importance of the fastf1 Python package for conducting such analyses and points readers to a newer tutorial for updated methods.
The author concludes with a note of uncertainty regarding hypothetical scenarios, such as the impact of a faster pit stop for Verstappen or different tire strategies, acknowledging that the outcome could have been different.
How I Analyze Formula 1 Data With Python: 2021 Italian GP
This tutorial is out-dated! There is a newer version of the Fastf1 library, about which I have created a new tutorial. You can check it out below 👇
As a data-fanatic and a Formula 1-fan, the amount of data coming from Formula 1 weekends is simply amazing to play around with. How cool is to create insights that you even haven’t seen on TV during the weekend?!
I therefore have started a series (have a look at Part 1 about the Dutch GP) that provides you with in-depth analyses of specific events at a Formula 1-weekend. But, more importantly, this series will show you how these Formula 1 analyses are being made in Python.
If you are only interested in the analysis, I suggest you scroll down to Step 3: Plot the data.
The 2021 Italian Grand Prix
“Would Ricciardo Have Beaten Verstappen on Merit During the Italian GP?”
That is a question I have seen come across a lot. Unfortunately, we never got to find out the answer because — as we all know — Hamilton and Verstappen collided. What we can do, however, is have a look at the data of some laps from before the crash, and ask ourselves whether Verstappen would have been able to pass Ricciardo during later stints.
This tutorial will — just like the previous tutorial — dive into telemetry data. It will also show you how to use and process the available distance variables between Ricciardo and Verstappen, which requires some different approaches and some data transformations.
Step 1: Set up the basics
First, we include all required libraries. These analyses will rely heavily on the data provided by the fastf1 Python package.
Also, we want to enable the built-in plotting functionality from the fastf1 library and enable the cache to avoid repeatedly long waiting times. In addition, we quickly hide some type of warning which is not relevant for us but can be quite annoying.
Step 2: Collect the data
We start by loading the full session data by passing the year as the first parameter, the location as the second parameter and the session as the third parameter (this one could, for example, also be ‘Q’).
Next, we load all laps including telemetry…
… and we only select the drivers we’re interested in.
Since Verstappen and Hamilton crashed right after the pitstops and the close battle between Verstappen and Ricciardo was mainly happening during their first stint, we only select the first stint:
Expanding the data
Now we have loaded all our data, we want to run some calculations and transformations on them to prepare for our analyses.
First, we quickly create a variable for the race lap number. The difference between the race lap and the “actual” lap is that the latter also includes the warm up lap (by the way, I highly recommend you play around with the data yourself so you find out things like this). Since the race started in lap 2 because lap 1 was the warm up lap, we subtract 1.
This was easy. Now, however, we are going to do some more complicated stuff. The thing we want to achieve is to get an idea of how close verstappen was during the laps. To do so, the fastf1 telemetry data can provide us with the distance to the driver ahead.
We are therefore only interested in the telemetry of Verstappen, since he driving was behind Ricciardo. So, let’s create two DataFrames: one for all the distances between Verstappen and Ricciardo at any moment during any lap, and one for the average (summarized) distance per lap.
Next, what we need to do is loop through all the laps one by one. The reason we do this is because if we don’t, the ‘Distance’ variable in the telemetry data will range from 0 meters until the entire length of the first stint (which is about 23 laps, so over 130km). However, we don’t want that, since we want to compare lap-by-lap. In other words, the telemetry data needs to be requested again and again to reset the distance variable to 0. If you don’t understand what is going on, I highly suggest you to experiment with this yourself (or you can also specify your question in the comments of course).
So, this is a big chunk of code. Let me explain what’s going on.
[Line 1] First, as mentioned above, we loop through all Verstappen’s laps one by one. Fastf1 has a built-in function to iterate through laps, which is called iterlaps() . It basically is similar to Pandas’ iterrows() .
[Line 2] Then, we create the telemetry variable. We select the lap data, tell fastf1 to load the car data (so the telemetry), tell fastf1 to include the distance that has been driven into the lab, and lastly, tell fastf1 to add the driver ahead. That last thing will do two things: add a column DriverAhead and add a column DistanceToDriverAhead to the telemetry data of Verstappen.
[Line 5] We then only select telemetry from when the DriverAhead was number 3, since we’re only interested in Verstappen’s battle with Ricciardo (car number 3).
[Line 7] We quickly check if the telemetry DataFrame is not empty. If it is empty, this means that the driver ahead is not Ricciardo (e.g. during the pitstops) so we don’t want to include that telemetry data.
[Lines 8–12] We then select the columns we want for the full distance DataFrame, and include the lap number with it so we can compare all the laps and all the distances later on.
[Lines 14–22] We then calculate the summary statistics for the distances per lap. We want the average and the median distance for the laps, so we can compare those later as well. The method np.nanmean() automatically ignores all NaN values, which is convenient for us. The reason we do this is because sometimes there is no driver in front (like during pitstops), which will result in a NaN value.
Step 3: Plot the data
Now, we finally have everything in place. To evaluate the battle between Verstappen and Ricciardo, let’s first start with comparing their lap times and the average distance between them (which we just calculated!).
Laptimes and distance
To do so, we use the subplots functionality from matplotlib. This allows us to create multiple plots in one single figure. The first subplot consists of the lap times during the opening stint, while the second subplot will show us the average distance between Verstappen and Ricciardo.
I (again) highly suggest that you play around with this. It can come across as quite confusing (I had the same in the beginning), but it actually is really straightforward. On line 3, we pass “2” as an argument to the subplots method. This generates 2 subplots, which can be accessed through ax[0] and ax[1] . After that, it is just basic matplotlib plotting.
The result looks as follows:
As you can see, Verstappen was really close during laps 4, 5 and 6. Their lap times were almost similar, meaning he was really pushing to get close and probably had some overspeed. However, as we all know, the closer you get, the more difficult it is to follow. If you are chasing another driver, you suffer from dirty air and your tires will start overheating.
Lap 4 telemetry
Since Verstappen was closest during lap 4, let’s analyse the telemetry of that lap. To provide some context, we will also include the distance between Verstappen and Ricciardo from across the lap of laps 3, 4, 5 and 6.
So, to begin, we select the telemetry from both Ricciardo and Verstappen of lap 4 (and we use the previously created variable RaceLapNumber ). As before, we run get_car_data() and add_distance() to include the telemetry and the distance that has been driven during the lap. Next to that, we select the distances between the drivers from lap 3 until 6 from the variable full_distance_ver_ric . This variable contains all the distances at any given moment within a lap (play around with this yourself to fully understand what the data looks like).
After that, we can start plotting. This plot is a bit more advanced, but also a bit similar compared to the one from the previous tutorial.
The first subplot shows us the distance between the drivers during lap 3–6, with the main focus on lap 4 where Verstappen was closest. Then, we show all the telemetry of lap 4, where we can see what actually causes the distance between Verstappen and Ricciardo to increase and decrease.
So, now things are becoming intresting. First of all, we can see in the first subplot that Verstappen was really close for a few laps, and actually very close to overtaking Ricciardo into T1 and T3. If you look at the distance during lap 3 compared to the distance during lap 4, you can see that Verstappen actually left much more room between Ricciardo in for example T2 (Curva Grande). This shows that Verstappen was really pushing during lap 4, 5 and 6.
When analyzing the telemetry of lap 4, we can see a few interesting things happen. All of the circles indicate that verstappen was really struggling in the Ricciardo’s dirty air, forcing him to make corrections. For example, when exiting T1, verstappen had to correct his throttle appliance and apply the brakes for a brief moment, probably leaving a few tenths on the table. Also, during the end of the lap, Verstappen had to break earlier than Ricciardo, while he also had to brake for a longer period of time than Ricciardo. It is also really visible from the throttle input that Verstappen really struggled when going through the high-speed corner Ascari.
Conclusions
Let’s go back to the question that started this article.
“Would Ricciardo Have Beaten Verstappen on Merit During the Italian GP?”
I would say: yes. When looking at the data we just analyzed, it was really hard to follow and stay close to Ricciardo. Apparently, Verstappen was really pushing at some points during the opening stint. The McLaren’s Mercedes engine, however, was just too strong for the Red Bull to pass on the straights. I guess Verstappen had a bit more overspeed and would have pulled away if he managed to pass Ricciardo, but the nature of the Monza circuit made it just really difficult to get by.
However… I am not sure. What if Verstappens pitstop would have been really fast? What if the Red Bull was stronger on the Hards than on the Mediums? What if Hamilton joined the fight for the race win? We don’t know. All I can say is that the McLarens looked really strong, and I think that even without Verstappen’s crash they would have had a good chance of getting the race win.
Thanks for paying attention to this tutorial. Feel free to ask any questions you have and to leave feedback if you have any! Also, if you appreciate this post, please show some love on Medium ❤️