avatarBrian Rock

Summary

The article compares the Boston, NYC, and Chicago Marathons in terms of runner speed, age demographics, and weather conditions, revealing insights into the fastest runners, the impact of age, and the influence of weather on race outcomes.

Abstract

The comparative analysis of the Boston, NYC, and Chicago Marathons indicates that Boston consistently has the fastest runners, particularly evident at the front of the pack. The data, spanning from 1996 to 2022, shows that Boston's strict qualifying times contribute to a faster overall field. Additionally, the article highlights a trend towards older runners across all marathons, with Boston having the oldest field, especially among men. Weather analysis reveals that Chicago experiences the most extreme weather variations, with a higher incidence of hot race days compared to Boston and NYC, which generally have cooler conditions. The article concludes that weather significantly affects finishing times, with hotter temperatures leading to slower races.

Opinions

  • The author suggests that Boston's qualifying standards are a key factor in the race having the fastest runners.
  • There is an opinion that the fields at NYC and Chicago are more diverse in terms of runner ability due to a larger proportion of non-qualifier entrants.
  • The article implies that the shift towards older runners is a significant trend in marathon running.
  • The author expresses surprise at the severity of weather conditions in Chicago, given its reputation for favorable early fall weather.
  • The analysis indicates that the author believes that weather plays a crucial role in marathon performance, potentially affecting the outcomes more than previously acknowledged.
  • The author hints at the possibility of Boston Marathon qualifying times being revisited due to the high number of applicants and the trend of faster finishing times.
  • There is a suggestion that masters runners' qualifying standards might be softer, contributing to the increased participation of older runners in marathons.
  • The author is curious about the impact of long-term temperature changes on both training and racing conditions for marathon runners.

Comparing the American Marathon Majors: Boston, NYC, and Chicago

Which race has the fastest runners? The oldest runners? The worst weather? Let’s find out.

Photo by Miguel A Amutio on Unsplash

I’ve been working on a series of articles exploring data on finishers and finishing times at the large marathons in the United States. So far, I’ve looked individually at the Boston, NYC, and Chicago marathons.

Today, we’re going to take a step back and see how they stack up against each other.

For some more context about what I’ve been looking at and why I started this project, check out the introductory post here. It also has links to other stories in the articles.

For our purposes today, you don’t need to know that much about the context, although two previous findings will be relevant. First, all of these races are seeing greater participation from older runners. Second, all of these races have had warm (or hot) days that caused outliers in the finishing times.

Note: This article is part of a larger series that is behind Medium’s paywall. If you’re not a member of Medium, you can use this form to get access to this series. I’ll e-mail you a special link to one article each week.

Today, we’re going to focus on three main questions. Of the three American Marathon Majors…

  1. Which of them has the fastest runners?
  2. Which of them has the oldest runners?
  3. Which of them has the worst weather?

Data Sources and Methodology

Before we jump into the analysis, let me briefly explain where the data comes from and some of the techniques and terms I’ll use in this analysis.

Where the Race Data Came From

My dataset includes the age, gender, finishing time, and overall place for each runner.

For the Boston Marathon, I started with the data from the Boston Marathon Data Project. It only went up to 2019, so I downloaded more recent results directly from the BAA website. There was an issue with the file for the 2003 race, and many runners were missing their ages. I scraped the results for that race from Marathon Guide.

Neither the New York City Marathon nor the Chicago Marathon had results that could be exported and downloaded as a CSV file. So I scraped the results directly from the NYRR website and the Chicago Marathon website.

The data from Chicago begins in 1996 and so far only Boston has been run in 2023. So I’m limiting the analysis for today’s article to the years 1996 to 2022. Note that the NYC marathon was not run in 2012 due to Superstorm Sandy, and none of the races were run in 2020 due to the COVID-19 pandemic.

I used Python to read the data into a Pandas DataFrame, and then I used Plotly to create the visuals you’ll see throughout the article.

In order to make apples-to-apples comparisons, I’m not focusing on particular placements in a race. Instead, I’m focusing on runners at given percentiles. For example, the runner at the 98th percentile finished ahead of 98% of runners — in other words, they finished in the top 2%. We’ll look at the runner at the 98th (top 2%), 90th (top 10%), and 75th percentiles (top 25%).

Where the Weather Data Came From

For the weather data, I used data from the National Weather Service. I looked up the historical data for each race date and found the recorded high temperature for that date. The measurements were taken at the nearest weather station to the race finish — Central Park (NYC), Midway Airport (Chicago), and Logan Airport (Boston).

For a precise analysis of the weather, it would be important to consider some additional factors — the actual temperature at the start and finish, the humidity, the windspeed, the cloud cover, among other things. But for simplicity’s sake, I’m using the daily high temperature to essentially determine which race days were hot (80+), warm (70s), or normal (60s or below).

An actual runner may have finished the race before the temperature reached the daily high, but on a day that reaches 88 degrees…it’s hot.

Image by modi74 from Pixabay

Which Race Has the Fastest Runners?

With that out of the way, let’s get down to the analysis, starting with the first question. Are the runners faster at Boston, Chicago, or New York?

All three races are fast races. As Marathon Majors, they attract world-class elite talent as well as large fields of sub-elite and serious amateur runners. In most cases, the winners are finishing in less than 2:10 for the men and 2:25 for the women.

Each race also uses a time qualifier as a form of entry. For Boston, this is the means of entry for the majority of runners. For NYC and Chicago, though, a small fraction of the field makes its way in through time qualifiers and a majority of runners enter through a lottery or through a charity partner.

That being said, there may be a difference between the front of the pack (the 98th percentile) and those runners further back (90th and 75th percentiles). For NYC and Chicago, it’s quite possible that the runners at the 90th and 75th percentiles will not be time qualifiers — and will be slower. For Boston, those runners will still have had to meet qualifying times for entry.

It’s worth keeping in mind that Chicago is traditionally viewed as the “faster” course — due to its elevation profile. Boston is a mixed bag, with its net downhill course and its later series of punishing hills. And NYC is typically considered a challenging course with rolling hills.

Finally, it’s worth pointing out that a few years saw huge outliers in specific races due to warm weather — Chicago, 2007, 08; NYC, 2003, 04, 05, 22; and Boston, 2004, 12. Boston was also slower than usual in 1996 due to the surge in field size for its 100th anniversary.

Comparing Finishing Times Between the Three Races

The chart above shows the finishing time for the male finisher at the 98th percentile — the man who finished in the top 2%. Each color represents a different race.

While there’s not a huge distance between them, Boston seems to have faster finishers than the other two. Other than the two outlier years (2004 and 2012), the other races hardly came close. The other two races are closer — although overall the men at Chicago are a little faster than the men in NYC.

At the front of the field, I’m a little surprised that Boston is so much faster. The qualifying standard for NYC is much tougher, and Chicago is generally seen as a more favorable course. But neither seems to make a difference to the men finishing towards the front of the pack at Boston.

The chart above is the same as the previous one, but it focuses on the women finishing at the 98th percentile. The pattern is similar.

Again, Boston is clearly faster than the other races — with the exception of the two warm years (2004, 2012). Chicago and New York are more of a mixed bag, here. There are years where Chicago is faster, and others where New York is. Weather explains some of that (Chicago, 2007; New York, 2003–05), but probably not all of it.

With this chart, we’re moving back in the pack a little to the finisher at the 90th percentile. This is the point where the fields diverge — and the runners at NYC and Chicago may well have earned entry through the lottery or a charity partner.

And sure enough, the graphs have gotten further apart. 2004 and 2012 are the only years when it’s even close. Most of the other years, the gap is 10 to 20 minutes.

At the top of the graph, things are also starting to sort out a little bit. 2007, 2008, 2010, 2011, and 2021 were all hot years in Chicago — and those are the only years where the runners at New York finished faster (or close to it in 2021). There’s not a huge gap, but Chicago is on average a bit faster. It’s quite possible that the course is making a difference here — and that the slower runners are more impacted by the elevation changes.

Here, we’ve got the woman finishing at the 90th percentile. And the story is pretty much the same.

There was an even greater increase in the number of women running at Boston in 1996 than men, which explains why that year was slower here than above. Again, other than the two hot years, it’s not even close. In recent years, the margin is 20 minutes or more between Boston and Chicago.

And again, Chicago is a little more clearly faster than New York here. In most years, it’s not a huge margin, but on average the women at Chicago are finishing earlier than the women in New York.

Finally, we’ll take a step further back in the field to the 75th percentile. For Chicago and New York, these are now well behind the time qualifiers. Again, that’s not the case for Boston.

The gap here is widened significantly. In more recent years, the finishers at Boston are 20 to 30 minutes faster than the finishers at Chicago. At the 90th percentile, that gap was closer to 15 to 20 minutes.

Not much has changed between New York and Chicago, though. Both lines moved up — slowing by 20 to 30 minutes compared to the fast group — but they moved more or less in tandem. The general comparison between them hasn’t changed much.

And again, the women tell the same story.

In some cases, the women at Boston are finishing more than 30 minutes faster than comparable runners at New York and Chicago. Even in 2004, when it was over 80 degrees at the start in Hopkinton, these women did significantly better than those in New York and Chicago.

So what can we conclude?

If the question is simply, “Which race has the fastest runners?” I think the answer is Boston, hands down.

The actual winners are often a little bit faster at Chicago, but by the time you get back to the finisher in the 98th percentile, things tilt towards Boston.

That being said, it’s closer at the front of the pack and things move further apart as you move back. And this supports another general conclusion that the vast majority of runners at Boston are faster than average — due to the strict qualifying standards — while New York and Chicago attract a smaller field of fast runners and a larger field of more average runners.

I suspect if you looked at the median, things would get even further apart. But I think we’ve seen enough of these graphs.

Let’s move on to another topic: age.

Photo by Vincent Peters on Pexels

Which Race Has Shifted the Hardest Towards Masters Runners?

When I first started looking at the data, I was interested in finishing times. It wasn’t until I’d had some discussions about the Boston Marathon that it occurred to me that the age distribution of runners could be changing — and that could in turn be impacting the average finishing time.

As it turns out, there is a consistent trend across the board towards more older runners and fewer younger runners. Whether you look at men, women, New York, or Boston, the shift is clear.

But which race has shifted the hardest — and which one has the oldest field? Let’s take a look.

A quick note on data here. We’re only looking at 2000 to 2022 because I don’t have age data for Boston from 1996 to 1999.

This first chart shows the percentage of the men’s field made up of younger runners — those under 40 — in each race. Across the board, you can see a general decline.

However, the trend is a little bit different depending on which race you look at. There are two things to note here.

First, Chicago starts out with a much larger proportion of runners under 40 — and that number continues to drop consistently throughout the time period. In Boston and New York, the shift more or less comes to an end between 2005 and 2010.

Second, there’s a divergence in 2021. In this year, all three races had much smaller fields due to COVID. Interestingly, Boston skewed older — with a significant drop in the proportion of runners under 40 — while NYC and Chicago skewed younger — with a significant increase in the number of runners under 40.

Overall, Boston consistently has the smallest share of its men’s field under 40 — suggesting it has the oldest overall field of men.

What’s happening on the women’s side?

The overall trend is the same. We know that the fields are shifting towards older runners — and all three races show a significant decline in the proportion of runners under 40.

Here, though, the shift continues for longer than it did with the men. It appears to start flattening out around 2010 for Boston and New York, but it never completely flatlines.

The same divergence appears in 2021 — with Boston skewing much older and Chicago and NYC skewing slightly younger.

It’s also interesting to note that overall, the women’s field is much younger than the men’s field. At the beginning of the time period, a whopping 75% of the field at Chicago was under 40. But even after it decreased, all three races seem to be converging around 45 to 50% — compared to 35 to 45% for the men.

There’s one other outlier here — Boston 2013. This was the year of the bombing, and the race was called early. Some runners were diverted and not included in the finisher data — and that group probably included more women and more older runners. So that would explain why the women’s field of finishers bumped up a little that year — with a greater proportion of finishers under 40.

Again, Boston seems to have the oldest field. However, it’s much closer to New York on the women’s side than it is on the men’s side.

What happens if you look at runners in their 40s?

In the men’s field, the three races started at three different places. In 2000, Chicago had the smallest proportion of men in their 40s — around 27%. At New York City, it was 31%, and at Boston, it was around 37%.

From 2000 to 2015, the trends are very different — but all three races converge around the same proportion around 2017. There is a slight downward shift in the last few years, but all three races declined together. They’ve seemed to settle at around 30%.

On the women’s side, all three races again seem to be converging around the same point for runners in their 40s.

Chicago starts out with a much smaller group of women in their 40s — less than 20%. But it increased fairly rapidly, and by 2017 the races were pretty close. Boston and NYC both increased slightly, but there’s not a huge shift from 2000 to 2022.

At the end of the time period, about 30% of the women’s field is women in their 40s — similar to the men.

The final piece of the equation is the true masters runners — those in their 50s or above. If one part of the field is shrinking, it only stands to reason that the other part must be growing.

And on the men’s side, all three races see a steady increase in runners that are 50 or older. For a brief moment — 2000 to 2002 — New York’s field skews slightly older than Boston, but that changes in 2003. That year, the proportion of the field over 50 jumps 5% — and it continues to grow.

Coincidentally, this is also the year that Boston significantly changed the qualifying standards for masters runners. Qualifying times stayed the same for runners under 45, but everyone else had their standard increased — and new categories were created for runners who were 75–79 or over 80.

Although all three races see an increase in the proportion of runners over 50, it seems clear that the change in Boston’s qualifying standards played a role in causing a different shift there.

Finally, what’s going on with the women?

In some ways it’s similar. New York and Chicago both increase, although the shift isn’t that pronounced. The change at Chicago is modest early on, but it jumps a bit after 2015.

What’s a little different here is Boston. In 2000, the overall picture looks very different than for the men. Just over 5% of women are over 50 — compared to just under 20% for the men. As a result, although Boston shifts at a greater pace than the other two races it’s not far out of sync in terms of the overall percentage of the field that is 50 or over.

Note that there is, again, a significant bump in 2003 in Boston. Between that, and the fact that the increase is much larger at Boston than New York and Chicago, the qualifying standards must have something to do with what’s going on.

So what’s the bottom line?

All three races have tilted older, and all three have converged on a proportion of runners in their 40s — about 30%.

But there is a difference at the extremes — with Boston having significantly fewer runners under 40 and significantly more runners 50 and over. The men’s field also tends to be older than the women’s field, in all three races, so the men at Boston are the oldest group, overall.

This would be a good place to note that in 2023, over 100 men over 70 ran the Boston Marathon. And an 80-year-old guy finished in 4:12:55.

I can only hope I’m in that kind of shape in 30 to 40 years.

Photo by RUN 4 FFWPU on Pexels

Which Race Has the Worst Weather?

In looking through the race results for these three races, it was clear that weather had a role to play. In each case, there was at least one year where the weather was unseasonably warm — and the runners were uncharacteristically slow.

So how bad is it — and which of these three races have the worst weather?

New York City, which is run in early November, likely has the coolest overall weather. But even New York is not immune from heat and humidity, as we learned last year.

First, let’s take a look at the conditions on race day.

The graph above plots the maximum recorded daily temperature on race day for each race.

Chicago starts on the early side — with the first wave around 7:30 and the last wave around 8:30 — so the faster runners are likely off the course earlier in the day. But Boston and New York both start a little later in the day, and even runners in the first wave may still be on the course until the early afternoon.

We’ll just accept as a caveat that actual runners may not experience these exact temperatures — but you can still assume that on a day that the high will reach 80 degrees or hotter…it’s going to be hot.

One thing that should be apparent is that there is a ton of annual variation. Although there are some general trends, it’s not uncommon for temps to vary by 10 or 20 degrees between successive years.

If you look at New York City, it does appear to be the coolest race, overall. One race — 2022 — the temps got into the 70s. Other than that, there were three warm days (2003, 04, 05) in the high 60s, and 2015 reached the mid-60s. Most of the other years, the temps were in the 50s — or at worst just over 60. If you run New York, chances are pretty good that you’ll see good weather — and last year really was an outlier.

Boston is a little bit worse. Two years were extremely hot (2004 and 2012), and 2017 was also a hot one. Five other races had temps in the mid to high 60s, which I would consider warm (especially for a late morning start). Out of the 26 years we’re looking at, 7 of them were unseasonably warm. That’s a little over a quarter — you’ve still got good odds of good weather, but you never know.

And then there’s Chicago. In my mind, I had always assumed the weather would be nice there on Lake Michigan in early fall. Boy, was I wrong. I remember how hot it was in 2021, but I thought that was an outlier.

But there were 7 years — over a quarter — that got into the 80s. Then there are another 4 years around 70 degrees. The five years in the mid-60s are starting to seem mild by comparison. Only 8 days — less than a third — had temperatures that I would consider “good” racing weather (in the 50s or below).

It’s also worth noting that Chicago is the only race that actually canceled the end of a race due to weather. In 2007, with temps soaring, they eventually called the race a few hours after the start. The headline in the New York Times the next day was, “Death, Havoc, and Heat Mar Chicago Race.” Yowza.

So if you’re worried about the weather — New York is your best bet, Boston has good odds, and Chicago is a real gamble.

And How Does That Weather Impact Finish Times?

But how bad is this weather, anyway? Is it just a little uncomfortable, or is it hot enough to have an impact on finish times?

Let’s take a look at a different graph to explore that question.

The chart above is a scatterplot showing the finishing times for the male runner at the 98th percentile (the front of the pack) plotted against the high temperature on race day. The color and shape of the markings indicate the race.

Three trends jump out at me here.

First, the races themselves are clustered around specific finishing times. And they look very similar to the graphs at the beginning of this article — showing that Boston had a much faster field, followed by Chicago, followed by New York.

Second, there’s a lot more red (Chicago) on the far right of the graph than any other race. This confirms what we saw above, that there were more race days at Chicago with hot or warm weather by far.

Third, hotter race days tend to have higher finishing times. The fastest finishing times at Chicago all occurred on race days that were 70 degrees or below. And the slowest races — for all three races — were towards the right. Even though New York had better weather overall, the five races clustered in the top right of the graph show that on warm days, runners at New York tend to finish five to ten minutes slower than on nice days.

The chart above is set up the same way, but it shows the data for the female runner at the 98th percentile. And the trends are the same.

Again, you can see the three races sorted fairly well based on finishing times into three clusters. There is some overlap between Chicago and New York, but Chicago is still on average faster. And Boston is far and away the fastest.

Again, you can see the slowest times clustered in the top right of the graph, and the fastest times clustered on the bottom left of the graph. Clearly, weather matters.

If we move back a little bit in the pack — the male at the 90th percentile — we see a similar graph. But here, the overlap between the races is starting to sort itself out.

On nice weather days, there is very little overlap between the three races. It’s only when the temps get into the 80s that the runners at Boston get slower than Chicago and the runners at Chicago get slower than New York.

Finally, here’s the graph for the female runner at the 90th percentile. Again, the three races have sorted themselves out neatly into different clusters. NYC is slower than Chicago, which is slower than Boston.

There’s one significant outlier here — the lone blue dot all the way to the left and towards the middle of the graph. Despite being a cool day in Boston (the coldest in our time period), the finishing time was uncharacteristically high. I checked, and this was 1996 — the same year that saw a massive increase in the number of female finishers, and that difference in the field is resulting in a slower finishing time.

Otherwise, there’s a pretty clear positive relationship for each race cluster. As temperatures increase, so do finishing times.

Looking back through these graphs, I also think it’s interesting that New York has fairly mild weather overall — but it still has a temperature effect. The races in the 60s tend to be slower, and the fastest races were all when the temp was below 60. In Boston and Chicago, those moderate days in the 60s don’t seem to be affected much — but the warmer days in the 70s and 80s clearly are.

Bottom Line — What Have We Learned?

That was a lot of graphs and a lot of words. So if you’re still with me, let me restate some key findings and conclusions.

  1. Runners finish the fastest at Boston, followed by Chicago, followed by New York.
  2. As you move further back in the pack, the difference in finishing times between Boston and the other two races increases.
  3. The field at Boston skews more heavily towards older runners. New York and Chicago are similar in terms of age breakdown, but the field at Chicago is slightly younger.
  4. In all three races, a similar percentage of runners are in their 40s. The big differences are in the under 40 and 50 and over age groups.
  5. In all three races, the women’s field skews younger than the men’s field.
  6. Since 1996, Chicago has had the worst and most volatile weather. Boston had some warm and hot days, but far fewer than Chicago. New York typically had the coolest weather, with a couple minor exceptions.
  7. As temperatures increase, the finishing times for runners at each race increase as well — especially when the temperatures get into the 70s or 80s.
  8. When you plot each race's finishing times by race day weather, it supports the conclusion that finishers at Boston are faster than finishers at Chicago, who are in turn faster than finishers at New York.

What’s Next?

I hope that was an interesting detour back through the data for Boston, New York, and Chicago. I know I learned a few things that I hadn’t otherwise expected.

But it’s time to forge ahead with my initial project — to investigate whether the fastest (non-elite) runners at American marathons are getting faster, slower, or staying about the same. And to answer that question, we’ll next be looking at data from the Philly, LA, and Marine Corps Marathons.

That being said, I do have a few other questions percolating in my head relating to what we just saw above.

First, with regard to speed at Boston. Last week, the BAA announced that a record 33,000 people applied to run Boston this year. This begs the question of whether the qualifying times will or should be lowered again. So I want to dig into the data a little bit to see what could be going on here.

Second, and on a related note, the qualifying times for masters runners could play a role in this. Some have argued that the standards are softer than for open runners, and the field has continued to shift towards older runners. So it would be interesting to look at just how soft those qualifying standards are for runners over 40 or 50.

Third, I want to take a look at how temperatures have changed in the last 40 years. To be clear, I’m not worried about whether global warming is or isn’t a thing (It is). But I wonder exactly how much impact that trend is having on summer training weather (which has seemed horrendous this year) and fall and spring racing weather. Is it really getting that much worse, or does it seem worse because we’re in the middle of it?

If any of that sounds interesting to you, make sure to follow me here on Medium. I will continue to dig into the data around running and marathons until I’ve run out of questions worth answering. You can also bookmark the original article in this series, to which I will continue to add links to each subsequent article.

And if there’s any particular question that you’ve been wondering about — let me know. I’m often inspired by responses, comments, and feedback to look at the data in new and different ways.

I’m an avid runner, and I recently ran the Erie Marathon. I finished in 3:09:47, which is a Boston qualifying time … but I highly doubt my 13 second buffer is going to count for much. You can read the race report on my blog, “Running with Rock.” You can also follow me on Strava.

Running
Marathon
Boston Marathon
Data
Data Science
Recommended from ReadMedium