avatarBrian Rock

Free AI web copilot to create summaries, insights and extended knowledge, download it at here

12996

Abstract

-number">39</span>:<span class="hljs-number">32</span></pre></div><p id="01e6">Using this table, I was able to assign a score to each individual result in the sample set.</p><p id="3f5e">For comparison purposes, I also used <a href="https://github.com/AlanLyttonJones/Age-Grade-Tables">the age grading tables from 2020</a> to calculate an age grade for each performance. New age grade factors were released in 2023, but I’m using the 2020 tables because the data I’m looking at is from before 2020.</p><h1 id="504f">An Example of Percentile Grading In Action</h1><p id="c567">So what does this look like with actual race results?</p><p id="1c91">As an example, I applied the percentile tables to the results from the 2019 Columbus Marathon. Of the 3,594 finishers, here are the top ten by percentiles:</p><div id="2e24"><pre><span class="hljs-attribute">Gender</span> Age Time Percentile <span class="hljs-attribute">M</span> <span class="hljs-number">44</span> <span class="hljs-number">02</span>:<span class="hljs-number">23</span>:<span class="hljs-number">00</span> <span class="hljs-number">99</span>.<span class="hljs-number">98</span> <span class="hljs-attribute">F</span> <span class="hljs-number">39</span> <span class="hljs-number">02</span>:<span class="hljs-number">35</span>:<span class="hljs-number">50</span> <span class="hljs-number">99</span>.<span class="hljs-number">97</span> <span class="hljs-attribute">F</span> <span class="hljs-number">46</span> <span class="hljs-number">02</span>:<span class="hljs-number">58</span>:<span class="hljs-number">00</span> <span class="hljs-number">99</span>.<span class="hljs-number">94</span> <span class="hljs-attribute">F</span> <span class="hljs-number">28</span> <span class="hljs-number">02</span>:<span class="hljs-number">34</span>:<span class="hljs-number">29</span> <span class="hljs-number">99</span>.<span class="hljs-number">94</span> <span class="hljs-attribute">M</span> <span class="hljs-number">27</span> <span class="hljs-number">02</span>:<span class="hljs-number">15</span>:<span class="hljs-number">05</span> <span class="hljs-number">99</span>.<span class="hljs-number">92</span> <span class="hljs-attribute">M</span> <span class="hljs-number">30</span> <span class="hljs-number">02</span>:<span class="hljs-number">15</span>:<span class="hljs-number">05</span> <span class="hljs-number">99</span>.<span class="hljs-number">92</span> <span class="hljs-attribute">F</span> <span class="hljs-number">23</span> <span class="hljs-number">02</span>:<span class="hljs-number">37</span>:<span class="hljs-number">51</span> <span class="hljs-number">99</span>.<span class="hljs-number">92</span> <span class="hljs-attribute">M</span> <span class="hljs-number">35</span> <span class="hljs-number">02</span>:<span class="hljs-number">24</span>:<span class="hljs-number">54</span> <span class="hljs-number">99</span>.<span class="hljs-number">91</span> <span class="hljs-attribute">F</span> <span class="hljs-number">54</span> <span class="hljs-number">03</span>:<span class="hljs-number">05</span>:<span class="hljs-number">47</span> <span class="hljs-number">99</span>.<span class="hljs-number">90</span> <span class="hljs-attribute">M</span> <span class="hljs-number">55</span> <span class="hljs-number">02</span>:<span class="hljs-number">52</span>:<span class="hljs-number">04</span> <span class="hljs-number">99</span>.<span class="hljs-number">90</span></pre></div><p id="b921">Meanwhile, here are the top ten results if you relied on age grading.</p><div id="9250"><pre><span class="hljs-attribute">Gender</span> Age Time Age Grade <span class="hljs-attribute">M</span> <span class="hljs-number">44</span> <span class="hljs-number">02</span>:<span class="hljs-number">23</span>:<span class="hljs-number">00</span> <span class="hljs-number">90</span>.<span class="hljs-number">61</span> <span class="hljs-attribute">M</span> <span class="hljs-number">27</span> <span class="hljs-number">02</span>:<span class="hljs-number">15</span>:<span class="hljs-number">05</span> <span class="hljs-number">90</span>.<span class="hljs-number">06</span> <span class="hljs-attribute">M</span> <span class="hljs-number">30</span> <span class="hljs-number">02</span>:<span class="hljs-number">15</span>:<span class="hljs-number">05</span> <span class="hljs-number">90</span>.<span class="hljs-number">06</span> <span class="hljs-attribute">M</span> <span class="hljs-number">29</span> <span class="hljs-number">02</span>:<span class="hljs-number">18</span>:<span class="hljs-number">22</span> <span class="hljs-number">87</span>.<span class="hljs-number">92</span> <span class="hljs-attribute">F</span> <span class="hljs-number">39</span> <span class="hljs-number">02</span>:<span class="hljs-number">35</span>:<span class="hljs-number">50</span> <span class="hljs-number">87</span>.<span class="hljs-number">65</span> <span class="hljs-attribute">F</span> <span class="hljs-number">28</span> <span class="hljs-number">02</span>:<span class="hljs-number">34</span>:<span class="hljs-number">29</span> <span class="hljs-number">86</span>.<span class="hljs-number">78</span> <span class="hljs-attribute">F</span> <span class="hljs-number">54</span> <span class="hljs-number">03</span>:<span class="hljs-number">05</span>:<span class="hljs-number">47</span> <span class="hljs-number">85</span>.<span class="hljs-number">62</span> <span class="hljs-attribute">F</span> <span class="hljs-number">23</span> <span class="hljs-number">02</span>:<span class="hljs-number">37</span>:<span class="hljs-number">51</span> <span class="hljs-number">84</span>.<span class="hljs-number">93</span> <span class="hljs-attribute">M</span> <span class="hljs-number">35</span> <span class="hljs-number">02</span>:<span class="hljs-number">24</span>:<span class="hljs-number">54</span> <span class="hljs-number">84</span>.<span class="hljs-number">42</span> <span class="hljs-attribute">M</span> <span class="hljs-number">52</span> <span class="hljs-number">02</span>:<span class="hljs-number">45</span>:<span class="hljs-number">29</span> <span class="hljs-number">83</span>.<span class="hljs-number">86</span></pre></div><p id="5c2f">The overall winner remained the same — a 44-year-old man who ran 2:23.</p><p id="13b7">But the next three on the age graded results moved up from lower on the list. The #2 and #3 are men (27 and 30) who ran 2:15:05 — notching an age grade of 90.06, while #4 is a 29-year-old man who ran 2:18:22.</p><p id="e266">They leapfrogged ahead of a 39-year-old woman who ran 2:35:50 and a 28-year-old woman who ran 2:34:29. Meanwhile, the #3 on the percentile list is a 46-year-old woman who got knocked off the top ten list altogether.</p><p id="bceb">For one other example, here are the results from the 2019 Twin Cities Marathon.</p><p id="f365">First, by percentile:</p><div id="01c1"><pre><span class="hljs-attribute">Gender</span> Age Time Percentile <span class="hljs-attribute">F</span> <span class="hljs-number">40</span> <span class="hljs-number">02</span>:<span class="hljs-number">34</span>:<span class="hljs-number">07</span> <span class="hljs-number">99</span>.<span class="hljs-number">99</span> <span class="hljs-attribute">M</span> <span class="hljs-number">54</span> <span class="hljs-number">02</span>:<span class="hljs-number">38</span>:<span class="hljs-number">38</span> <span class="hljs-number">99</span>.<span class="hljs-number">97</span> <span class="hljs-attribute">F</span> <span class="hljs-number">27</span> <span class="hljs-number">02</span>:<span class="hljs-number">31</span>:<span class="hljs-number">29</span> <span class="hljs-number">99</span>.<span class="hljs-number">96</span> <span class="hljs-attribute">M</span> <span class="hljs-number">31</span> <span class="hljs-number">02</span>:<span class="hljs-number">12</span>:<span class="hljs-number">23</span> <span class="hljs-number">99</span>.<span class="hljs-number">95</span> <span class="hljs-attribute">F</span> <span class="hljs-number">24</span> <span class="hljs-number">02</span>:<span class="hljs-number">32</span>:<span class="hljs-number">49</span> <span class="hljs-number">99</span>.<span class="hljs-number">95</span> <span class="hljs-attribute">F</span> <span class="hljs-number">37</span> <span class="hljs-number">02</span>:<span class="hljs-number">40</span>:<span class="hljs-number">24</span> <span class="hljs-number">99</span>.<span class="hljs-number">95</span> <span class="hljs-attribute">M</span> <span class="hljs-number">29</span> <span class="hljs-number">02</span>:<span class="hljs-number">13</span>:<span class="hljs-number">50</span> <span class="hljs-number">99</span>.<span class="hljs-number">94</span> <span class="hljs-attribute">M</span> <span class="hljs-number">45</span> <span class="hljs-number">02</span>:<span class="hljs-number">36</span>:<span class="hljs-number">44</span> <span class="hljs-number">99</span>.<span class="hljs-number">94</span> <span class="hljs-attribute">F</span> <span class="hljs-number">30</span> <span class="hljs-number">02</span>:<span class="hljs-number">35</span>:<span class="hljs-number">50</span> <span class="hljs-number">99</span>.<span class="hljs-number">93</span> <span class="hljs-attribute">F</span> <span class="hljs-number">33</span> <span class="hljs-number">02</span>:<span class="hljs-number">36</span>:<span class="hljs-number">34</span> <span class="hljs-number">99</span>.<span class="hljs-number">93</span></pre></div><p id="682e">Now, by age grade:</p><div id="3c64"><pre><span class="hljs-attribute">Gender</span> Age Finish String Age Grade <span class="hljs-attribute">M</span> <span class="hljs-number">31</span> <span class="hljs-number">02</span>:<span class="hljs-number">12</span>:<span class="hljs-number">23</span> <span class="hljs-number">91</span>.<span class="hljs-number">89</span> <span class="hljs-attribute">M</span> <span class="hljs-number">29</span> <span class="hljs-number">02</span>:<span class="hljs-number">13</span>:<span class="hljs-number">50</span> <span class="hljs-number">90</span>.<span class="hljs-number">90</span> <span class="hljs-attribute">M</span> <span class="hljs-number">29</span> <span class="hljs-number">02</span>:<span class="hljs-number">15</span>:<span class="hljs-number">55</span> <span class="hljs-number">89</span>.<span class="hljs-number">50</span> <span class="hljs-attribute">M</span> <span class="hljs-number">25</span> <span class="hljs-number">02</span>:<span class="hljs-number">16</span>:<span class="hljs-number">30</span> <span class="hljs-number">89</span>.<span class="hljs-number">12</span> <span class="hljs-attribute">F</span> <span class="hljs-number">40</span> <span class="hljs-number">02</span>:<span class="hljs-number">34</span>:<span class="hljs-number">07</span> <span class="hljs-number">89</span>.<span class="hljs-number">12</span> <span class="hljs-attribute">M</span> <span class="hljs-number">54</span> <span class="hljs-number">02</span>:<span class="hljs-number">38</span>:<span class="hljs-number">38</span> <span class="hljs-number">89</span>.<span class="hljs-number">07</span> <span class="hljs-attribute">F</span> <span class="hljs-number">27</span> <span class="hljs-number">02</span>:<span class="hljs-number">31</span>:<span class="hljs-number">29</span> <span class="hljs-number">88</span>.<span class="hljs-number">50</span> <span class="hljs-attribute">M</span> <span class="hljs-number">24</span> <span class="hljs-number">02</span>:<span class="hljs-number">17</span>:<span class="hljs-number">44</span> <span class="hljs-number">88</span>.<span class="hljs-number">32</span> <span class="hljs-attribute">F</span> <span class="hljs-number">24</span> <span class="hljs-number">02</span>:<span class="hljs-number">32</span>:<span class="hljs-number">49</span> <span class="hljs-number">87</span>.<span class="hljs-number">73</span> <span class="hljs-attribute">M</span> <span class="hljs-number">27</span> <span class="hljs-number">02</span>:<span class="hljs-number">18</span>:<span class="hljs-number">53</span> <span class="hljs-number">87</span>.<span class="hljs-number">59</span></pre></div><p id="6408">In the age graded results, the top four slots are taken up by young men.

Options

The #5 is a 40-year-old woman who finished in 2:34 — who got bumped down from #1 on the percentile results. She finished ahead of 99.99% of women in the 40–44 age group, but age grading only scores her 89.12.</p><p id="0368">The #2 on the percentile list — a 54-year-old man who ran 2:38:38 — got bumped down to #6 on the age graded list. He finished ahead of 99.97% of men 50–54, but age grading only scores him 89.07.</p><p id="e214">A few things to note here.</p><p id="bf56">First, there’s a significant amount of overlap. That’s a good thing. The age graded tables may have some limitations, but they’re not horrible. So it would be surprising if a new system didn’t look similar.</p><p id="5091">Second, the difference in terms of percentiles among the top athletes is very small. It’s definitely important to have tables calculated to the 100th of a percent to have enough precision to differentiate these runners. This also means that having a large sample is important.</p><p id="0103">Finally, in both cases, the age graded tables seemed to benefit men while the percentile tables seemed to benefit women. This may indicate that the age graded tables aren’t calibrated correctly — but it also may indicate that the percentile tables are swinging too far in the other direction.</p><h1 id="cbc2">How Do Percentiles Compare to Age Grades?</h1><p id="0df3">Now that we can see the percentiles in action, how do these results compare to age graded results?</p><p id="0211">Let’s take a look at two individual races.</p><h2 id="cdf9">Analyzing the Results from Philly 2019</h2><p id="60a2">The results below represent the top finishers at the Philly Marathon in 2019. The orange dots are women and the green dots are men. Using the dropdown at the top, you can filter the results to only include particular age groups — but by default, it shows all runners.</p> <figure id="82e3"> <div> <div> <img class="ratio" src="http://placehold.it/16x9"> <iframe class="" src="https://flo.uri.sh/visualisation/16833004/embed" allowfullscreen="" frameborder="0" height="575" width="700"> </div> </div> </figure></iframe></div></div></figure><p id="4d84">From the 60th to the 95th percentile, notice how the graph increases very slowly. That large swath of runners — results spanning about 35% of the running population — only covers 20 age grade points. These runners — from slightly above average to serious — are differentiated much better by the percentile system than by age grading.</p><p id="75bf">At the top end, however, there is a clustering in the high 90’s. The final 5% of results span age grades from 70 to 90. The tradeoff here is that the percentile system offers more clear differentiation in the middle of the pack — but the age grade system more clearly differentiates among the top runners.</p><p id="b2c7">But there’s something else interesting going on here. At the front of the pack, there’s a clear separation between the men’s results and the women’s. Notice how from 80 to 95%, the orange line of women droops below the green line of men.</p><p id="0f44">In other words, a woman finishing at the 95th percentile earned a significantly <i>lower</i> age grade than a man who also finished in the 95th percentile.</p><p id="a73a">If you filter through the age groups, you’ll see that this difference is clearest at the youngest age groups (under 35, 35–39, 40–44), disappears in the middle (50–54, 55–59), and reverses at the older end of things (60–64, 65–69).</p><p id="aa4b">This suggests to me that the current age grading system may <i>undervalue</i> performances by young women while <i>overvaluing</i> performances by older women.</p><h2 id="e35e">Analyzing the Results from Marine Corp Marathon 2019</h2><p id="0de6">The visual below is similar to the previous one. It graphs the results from the 2019 Marine Corps Marathon along the same two axes.</p><p id="977f">However, it’s zoomed in to focus on the top athletes — finishing with a time in the 90th percentile or above.</p><p id="00d1">If you hover over an individual dot, you can see the actual age of that athlete along with their finish time.</p> <figure id="2df1"> <div> <div> <img class="ratio" src="http://placehold.it/16x9"> <iframe class="" src="https://flo.uri.sh/visualisation/16833764/embed" allowfullscreen="" frameborder="0" height="575" width="700"> </div> </div> </figure></iframe></div></div></figure><p id="8ad0">Again, there’s a clear separation between the green and the orange lines. Younger women, at similar percentiles, are receiving lower age grades.</p><p id="359c">But there are a smattering of orange dots situated well above the mass of green dots. These are women 60 and above.</p><p id="1216">One particular outlier represents a woman aged 79. She finished in 5:01 — which is in the 91st percentile of women 75–79 and earned her an age grade of 79. But other runners scoring around the 91st percentile got age grades between 60 and 70.</p><p id="d6ff">Now it’s possible that the sample of women 75–79 is skewed towards faster runners since it is so small.</p><p id="65ab">But something else may be going on here. The age grade factors grant increasingly higher scores with every year — so a 79-year-old woman would get a much higher score than a 75-year-old woman running the same time.</p><p id="ca68">One example in my dataset was a 75-year-old woman who ran 4:37:40 — earning an age grade of 79.12. A 79-year-old woman ran slightly slower — 4:42:50 — and earned an age grade of 85.19. Those four years bumped up the score significantly, despite being five minutes slower.</p><p id="cc41">But is there actually a huge difference between 75-year-old and 79-year-old women?</p><p id="e66d">I looked at my data to see who scored the best performances, by percentile, among this age group. Of the top 50 women, 18 were 75 years old and 14 were 79 years old. So it’s not the case that the age group is dominated by 75-year-old women — with 79-year-old women running significantly slower.</p><h1 id="d005">Are the Age Grading Tables Fair to Women?</h1><p id="416a">In analyzing these results, it’s becoming clear to me that the 2020 age graded tables seem to be mis-calibrated when it comes to women.</p><p id="70eb">The graphs above suggest that young women are under-scored and older women are over-scored.</p><p id="15c0">But I was curious to look at the big picture and see — if we draw a cutoff point, what percentage of runners in each age and gender group exceed that cutoff?</p><p id="0d2d">I decided to draw the line at an age graded score of 70. Then, I calculated — for every result in 2019 in my sample set — what percentage of runners in each age and gender group exceed that age grade score.</p> <figure id="e4dc"> <div> <div> <img class="ratio" src="http://placehold.it/16x9"> <iframe class="" src="https://flo.uri.sh/visualisation/16837843/embed" allowfullscreen="" frameborder="0" height="575" width="700"> </div> </div> </figure></iframe></div></div></figure><p id="33a1">The green bars are men.</p><p id="ed32">For the most part, it’s fairly consistent. The 55–59 age group has the highest rate — 8% — of runners earning an age grade of 70+. The 70–74 has the lowest rate — 5.17%. The rest hover in between.</p><p id="14af">The women are represented by the red bars.</p><p id="58cc">Unlike the men, these bars are <i>wildly</i> inconsistent.</p><p id="6a2d">At the younger end (under 35 to 45–49), it’s far lower than the men — around 2–3%.</p><p id="4a50">For women 50–54, the rate is about 5%. So more or less in line with the men.</p><p id="263e">Beyond that, the older women exceed 70 age grade at <i>much</i> higher rates.</p><p id="eb28">I thought maybe this was the result of a smaller sample among older women, producing unreliable results. But the number of women in the 55–59, 60–64, 65–69, and 70–74 age groups is roughly the same as the number of men in the next oldest age group.</p><p id="d63c">So if you compare women 55–59 with men 60–64, the samples are a similar size — but the women receive much higher age grade scores. Likewise for women 60–64 and men 65–69, so on and so forth.</p><p id="0127">Between the consistency among the men of different ages and the huge disparities among women, I think this is clear evidence that the 2020 age grade tables weren’t calibrated well when it comes to women.</p><h1 id="6fc4">Potential Drawbacks of Percentiles</h1><p id="ac06">While I think this approach has some potential, there are some limitations to consider. Some of these may be fixable, some are inherent to the system.</p><p id="3341">First, distinguishing between two athletes at the top of the list of performances requires precision to the hundredth of a percent. This requires at least 10,000 results in a sample, and some of the older age groups aren’t that big. It also may introduce some inconsistencies among the 99.91 to 99.99% in a small sample with some freak outliers.</p><p id="fc37">To some extent this is unavoidable. But, one potential solution is to graph the table and fit it to a curve. This is essentially what they did with the age grade tables. They first looked at the best results at each age, and then they fit them to a curve to mathematically calculate the factors. A similar approach could allow you to interpolate results at any percentile and smooth out the outliers.</p><p id="f204">Second, the results for the open men and open women may be influenced too heavily by professional runners.</p><p id="dd5e">For example, in 2019 there were 32,408 men in the under 35 category. Between NYC, Chicago, and a few other major races, there were likely several dozen professional athletes in this group. For easy math, let’s say 32. That would mean the top 0.1% to of athletes are elites — making it harder for young men (or women) to score 99.90% or higher.</p><p id="7f4d">This isn’t the case in older age groups. To be fair, though, the same problem impacts the current age grading tables.</p><p id="4d81">Finally, I calculated these tables based on age groups — not individual ages. Potentially, this disadvantages athletes at the top end of an age group and advantages those at the bottom end.</p><p id="7918">I chose this approach to ensure there was a large enough sample. There are many more men in the 65–69 age group than men who are exactly 65 years old.</p><p id="9673">This could be another place where mathematical modeling could help fill in the gaps that exist in the data. But for now, I’m going to put a pin in it and ignore it.</p><h1 id="43ae">What’s Next?</h1><p id="6377">This has already gone on longer than I originally expected, so I’m going to pause here.</p><p id="50d3">I’ve got a few ideas simmering for next steps.</p><p id="e463">Before I move on to the next approach, I think I want to take a look at mathematically modeling the tables and smoothing them out a bit. That will likely be the topic of the next article.</p><p id="1e77">I also want to update the data and take a second look at this based on results from 2023. My previous research suggests that times at American marathons are getting faster, so these percentiles may already be outdated. But this would require additional data collection, so I’ll probably save this for later.</p><p id="2749">Then, of course, there’s my second idea for an alternative approach — calculating z-scores for each performance to see how far above or below the mean it is. If the mathematical modeling of the percentiles turns out to be too time-consuming, I may just move straight on to this.</p><p id="7330">If you’re interested to see how this project turns out, <a href="https://medium.com/@brianrock.nj/subscribe">make sure you subscribe for email updates.</a> I’ll be sure to let you know when the next article is published.</p><p id="7753">And if you have any feedback or ideas that will help with this analysis — please leave a response. It always helps to have a second (or third or fourth) opinion!</p><p id="81a0">Finally, I plan to share this dataset on Kaggle once I’m done with this series. So if you’re interested in doing your own analysis, be sure to follow along.</p><p id="5372"><i>I’m an avid runner and a data nerd. I turn 40 this week, so comparing results across age groups is of particular interest to me. Here’s how you can keep up with what I’m doing:</i></p><ul><li><a href="https://medium.com/running-with-rock"><i>Follow Running with Rock to hear about my training</i></a></li><li><a href="https://runningwithrock.com/best-marathon-training-plan/"><i>Read these tips on picking a marathon training plan</i></a></li><li><a href="https://www.strava.com/athletes/49455615"><i>Stalk me on Strava</i></a></li></ul></article></body>

Can We Use Percentiles to Compare Running Performances Between Age Groups?

A first look at an alternative to the current age grading system

Photo by Kampus Production on Pexels

How do you compare race results between runners of different age groups?

That’s the question I’ve been looking at in an ongoing series of articles.

Age affects individuals differently, but as you get older it’s inevitable that you eventually get slower. Lest masters runners be left out of the competitive side of running completely, this calls for a way to make apples-to-apples comparisons between the finish times of different runners.

The current system is called age grading. It begins with the assumption that there is a best possible time for a given age group. Then, it measures your individual performance against that best possible time to create a score.

The system is useful, and it’s certainly better than nothing. But there’s a challenge in determining what that best possible time actually is — and the result may be a system that advantages certain age groups over others.

You can read more about age grading and my initial thoughts in the first article in the series.

Today, I want to focus on a potential alternative: grading performances with percentiles.

What Are Percentiles and How Do They Apply Here?

A percentile is a measurement of where a given number sits in a larger set of numbers.

You may remember these from when you were young and you took a standardized test in school. Instead of an actual grade, your results would say something like, “You scored ahead of 95% of other students your age.”

In this case, let’s say we take 100 runners. We record their finish times and then we put them in order.

The winner would be the 100th percentile — the best time and the one that is better than or equal to 100% of all times.

As you move further back in the field, you get to the 99th, then the 98th, then the 97th percentile … you get the idea.

Now instead of 100 runners, imagine you had a set of 100,000 runners. Similarly, you can line up all of their results and look at what time is better than 99% of other times. That would give you a benchmark against which to measure future performances.

If you divided all of the runners into groups by age and gender, and each group had a sufficient number of runners, you should be able to determine a set of benchmarks for any type of runner.

Whether you’re a 35-year-old man or a 65-year-old woman, finishing ahead of 99% of your peers is a sign of excellence.

As long as you can compile a set of results for each age group that is large enough to be competitive and representative, we should be able to calculate a percentile score for any given finish time — and use that to compare results across age groups.

Gathering the Data and Calculating the Percentiles

To use this method, the first step is to collect a sufficiently large dataset to work with.

In a previous article, I laid out how I identified a sample to work with. The sample includes all marathons with over 500 finishers that took place in the United States in September, October, or November, from 2010 to 2019.

Once I identified the races to include in the sample, I scraped the results from a variety of sources — including Marathon Guide, Athlinks, and some individual race websites.

In my last article, I explored the data in the sample. It includes a little over 2,000,000 individual finishes. For the younger age groups, there were well over 100,000 finishes. The older age groups were smaller, but there were still well over 10,000 finishers in most.

The group of runners over 80 was extremely small, and the results were out of line with the remainder of the sample, so I’m not including results for athletes beyond the 75–79 age group.

Once I collected the data, the final step was to create tables with an actual finish time that corresponds to each percentile. Using the Pandas package in Python, I was able to group the data by gender and age group, calculate each percentile, and save it to a CSV file.

Initially, I calculated this to the tenth of a percent — so 90.1%, 90.2%, etc. However, when I started to analyze actual results, I found that there were a number of runners who scored in the 99.9th percentile. So for those runners, I increased the precision to the hundredth of a percent and calculated percentiles from 99.91% to 99.99%.

The table below includes a small sample of the results. The full table includes 1,010 values for each gender and age group.

       | 99.99 | 99.9  |  99   |  98   |  95   |  90   |  80
---------------------------------------------------------------
M OPEN  2:07:43 2:16:35 2:36:15 2:45:07 2:58:09 3:11:04 3:30:27
M 35-39 2:13:45 2:25:57 2:45:59 2:53:14 3:03:27 3:15:00 3:32:49
M 40-44 2:22:55 2:32:52 2:50:49 2:57:08 3:07:45 3:18:43 3:35:18
M 45-49 2:29:33 2:39:48 2:55:41 3:00:34 3:12:28 3:22:47 3:38:24
M 50-54 2:35:55 2:45:09 2:59:32 3:07:00 3:18:17 3:28:03 3:43:36
M 55-59 2:43:42 2:52:14 3:08:13 3:15:41 3:26:47 3:36:38 3:52:06
M 60-64 2:47:31 2:58:41 3:17:35 3:24:42 3:36:30 3:47:34 4:02:21
M 65-69 2:56:09 3:09:02 3:28:17 3:35:54 3:48:45 3:59:25 4:17:31
M 70-74 3:06:17 3:17:53 3:38:52 3:48:26 4:02:24 4:16:31 4:39:10
M 75-79 3:08:41 3:24:47 3:42:40 3:54:16 4:23:29 4:40:36 5:06:34
W OPEN  2:25:32 2:40:04 3:05:07 3:14:51 3:28:24 3:40:48 3:57:54
W 35-39 2:27:52 2:46:42 3:09:42 3:18:48 3:31:15 3:42:27 3:59:03
W 40-44 2:35:25 2:52:22 3:15:30 3:23:32 3:35:19 3:46:01 4:01:57
W 45-49 2:48:48 3:01:55 3:21:17 3:28:18 3:39:41 3:50:38 4:06:16
W 50-54 2:53:17 3:07:27 3:28:29 3:35:08 3:46:50 3:57:11 4:14:08
W 55-59 2:56:25 3:12:56 3:34:21 3:42:02 3:54:01 4:04:19 4:22:01
W 60-64 3:12:17 3:21:37 3:41:47 3:50:05 4:03:00 4:16:19 4:34:18
W 65-69 3:25:11 3:31:03 3:53:34 4:02:48 4:17:32 4:32:05 4:53:04
W 70-74 3:29:27 3:41:25 4:05:06 4:13:54 4:37:53 4:51:36 5:18:49
W 75-79 4:16:38 4:16:38 4:18:30 4:28:05 4:45:29 5:10:00 5:39:32

Using this table, I was able to assign a score to each individual result in the sample set.

For comparison purposes, I also used the age grading tables from 2020 to calculate an age grade for each performance. New age grade factors were released in 2023, but I’m using the 2020 tables because the data I’m looking at is from before 2020.

An Example of Percentile Grading In Action

So what does this look like with actual race results?

As an example, I applied the percentile tables to the results from the 2019 Columbus Marathon. Of the 3,594 finishers, here are the top ten by percentiles:

Gender    Age       Time       Percentile
     M    44      02:23:00       99.98
     F    39      02:35:50       99.97
     F    46      02:58:00       99.94
     F    28      02:34:29       99.94
     M    27      02:15:05       99.92
     M    30      02:15:05       99.92
     F    23      02:37:51       99.92
     M    35      02:24:54       99.91
     F    54      03:05:47       99.90
     M    55      02:52:04       99.90

Meanwhile, here are the top ten results if you relied on age grading.

Gender    Age       Time      Age Grade
     M    44      02:23:00      90.61
     M    27      02:15:05      90.06
     M    30      02:15:05      90.06
     M    29      02:18:22      87.92
     F    39      02:35:50      87.65
     F    28      02:34:29      86.78
     F    54      03:05:47      85.62
     F    23      02:37:51      84.93
     M    35      02:24:54      84.42
     M    52      02:45:29      83.86

The overall winner remained the same — a 44-year-old man who ran 2:23.

But the next three on the age graded results moved up from lower on the list. The #2 and #3 are men (27 and 30) who ran 2:15:05 — notching an age grade of 90.06, while #4 is a 29-year-old man who ran 2:18:22.

They leapfrogged ahead of a 39-year-old woman who ran 2:35:50 and a 28-year-old woman who ran 2:34:29. Meanwhile, the #3 on the percentile list is a 46-year-old woman who got knocked off the top ten list altogether.

For one other example, here are the results from the 2019 Twin Cities Marathon.

First, by percentile:

Gender  Age        Time      Percentile
     F   40      02:34:07       99.99
     M   54      02:38:38       99.97
     F   27      02:31:29       99.96
     M   31      02:12:23       99.95
     F   24      02:32:49       99.95
     F   37      02:40:24       99.95
     M   29      02:13:50       99.94
     M   45      02:36:44       99.94
     F   30      02:35:50       99.93
     F   33      02:36:34       99.93

Now, by age grade:

Gender  Age Finish String  Age Grade
     M   31      02:12:23      91.89
     M   29      02:13:50      90.90
     M   29      02:15:55      89.50
     M   25      02:16:30      89.12
     F   40      02:34:07      89.12
     M   54      02:38:38      89.07
     F   27      02:31:29      88.50
     M   24      02:17:44      88.32
     F   24      02:32:49      87.73
     M   27      02:18:53      87.59

In the age graded results, the top four slots are taken up by young men. The #5 is a 40-year-old woman who finished in 2:34 — who got bumped down from #1 on the percentile results. She finished ahead of 99.99% of women in the 40–44 age group, but age grading only scores her 89.12.

The #2 on the percentile list — a 54-year-old man who ran 2:38:38 — got bumped down to #6 on the age graded list. He finished ahead of 99.97% of men 50–54, but age grading only scores him 89.07.

A few things to note here.

First, there’s a significant amount of overlap. That’s a good thing. The age graded tables may have some limitations, but they’re not horrible. So it would be surprising if a new system didn’t look similar.

Second, the difference in terms of percentiles among the top athletes is very small. It’s definitely important to have tables calculated to the 100th of a percent to have enough precision to differentiate these runners. This also means that having a large sample is important.

Finally, in both cases, the age graded tables seemed to benefit men while the percentile tables seemed to benefit women. This may indicate that the age graded tables aren’t calibrated correctly — but it also may indicate that the percentile tables are swinging too far in the other direction.

How Do Percentiles Compare to Age Grades?

Now that we can see the percentiles in action, how do these results compare to age graded results?

Let’s take a look at two individual races.

Analyzing the Results from Philly 2019

The results below represent the top finishers at the Philly Marathon in 2019. The orange dots are women and the green dots are men. Using the dropdown at the top, you can filter the results to only include particular age groups — but by default, it shows all runners.

From the 60th to the 95th percentile, notice how the graph increases very slowly. That large swath of runners — results spanning about 35% of the running population — only covers 20 age grade points. These runners — from slightly above average to serious — are differentiated much better by the percentile system than by age grading.

At the top end, however, there is a clustering in the high 90’s. The final 5% of results span age grades from 70 to 90. The tradeoff here is that the percentile system offers more clear differentiation in the middle of the pack — but the age grade system more clearly differentiates among the top runners.

But there’s something else interesting going on here. At the front of the pack, there’s a clear separation between the men’s results and the women’s. Notice how from 80 to 95%, the orange line of women droops below the green line of men.

In other words, a woman finishing at the 95th percentile earned a significantly lower age grade than a man who also finished in the 95th percentile.

If you filter through the age groups, you’ll see that this difference is clearest at the youngest age groups (under 35, 35–39, 40–44), disappears in the middle (50–54, 55–59), and reverses at the older end of things (60–64, 65–69).

This suggests to me that the current age grading system may undervalue performances by young women while overvaluing performances by older women.

Analyzing the Results from Marine Corp Marathon 2019

The visual below is similar to the previous one. It graphs the results from the 2019 Marine Corps Marathon along the same two axes.

However, it’s zoomed in to focus on the top athletes — finishing with a time in the 90th percentile or above.

If you hover over an individual dot, you can see the actual age of that athlete along with their finish time.

Again, there’s a clear separation between the green and the orange lines. Younger women, at similar percentiles, are receiving lower age grades.

But there are a smattering of orange dots situated well above the mass of green dots. These are women 60 and above.

One particular outlier represents a woman aged 79. She finished in 5:01 — which is in the 91st percentile of women 75–79 and earned her an age grade of 79. But other runners scoring around the 91st percentile got age grades between 60 and 70.

Now it’s possible that the sample of women 75–79 is skewed towards faster runners since it is so small.

But something else may be going on here. The age grade factors grant increasingly higher scores with every year — so a 79-year-old woman would get a much higher score than a 75-year-old woman running the same time.

One example in my dataset was a 75-year-old woman who ran 4:37:40 — earning an age grade of 79.12. A 79-year-old woman ran slightly slower — 4:42:50 — and earned an age grade of 85.19. Those four years bumped up the score significantly, despite being five minutes slower.

But is there actually a huge difference between 75-year-old and 79-year-old women?

I looked at my data to see who scored the best performances, by percentile, among this age group. Of the top 50 women, 18 were 75 years old and 14 were 79 years old. So it’s not the case that the age group is dominated by 75-year-old women — with 79-year-old women running significantly slower.

Are the Age Grading Tables Fair to Women?

In analyzing these results, it’s becoming clear to me that the 2020 age graded tables seem to be mis-calibrated when it comes to women.

The graphs above suggest that young women are under-scored and older women are over-scored.

But I was curious to look at the big picture and see — if we draw a cutoff point, what percentage of runners in each age and gender group exceed that cutoff?

I decided to draw the line at an age graded score of 70. Then, I calculated — for every result in 2019 in my sample set — what percentage of runners in each age and gender group exceed that age grade score.

The green bars are men.

For the most part, it’s fairly consistent. The 55–59 age group has the highest rate — 8% — of runners earning an age grade of 70+. The 70–74 has the lowest rate — 5.17%. The rest hover in between.

The women are represented by the red bars.

Unlike the men, these bars are wildly inconsistent.

At the younger end (under 35 to 45–49), it’s far lower than the men — around 2–3%.

For women 50–54, the rate is about 5%. So more or less in line with the men.

Beyond that, the older women exceed 70 age grade at much higher rates.

I thought maybe this was the result of a smaller sample among older women, producing unreliable results. But the number of women in the 55–59, 60–64, 65–69, and 70–74 age groups is roughly the same as the number of men in the next oldest age group.

So if you compare women 55–59 with men 60–64, the samples are a similar size — but the women receive much higher age grade scores. Likewise for women 60–64 and men 65–69, so on and so forth.

Between the consistency among the men of different ages and the huge disparities among women, I think this is clear evidence that the 2020 age grade tables weren’t calibrated well when it comes to women.

Potential Drawbacks of Percentiles

While I think this approach has some potential, there are some limitations to consider. Some of these may be fixable, some are inherent to the system.

First, distinguishing between two athletes at the top of the list of performances requires precision to the hundredth of a percent. This requires at least 10,000 results in a sample, and some of the older age groups aren’t that big. It also may introduce some inconsistencies among the 99.91 to 99.99% in a small sample with some freak outliers.

To some extent this is unavoidable. But, one potential solution is to graph the table and fit it to a curve. This is essentially what they did with the age grade tables. They first looked at the best results at each age, and then they fit them to a curve to mathematically calculate the factors. A similar approach could allow you to interpolate results at any percentile and smooth out the outliers.

Second, the results for the open men and open women may be influenced too heavily by professional runners.

For example, in 2019 there were 32,408 men in the under 35 category. Between NYC, Chicago, and a few other major races, there were likely several dozen professional athletes in this group. For easy math, let’s say 32. That would mean the top 0.1% to of athletes are elites — making it harder for young men (or women) to score 99.90% or higher.

This isn’t the case in older age groups. To be fair, though, the same problem impacts the current age grading tables.

Finally, I calculated these tables based on age groups — not individual ages. Potentially, this disadvantages athletes at the top end of an age group and advantages those at the bottom end.

I chose this approach to ensure there was a large enough sample. There are many more men in the 65–69 age group than men who are exactly 65 years old.

This could be another place where mathematical modeling could help fill in the gaps that exist in the data. But for now, I’m going to put a pin in it and ignore it.

What’s Next?

This has already gone on longer than I originally expected, so I’m going to pause here.

I’ve got a few ideas simmering for next steps.

Before I move on to the next approach, I think I want to take a look at mathematically modeling the tables and smoothing them out a bit. That will likely be the topic of the next article.

I also want to update the data and take a second look at this based on results from 2023. My previous research suggests that times at American marathons are getting faster, so these percentiles may already be outdated. But this would require additional data collection, so I’ll probably save this for later.

Then, of course, there’s my second idea for an alternative approach — calculating z-scores for each performance to see how far above or below the mean it is. If the mathematical modeling of the percentiles turns out to be too time-consuming, I may just move straight on to this.

If you’re interested to see how this project turns out, make sure you subscribe for email updates. I’ll be sure to let you know when the next article is published.

And if you have any feedback or ideas that will help with this analysis — please leave a response. It always helps to have a second (or third or fourth) opinion!

Finally, I plan to share this dataset on Kaggle once I’m done with this series. So if you’re interested in doing your own analysis, be sure to follow along.

I’m an avid runner and a data nerd. I turn 40 this week, so comparing results across age groups is of particular interest to me. Here’s how you can keep up with what I’m doing:

Marathon
Running
Data Science
Data Visualization
Data
Recommended from ReadMedium