Aging Gracefully: Approaching Aging Curves and Advanced Stats- Part II
Fantasy baseball, generally, cares little for age past a minor concern of whether an older superstar is going to fall off of his cliff this season or the next. Dynasty baseball, on the other hand, cares very much about age; be it concerning how soon the hot young prospect will reach The Show or… ok, yes, we still care about whether old superstars will fall off the cliff. The point is, they who understand how age affects performance understand that they gain a significant advantage over their league-mates. This series aims to evaluate aging curves and how those curves relate to stats that can help us win our leagues.
I had previously discussed the background of aging curve research and my approach. Now I’d like to apply that approach to some of the important stats of baseball analysis- weighted On Base Average, Isolated Power, Strikeout rate, batting average on balls in play, walk rate, and the triple slash (batting average, on base percentage, and slugging).
The New Methodology
In what follows, I present aging curves for eight component statistics based on the delta method with estimated missing season twos. Recall that estimated missing season twos are based on players who missed season two and returned in season three. All data comes from 1985 to 2018 unless otherwise noted. All data is league-adjusted and park-neutral. I only used players who played two successive seasons in the same park (most players do). The data is potentially slightly skewed from changes within the same park, but these changes are rare enough where they are quite unlikely to affect things much.
I followed past research norms, weighting plate appearances in season pairs by the harmonic mean (weighted average) of plate appearances in each season. This weighted average is similar to using the lower plate appearance amount of two adjacent seasons–another common method used in past research–but not quite so extreme. I found it did not make a substantive difference whether I used the harmonic mean or the lesser season’s plate appearances. I set the plate appearances minimum at 90 to minimize zero values (zeroes would have made the math a bit more tedious).
The main results are below, league-adjusted, park-neutral, and corrected for selection bias. Percentage of previous season wOBA is displayed for each age. Numbers highlighted in green represent peak ages. For example, age 20 wOBA grows 11.3% higher than age 19 wOBA. The sample size is over 100 for each age from 22 to 40. From 24 to 35, the sample size is over 500 for each age. The sample size at age 19 is 11 players, and at age 20 it is 40 players. I went back to 1950 to expand the sample for 19-year-olds to 55 and the wOBA results were similar, within 1 percentage point (these findings will be discussed more in a follow-up article).
Main results with forecasted missing season twos to correct for selection bias
wOBA rapidly rises from 19 to 26 before gradually flattening out, reaching a peak at 29. Changes between 27 and 29 are very gradual. Starting at 30, wOBA steadily declines, first gradually, and then more steeply after 34. Past research typically finds peak age anywhere from 26 to 29—pretty similar to these findings where most significant development happens before 27.
Isolated power grows second most of any component (walks grow the most), and the gains are especially large, around 10 percent a year on average from 19 to 26. Isolated power keeps growing all the way until 34—a surprising finding, but other studies have also found power peaks later than other skills. John Charles Bradbury found that power (measured by home runs) peaks at 30. This is not too different from the results here, where isolated power only changes gradually after 31. Bradbury theorizes power is less reliant on athleticism relative to other skills, and more reliant on hand-eye coordination and strategic thinking at the plate. Perhaps players get better at shifting their launch angle, batted ball profile, and swing to maximize their power at older ages.
Walk-rate grows a lot over a player’s career and does not peak until 37. Not swinging does not require as much athleticism as other skills—it appears to be the ultimate old-player skill. Older players appear to partially compensate for the decline of other skills by becoming more patient. Becoming more patient only slows their decline, though- it does not prevent it. Part of the reason walk-rate grows so much at younger ages is because the starting point is very low: 19-year-olds in the sample walked around 6.5% of the time on average.
Skills requiring the most athleticism, like BABIP and, consequently, batting average and strikeout rate, peak the earliest at 26. This also aligns with past research. Sprint speed, swing speed, agility, and perhaps reaction time, each rely heavily on athleticism—these are the first skills to decline, driving down BABIP and batting average, and driving up strikeouts.
I will dive into the findings as we progress here, and present smoothed-over regression curves for each statistic to remove random fluctuations and better account for nonlinearity in aging (bigger gains at younger ages).
Weighted On-Base Average (wOBA)
I will now present graphs of aging curves smoothed over with regression (chained, smoothed-over aging curves, based on regression results from log-transformed variables, for my fellow nerds). I will present the career wOBA growth for an archetypal 19-year-old and 23-year-old (average performance for each age). These ages are chosen pretty arbitrarily—23 is a common debut age for a top 100 prospect, 19 is a common debut age for a future superstar.
A note- presenting smoothed-over, chained aging curves assumes player improvement is linear, always the same percentage improvement at each age regardless of how good the player is. This is an incorrect assumption. Better players tend to improve less than average players, and worse players tend to improve more, mostly because of regression to the mean.
I will nonetheless present curves this way for a few reasons. First, presenting findings this way is intuitive, and initial evidence suggests players tend to adhere to general aging curves reasonably closely, with qualitatively small divergences except for extreme outliers. Second, regression removes random fluctuations stemming from smaller sample sizes, smoothing over the results. Third, a logarithmic regression model better captures nonlinearity in aging, a tendency for bigger gains at young ages and bigger declines at older ages.
wOBA aging curves for the average 19- and 23-year-old are below. Players rapidly improve from 19 to 26, then gradually improve until 29, then decline until the end of their careers, first gradually and then more rapidly. Notice the average 19-year-old in the sample peaks with a .392 wOBA at age 28, 27% higher than his starting wOBA at age 19. This is slightly larger growth than we saw in the raw results earlier because the log-transformed variables capture larger gains at younger ages.
To repeat an important caveat, this is a general aging curve for the average player at each age in the sample. For example, the average 19-year-old player starts with a .308 wOBA and improves by 11 percent to a .342 wOBA in their age-20 season. The average 29-year-old starts with a .329 wOBA and declines by 1% in their age-30 season. Starting wOBAs for other ages fall somewhere in between .310 and .330, near the MLB average wOBA (.320 in 2018). Other statistics also generally start around league-average. The delta method crudely chains growth rates at different ages together to come up with an overall aging curve. It finds 19-year-olds peak 27% higher than their starting wOBA.
In reality, players with an above average wOBA for their age tend to see a smaller improvement, percentage-wise, while players with a below-average wOBA for their age tend to see a larger improvement. This is partially because of regression to the mean, but it is also because it is likely easier to go from horrible to below-average than it is to go from amazing to even more amazing.
Consider what would happen if you, dear reader, devoted yourself to baseball fulltime starting at age 19. With extensive training and a killer diet, you might be able to go from batting .001 in 600 plate appearances to batting .010. This is a 1000 percent increase! You would still be the worst player in the history of baseball by far. Alternatively, take Ken Griffey Jr., who was already an above-average player at age 19, with a .330 wOBA. His best season was .440 wOBA, a 33% increase over his age 19 wOBA. This is much smaller than your hypothetical percentage increase, but it is also obviously a lot more difficult—not to take away from your notable accomplishment (way to go!).
Despite this tendency for good players to improve less, even very good players tend to reasonably closely follow an average player’s aging pattern. An elite 19-year-old is likely to see smaller improvement than a terrible 19-year-old, but both 19-year-olds still improve significantly on average as their careers progress. Divergences from average aging patterns tend to be on the smaller side, except for extreme outlier performances (think Adam Dunn’s .300+ isolated power at age 21).
The delta method aging curves are thus predictive but not a prediction – they offer useful predictive value on the overall direction and magnitude of aging impacts across players, but a precise forecast needs to be catered to each player’s starting performance (among other things, e.g. body type).
With the methodology now clear and caveats out of the way, the rest of this article can proceed more quickly. I will now present smoothed-over aging curves for an archetypal 19-year-old for each statistic.
A smoothed-over aging curve for a 19-year-old with a .165 (league average) peak isolated power is featured below. Since the aging curve shown below is for a 19-year-old with below-average power at most ages, around .075 ISO, it understates aging growth. This caveat is true for all aging curves that follow. They are for players with a league average peak performance, so performance growth is understated (to err on the side of caution in presenting these curves).
The curve below shows isolated power grows rapidly, more so than most other components, until age 29, and then plateaus from 29 to 33 before declining. Most of the gains happen at younger ages, with age 19 isolated power doubling by age 26.
Analysts often argue power grows with age, and in this case, a closer look at the data reveals conventional wisdom to be grounded in truth. Isolated power grows rapidly (and nonlinearly) at young ages, with 20-year-olds improving more than 10% above their age-19 season. General aging curves project a 19-year-old with a ~.075 isolated power to improve to an average power hitter by his peak. It’s not easy to put up a ~.075 isolated power at 19, not many players are capable of this, but the players that do are likely to grow into average-to-above-average power hitters (since the curve above understates their improvement).
To cherry pick one example: Elvis Andrus put up a .106 isolated power at 19. Back then, he was viewed as a slick-fielding shortstop with little offensive potential—his glove was his calling card. However, in 2017, at age 28, his power reached its peak (so far) at .174, a 64% raw improvement over his rookie isolated power. Jose Ramirez and Jose Altuve illustrate this example even more fully (there is a reason people often refer to them as archetypal players for power growth). Both have seen their isolated power double since their rookie seasons.
As a simple validation check (again, a different valuation method that I will discuss more in depth in part III), I looked at isolated power development for the ten 19-year-olds in the sample: Alex Rodriguez, Ivan Rodriguez, Ken Griffey Jr., Manny Machado, Bryce Harper, Andruw Jones, Edgar Renteria, Mike Trout, Adrian Beltre, and Justin Upton. These players had slightly above-average isolated power at 19, a bit higher than .175. Their peak isolated power was 74% percent higher, around .302.
This is lower growth than the delta method’s ~130 percent growth. These players already had above-average power at 19, and above-average performers tend to grow less than average performers. Nonetheless, 74% isolated power growth over a precocious 19-year-old’s career is substantial. Even 19-year-olds with slightly above-average power grow more powerful.
An aging curve for a 19-year-old with a peak league average strikeout rate (21.7%) is shown below. Contact ability improves until age 27 and then declines. Declines are steep after 32. The biggest improvements come at younger ages, with most substantial improvement happening before 25. For instance, these aging curves predict a 19-year-old with a 27 percent strikeout rate to improve to a 22% strikeout rate by age 27. Most of this improvement, over 80 percent, happens before age 25, with the rate falling to 22% by 24.
Batting average on balls in play
BABIP improves until age 28 before declining. Improvements are modest overall: a 23-year-old with a ~.290 BABIP peaks at ~.300 BABIP and then declines, again reaching a .290 BABIP by age 32. A 19-year-old with a ~.260 BABIP peaks around .300. Along with contact ability, BABIP peaks at the earliest age. This is likely driven by speed and athleticism changes. Hitters appear to lose speed and agility before anything else.
Walk-rate is the last component to decline. It improves all the way up until 34 before declining. As players age, they lose their speed and contact ability first, and then power. By age 34, every component besides walk rate is in decline. To try and prevent the inevitable decay that awaits us all, old players become more patient, drawing more walks. This offsets their overall decline somewhat but does not prevent it; wOBA starts declining at age 30 despite increased patience.
The smoothed-over curve below shows walk rate development for a 19-year-old with a league average peak walk rate (8.6%).
As a point of comparison, I looked at walk-rate growth over the careers of the ten 19-year-olds in the sample. Each of these players saw substantial improvement. They experienced ~150% growth in walk-rate from starting age to peak age (compared to 260% growth found in the delta method). Each of the ten 19-year-olds substantially increased their walk rate over their careers. The sample averaged about a 7% walk-rate in their age-19 seasons. Their peak walk rate was around 16%. The delta method predicts a 24 percent peak for this group. The delta method overstates improvement for these players, though, because they already had above average plate discipline at a young age. In any case, walk rate grows substantially as players age.
Batting Average, On-Base Percentage, Slugging Percentage
The next three curves, batting average, on-base percentage, and slugging percentage, are all functions of the earlier statistics so I will not spend much time discussing them.
Batting average follows a similar pattern to BABIP and wOBA, with players improving until age 29 and then declining. Like with other components, the biggest gains happen before age 25. The decline likely stems from a combination of factors: diminishing speed and consequently BABIP, and increased strikeouts. After 32, diminished power also drives batting average lower.
Slugging percentage also peaks at 29, similar to batting average and wOBA. It is a function of batting average and isolated power, after all, so this is unsurprising. Slugging percentage declines more quickly than isolated power, though, because it does not just capture power, it also captures batting average and other factors indirectly (strikeout rate, BABIP). Slugging percentage declines at 29 come more from a loss of speed, BABIP, increased strikeouts, and batting average, rather than from power. Only at age 32 does isolated power decline start driving down slugging percentage. After this point, slugging percentage declines much more rapidly.
On-base-percentage peaks at 30, well before walk rate, because it is in large part a function of batting average. It does peak a bit later than batting average, though, because it also captures walk rate, and walk rate does not peak until 34.
These aging curves are based on the average improvement for an average performer at each age. The smoothed-over curves shown are therefore increasingly misleading as a player deviates further from average performance. Better players see smaller improvements, worse players see larger improvements. This is because of regression to the mean, and because it is likely harder to go from bad to less bad than to go from amazing to even more amazing. Players with different skillsets also age differently. Only a complex projection system—like any of the various publicly available models—is capable of providing a precise aging forecast for particular players.
Notwithstanding, these results give an idea of the magnitude of change for different players. Both good and bad players tend to follow general aging patterns reasonably closely. Good and bad players improve at different rates, but even very good players experience substantial age-related improvements. This idea will be explored more fully in a follow-up article featuring a few wOBA validation tests.
These findings also give an idea of the general peak age for each statistic. I found players improve until 29, but performance changes between 27 and 31 are very gradual. Most improvements happen early in a player’s career, at age 25 or younger; especially, large improvements happen from 19 to 23. There is no data on player’s younger than 19 as virtually all of these players have historically been kept in the minor leagues. Gains for players younger than 19 are surely significantly larger at younger and younger ages—an important idea for judging prospects.
This was a long and at times tedious work to prepare (and probably to read also)—thanks very much for staying with me!
Check out next week for an Appendix on Validation Tests and application of these curves to some real-life baseball players!
PS: A Quick Note on Findings Without Selection Bias Correction
Onwards, the table below shows the main findings without the selection bias correction. Missing seasons were simply left out of the sample. I expected aging growth to be attenuated and aging decline to be magnified. This is because unlucky players are removed from the sample before they can positively regress in season two, but lucky players do negatively regress in season two.
Main results without selection bias correction
Aging gains were indeed attenuated, with slightly lower overall growth from age 19 to peak for every component outside of walk rate. Aging declines were also maximized for most statistics, with slightly steeper declines after peak at most ages.
The peak age for wOBA is younger, 26, without correcting for selection bias. This is likely because aging gains are understated and aging declines are overstated, without correction. In any case, wOBA is mostly stable from 27 to 29 regardless of correcting for selection bias—a reassuring fact.
There has been much debate in the past over whether players peak at 26 or 29. I think the growth rates from 27 to 29 are small enough to render this debate mildly uninteresting—at least for practical purposes. I would not be surprised if true peak age was anywhere in the 26 to 29 range (or even 25 to 31).
Besides wOBA, overall growth patterns and peak ages are generally comparable for both methods, despite the differences already highlighted—an indication the selection bias correction did not do anything too radical.