Aging Gracefully- Approaching Aging Curves and Advanced Stats, Part I

Fantasy baseball, generally, cares little for age past a minor concern of whether an older superstar is going to fall off of his cliff this season or the next. Dynasty baseball, on the other hand, cares very much about age; be it concerning how soon the hot young prospect will reach The Show or… ok, yes, we still care about whether old superstars will fall off the cliff. The point is, they who understand how age affects performance understand that they gain a significant advantage over their league-mates. This two-part series aims to evaluate aging curves and how those curves relate to stats that can help us win our leagues.
Introduction
Past aging research* tends to focus on how overall player value changes with age, whether that value is captured by WAR, weighted on-base percentage (wOBA), or linear weights. Research is much scarcer on how aging affects component statistics: isolated power, batting average on balls in play, strikeout percentage, walk percentage, batting average, on-base percentage, and slugging percentage. This series is an attempt to better understand how these different component statistics change with age.
This work leverages a new spin on the “delta method” typically featured in aging studies (we will get into this later), and finds that players grow substantially over their careers, with the majority of improvement happening before age 25 for most statistics. Improvement before 23 is especially rapid. Players see wOBA improve until 29, but changes between 27 and 31 are slight. Certain statistics grow much more substantially than others, especially isolated power and walk rate. Statistics most reliant on athleticism–BABIP, batting average, and strikeout percentage–peak earliest.
The Old Methodology
Typically, aging studies use the delta method. This method groups together same-aged players, capturing their performance change from one season (season one) to the next (season two). Performance changes for different age groups are “chained” together to form one long aging curve. For example, one study might find that 25-year-olds see a league-adjusted wOBA improvement of 1% in their age-26 season.
So this is all well and good, but there is an issue with this method- selection bias.
Second Season Selection Bias
Delta method studies share an awareness of one issue: selection bias. Only certain players are selected to play a second season, and these players generally have to be good enough in season one to play in season two. Lucky season one players will negatively regress in season two (on average), but unlucky players receive fewer chances to positively regress in season two—they are often kept out of the major leagues entirely. This skews aging curves, understating age-related improvements at young ages and overstating age-related declines at old ages.
MGL on Baseball explains this problem in his aging curves article with an example I’ll paraphrase here: as a product of random chance alone, 5% of players will perform two standard deviations worse than their true talent level, and 5% of players will overperform by two standard deviations in a given season. If a group of .280 wOBA true-talent hitters each get 300 plate appearances, 5% of them will put up a .330 wOBA and 5% will put up a .230 wOBA. In the following season, the .330 hitters would regress toward their true .280 wOBA talent level. Many of the .230 hitters will not receive MLB playing time in the following season, and therefore will not get the chance to positively regress and offset the declines of the .330 wOBA group. This biases the group’s overall season-to-season change in performance. So how do we address this bias and fix it?
Past Selection Bias Solutions
There have been a couple of alternative solutions to deal with this issue. In the past, Tom Tango has removed final player seasons (the last season of a player’s career), as he finds that performance decline in final seasons far exaggerates performance decline in any other season—likely a product of bad luck more than anything. MGL leverages Marcel, a simple projection system, to forecast players who did not play in season two (and players who only played a partial season in season two). There are issues with each of these solutions, however.
A problem with removing final seasons for every player is some of the declines in final seasons is surely genuine. Removing final seasons entirely may over-correct for the selection bias issue, and understate aging decline. I did indeed find more gradual aging declines at older ages when I tried Tango’s method, though I found a similar peak age and peak performance level.
Conversely, Marcel already builds in a small aging adjustment. Players are forecasted to improve until 29 and then decline afterward; consequently, using Marcel biases the aging curves towards Marcel’s aging curve. Alternatively, if one ignores Marcel’s age adjustment and uses an unadjusted Marcel projection, like MGL may have done, then the aging curves will be biased toward not showing aging effects. By definition, unadjusted-for-age Marcel forecasts do not forecast age-related improvements or declines – they merely extrapolate past performance, weighting recent seasons more heavily. Notwithstanding, Marcel forecasts have been validated many times and hold up well relative to other projection systems, so MGL’s is a worthwhile approach.
No aging curve study is problem free, and Tango and MGL’s methods bring essential insights on aging that are generally comparable to my results (see a comparable study from Tango here). My results are also comparable to J.C. Bradbury’s excellent work leveraging a fixed effects regression model (we find the same peak age). There are many variations of aging curves, and each help triangulate how players really age. As both Tango and MGL have explained in the past, to ideally understand aging curves every player would receive thousands of at-bats every year regardless of performance. This is the only way to truly avoid selection bias issues and understand true aging curves. This method is impossible, however, unless teams start to care about science more than winning—a weird future to picture.
Since no aging curve study is problem free, let’s read on to see what I came up with to address the aforementioned issues.
Novel Solution to Selection Bias
Following in the footsteps of past research, especially influenced by MGL’s Marcel method, I devised a novel solution to the selection bias problem. Again, the main problem with aging curves is unlucky players in season one often do not get a chance to positively regress in season two, while lucky players usually do get a chance to negatively regress. Since 1985, four-fifths of players who miss season two never return to the major leagues (about 2,000 guys). Fortunately, about one-fifth of unlucky players who miss season two do eventually return to the major leagues in season three (about 500 guys). So I’m focusing on those season one players who were out of the majors in season two but made it back in season three.
The bad players that did get another chance in season three are similar to the bad players that did not: both groups of players are, shocker, very bad in season one. Both groups have a season one wOBA within .010 of each other—the 2,000 that never return average a .277 wOBA, the 500 that return in season three average a .286 wOBA (all in-text statistics are scaled to 2018 unless otherwise noted). These groups are reasonable proxies for each other. The players who never got a second chance would likely improve slightly more with a second chance than the players who returned in season three. They are slightly worse in season one, so they would likely regress more toward the mean. Considering their very similar wOBAs, though, this improvement difference is likely tiny.
The performance of players who return to the majors in season three is thus a simple, theoretically sound solution to approximately forecast missing season two performances. This approach minimizes the selection bias problem, potentially more accurately capturing true aging curves. The only issue is dealing with that missing season two, but we will get to that …right now.
Forecasting Missing Season Twos
*WARNING- NERDERY AHEAD*
For players who missed season two and returned in season three, I captured the average improvement in season three over season one at each age. This provided the average performance change for each age. I then smoothed-over the average performance change for each age with logarithmic transformed variables in regression, in order to account for nonlinearity (e.g., larger gains at younger ages, and random fluctuations stemming from small sample sizes). I used the halved, smoothed-over, nonlinear growth rates generated in the regression to forecast season twos for players with missing data. I used halved growth rates because the season three growth rates represent performance change after two seasons, and I was estimating missing season two performance changes after only one season.
I estimated plate appearances for missing season twos based on plate appearances for players who skipped season two but returned in season three. There is a very strong correlation between plate appearances and wOBA in every season—managers tend to give better players more plate appearances (nice job, managers). For players who played season three and missed season two, I regressed plate appearances on wOBA and used the results to estimate plate appearances for missing season twos.
A more precise and complex forecasting technique is possible than the one I used here, and I hope for this research sooner rather than later. Notwithstanding my self-depreciation, however, the technique I used is a simple and likely a good enough approximation for missing season two performances. This, in turn, will hopefully account for the selection bias inherent in a less-adjusted data set, or one that accounts for that selection bias differently.
Stay tuned for Part II where I apply my findings to weighted On Base Average, Isolated Power, Strikeout rate, batting average on balls in play, walk rate, and the triple slash (batting average, on base percentage, and slugging).
*For an overview of past influential aging studies, check out Neil Weinberg’s Fangraphs introduction (with links to additional articles from Jeff Zimmerman), Mitchel Lichtman’s hardball times article (with links to additional articles from Tom Tango), J.C. Bradbury’s Baseball Prospectus article, MGL on Baseball’s article, and Fangraphs’ Jeff Zimmerman article. Each of these studies informed my research, from methodology choices, to generating hypotheses, to interpreting findings.