Potentially Ill-Advised Adventures in xwOBA (and a New xwOBA Metric!)
Let’s play a game: what do Brian Dozier, Rougned Odor, Andrelton Simmons, Jonathan Schoop, Jose Ramirez, Paul Dejong, Max Kepler, Travis Shaw, Randal Grichuk, and Sal Perez have in common?
Time’s up. These are the top 10 pulled fly ball hitters in the league over the last three calendar years. Each pulls over 33% of their fly balls, way above league average (around 24%). This group has also been moderately underrated by Baseball Savant’s xwOBA* since it was unveiled in 2015. Their wOBAs are 2% higher than their xwOBAs on average.
*mlb.com’s explanation of xwOBA: “Expected Weighted On-base Average (xwOBA) is formulated using exit velocity, launch angle and, on certain types of batted balls, Sprint Speed.”
Earlier this year, Tom Tango, MLBAM (MLB Advanced Media) Senior Data Architect of Stats, shared the number one question people ask him: why doesn’t he include spray angle (pull tendency) in his xwOBA model?
Tango’s response features a graph of pull tendency and wOBA minus xwOBA, similar to the one below. Like Tango’s, my graph shows no relationship between pull percentage and wOBA minus xwOBA in 2019 (all 2019 hitters, minimum 1,000 pitches seen, or around 300 plate appearances).
When considering pulled fly ball percentage rather than pulled percentage alone, however, a positive relationship with wOBA minus xwOBA emerges (the correlation is .22).
Perhaps there is in fact a small bias in xwOBA against hitters who pull a lot of fly balls. The graph below tests this hypothesis further, looking at 2018 pulled fly ball percentage and 2019 wOBA minus xwOBA (minimum 100 fly balls in 2018, 1000 pitches seen in 2019). The relationship is less strong but still positive (the correlation drops to .13). The weaker correlation suggests a lot of random noise in pulled fly ball percentage, but still some year-to-year stickiness in the metric.
Is this likely bias in xwOBA undesirable? Not necessarily.
As Tango explains, xwOBA is a descriptive metric, like FIP, rather than a predictive one. It’s a simplification of the past, capturing the two most important batted ball indicators (exit velocity and launch angle) and ignoring noisier and less useful evidence.
Depending on how it’s done, incorporating pulled fly ball percentage could make xwOBA noisier and potentially less informative, rewarding hitters too much for luck-based, random fluctuations in pulled fly percentage. Done well, and incorporating pulled fly ball percentage in some modest way could make the metric more informative and less biased. In any case, future updates to xwOBA will likely incorporate pull tendency in some form (Tango and MLBAM are constantly improving their metrics, and have already incorporated sprint speed on certain batted balls in 2019–we fans are in good hands!).
In the meantime, on the grounds that it’s interesting to see the differences that emerge, I’ve generated a quick and dirty updated xwOBA metric that incorporates pulled fly ball percentage along with some other metrics potentially underrated by the current version of xwOBA (other metrics correlated with wOBA minus xwOBA). To generate the new xwOBA, I regressed 2019 wOBA on Baseball Savant xwOBA, pulled fly ball percentage, and a few other theoretically relevant variables discussed below (2019 sample, minimum 1,000 pitches seen).
Mind that my version of xwOBA is potentially noisier than Tango’s, capturing more variables, but with more random variation than launch angle and exit velocity. For example, if a player increased their pulled fly percentage a lot this year over career norms, a related step forward in xwOBA might be purely luck-driven. Readers should consider whether increases in a given player’s pulled fly ball percentage are sustainable or not. Fangraphs’ Al Melchior’s recent pulled fly ball research is helpful in this light: he found year-to-year correlations in pulled fly ball percentage ranging from .55 to .71 (over each of the past four years). A player should sustain a good portion of their pulled fly ball percentage from season to season.
Pop the hood on my xwOBA (i.e. see what I’ve put into it other than fly ball percentage) at the end of this piece. This google sheet shows the new xwOBA for all hitters (updated through last week’s games). It also has a calculator to generate an expected xwOBA for a given set of shift percentage, Baseball Savant xwOBA, sprint speed, pulled fly ball percentage, and opposite field fly ball percentage. Download the spreadsheet to use the calculator.
Here are the biggest gainers in the new xwOBA metric compared to the old one–lots of speedsters (Mondesi, Buxton, Tatis Jr.) and also classic pulled fly ball guys (Kepler, Grichuk, Dejong).
Here are the biggest losers in the new xwOBA metric compared to the old oner: lots of older, slower players who are shifted often.
What I’ve Added to xwOBA aside from fly ball data
Here are the other theoretically relevant metrics correlated with wOBA minus xwOBA, and their 2019 correlations in parentheses: percentage of plate appearances where a batter is shifted (-.07); opposite-field fly ball percentage (-.10); sprint speed (.28).
The coefficients in the new xwOBA model are as follows: a one percentage point increase in plate appearances where a batter is shifted decreases new xwOBA by .0008; a one percentage point increase in pulled fly ball percentage increases new xwOBA by .001; a a one percentage point increase in opposite field fly ball percentage increases xwOBA by .0004; a 1 (feet per second) unit increase in sprint speed increases xwOBA by .004; a .001 increase in Baseball Savant xwOBA increases new xwOBA by .00092. Baseball Savant xwOBA is extremely correlated with the new xwOBA at .96–launch angle and exit velocity are clearly the most important variables in these models. Notwithstanding, some interesting differences did indeed emerge after adding the new variables.