Wednesday, June 29, 2011

ERA vs. xFIP

by Mike Z

Considering I throw around a bunch of these obscure stats, I've often considered posts explaining the major ones a little more. This was a little tougher than I thought because I had to find some way to make formulas sound interesting. The ERA formula is self-explanatory (earned runs/innings pitched X 9), but the xFIP one is insane [xFIP = ((FB*.11)*13+(BB+HBP-IBB)*3-K*2)/IP] and essentially miserably dry to break down all the variable. However I'll try to briefly go over what xFIP is before I go any farther.

xFIP, or expected fielding independent pitching, is a metric devised to determine what a pitcher's ERA should be, taking into account only the variable that he is theoretically responsible for. Those variables are walks, strikeouts, hit batters, home runs and innings pitched. Walks, strikeouts, and hit batters are self explanatory. The innings pitched dividend puts the calculation onto an ERA scale so you can now effectively compare the two stats.

The one major thing I want you to focus on in the formula is the (FB*.11). This is fly balls times .11, with .11 being the average home run to fly ball ratio for pitchers league wide. Without that correction, you get the calculation for FIP instead of xFIP. Home run rates for a single pitcher fluctuate wildly from year to year so (FB*.11) provides a more accurate calculation.

So now that the crash course is completed, what's a good way to compare the statistics? I decided on a correlation study as it can compare multiple data points at once and is fairly simple to follow. I compared starting pitcher win/loss percentage to ERA, and then starting pitcher wins to xFIP on a graph made by Excel and then best fit a line through the points, using the slope of the line as the read out. A slope/correlation value is on a 0 to 1 scale, for example, 0.1 being a weak correlation to 0.9 being a strong correlation. To protect myself from small sample sizes skewing my results, I took statistics for only the starters that qualified for statistical awards (about 25 starts each year) and took all data from 2008-2010, which gave me 251 data points to work with (Excel can only calculate 255 points, otherwise I had data going through 2006).

First, here's the correlation graph for W/L (X-axis) vs. ERA (Y-axis):

The correlation for this graph is 0.44, so not that strong. the greatest density of points is running right along the 3-4 range of ERA , but there are still points all over the place, which is what the correlation value indicates.

Here's the same thing using W/L vs. xFIP instead:


The correlation with this graph looks similar to the ERA comparison, but notice how the data points are much more closely scattered around the best fit line than previously. The correlation for this graph is 0.62, not strong, but much better than ERA and definitely an improvement.

So what does all this mean? This is a real quick study with plenty of factors that make it less than ideal, with using Wins/Losses the biggest gripe that I have. If anybody has an idea for a better correlation study, let me know because I saved all my data from this so that I can revisit it if I decide to do anything more in-depth with the data. This was pretty crude, but I hope it illustrates why a lot of the sabermetrics types use xFIP to gauge starting pitchers than ERA, especially when attempting to project how they'll perform for the remainder of the season.

If anybody has any suggestions for other stats they would like me to look at, leave them in the comments and I'll look at it the next time I do one of these.

4 comments:

  1. Do you actually enjoy doing all that math?

    ReplyDelete
  2. Yeah I love it...how weird am I? You should've seen me when I got to take a Biostatistics course in undergrad.

    ReplyDelete
  3. So that's why you and others are always on cells at games. You're all using your calculators to figure out these stats. Or you're youtubing kittens.

    ReplyDelete
  4. http://www.youtube.com/watch?v=Tlbdk3nvt7I

    ReplyDelete