Thursday, July 17, 2008

Pitcher Savvy

Using Pitch F/X Data to Anticipate Peripherals
by Paul Calluzzo

A few months ago Sal Baxmusa talked about building blocks of sabermetrics at The Hardball Times. A main point he made was that peripheral pitching statistics are the building blocks for ERA, which is in turn is the building block of wins. Along these lines of reasoning he made the observation that the development of pitch F/X data would provide the building block for the pitching peripherals statistics BB%, K% and GB/FB rate. Thanks to the information readily available on the Internet, specifically through Fangraphs' Export function and Josh Kalk’s “Bornbybits” Blog, I was able to acquire pitch data that allowed me to quantify pitchers’ "stuff." This data included the fastball velocity, vertical fastball movement, horizontal fastball movement, # of pitches in repertoire, and % of pitches which were fastballs. Using regression analysis I was then able to quantify how these skills affected a pitchers peripherals (BB%, K% GB/FB and FIP). Through this analysis I was then able to assign an expected outcome for each pitcher given their “stuff.”

By looking at the difference between expected outcomes and real life outcomes, we can begin to estimate a “pitcher savvy” skill which may explain the difference between the two outcomes (along with random variance and a lack of explanatory power of the model which is due to the obvious flaws in my econometric ability, my assumptions and my data). Below are my findings.

GB/FB Ratio
After toying with the available pitch f/x data I determined that the inputs which affected a pitcher’s GB/FB ratio were fastball velocity and vertical fastball movement. The analysis showed that 10 mph of fastball velocity increases the GB/FB ratio by 0.26. For each inch a pitcher’s fastball drops, the GB/FB ratio increases by .17. The GB/FB ratio prediction formula can be expressed as follows:
GB/FB Ratio = 1.4778 + (Fastball Velocity*0.2622) + (LnVertical Movement*-1.1658)
The formula is significant at a level p<0.01 and with an adjusted R-squared of 51.7. The R-squared essentially says that 51.7% of the variance in GB/FB ratios can be explained by this model. By plugging all 97 pitchers in my sample into this formula, we can see based on “stuff” which pitchers have the highest and lowest Expected (ex) GB/FB rate. This is listed below:

League Leaders in Expected GB/FBLeague Losers in Expected GB/FB
NameReal GB/FBexGB/FBNameReal GB/FBexGB/FB
Brandon Webb3.383.38Scott Olsen0.750.84
Roy Halladay2.252.38 Jered Weaver0.780.86
Derek Lowe2.442.12Barry Zito1.080.88
Greg Maddux1.772.01Ted Lilly0.730.89
Tim Hudson2.671.88Shaun Marcum1.170.99

By subtracting Real GB/FB from exGB/FB we can begin to try to isolate the “pitcher savvy” skill. As I stated before we would need an econometrician with skills far exceeding mine before we can feel confident in the “GB/FB ratio pitcher savvy” metric. This doesn’t mean it’s not fun:

League Leaders in GB SavvyLeague Losers in GB Savvy
NameGB SavvyName GB Savvy
Tim Hudson0.79Micah Owings-0.77
Andy Pettitte0.76David Bush-0.56
Jair Jurrjens0.59Paul Byrd-0.56
John Lannan0.58Randy Wolf-0.51
Jon Garland0.52Brandon Backe-0.48

K/9
I determined that the inputs which affected a pitcher’s K/9 were fastball velocity, horizontal fastball movement and % of pitchers thrown which are fastballs (Fastball %). The analysis showed that 10 mph of fastball velocity increases the K/9 by 3.6. For 10 inches of horizontal movement K/9 increase by 0.6. For a 10% decreases in Fastball %, K/9 increases by 0.3. The K/9 predication formula can be expressed as follows:
K/9 = -23.6625 + (Fastball Velocity*0.3559) + (Horizontal Movement*0.0599) + (Fastball %*-0.0313)
The formula is significant at a level p<0.03 and with an adjusted R-squared of 28.6, meaning 28.6% of the variance in K/9 can be explained by this model. An obvious flaw of this analysis is that to determine “stuff” I am only looking at fastball data. It is outside of my abilities to incorporate multiple pitches into the analysis. I figured a proxy of non-fastball stuff would be Fastball % and the number of pitches in a pitcher’s repertoire (defined as pitches pitcher throws more than 10% of the time). Running these two input variables through the output data, it was shown that Fastball % but not Pitch Repertoire had a significant relationship on the output data. This was the logic for including it in the formula.

As with GB/FB ratio, I will now give a list of league leaders and losers in expected K/9 and Strikeout Savvy:

League Leaders in Expected K/9League Losers in Expected K/9
NameReal K/9exK/9NameReal K/9exK/9
CC Sabathia8.988.25Greg Maddux4.433.45
Dustin McGowan6.877.84Livan Hernandez3.363.51
Randy Johnson8.727.61Jamie Moyer5.534.32
Tim Lincecum9.387.60Paul Byrd3.884.76
Felix Hernandez8.047.60Mike Mussina5.704.76

League Leaders in K/9 SavvyLeague Losers in K/9 Savvy
NameK/9 SavvyNameK/9 Savvy
Chad Billingsley3.10Zach Duke-2.73
Jonathan Sanchez2.65Jesse Litsch-2.66
Brandon Webb2.64Joe Saunders-2.41
Ted Lilly2.31Jon Garland-2.28
Edinson Volquez2.21Nick Blackburn-2.19

BB%
Analysis showed that both horizontal fastball movement and Fastball % had a statically significant (p<.03) effect on BB%. However, for two reasons I will not delve further into the analysis. Firstly, the adjusted R squared showed that the model only explained 10.3% of the variance in BB%. Secondly and more importantly, I believe that this minimal amount of variance is due to a self-selecting sample error. A pitcher who is able to mix in non-fastball pitches and throws high-movement fastballs can get away with more walks because he is striking out more batters. A pitcher who is unable to mix up his pitches and throws low-movement fastballs, and also walks a lot of batters, is probably not in the Major Leagues. A self-selecting sample may undermine much of the other above data, but logic dictates (I do not know of a way to test for the error) that it is particularly vicious in this part of the analysis.

I also wanted to look at how my “Savvy” stats were related to age, but the same self-selecting sample error would arise. Instead it would be necessary to look at the change in the Savvy metric as an individual pitcher ages. I think that would be a lot of fun.

FIP
My method of analysis to investigate the relationship between “stuff” and FIP was draining. For a time I thought the best method would be to use the discussed “expected” peripherals and use these as building blocks to construct FIP via the traditional FIP formula. Unfortunately I could not find a formula which used these variables in this exact form. Furthermore I did not have a big enough sample of pitchers to make my own FIP metric. I tried but could only get a model with an adjusted-R square of 0.25 which I thought was too low to move forward in this way. In the end I found it most effective to work straight from the pitch F/X data. Perhaps this was another area in which my econometric skills failed me. With this behind us, I think the derived “Complete Savvy” metric is the most telling, if not in an adjusted R-squared perspective, then in a qualitative who-is-doing-the-most-with-the-least and the-least-with-the-most sort of perspective.

I determined that the inputs that affected a pitcher’s FIP were fastball velocity and vertical fastball movement. The analysis showed that 10 mph of fastball velocity decreased FIP by 1.06. 10 inches of fastball drop decreased FIP by 0.46. The FIP prediction formula can be expressed as follows:
FIP = 11.8224 + (Fastball Velocity*-0.0964) + (LnVertical Movement*0.44397)
The formula is significant at a level p<0.06 and with an adjusted R-squared of 14.5, meaning 14.5% of the variance in FIP can be explained by this model. Below are the league leaders and losers in Expected FIP and Complete Savvy:

League Leaders in Expected FIPLeague Losers in Expected FIP
NameReal FIPexFIPName Real FIP exFIP
Roy Halladay2.883.46Jamie Moyer4.354.85
Brandon Webb3.033.50Barry Zito4.844.75
Felix Hernandez3.283.62Livan Hernandez4.484.69
Ublado Jimenez3.883.63Kenny Rogers4.854.58
Josh Beckett3.453.68Mike Mussina3.814.57

League Leaders in Complete SavvyLeague Losers in Complete Savvy
NameSavvyNameSavvy
Cliff Lee1.85Brett Myers-1.54
Dan Haren1.23Vicente Padilla-1.44
Justin Duchscherer1.19Brandon Backe-1.36
Tim Lincecum1.07Oliver Perez-1.34
John Danks1.04Daniel Cabrera-1.26

Applications
Assuming that “Savvy” is a measure of the gap between talent and performance, it may be useful as a tool to identify underperforming players. Not so much in the vein of FIP vs ERA where a large sample size would fix the discrepancy. I think it would be more along the lines of needing an expert (i.e. sabermetrics-minded pitching coach Rick Peterson) adjusting a pitcher’s approach to the game in order to improve their Savvy and bring their performance in line with their talent. Perhaps the next step (in addition to shoring up the robustness of the model with a larger sample size) of this model would be identifying “talent” inputs such as the discussed fastball velocity and movement, but also incorporating “approach” inputs which may also be components of savvy. I wonder if it is true that only by quantitatively identifying the inputs of “savvy” can it be taught.

Perhaps Savvy could be used in ways similar to BABIP for batters. Outliers in Pitcher Savvy (both compared to previous years for themselves and compared to all pitchers) may be due for a regression to the mean. On the subject of individual pitchers’ Savvy over the years, I would be curious to see if this is a repeatable skill, or simply random variance. I’m hoping it is a repeatable skill, otherwise I think that means that Savvy falls into the same bin “clutch” stats are in.

Shortcomings
All this data needs to be taken with a large grain of salt for two reasons. The first problem is the small sample size the model deals with. I am not sure how many years back the Pitch F/X data goes; I only took 97 starting pitchers from 2008 as chosen by Fangraphs. There are no reasons (other than me not wanting to spend days compiling the data) that this sample should not be in the several hundreds if not the thousands. This would instantly make the data more robust.

The second major shortcoming is a dearth of input variables. There are many more metrics which are available but were not included due to inconvenience and my own inabilities. Off the top of my head I think there would be value in looking at metrics such as “release point variability” and “release point distance from home plate.” I have already discussed how this model does not take into account secondary stuff (think Mr. Hamels’ changeup). Furthermore, there is no input metric to gauge control (Strike % or Ball % may suffice) in the model. Some may argue that control is not a component of “stuff” and thus should be excluded from a model that looks at the relationship between “stuff,” “savvy” and peripherals. Where exactly does control fall? Is it a physical skill or more an issue of a pitcher’s approach to the game? I am not exactly sure, but there would be no harm in plugging control variables into the model and seeing what the computer gods spit out.

Because of these problems with the model it is difficult to say how much of the variance between the expected and real life outcomes are due to “Savvy,” how much is due to missing inputs which would yield a statistical significance, and how much is due to natural random variance. Because of this it is would be presumptuous to call the various “Savvy” metrics and regression models I have presented as anything more than pilots. They show the possibilities of analysis the Pitch F/X makes available, but fall far short of the finish line. Well, I also think they are kind of fun to play with. I have only presented the top and bottom five pitchers in each category in this article. It is fun to look up players, and see how your expectations match the results of the model. An example of this is Jamie Moyer’s last-in-league expected FIP matching his beer league “stuff.”

I have one semester of Econometrics under my belt as my statistical training. I’m sure I overlooked a lot. If nothing else, I hope that someone more skilled and with more time then myself will grab upon this article and improve its content. (Well, that’s if I do not have a message in my inbox from Mr. Wilpon.) Not to knock The Hardball Times but I do not find their Anatomy of Pitcher series to be very interesting. Instead I think this is the future of the pitch F/X data: to roll the building blocks back a level from FIP, and look at what factors determine pitching peripherals.

5 comments:

Sal said...

Perfect? Goodness no. On the right track. Most definitely. This is a great, great step in the right direction. I'm happy that someone took off on my suggestion.

Keep up the good work!

Sal Baxamusa

Paul Calluzzo said...

Thanks Sal. I am going to incorporate some recommendations from tango to try to improve the model.

Triumph said...

can you give a simple account of how these formulas were derived?

interesting stuff here, great work. i do miss hearing about willie randolph sucking though.

Paul Calluzzo said...

Multiverable Regression using the Regression feature in Data Analysis menu of excel. Essentually you make a spreed each with a column for each stat, click the button, and the computer does it all for you.

Anonymous said...

I would like to suggest ditching excel and using more powerful statistical software if you're interested in coming up with the most accurate regression equations. There are documented problems with the way excel handles regression (http://www.cs.uiowa.edu/~jcryer/JSMTalk2001.pdf)

R is a very powerful open source tool. It has a rather steep learning curve, but it should produce more accurate results.