Mar 18, 2014; Fort Myers, FL, USA; A view of the stadium before the game between the Minnesota Twins and the Tampa Bay Rays at Hammond Stadium. Mandatory Credit: Jerome Miron-USA TODAY Sports

Using Aggregate Statistics to Perceive What Matters in Spring Training


At this point, we know the drill. It is inevitable that we are going to look at the spring training statistics of the players on our favorite teams, but we realize in the back of our heads that those numbers make little difference. First off, they come in a small sample size. Looking at how a player does in a few games rarely tells us how good he will be in the long term. Secondly, pitchers take the mound not looking to retire hitters but to work on particular pitches, and that makes their numbers suffers while giving hitters an easier time. How much can we really take from numbers that pitchers compile almost incidentally as they get ready for the season? Let’s attempt to figure that out. There are a lot of things in spring training that are meaningless, but hidden within them are some things that could truly make a difference. To understand spring training, we have to understand the numbers in context.

The past couple of years, Baseball-Reference has compiled all the spring training statistics on one handy page. I pasted this year’s spring training data (entering Saturday’s games) into Excel and went to work comparing the aggregate numbers of spring training to the 2013 regular season. The results were quite interesting.

In the major leagues in 2013, the league average was a .253/.318/.396 line. In this year’s spring training, that has jumped to .265/.330/.406. Do those numbers really tell us something? Using a two-proportion Z-test, all three slash stats are statistically significant, with batting average and OBP being exceedingly likely to fluctuate that much. The probability of such a difference in batting average occurring by chance alone is just .0000769 (think 13,000 to 1 odds), but the probability for OBP was actually even lower at .0000561 (18,000 to 1). We will get to why that is in a minute. The difference in slugging percentage was a little more plausible, but .002 (500 to 1) still means that we are over 99% confident that we did not get that result by luck. It is clear immediately that spring training is a more hitter-friendly environment than the regular season. To view it from the opposite perspective, pitchers had a 3.87 ERA and a 4.18 run average in 2013, but that goes up to 4.43 and 4.96 this spring. I have not seen the differences quantified like this before, but the results certainly make sense. Pitchers are working on their arsenals while hitters are hitting like usual. But we can go into much more depth than that.

This spring training, batters have walked just a touch more than they did last season (8.1% versus 7.9%), but have struck out quite a bit less, just 18.6% compared to 19.9%. The difference in walk rate is not even statistically significant. As pitchers have worked on things like their fastball command, they have sacrificed some whiffed but done a solid job staying around the zone. But of course pitchers are also working on secondary pitches that are often thrown out of the zone. How could their walk rate be so similar with that in mind? The answer is that the walk rate presented above is quite misleading. In the regular season, there are intentional walks plus plenty of hitters that are pitched around. In spring training, the situations do not matter nearly as much and there have been just 6 intentional passes given out total between the Grapefruit and Cactus Leagues. The difference in unintentional walk rate is more staggering: 8.1% this spring versus just 7.4% in 2013, and that is a significant result (p=.000034). That helps explain where our unlikely result for OBP came from. With that in mind, a great walk rate in spring training is not nearly as impressive as it might appear.

Aside from the control factor, there is the manner of defense. The quick summary: it is quite bad. Hitters’ batting average on balls in play jumps from .297 last season .310 this spring, another statistically significant result that explains much of the difference in batting average. If that wasn’t bad enough, we get to doubles and triples. In 2013, 4.4% of plate appearances ended in a double. This spring, that has gone up to 5.0%. For triples, it goes from .4% of PA’s all the way to .7%, nearly doubling. The difference in doubles was significant, but the probability of that big of a difference in triples occurring by chance is just .000000000543 (nearly two billion to 1). The bottom line is that the defense in spring training is far inferior to regular season standards. Home runs, though, are where things get interesting.

We mentioned before how slugging percentage did not differ as significantly as batting average and OBP. What barely differed at all, though, was isolated power, which was actually slightly higher in the regular season (.143) than it has been this spring (.141). The reasons for that are the flyballs that the outfielders don’t have a chance at, the home runs. After pitchers allowed 1.0 homers per 9 innings last season, that has dropped to just 0.82 this spring. The difference in home runs is actually significantly higher in the regular season, providing us with a stat that has a better chance of meaning something. For instance, pitchers can be expected to be hit around to an extent, but if they are allowing home runs in an environment that depresses them, that is not a good sign.

One last thing to examine in stolen bases. The major league success rate on stolen base attempts was 72.8% in 2013. This spring, it is just 70.2%. That is not quite statistically significant (.084 probability), but it appears that hitters are being less selective on the basepaths as they try to show their teams that they can swipe the occasional bag. To support that more strongly, I looked at stolen bases divided by singles, doubles, walks, and hit-by-pitches to get a rough idea of how often runners elected to steal. That percentage went up from 7.0% in 2013 to 8.9% this spring, which yielding a .0000441 probability (nearly 23,000 to 1) of occurring by chance. Bottom line, it might be a little more impressive for a runner to steal successfully, but don’t read too much into a non-basestealer who makes a couple attempts or a speedy prospect who gets caught a few times.

From the statistics, we can see that the major differences between the regular season and spring training are worse defense, more walks, less strikeouts, and less home runs. To find the players whose spring numbers are most likely to mean something, we should be looking for the hitters who show home run power at the plate to go along with a solid walk rate. The home runs stand out in the spring, but if they fail to walk in a league where pitchers are walking quite a few more batters than normal, then your excitement has to wane. On the pitcher side, meanwhile, allowing home runs is a major issue, but doubles and triples can be mostly ignored. While we talked about unintentional walk rate increasing in spring training, pitchers competing for spots as opposed to working on their repertoires may have an advantage in the walk department because they are locked in but do not need to pitch around anyone. With that in mind, don’t put too much weight in a pitcher walking less batters and it is really the strikeouts that matter more. A combination of strikeouts, few home runs, and not too many walks could be the right combination to turn a spring training sensation into an impact major league pitcher.

It is not profound to say that the hitters with power and plate discipline and the pitchers with great stuff and command are the most likely to succeed–the statistics are directing us to the mindset we should have had all along. With an understanding of how spring training works, look at the stats but zone in on the trends most indicative of future success as you try to sort through what it all means.

Next Rays Game View full schedule »

Tags: Statistical Analysis Tampa Bay Rays

  • Baltar

    Robbie, I have seen somewhere on the web in previous years that there are at least three spring training statistics that correlate somewhat to the following season’s performances: team record, pitcher K’s, pitcher BB’s.
    Do you have any knowledge of that?

    • Robbie_Knopf

      For team record, the distribution isn’t linear–the poor records and the great records might mean something, but .500 means little. In any event, the correlation is not very strong.

      For K’s and BB’s, that is not the data I was looking at here–rather than going player by player, I was looking at the totals for spring training and the regular season. That being said, this data would tell us that strikeouts are especially notable while walks depend more on the situation. For a pitcher working on his arsenal, we can give him more leeway in regards to walks, but if we are talking about a player competing for a spot, low walk totals may not be indicative of much and high walk totals are more concerning. If I had the time, it would be very interesting to separate pitchers into those two categories and look at strikeout rates, walk rates, and everything else from there.

      A good example for strikeouts and walks is Cesar Ramos. His 7.2 K/9 is nice but not extremely impressive (that’s both his career number and the spring training average), while his 0.7 BB/9 is misleading because he does not have to work around hitters.