Last week we examined the Batting Average on Balls in Play (BABIP) of Astros' starting pitchers. Now we will evaluate the BABIP of Astros' hitters. In both instances, we use BABIP to provide us information regarding the extent that a players resultsoriented stats this year (like ERA and Batting Average) reflect good or bad "luck" on batted balls which turn into hits.
Evaluating hitters' BABIP is not as straightforward as pitchers' BABIP. According to accepted sabermetric principles, batters have more control over BABIP than pitchers. We assume that pitchers' BABIP will regress to an average BABIP which falls witinh a relatively narrow range around the league average. The range of hitters' expected BABIP is much larger because the BABIP will be driven by the diverse range of hitting skills at the major league level.
Like the evaluation of pitchers in last week's sabermetric article, we can estimate an "expected BABIP" (xBABIP) for each hitter's performance so far this season. The xBABIP method relies upon batted ball data to provide a benchmark for evaluating whether the batters' hits/outs on balls in play appear to be "lucky" or "unlucky." The xBABIP benchmark leads to conjecture of an under or over performance during the period in question. An overperformance can be a red flag that the players' batting may regress downward in the future. Conversely, an underperformance tells us, at the least, that the hitter has a good chance for improving his batting average with normal regression to mean.
Hitters' BABIP can be extremely volatile from monthtomonth and yeartoyear. For that reason, xBABIP is helpful in giving us an indication whether a hitters' slumping or skyrocketing batting stats are "real" or not. Obviously the stats are real in that they happened and helped or hurt the team win games. But BABIP "luck" can result in a mirage, leading us to believe that a hitter is better or worse than his actual skill level.
The early major league experience of Chris Johnson and Jimmy Paredes provide Astros' fans with examples of putting too must trust in a player's BABIP. In Johnson's first extended call up, 2010, he hit for a .308 batting average and a 119 wRC+, leading General Manager Ed Wade to pencil him in as a future franchise player for the Astros.
Oh, but wait, Chris Johnson's BABIP was .387 in 2010we can't expect that performance to be the norm.
In the next year, Johnson's BABIP declined 70 points, and his batting average fell to .251 with an 81 wRC+. Ed Wade was disappointed, but should we have been surprised? (No.) You can make the same comparisons of Paredes' 2011 (.383 BABIP) and 2012 (.255 BABIP) seasons. These were the same players, with the same skills, in both seasons. The xBABIP concept assists us in separating the real from the mirage.
For estimating xBABIP I have used the most recent modification of the formula used by Jeff Zimmerman (with the help of Robert Boden) at Fangraphs. This formula uses groundballs, fly balls, infield hits, infield pop ups, line drives, bunt hits, and home runs as data inputs for estimating a player's xBABIP. I rely upon the average weightings applicable to those inputs for the period 20092012 to estimate xBABIP for the 2013 performance. Because the selection of a xBABIP formula is somewhat subjective, I have also used the older Hardball Times xBABIP formula, which followed up on the ground breaking HT article on BABIP by Chris Dutton and Peter Bendix. Utilizing both formulae provides us a range of expectations.
In the table below, the Astros hitters' BABIP so far this season is compared to the xBABIP for the same period. The fangraphs formula is xBABIP(1) and the Hardball Times' formula is xBABIP(2). The difference between BABIP and xBABIP is shown as a negative percentage for an under performance (xBABIP higher than BABIP) and a positive percentage for an over performance (xBABIP is lower than BABIP). Note that I have excluded hitters who have a small number of at bats (such as Paredes and Maxwell).
Over / (Under) 
Over / (Under) 

BABIP 
xBABIP(1) 
xBABIP(2) 
Performance(1) 
Performance(2) 

Altuve 
0.326 
0.342 
0.338 
4.7% 
3.6% 
Dominguez 
0.245 
0.274 
0.295 
11.7% 
20.2% 
Castro 
0.333 
0.348 
0.336 
4.3% 
0.8% 
Barnes 
0.364 
0.348 
0.328 
4.5% 
10.0% 
Pena 
0.285 
0.318 
0.338 
11.8% 
18.7% 
Carter 
0.326 
0.307 
0.313 
5.7% 
3.9% 
Martinez 
0.313 
0.336 
0.330 
7.0% 
5.3% 
Gonzalez 
0.261 
0.304 
0.319 
16.4% 
22.3% 
Cedeno 
0.329 
0.352 
0.317 
6.9% 
3.7% 
Corporan 
0.379 
0.376 
0.349 
0.9% 
8.0% 
Notes: (1)= Fangraphs xBABIP (2)= Hardball Times xBABIP
Notice that the two xBABIP formulae show the same direction of performance (i.e., under or over performance) in each case except for Ronny Cedeno. Cedeno under performed xBABIP, according the fangraphs formula, but over performed xBABIP based on the HT formula. Excluding Cedeno, six of the remaining nine Astros' hitters under performed xBABIP. Corporan, Carter, and Barnes have over performed xBABIP so far this year.
Dominguez, Pena, and Gonzalez are the biggest under performers of xBABIP. Gonzalez has suffered through a miserable offensive season, with an OPS below .600. So, it's not surprising that he has been a major under performer of xBABIP.
Pena has been somewhat productive, but his OBP and power has been lower than expected. Both BABIP models estimate much higher xBABIP (.318  .338) than his .285 actual BABIP. This may provide some hope that Pena's batting will improve in the future. However, Pena's vulnerability to defensive shifts may be one of the reasons for under performing xBABIP. Defensive shifts typically reduce a player's BABIP by 0.013 points. In Pena's case, the xBABIP differential is more than three times the average effect of defenive shiftsthus, bad luck remains a possible explanation for his BABIP under performance.
Dominguez has been productive on occasion at the plate, but his overall offensive performance has been poorin part due to a very low BABIP. Although Dominguez has other parts of his offensive game which need work, such as his walk rate, there is a lot of room to increase his BABIP and thereby improve his batting average in the future. Perhaps an improvement in Dominguez's BABIP will allow him to push his OPS over the .700 threshold.
J.D. Martinez has experienced a moderatetohigh level of under performance. Martinez's current wRC+ (83) is disappointing, and any improvement in his BABIP would be welcome. A .325 BABIP supported Martinez's notable production (wRC+ 103) when he was first called up in 2011. His current xBABIP is higher than his BABIP during that 2011 ML campaign; perhaps this is an encouraging sign for an upturn in his offense.
Barnes' has the highest over performance and largest potential for downward regression in BABIP. This isn't surprising; his BABIP has been regressing ever since he posted an unsustainable .483 BABIP in March/April. His monthly OPS has dipped below the .600 mark as his BABIP has regressed. Barnes' xBABIP indicates that he can sustain a relatively high BABIP, but his current 2013 BABIP remains higher than even the high xBABIP level.
Although Corporan has a high BABIP, his xBABIP is also surprisingly high. The potential for regression based purely on BABIP is fairly moderate. xBABIP is not a prediction of future performance, but instead focuses on whether the BABIP during a past time period is abnormal based on the batted ball distribution. Thus, xBABIP is not a prediction of the player's true talent level. In the case of Corporan, his xBABIP is quite high (over .346), but this is largely based on a high line drive rate (28%). Corporan's hitting skill may not sustain a line drive percentage that high, and if that's true, his performance may regress more than xBABIP would indicate.
Carter has moderately over performed his xBABIP. The extent of over performance is small enough that normally it might not be viewed as a concern. However, Carter's .231 batting average is already relatively low, and any additional decline in his BABIP might be a concern. A major reason for the apparent over performance is an above average .794 BABIP on line drives. Typically line drive BABIP will regress toward the low .700's. However, there is evidence that power hitters can sustain somewhat higher BABIP on line drives, probably because their liners are hit harder. And Carter does hit the ball hard. The fact that Carter also had a .833 BABIP on line drives in 2012 may indicate that he can normally sustain above average batting averages on liners. At this point, we can't reach any firm conclusions.
Any surprises here?
Loading comments...