By In Stuff

Judgmental Stats: ERA

A few years ago — well, yikes, it’s actually more than 15 years now — a guy named Voros McCracken developed a fascinating and counterintuitive theory. He couldn’t help but think that while pitchers have firm control on some parts of pitching (walks, strikeouts and home runs) they have much less control, if any, on balls actually hit into play.

As he worked on it, he found that the numbers backing his theory were even starker than he expected.  Understand that at that point, everyone believed — as many still believe — that great pitchers must give up fewer hits on balls in play than merely good or average or below average pitchers. It was more than belief, it was obvious fact. It went without saying that Greg Maddux or Randy Johnson were much less likely to give up hits on balls in play than, say, Marco Estrada or Jeremy Hellickson.

Of course, I didn’t just come up with Marco Estrada and Jeremy Hellickson by accident. Those two have the lowest Batting Average on Balls in Play for a season since 1990.

Lowest BABIPs among qualifiers since 1990:

Marco Estrada, 2011, .217

Jeremy Hellickson, 2015, .224

Chris Young, 2006, .230

Curt Schilling, 1992, .230

Zack Greinke, 2015, .232

Estrada, by the way, also has the seventh lowest BABIP season, just last year.

As for Maddux, in 1999 his BABIP was .331, one of the highest totals in the last quarter century. But his BABIP one year earlier was .267, well below the league average. Who would have thought that Pedro Martinez in 1999, when basically unhittable, actually gave up a .325 average on balls in play. Numbers like those led McCracken to his controversial but fascinating conclusion that starting pitchers don’t have control of balls in play. Those balls are the stuff of luck and the Gods and the weather and defense.

As you no doubt know, this theory has remained controversial … and it has been extremely influential in various baseball circles. The incoparable Fangraphs began publishing a stat called Fielding Independent Pitching (FIP) and built its pitcher WAR statistic on the FIP framework — both work off the premise that a pitcher’s contribution to baseball comes down to strikeouts, walks and home runs. Fangraphs has made some adjustments to WAR (counting infield flies as strikeouts, for instance, and also treating relief pitchers a bit differently) but even now Fangraphs WAR does not take into account how many hits or how many runs the pitcher allows.

This leads some people to discount Fangraphs’ WAR  … and others to embrace it wholeheartedly. I have a friend who loathes the very idea of FIP and Fangraphs WAR; at any given moment he will just start ranting: “You cannot separate what a pitcher does from a defense does,” he shouts. “It all works together. You can’t just count strikeouts, walks and homers. You can’t separate a pitcher and defense!”

“Do you like ERA as a statistic,” I ask him.

“Yeah, it’s fine.”

“Well,” I say. “ERA tries to do exactly the same thing.”

Yes, ERA is the original FIP, the original effort to isolate what a pitcher does from what a defense does. Of course, ERA is quite a bit less elegant. I have told the story many times, but I’ll tell it again — the first baseball story I ever wrote in a newspaper, a four-paragraph story about a high school game, was read by my baseball-averse mother. “Good story,” she said. “But I have one question. Who are you to say that the team scored an unearned run? Aren’t you supposed to be unbiased?”

It’s one of my favorite stories — my mother, who has suffered through this blog forever and now knows WAY more about baseball that she would care to know, isn’t crazy about it — but really she was right. Calling any run “unearned” is just a silly judgment. Unearned how? Did someone round the bases and touch home plate? Yes? Then wasn’t the run earned? This is another one of those judgmental baseball stats that would look like madness in any other sport. Empty net goals count as goals. Of course they do. Catches made after the defensive back fell down count as catches. Of course they do. Breakaway dunks after stupid turnovers count as baskets. Of course they do.

And runs should count as runs. No adjustment would be easier than turning ERA into a true counting stat: Just make it Run Average. That’s it. End of story.

Even if you want to live in the fantasy world of ERA, there are two specific points that make ERA kind of ridiculous. Actually, there are two and a half points.

Point 1. If you want to play this imaginary game where pitchers do not deserve to be blamed for defensive miscues, well, how do you not also consider the opposite. Let’s say that the bases are loaded, and the pitcher grooves a fastball that gets hammered. The centerfielder rushes back, leaps at the wall and takes away the home run.

Shouldn’t the pitcher be charged with four runs? I mean, in this fantasy world we have created, the pitcher EARNED those runs.

Point 2: It really is a fantasy world that ERA creates, an alternate universe, sort of like in the movie Source Code, where you try to go back and imagine what the world would look like if the bomber was caught before he blew up the train. Think about this: If a fielder makes an error with two outs, every run for the rest of that half inning is considered unearned. The pitcher could give up six home runs, and none of them would count against his record because, of course, there’s an alternate universe where he is already out of the inning.

In football, this would be a bit like saying that every touchdown scored on a drive where a defensive penalty gave the team a first down is an unearned touchdown. Hey, if not for that penalty, they would have punted back on their own 35.

Point 2.5. If the pitcher is the one who makes the error that leads to the run, it’s STILL considered an unearned run. This has led to numerous Abbott and Costello routines:

Abbott: It’s an unearned run.

Costello: How is that an unearned run?

Abbott: Because somebody made an error. You can’t blame the pitcher for that.

Costello: But it was the pitcher who made the error.

Abbott: No. He was not a pitcher when he made that error.

Costello: What was he?

Abbott: Why he was a fielder of course.

Costello: Wait a minute, we’re talking about the same guy right.

Abbott: Yes.

Costello: Same guy on the mound. Same guy who caught the ball.

Abbott: Yes.

Costello: So … why …

Abbott: No, why is in left field.

I’m only giving this half a point, though, because I’ve heard from various statisticians that if you are going to have unearned runs, you have to stay consistent, you have to count it as an unearned run even if it’s the pitcher’s error. I have no idea if this is true or not. It sounds to me like saying, “Well, if you are going to believe in unicorns, you have to make them purple.” But so it goes.

Now it should be said: We have grown so used to ERA that it has taken up residence in our minds. We get ERA. We are attached to it. Bob Gibson’s 1.12 ERA in 1968 looks right; his 1.45 RA does not. Nolan Ryan’s 3.19 career ERA feels right in a way that his 3.64 RA does not. We think of the quality start as six innings, three EARNED runs, when it certainly makes a lot more sense as six innings, three runs whether earned or unearned. ERA has been around so long that it’s hard to imagine life without it. I suspect we won’t ever change to RA, and I don’t think that’s any great tragedy.

Still, after all these years I have to say: My mother was right. Who are we to judge if a run was earned or unearned?




Print Friendly, PDF & Email

66 Responses to Judgmental Stats: ERA

  1. Kevin says:

    I’m probably missing something obvious, but isn’t the logical conclusion of the points made here to eliminate errors as a statistic too? If we say that pitchers shouldn’t be statistically “protected” from fielding mistakes, is there any reason hitters should be statistically punished? I don’t see any points here about pitchers and unearned runs that can’t clearly be equally applied to hitters and errors.

    • Andy says:

      Yes, I would say that this series of articles boils down to: “eliminate errors.” But I think, besides being entertaining, what Joe is pointing out is how fully the concept of an error has permeated traditional stats.

      • invitro says:

        ‘Yes, I would say that this series of articles boils down to: “eliminate errors.”’ — Just to be clear, we’re not talking about eliminating errors as their own stat, only eliminating them from the formulae for BA, RA, etc.

      • Simon says:

        I think that this series also wants to make it clear that Traditional Stats are complicated, but we are so used to them that we don’t care. And that is partly why it is even more ridiculous when old timers rail against new-world stats like OBP, WAR, and FIP that are too darn complicated, not like good salt of the earth stats like Wins, ERA, and AVG – that each have elements that are strangely complex and arbitrary.
        What the 6 stats have in COMMON is that they are their era’s attempted measurement of: How Good is This Player? Is Player A better than Player B?
        Without computers or video replay – or even audio replay – errors are a pretty good rough-accounting to try to get at the truth of who is better. The thought process is not so different than what Voros McCracken did.

        Start with wins:
        Player A’s last season of Base Ball:
        Player B:

        Player A is better, right? But look. In that last year, Player A’s team finished 91-63 and scored 715 runs. Player B’s team went 66-85 and scored only 501 runs. Maybe Player B has a worse record because his team stinks? Maybe we should not care about factors outside of the player’s control?

        Let’s look at how many runs they gave up. Because if you switched their teams, and they gave up the same runs, the win loss record would be “corrected”, right?
        Player A gave up 100 runs
        Player B gave up 92 runs, in more innings pitched.
        So Player B is better, now? What if Player A pitched in front of a bad defense, so those runs were not his fault?

        We determined that 33 of Player A’s runs were not his fault, but Player B had even more, 36 of those. Player A earned 1.89 runs per 9 innings. Player B earned only 1.36.

        Now old chap, let’s consider their Wins Above Replacement…

        I’ll take Walter Johnson’s 1910 over Christy Mathewson’s.

    • Donald A. Coffin says:

      In fact, that is precisely part of his discussion of batting average.

  2. Charlie B says:

    I understand the point about FIP, but after reading George Will’s Men at Work and the section with Cal Ripken Jr talking about defensive positioning I also understand that there are more nuances involved here.

    I think the big problem is that we try to put individual statistics into what is really a team attempt to get batters out.

    Couldn’t really see how to do this online, but I’d love to see a breakdown of fielding percentage based on the pitcher at the time and how different that might be for different starters.

  3. Benjamin Wildner says:

    Oh come on. You can not do this series and then refer to quality starts without taking a shot at them.

  4. Hamster Huey says:

    I’ve always been curious about SlgBIP. Are some pitchers better than others at not allowing doubles (and triples, though those will be rare)? Is this more reproducible by pitcher than BABIP? It seems like that’s a better test of the pitchers’ control over hard-hit balls in play than BABIP. I mean, we know BA is a flawed stat. Alas, I’m not one to be able to go through and calculate SlgBIP, so I resort to asking the BRs here. (There are also now some attempts to actually measure the speed of the batted ball, which might be an even more direct measure of a pitcher’s skill in non-K/BB at-bats, though it’s way too early yet to know if pitchers have control over this, let alone reproducible control year-to-year.)

    • Donald A. Coffin says:

      You can actually do this fairly easily now, since BBRef has changed its formatting of some of the data. They now have for pitchers a table which shows essentially the same data as in the standard presentation of a hitter’s data. Click on Finders & Advanced Stats and scroll down. For example, Kershaw, in his career gives up 2B+3B in 8% of his (PA-1B-HR-K-BB-HB). It varies from 6.8% (2011, 2013) to 10.8 (2008–his rookie season) and 9.3% (2014).

    • Go Indians says:

      SigBIP probably has a strong correlation to FB/GB ratio since fly balls result in extra base hits far more often than ground balls. Separating any additional difference between pitchers might be difficult due to SSS issues.

    • Rob Smith says:

      Good point. There’s a big difference between and infield single and a base clearing double off the wall. But if the ball is in play, your outfielders have a part in the outcome. If you have Andruw Jones in centerfield, circa 2000, maybe he catches it. If you have Garrett Anderson out there, it might be a triple.

      • Marc Schneider says:

        Everyone should know how random much of baseball is anyway. A pitcher hangs a slider and the hitter just misses it and hits a long fly ball. It would be interesting if one could figure out how many mistakes a pitcher makes that become outs; ie, sort of a luck factor, either because of having Andruw Jones instead of Garret Anderson or just being plain lucky by having the hitter miss the pitch. I remember when I was a kid you would read about these pitchers in Spring Training who were putting up great numbers, but when you actually saw them, you realized people were hitting shots all over the place that were being caught.

        • Richard says:

          World Series have turned on chance factors…. A ground ball hits a lump in the dirt, and bounces away from a fielder. A gust of wind from a squall line passing through the area keeps a fly ball from just going over the wall.

          A cloud of insects upsets the pitcher….

        • Rob Smith says:

          I think random is the wrong word. In MLB, the infields and outfields are normally pristine. So, while bad hops occur occasionally, they don’t impact anything significant. And it’s not random having Andruw Jones in the outfield. That’s a decision by a team with the knowledge that he’s going to track down a lot of balls. As far as missing hanging sliders, that happens, of course. But if you hang a lot of sliders, MLB hitters won’t continue to miss them indefinitely. In a small sample size, like one game, a pitcher can get pretty lucky or unlucky on their pitches getting hit or not hit. Over the course of a season, and certainly a career, these things even out to the point of irrelevance. Your Spring Training example of someone putting up “great numbers” is the ultimate in small sample size. A front line starting pitcher might log 15-20 actual innings, or roughly 3 regular season starts, at the most. If the pitcher continued to have shots hit off him all over the place, his ERA would reflect that over a larger inning count.

          • invitro says:

            “Over the course of a season, and certainly a career, these things even out to the point of irrelevance.” — I recall reading many studies by lots of different people that say they definitely do not even out over a season, and often not over a career.

            Bad hops is not the issue. The issue is that if you swing a hundredth of a second earlier or later, or a millimeter higher or lower, the ball location is changed by several feet. Hitting a baseball is an unstable event, or a chaotic event if you like that term: tiny differences in input have huge differences in output. This is the general way that randomness is produced in nature at the macroscopic level, and it’s no surprise that it produces a lot of randomness in baseball.

      • Hamster Huey says:

        To Rob Smith: Sure, I’m not wondering if SlgBIP will be completely defense-independent, only if it will become pitcher-dependent (in addition to defense-dependent). You could imagine that pitchers can control (i.e. pitcher ability will play a deterministic role in) how hard a ball in play is hit, and that how hard a ball is hit will impact how likely a defense is to be able to turn a ball in play into an out. This would have been my default assumption, but the BABIP / Voros McCracken (aside: what an improbably amazing name) realization seems to be that no, pitchers have no control over balls in play becoming hits. (I don’t follow this area closely, but BR comments elsewhere seem to suggest that hitters don’t either? i.e. it’s almost all defense? or randomness?) This would in turn mean that either how hard a ball is hit does NOT affect its likelihood of turning into an out, or that pitchers have NO effect on how hard a ball (in play) is hit, or that randomness / noise or defensive ability have so much more effect than the other two parameters that no reproducible trends can be seen in those two. (And if it was defensive ability, shouldn’t pitcher BABIPs trend by team? Do they?) Anyway, my question about SlgBIP is really tied to my inability to let go of those expectations, and the thought that Slg should correlate better with hard-hit balls than BA. In other words, just as SABR sorts were embracing alternatives to BA, they were also embracing this counterintuitive statistical realization that was based on BA. Shouldn’t we be able to do better?
        To Donald A. Coffin (above): thanks for the tips! If I find some time this might be a good incentive to expand my data wrangling skills, currently in their infancy.

  5. invitro says:

    “I suspect we won’t ever change to RA” — I changed to RA a couple of years ago. Anyone can, even without express written approval of Major League Baseball. 🙂

    • SDG says:

      ERA was always the best of the traditional stats. It measures what pitchers are actually supposed to do – prevent runs. It has flaws (reliance on defense, albeit not as much as wins, doesn’t handle relief pitchers with runners on, etc) but not nearly as much as other stats which only get at pitcher value by proxy.

      Besides, they’re comparative. I agree that earned runs specifically are dumb, for all the reasons we’ve talked about. But if every pitcher is judged by the same standards over the course of a career and era it should balance out. Pitch enough innings and you’ll have both Ozzie Smith and Bill Buckner behind you.

      • invitro says:

        Fun fact: Bill Buckner was actually an above-average fielder. At least bb-ref’s Rfield says so, which has him at +12 runs for his career. Now, almost all of the above-average comes from his 1970’s LAD years… he was a hair under average when with the Cubs and Boston.

  6. Bryan says:

    RA isn’t sufficient:
    If the starting pitcher leaves with the bases loaded and nobody out he should be charged with 2.37 runs. The reliever gets -2.37 and then is charged for runs that score as well as the run expectancy when he leaves if he doesn’t finish the inning.
    Starter: 2.37 runs
    Reliever1: -2.37 runs, allows a sac fly and the runner on 2nd is held is charged 1 run that scored and 0.92 runs for expected runs, net -0.45 runs charged
    Reliever2: -0.92 runs, allows a single that scores a run and then walks the bases full, charged with 1 run that scored and 1.57 for expected runs, net 1.65 runs charged
    Reliever3: -1.57 runs, induces a double play, net -1.57 runs charged
    Currently the starter is charged with 2 runs and 0 runs to each of the relievers. If you use RA to evaluate the pitcher you’re also evaluating the effectiveness of his relievers handling his inherited runners. Assigning runs also serves well to evaluate “holds” or “fireman saves” currently if the reliever enters with the bases loaded and no outs he is charged with 0 runs whether he allows 3 runs (barring a fielder’s choice or similar) or 0 runs. Now when Goose comes in and puts out the fire he gets far more appropriate credit of -2.37 runs while allowing all of them to score is still mostly blamed on the starter but 0.63 runs are still charged to Goose.
    It provides little value to try to get people to stop using deeply entrenched ERA if you’re just going to replace it with RA since that still falls well short of even a basic evaluation of reliever performance although RA is a pretty clear upgrade for evaluating starters over ERA.

    • invitro says:

      In the spirit of keeping things non-judgemental, I think it’d be better to assign an inherited run evenly across all pitchers involved, so .50-.50, or .333-.333-.333, etc.

      • Player to be Named Later says:

        I’ve often thought along these lines, and that each run could be easily broken down into quarters. (1 run = 4 x 90 feet)

        So if a starter walks a guy and get’s pulled for a reliever, and that runner then scores while the reliever is pitching, then the starter is responsible for .25 of that run and the reliever .75. (Each 90 feet = .25 of a run.)

        • SDG says:

          That wouldn’t work. All quarters of the run are not equal. Getting the first 90 ft is harder and more important than the rest of the 270 feet. I’m not sure what the probability is for the likelihood of moving to second, or third, or home but I expect they aren’t equally likely.

  7. Tangotiger says:

    For those who want to include inherited runners, there’s a stat for that, called RE24. It’s on both Fangraphs and Baseball Reference.

    To turn it into an RA stat (or we call it RA9 to distinguish it from the counting stat of runs or runs allowed), just divide by IP x 9 and subtract from league average.

    Of course, it’s easier when someone ELSE does it for you, just like it’s easier when someone else decides what counts as an AB and an ER, so the calculation ends up looking easy for the consumer. “Here, let me give you all the pieces, and all you have to do is divide these two numbers. See? That’s an easy stat.”

  8. evanecurb says:

    Unicorns are lavender, not purple. Everyone knows that. I agree with everything else Joe said.

  9. John Autin says:

    Another bad aspect of ERA is how it distorts historical comparisons. For example:
    – Last year’s scoring average was about the same as in 1903 (R/G 4.48-4.44), but the league ERA was a full run higher (4.19-3.17).
    – That’s because the 1903 share of unearned runs was more than 4 times last year’s, 31.4% to 7.3%.
    – So, while the 1903 league ERA champs look much better than last year’s pair (combined 1.91 to 2.57 ERA), they were actually worse at run prevention: 3.15 RA/9 for 1903, 2.87 for 2016.

    And while Greg Maddux’s 1.63 ERA in 1995 misses the top 50 modern marks, it’s 5th-best by RA/9, since he yielded just 1 unearned run.

    • SDG says:

      I think that’s intentional. Errors WERE much higher in the no gloves, rocky fields, no lights era. That’s WHY errors (and ERA) were a thing to begin with. All the modern advances were designed to make fielding easier

      • Scott says:

        I agree. ERA made a lot more sense when a large proportion of baserunners (and runs) were due to poor fielding. It served as a way to separate between fielding and pitching. But as errors have gone down, the distinction between ERA and RA is increasingly irrelevant.

      • John Autin says:

        No doubt that was the intention. That doesn’t mean it ever made sense. The bad fielding conditions were shared pretty evenly, and the pitcher’s task was always to prevent ALL RUNS, and run prevention always had a large non-error fielding component.

        If we’re comparing a pitcher from 1903 with one from 2016, and each gave up a typical share of UER for that season, and each has a 3.00 RA/9, the 1903 guys IS NOT BETTER just because he has a lower ERA.

  10. A possibly irrelevant story.

    After the 1987 season, Orel Hershiser went to Fred Claire, then the Dodgers GM, and noted that his ERA had been higher the past couple of years than in the previous two years. He said that was because he was trying to strike out more batters because then they wouldn’t hit the ball and put it in play, where the Dodgers defense couldn’t field it. While Kirk Gibson gets a lot of credit for the Dodgers’ success in 1988, part of it, too, was that Claire traded for Alfredo Griffin, who improved the defense considerably at SS. Hershiser’s ERA did drop close to a run per nine innings.

    So, might ERA also tell us at times that the pitcher is afraid of his own defense?

  11. DavidJ says:

    Another problem with ERA is that it’s biased in favor of low-strikeout pitchers over high-strikeout pitchers, and it’s biased in favor of groundball pitchers over flyball pitchers.

    Errors, obviously, are more likely to be committed when a ball is put in play to begin with. So, everything else being equal, a pitch-to-contact pitcher is likely to have more errors committed behind him than a pitcher who allows less contact, and thus is likely to have a higher percentage of the runs he allows classified as unearned. Errors are also more likely to occur on groundballs than on flyballs. So, again, all else being equal, the groundball pitcher is likely to have a higher percentage of the runs he allows classified as unearned.

    The point is, every approach to pitching comes with tradeoffs. There are advantages to pitching to contact and there are advantages to trying to get groundballs, but an increased likelihood of errors being committed is simply a known risk that comes with the territory. ERA arbitrarily absolves certain kinds of pitchers of one of the consequences of their approach.

    • SDG says:

      All those problems would still be in place if we converted ERA to simply, RA. There’s no one stat that tells us everything about a player.

      RA is going to favour contact hitters whether we include errors or not. It will be higher if the fielders commit errors, but that’s like saying it will be higher if the fielders are worse. Anything that isn’t a K, BB or HR is going to run into the problem of baseball not being played in a lab but on a field with people.

      • DavidJ says:

        I’m not claiming that RA tells us everything about a pitcher, just that it tells us a little bit more about the pitcher than ERA does (especially over a very large sample), because ERA introduces some biases that are not present in RA. The point is simply that deducting “unearned runs” from a pitcher’s runs-allowed average makes an already imperfect stat even more imperfect.

  12. Bpdelia says:

    Except, of course, RA and ERA for most pitchers in most seasons are exceptionally close to one another. FIP and ERA see often miles apart. Look at Mike Pineda.

    BABIP may be higher for the greats because most of their best pitches are missed. Balls not in play don’t count. For a sinkerballer his best pitches are squibbed into the ground decreasing his babip.

    Anyone who watched Mariano Rivera spend 20 years creating broken bat slow rollers can’t possibly argue that his pitching produced more easily fielded balls than did Scott Proctor.

    Is the pitcher responsible for ALL of his outs on balls in play? Of course not.

    But since it’s impossible to fairly decide which he should and should not get credit for ERA works fine. RA is better but they’re so close its not worth arguing over.

    • invitro says:

      “Anyone who watched Mariano Rivera spend 20 years creating broken bat slow rollers can’t possibly argue that his pitching produced more easily fielded balls than did Scott Proctor.” — Why not? And where can you find the number of broken bat slow rollers induced by Rivera and Proctor?

      • Marc Schneider says:

        I assume he meant to say “can’t possible argue that is pitching DID NOT produce more easily fielded balls than did Scott Proctor.”

    • Joe Posnanski says:

      From 2004-2007, when they were teammates, Rivera’s BABIP (.277) was higher than Scott Proctor’s (.270).

      • Rob Smith says:

        Stats like this are really irksome. I hate FIP and BABIP instinctively. But there is something to them, I hate to admit. Really it shows, to me, how important defense is in baseball. We fully embrace and understand the importance of defense in basketball and football, but for some reason we put runs scored largely on the pitching. I love how Earl Weaver understood that back in the day. Having Mark Belanger, Brooks Robinson and Paul Blair on the team sure didn’t hurt. And yet Jim Palmer and the other starters get the lions share of the credit for the run prevention.

        • invitro says:

          “I hate FIP and BABIP instinctively.” — Why?

          “but for some reason we put runs scored largely on the pitching.” — Well, the reason is because runs scored *are* largely due to the pitching. Has any sane person suggested otherwise?

          • Rob Smith says:

            I realize you love to parse words very closely, but the point was that defense has a very large impact on the results a pitcher gets. Of course, pitching is a lot of it. But if you buy into FIP and compare it to ERA, the defensive impact can be up to a full run per 9 innings. That’s the difference between and ERA of 3.00, 3.50 or even 4.00. A half a run per game is everything when judging a pitcher & it might be related to the quality of defense.

            And I hate FIP because I instinctively believe pitchers have more impact on balls in play than maybe they actually do. It contradicts what I think I see with my own eyes.

          • invitro says:

            “But if you buy into FIP and compare it to ERA, the defensive impact can be up to a full run per 9 innings.” — You’re assuming the difference between FIP and ERA is defense, and that’s just not true. It’s *luck*. Defense may be a small part of it. Half a run per nine innings isn’t a lot, defense may be a small part of it, but it’s mostly luck, and it should make plenty of sense to baseball-watchers that at least half a run a game is due to luck.

          • invitro says:

            “I realize you love to parse words very closely” — I apologize for doing this. I very often get excited about a subject and don’t make the effort to be nice.

  13. Brian says:

    I will say this, and Joe alludes to it: it seems good pitcher’s make for balls in play that are easier for the defense to handle- batters don’t hit the ball as squarely. It makes for good, endless debate

  14. Mike says:

    I’m disappointed Joe didn’t mention passed balls and wild pitches and balks and catchers interference and fielder’s interference and….

  15. Tangotiger says:

    And yes, UER is biased based on GB and FB pitchers. Check out Schilling and Johan Santana among FB pitchers and Webb among GB pitchers.

  16. Jesse says:

    pretty sure empty net goals are charged to the team, not the goalie so it’s kind of the same as an unearned goal.

  17. Yazmon says:

    I do have a quibble about “point 1”. You are equating an error with a great play, trying to make them complete opposites. I would contend that a great play, an extra effort play that succeeds, is more the opposite of a lesser effort play that is not necessarily an error where an effort is made but botched. The high effort, great play is the opposite of the low effort, not so great play. They may tend to even out. Maybe. Who knows? Comparing to an error though is apples and oranges.

  18. Richard says:

    Is there ANY stat that isn’t flawed in some way?

    • Atom says:

      It’s not that their flawed, it’s that they’re judgmental.

      He’s highlighting stats that are, by their very nature, pretending to NOT be flawed, instead of doing what stats should do, which is counting things.

    • Atom says:


      Battings Average
      We think it means: How often then get a hit
      Really: Judges what a hit should be. Removes errors, doesn’t punish you for sac hits or sac flies, etc

      We think it means: how often they win!
      Actually: Complex set of rules regarding it, have to work x innings, terribly team dependent. If you throw 8 shutout innings and leave in a tie game, you could not get a win even if your replacement gives up three runs and then your offense scores 4.

      ERA: See above

      The point isn’t flawed or not flawed. It’s that a statistic itself shouldn’t make a judgement. It should be a concrete thing we count that then WE can analyze and make judgements off of.

  19. Mark Daniel says:

    Regarding “point 1”, I wouldn’t worry because that will probably happen if someone hasn’t already created that stat.
    It will go like this. Every ball in play given up by a pitcher will be given a run value based on the likelihood of that play being made and the base-out run value.
    For the sake of argument, let’s say the play you described is a difficult one and converted into an out only 10% of the time. In this new metric, the pitcher will have a certain run value added to his tally for that play, which will be 0.90 (90% of the time that ball is a HR) times the base out run value. It would be similar to the defensive runs saved credited to the fielder in that situation.

  20. Chris says:

    If I’m not mistaken, early baseball publications (I’m talking 1870s-era baseball) actually published some stat that was like RA/9. Except, of course, in the 1870s most pitchers were going the distance so the stat was basically RA/G (i.e. Runs Allowed divided by Games Pitched).

    Of course, they also used to post a Run Average for batters (Runs Scored divided by Times At Bat), so whatever. This is a whole other can of worms…

  21. Mark Daniel says:

    I’m not sure why we give hitters full credit for hits to begin with. Because we have the ability to track the trajectory, speed and location of batted balls, we know how often those balls are converted into outs. So, why isn’t every batted ball converted into a offensive value based on how often a similar ball is converted into an out combined with the base-out situation?
    This is particularly true today with the shift so many teams use. What used to be a line drive single 100% of the time is today an out probably 25% of the time or more. Isn’t it just bad luck that some guy roped a line drive to right, but the defense had a guy shifted right into that exact spot?

    So for every batted ball, the % of the time a similar ball is converted to an out is known. The base-out situation is also known. The fielder already gets credit or debit for making or not making the play, respectively. Why not do the same for the pitcher and the hitter? That’s the only way it seems fair to me. This would eliminate the error judgment, because we don’t actually care what really happened on the field, we care about what would have happened on average.

    • invitro says:

      This is an interesting idea, but does someone really know how to correctly assign the values? I’ve been looking for the latest evidence on what a batter can control, and I remember some discussion on Bill James’ site, and I don’t know if there’s a consensus on it.

      I have a feeling that you can accurately estimate a hitter’s true skill using just strikeout, home run, and walk rates, and nothing else. And in fact anything else adds more noise than signal. It’s just a feeling, though! 🙂

      What I’m looking for is a summary of current knowledge on what a batter can control, with more specifics than “not their BABIP”. I assume they can control their HR, BB, and SO rates. It looks like they can control their line-drive rates to some degree, and the speed the ball comes off the bat. How about the left-to-right angle of the ball? Maybe they can control where the ball crosses the basepaths within 50 feet, but no better than that? I have no idea, and I couldn’t find an answer, that’s just my guess. Can a hitter purposely hit the ball over the infielders, but in front of the outfielders? How often can a hitter hit a fly ball if he wants to, or a ground ball if he wants to?

      Here’s one article, from 2008: . I don’t have a good understanding of the article yet, but they regress BABIP against a bunch of factors like batting eye, number of pitches per PA, and line drive %age. I think the goal is to just to develop a model of BABIP that’s not just “it’s completely random”. A spreadsheet of the data is linked to; I’m going to get it if only for the hope that I can finally figure out how to do multiple regression in Libreoffice 🙂 (probably not 🙁 ).

      • invitro says:

        “How about the left-to-right angle of the ball? …” — I just want to quickly add that I want to know the answers to these questions badly enough to pay for a website/blog subscription, if necessary :). (Maybe I need to subscribe to Bill James again…)

    • Tangotiger says:

      Billy Hamilton and Prince Fielder hitting a ball with the same minus 5 degree angle and the same 75mph would result in different outcomes.

      • Mark Daniel says:

        Ah, I didn’t think of speed. So, perhaps do it like defense and take a ball in play that, for example, is a hit 50% of the time, and give the player 0.50 times (run value of hit in that base-out situation).
        A concern is that it seems like we’d be calculating offensive value (the most reliable stat, I think) in a similar manner to defense (the most unreliable stat).
        The major issue is that hits like infield singles by speedy guys will be given more points that a regular old line drive to center field.
        Okay, I’m convinced, this is a bad idea.

  22. James says:

    Regarding point 2.5, my understanding has always been the question is toward whom earned runs are being “judgmental.” The very fact it’s called “earned runs” implies that we’re primarily talking about the offense. Bob Gibson didn’t earn those 1.12 runs. He tried to prevent them. Evaluating pitchers by ERA is sort of a derivative use of the concept of earned runs*. So if we’re primarily talking about what the offense earned, it makes just as much sense to discount runs via pitcher error as it does via fielder error.

    *Of course, that’s now the *primary* way that we talk about earned runs, so maybe it just reveals a flaw in the entire project.

  23. invitro says:

    Here’s a short article on luck in baseball: . It’s not quantitative, and it’s really just a very, very basic introduction, I suppose for fans who think there is no luck in baseball. But it may be of some interest, and I guess if it’s Fangraphs’ first word on “luck”, it’s important. 🙂

  24. invitro says:

    I was curious to see which of ERA and RA correlated more from year to year. I expected RA would. I looked at all pitchers from 1980 to 2015 who had at least 150 IP in two consecutive years, and correlated their ERA and RA between the first and second years. I got a correlation of .373 for ERA, and .366 for RA. Since I wasn’t expecting that, I must be thinking about ERA/RA in the wrong way somehow.

    I correlated a bunch of other pitching stats in the same way. FIP has a high correlation of .583. Its components have: HR-.477, BB-.659, SO-.809. Also: H-.474. I did expect SO and BB to have high year-to-year correlation as they did. I expected HR to have a higher correlation than H, but it didn’t.

    Here’s everything I got. After the correlation is the mean +/- stddev for year 1, then the same for year 2. All stats except FIP are per nine innings. (I used 3.2 as the additive constant for FIP for every year, which is going to be off a bit, but shouldn’t affect the correlations much, doesn’t affect the stddev’s, it only affects the mean.)

    # n = 2052. ………MEAN STDDEV MEAN STDDEV
    # ERA corr= .373 | 3.82 +/- 0.79 | 3.92 +/- 0.84 |
    # RA corr= .366 | 4.20 +/- 0.84 | 4.29 +/- 0.90 |
    # FIP corr= .583 | 4.04 +/- 0.64 | 4.10 +/- 0.68 |
    # SO corr= .809 | 6.38 +/- 1.65 | 6.33 +/- 1.66 |
    # BB corr= .659 | 2.86 +/- 0.82 | 2.84 +/- 0.83 |
    # HR corr= .477 | 0.90 +/- 0.30 | 0.94 +/- 0.31 |
    # H corr= .474 | 8.75 +/- 1.06 | 8.87 +/- 1.12 |
    # W corr= .136 | 0.57 +/- 0.13 | 0.56 +/- 0.13 |
    # L corr= .168 | 0.46 +/- 0.14 | 0.48 +/- 0.15 |
    # G corr= .359 | 1.44 +/- 0.19 | 1.43 +/- 0.17 |
    # GS corr= .464 | 1.38 +/- 0.12 | 1.39 +/- 0.11 |
    # CG corr= .655 | 0.14 +/- 0.14 | 0.14 +/- 0.13 |
    # SHO corr= .217 | 0.04 +/- 0.05 | 0.04 +/- 0.05 |
    # IBB corr= .348 | 0.16 +/- 0.12 | 0.16 +/- 0.12 |
    # WP corr= .495 | 0.25 +/- 0.17 | 0.25 +/- 0.17 |
    # HBP corr= .471 | 0.25 +/- 0.16 | 0.26 +/- 0.16 |
    # BK corr= .341 | 0.04 +/- 0.06 | 0.04 +/- 0.06 |

  25. Contrarian says:

    Since there’s no Azocars Poscast post up yet, I’ll (irritating) comment here:

    The Yankees used to play Liza Minelli after losses, but now it’s Frank singing “New York, New York” after wins or losses. Because the organization is utterly without class or grace, it hasn’t occurred to anyone in that franchise that “That’s Life” would be an infinitely cooler pick for games they lose.

    I’ve made peace with the fact that everyone disagrees and that I’m the only one who thinks both Madden and Francona managed terribly in the World Series. Francona worked some magic getting his team that far, but I think both teams managed to lose Game 7 until the Cubs finally won it.

    Nolan Arenado hits .308/.355/.581 at Coors (119 OPS+) and .261/.305/.457 away (80). He might blow the league away with another team, but it’s far from a sure thing, and maybe not even a good bet.

    Joe and Mike didn’t mention this, but although I would pick Sale over Porcello on the Red Sox, I also expect David Price to have a better season and possibly the best of the three.

    As to whether anyone will be interested enough to participate in the Azocars next year… well, I think I just answered that.

  26. MikeN says:

    Does this mean the stories are fake where Greg Maddux plans to get groundouts?

  27. Brent says:

    “FIP is an attempt to isolate the performance of the pitcher by using only those outcomes we know do not involve luck on balls in play or defense; strikeouts, walks, hit batters, and home runs allowed. ” Quoted from explanation of FIP on Fangraphs website.

    My biggest problems with FIP has always been that there are some balls “in play” that are not a K, BB, HBP or HR that also are not affected by luck or defense. Joe has said in his article that Fangraphs now does treat Infield Fly rule outs the same as Ks, which is an improvement. But until they treat balls that hit 12 foot up or higher on any tall fence (like the Green Monster) the same as HRs, they really are being inaccurate. No defender is catching that ball, no matter what, therefore, it has to be treated like a HR, right?

Leave a Reply

Your email address will not be published. Required fields are marked *