I think it goes without saying that I am not a sabermatrician. I’m not that smart. I see some of the more intensive baseball analyses out there — the ones that have charts and graphs and the formula on the board from Good Will Hunting — and I’m entirely lost, not unlike the first day of chemistry class when the teacher said, “Well, obviously we don’t want to mix THIS chemical with THIS chemical” and everyone else kind of laughed and shook their head and said, “NOOOOO!” and I realized I really needed to get into something like journalism.

That said, of course, I like cool baseball statistics. I like how they can open up the game. And so I do make a moderate effort to keep up with what’s going on. I read my Baseball Prospectus. I go online to read articles about promising new and better stats. I play around with Excel spreadsheets. But it’s all amateur stuff. I have never attended a SABR Convention — not because of the cliche Star Trek syndrome but because of scheduling and don’t like going to conventions where I’m the dumbest person in the room. I went to a Shakespeare Club meeting once because I have a passing interest in the Bard, and I walked in the door two people, completely off the cuff, started doing the entire third act of “Measure for Measure” from memory, and I just kind of slunk out.

In any case, all of this is to explain why until a few days ago I had never heard of Base Runs. It somehow just slipped right by me. Base Runs are not new, I think this is probably like when I “discovered” this Evolution of Dance video on YouTube after 90 million other people saw it:

Anyway, as far as I know, by now, Base Runs may have been completely debunked in the SABR community. I’m always way late to this sort of stuff. But I ran across Base Runs and it sure seems like this is a pretty awesome way to measure baseball players.

I’ll give you a quick and probably inaccurate summary of Base Runs and then we’ll get to the fun stuff: I guess the statistic was invented by David Smythe, at least that’s what the Wikipedia page says, and if you can’t trust Wikipedia, who can you trust? And, Base Runs at its simples is a formula that figures how many runs you SHOULD have scored based on five very basic stats: At bats, hits, walks, total bases, home runs.

These are more or less the same basic stats that Bill James’ revolutionary “Runs Created” formula used too, but Base Runs is a more complicated formula and, according to people who are smarter than me much more accurate. I’m not going to get into the formula because I don’t really understand it. I’m just proud of myself for having the meager math skills to put the formula into a spreadsheet* … and I punched in some numbers from some teams, and WOW it is accurate.

*I suspect there are probably sites out there that do the figuring for you on Base Runs … but I couldn’t find them and anyway I didn’t try hard to find them, I like playing around with spreadsheets. That’s weird, isn’t it?

For instance, the Base Runs formula says the Royals should score 385 runs this year. They’ve scored 386. Dead on. The Dodgers should score 373 runs. They’ve scored 377. The Orioles should score 433 — they’ve scored 430. And so on. Every club I’ve figured except the Red Sox (who have scored quite a bit below the Base Runs expectation*) has scored within 3% of their Base Runs. Some teams scored right on the nose. It’s a pretty amazing statistic from what I can tell.

*I don’t know why the Red Sox don’t quite match up to their base runs. The formula says they should score 504 runs this year, and they have scored just 478 (which is still second in the league). The Sox have hit into a lot of double plays, but even when you do the more complicated formula they are still way underperforming their Base Runs expectation. Maybe it’s because they’re hitting .226 with two outs and runners in scoring position, I don’t know. I blame Varitek. That seems trendy these days.

So if you are one of the people who also never heard of Base Runs, you now ask: Well who cares? Well, this is the cool part that Bill James showed a long time ago … if you can come up with a deadly accurate run-scoring formula, you can use it to figure out EXACTLY how valuable each offensive player is, you can give them a value that goes way beyond batting average and RBIs, perhaps beyond even more advanced stats like OPS+ and Eqa and whatever (Editor’s note: I’m not saying that Base Runs is more ACCURATE than those numbers, I’m saying they could be valuable … I’ll explain at the end). With Base Runs, all you have to do is subtract a players numbers from the team’s numbers and find out just how many runs that person is responsible for … I mean, that’s pretty cool stuff, no? Who’s with me?

For instance … I figured out the Royals Base Runs on Thursday, before the Royals thrilling 4-1 victory over the White Sox*.

*In that game, in case you missed it, Mark Teahen hit his SECOND inside the park home run of the season. It was in large part due to the White Sox defensive incompetence — in fact the Royals’ victory can be pretty much chalked up to White Sox defensive incompetence since Paul Konerko dropped a double play relay, which led to a game-tying double, which led to starter Mark Buehrle being pulled, which led to the Royals four-run outburst — but it also confirmed Teahen’s place as perhaps the most daring base runner in the American League. I’ve been talking about this for two years now. He pretty routinely stretches singles into doubles (he did earlier in the game), he has scored from first base on a single, and he has those two inside-the-parkers. I know this is my silly dream, but I just think the guy has the soul of a leadoff hitter … his on-base percentage is low now, but I just think if put in that leadoff role, if asked to get on base and make things happen, he could really emerge. I have no basis for this thought, this is my baseball fan side talking.

The Royals had scored 382 runs going into Thursday’s game. Their Base Runs were also exactly 382. So, the stats nails the Royals perfectly. And the team’s individual Base Runs breaks down down like this:

Jose Guillen, 47.
David DeJesus, 47.
Alex Gordon, 47.
Mark Teahen, 43.
Mark Grudzielanek, 39.
John Buck, 27.
Miguel Olivo, 26.
Billy Butler, 25.
Ross Gload, 24.
Mike Aviles, 18.
Tony Pena Jr., 3.

I find this pretty enlightening. Guillen has the huge RBI totals this year — 65 of them now — and he’s among the league leaders in doubles, and so you would sense that he’s really head and shoulders above the other guys on the club when it comes to producing runs. But he’s actually tied for the team lead with DeJesus and Gordon, mostly because he doesn’t get on base, meaning he makes a lot of outs. DeJesus has a 120 OPS+, the best on the club, but he doesn’t break away either, perhaps because he only has 13 doubles among his 91 hits. A big part of Gordon’s value are his team leading 39 walks.

Grud is hitting .314, but his Base Runs are quite low. I have to admit being a bit surprised by this — the guy is hitting .314 with 20 doubles, but he’s also missed quite a few games, and he doesn’t hit for power.*

*Grud is an interesting player … he’s a .290 lifetime hitter, and he’s about to get his 2,000th hit, and he’s won a Gold Glove at second base, and yet the advanced stats hate his guts. In 1996, for instance, he hit .306, had 200 hits and made the All-Star Team. His OPS+ was a lousy 93. in 1997, he did something even more remarkable: He led the National League with 54 doubles, and his OPS+ was even worse, a lot worse, it was 81 which is dreadful. A lot of this is — maybe all of it — is because Grud doesn’t walk and never has. Walking is a skill, I’ve written this before, and you can’t just DECIDE that you will walk more anymore than you can decide you are going to add 30 points to your batting average. Still, he’s the kind of player who, if he had managed 75-80 walks a year, would have made a whole lot of money. I mean, he HAS made a whole lot of money in this game, but he could have made even more.

None of these Royals numbers, as you might imagine, are especially impressive. Well, the Royals have had trouble scoring runs. The best players seem to have Base Runs in the high 50s now. The very best players are in the 60s. And Texas Ian Kinsler is all alone with 74 Base Runs, an amazing number. Here are a few players I figured … it’s not a complete list, these are just players who interested me:

– Ian Kinsler, 74
– Josh Hamilton, 68
– Grady Sizemore, 67
– Milton Bradley, 65
– Nick Markakis, 65
– Kevin Youkilis, 63
– Justin Morneau, 62
– Brian Roberts, 61
– J.D. Drew, 60
– MannyBManny, 59
– Carlos Quentin, 59
– Dustin Pedroia, 58
– Jermaine Dye, 58
– Miggy Cabrera, 58
– A-Rod, 57
– Joe Mauer, 56
– B.J. Upton, 55
– Johnny Damon, 54
– Raul Ibanez, 53
– Evan Longoria, 53
– Jason Giambi, 50
– Derek Jeter, 43

Jeter is not next on the list, but I wanted to put his number up there anyway because, as you know, I’m a big fan.

Anyway, like I say, I don’t really know how Base Runs are viewed … I’m sure people will let me know in the comments. But it seems to me that if it really is as accurate as it seems — even more accurate than runs created — then Base Runs could and should become really mainstream, kind of like the passer rating has in football (even though passer rating is incredibly flawed as a statistic). It’s a bit complicated — kids might not be able to figure it on their napkin at breakfast — but in today’s world nobody really cares how you figure it, as long as the number at the end is simple to understand and useful.

That’s the key. People always ask me why I don’t include more advanced statistics in my newspaper columns, and the reason is that even something seemingly as simple as OPS+ would take at least two hefty paragraphs to explain in a column. And Eqa, which I think is a terrific stat, would take even more than that. And with limited space and limited time, you can’t really afford space to explain these more advanced statistics. Heck, even a mention of on-base percentage will usually draw a few puzzled emails.

But if Base Runs are for real, they are very easy to explain: You would can just say this is a statistic that shows how many runs this player produced. For example, you could say: “Tony Pena Jr. has produced three runs all year.” People would get that right away, I think.

This entry was posted on Friday, July 11th, 2008 at 6:56 am.
Categories: Baseball.

58 Comments, Comment or Ping

  1. Andy

    I like Base Runs…but don’t tell me they describe a player better than Eqa. There’s no park factor in these numbers.

  2. Incoming Message from Dr. Light

    If it’s based on runs, I’d imagine that implementing park factors would be fairly easy.

  3. Drew

    BaseRuns doesn’t purport to describe a players. It tells you how many runs teams will score given their component stats.

  4. So tom Cruise acts weird and gay. Guess what? The 10 best actors at my school were all weird and gay, too! Who cares? People continue to bash the guy, but he has a ton of good movies that any one of you would watch if it was on. Top Gun, Days of Thunder, a Few Good Men, Jerry Maguire, The Color of Money, Rain Man, The Firm, Collateral, War of the Worlds, Minority Report, Last Samurai, Vanilla Sky. I don’t like Magnolia, but he stole the movie. There’s probably others, too. I’m not saying these are the best movies of all time or anything, but that’s a list that any actor in Hollywood would trade their filmography for.

    I’m not saying the guy is the greatest, and i’m not saying all those movies are my favorites, but just b/c he jumped on a couch and proved he’s gay and weird… so what? He makes a hell of a movie…

    Kevin Costner on the other hand… he’s noticeably bad even when he’s in a good movie. (see : Dances With Wolves, Robin Hood)

  5. Dan

    EqA and OPS+, I think, are more valuable than BR, RC, WARP, VORP, etc simply because I like rate stats better than counting stats. The latter category places emphasis on playing time which, while helpful, sometimes doesn’t tell the whole story. Sure, Player A has a low BaseRuns total (or RC, or WARP, or VORP), but the stat doesn’t mention that his manager is playing him only part-time behind a guy who’s essentially blocking him while producing league-average or below numbers. EqA can show that, hey look, this guy is contributing much more, and perhaps he should be elevated to the starting lineup.

  6. Mike Williams

    Keep in mind, Joe, that base runs are, more or less, a “counting” stat; therefore DeJesus is tied with Guillen only because he missed two + weeks of games early in the year.

  7. Baseball Tonight and Tim Kurkjian (sp?) talked about Mark Teahan last night after his all-leg HR. Kurkjian called him one of the best base-runners in the game. People are starting to notice.

  8. Steve

    Aviles has 6 times as many base runs as TPJ in, what, 20 games played all season?

    TPJ could combine Ozzie Smith’s peak defensive value with a bionic arm and a pair of rocket boots and he still wouldn’t be worth trotting out in the lineup every day. My God is he terrible.

  9. Aaron M.

    I used Base Runs on the freshmen team I help coach because I felt like we weren’t scoring runs like we should be. Base Runs said we were under by about 6 runs. Not as bad as I thought, but those 6 runs at the right time would have won us about 2-3 more games. On the other hand, expected win percentage said we outperformed by half a game. So I guess we really were that bad.

    Base runs are useful, but not really on a per player basis. You’d have to divide by games played and get a per game number. Then it would make a little more sense.

  10. And as a bunch have already mentioned, Base Runs doesn’t take into account games missed because of injuries. A-Rod’s score, for example, goes from 57 to 72 if you extrapolate his numbers to include the 20 games he missed.

  11. AK

    Did I miss something? Tom Cruise? What?

  12. Jhohnny

    >>>”So tom Cruise acts weird and gay…<<<”

    Is that one of those computer generated hijacker things?

    They’re getting pretty realistic.

  13. Bart

    I’d divide by ABs-Sacrifices, instead of games to get a per-appearance comparable number, but I don’t know much.

  14. I’m sorry, I love Jeter and all, die hard Yankee fan, I still have (and occasionally wear) the Youth XXL jersey I bought of him back in 96… but he’s terrible this year. If I was Girardi, Derek would be at the bottom of the lineup. He’s batting 30 points under his career average, hasn’t hit a homerun since June 13th, has one stolen base in the past two months, only two extra base hits in the past three weeks, and has 8 errors already this year.

    His value as a SS has been declining for the past 3 years, and his prowess at the plate is declining as well. He’s fouling off pitches he used to drive to right field and I just don’t see him staying at the top of the lineup too much longer.

    And if he does, shame on Girardi, he doesn’t deserve that spot.

  15. Goetzo

    How depressing that I’m later to Base Runs than JoPo is.

    After having read the Wiki entry on Base Run it seems that the only possible reason the Red Sox are scoring significantly below their BR estimate is that they are not scoring their base-runners as effectively as the (Scoring Base Runners/Total Base Runners) estimates. Assuming you’re using the simple formula, I’d think that indicates the Sox are below-average base-runners. (Go ahead and blame Varitek, and probably MBM and Papi, for that.)

    As far as using rate stats vs. counting stats as Dan mentions, can’t we just normalize it by calculating BR/PA?

  16. Scott P

    You can’t beejo Teahan’s base running anymore especially since Tim “That’s just incredible” Kurkjian praised him as one of the best base runners in the game last nite.

  17. Jon Schmidt

    I studied several different run estimators a few years ago. BaseRuns really intrigued me, because it is grounded in the intuitively obvious idea that runs scored = baserunners x percentage of baserunners who score. However, the second term ends up being not much different from a variation on linear weights, which has statistical value, but the arbitrary constants have always bothered me.

    In the end, I could not find a whole lot of difference in accuracy between the various popular tools. My conclusion was that we might as well stick with the original, basic version of Runs Created, mainly because it is so simple and elegant. The underlying format is similar to BaseRuns–baserunners (OBP) x advancement (TB). The real beauty of it is that if you want a rate stat, rather than a counting stat, you can use Runs Created per Out = OBP x SLG / (1-AVG).

    Notice that this equation includes all three numbers that are commonly used to describe a hitter’s performance, and nothing else. It obviously excludes baserunning, but so does the increasingly popular OPS. What bothers me about OPS is that you are adding two fractions (OBP and SLG) that have different denominators–always a mathematical no-no.

    Keep up the great work, Joe!

  18. Drew

    I did a quick little study on Tom Cruise.
    I first took all of his measurables - maybe asexualness, weirdoness, makes me uncomfortableness, might be a robotness, sort of married to Katie Holmes but probably not reallyness and same character in every movieness and tried to come up with a stat that really made sense of Tom Cruise.
    The final figure I came up with was 0.
    Draw your own conclusions I guess.

  19. Andrew

    75 BB/yr Grudz (average numbers): 96 - 8 - 58 - 12 - .290/.370/.396, 104 OPS+

    I don’t know that he’s making a lot more money than the $35M+ he’s already made with a 104 OPS+. Tough to say though, I’m having trouble thinking of a good 2B comparison.

    Thoughts anyone? He doesn’t have the power or speed to be Biggio or Knoblauch…

  20. Jon Schmidt

    Just for comparison, here are the current top 20 in the AL in simple Runs Created, along with their Runs Created per Out and the ratio of that number to the league average, which is .186, expressed as a percentage:

    1. Ian Kinsler, 82, .321, 173
    2. Josh Hamilton, 74, .294, 158
    3. Grady Sizemore, 71, .279, 150
    4. Justin Morneau, 71, .295, 159
    5. Nick Markakis, 68, .284, 153
    6. Milton Bradley, 68, .381, 205
    7. Kevin Youkilis, 68, .315, 169
    8. Jermaine Dye, 64, .279, 150
    9. Aubrey Huff, 64, .265, 142
    10. Brian Roberts, 73, .252, 136
    11. Dustin Pedroia, 63, .238, 128
    12. Carlos Quentin, 62, .263, 141
    13. Manny Ramirez, 62, .270, 145
    14. JD Drew, 62, .323, 173
    15. Alex Rodriguez, 61, .335, 180
    16. Miguel Cabrera, 59, .247, 133
    17. Johnny Damon, 57, .267, 144
    18. Michael Young, 57, .215, 116
    19. Joe Mauer, 55, .284,153
    20. Magglio Ordonez, 55, .266, 143

    And here are the same numbers for the Royals, including a few that Joe left out:

    1. David DeJesus, 51, .245, 132
    2. Jose Guillen, 50, .193, 103
    3. Alex Gordon, 48, .185, 99
    4. Mark Teahen, 45, .182, 98
    5. Mark Grudzielanek, 43, .224, 121
    6. John Buck, 28. .170, 91
    7. Miguel Olivo, 27, .186, 100
    8. Ross Gload, 25, .156, 84
    9. Billy Butler, 24, .145, 78
    10. Mike Aviles, 20, .220, 118
    11. Joey Gathright, 19, .108, 58
    12. Alberto Callaspo, 12, .162, 87
    13. Tony Pena, 6, .042, 23
    14. Esteban German, 5, .078, 42

  21. ChuckO

    Base Runs are available on the Fangraphs web site, though I look at the Base Runs Above Average. That stat gives one a clearer picture of how a player’s performing relative to others. Of course, these stats often just confirm what you already know. Chipper Jones is carrying my Braves, and Jeff Francoeur has been a black hole that has almost completely swallowed Texeira’s offensive contributions.

  22. dan

    BaseRuns is the best run estimator available. Period, end of sentence. It models reality, and can work in any environment. Literally, it will work just as well for 6th grade girls softball as it will for Major League Baseball. Now, a slight variation of BaseRuns works slightly better for individual batters, and that is called Linear Weights (LWTS). BaseRuns will be plenty accurate for anything that an amateur sabermetrician will do, and is generally even accurate enough for the hardcore guys to use.

    EqA does not model reality. It is easy to understand, yes. And it is generally pretty accurate, but it does not have prescribed values for individual events. It forces certain events to match a final number, and doesn’t always make sense.

    You will generally do well using EqA, but BaseRuns will always be more accurate in every single way, both logical and practical. If you want to read more about why EqA isn’t so great, see the link in my name (where it says website)

  23. dan

    ChuckO–

    That’s not BaseRuns. That’s Batting Runs Above Average. BRAA uses run expectancy charts in estimating how many runs a player contributed above average IN CONTEXT. So two players could have the exact same batting line, but if one has more opportunities than the other to succeed (batting with RISP, men on base, etc.), he wil have a higher BRAA.

  24. Eric J

    Umm… people are criticizing BaseRuns for being a counting stat? Why? Counting stats and rate stats both have their uses; a combination is generally better than either in isolation. But shouldn’t players be penalized for missing time? They weren’t helping their teams in those games.

    BaseRuns are a good representation for how many runs each player has actually helped his team score. If I understand correctly, it’s best to calculate them within a team context (just like it is with Runs Created, or whatever you use), so strictly individual calculations may not be the best. But it’s a good estimator on the team level, so if you apply it correctly, it’ll be a good estimator on the individual level. No, a list of players by “most BaseRuns” won’t necessarily tell you everything you need to know - but it’ll tell you more than most lists by one thing.

  25. Eric J

    Jon Schmidt, I think the denominator for Runs Created per out (or, as it’s often used, per 27 outs) is supposed to be 1-OBP, not 1-AVG.

  26. Marty Winn

    Someone tell me why they put limits on QB ratings. If you have ever looked at the formula you get points for things like TD % and have them taken away for INT% but once you get to a threshold value the formula does not add or take away any more points. So 20 interceptions is no worse than 5.

  27. FredCDobbs

    I really cannot be the first person to mention that a certain large, jolly DH’s lack of wristal health has resulted in the shortening of the Red Sox lineup, and thus a lesser amount of runs scored, can I?

  28. Thanks for introducing me to another stat I’d never heard of before. I still don’t see what it does for me, though. Like most other stats it shows what a player (or team) did yesterday. Please invent one that shows what a player will do tomorrow. And PLEASE don’t normalize anything. I have no use for knowing what someone might have done in some imaginary parallel universe. If a guy hit .347 in 1986 in Fenway Park there is no way in hell you can figure out what he would hit in 2009 in Comerica Park. Heck, a couple years ago Brandon Inge was SURE he could hit better if he didn’t have to catch. If someone had been able to normalize his hitting from catching to third base, they could have saved him a lot of disappointment.

  29. Jon Schmidt

    Eric J, the denominator is 1-AVG.

    RC = OBP x TB = OBP x SLG x AB
    1-AVG = 1 - H / AB = (AB - H) / AB
    RCO = RC / (AB - H) = OBP x SLG x AB / (AB - H) = OBP x SLG / (1-AVG)

    RC/27 is intended to give you a value that can be compared with the runs per game scored by a team. However, it assumes that you are accounting for all of the outs. As Bill James himself noted, when using simple RC, (AB-H) ignores double plays, caught stealing, pickoffs, guys gunned down on the bases, etc. The exact number of (AB-H) per game varies somewhat from year to year; so far this season in the AL, it is 25.2. In keeping with the goal of simplicity, 25 is probably close enough.

  30. McKingford

    I guess if you want to turn Base Runs from a counting stat into a rate stat (so as to account for differences in playing time), you can just divide by games played:

    Base Runs/Games

  31. Richard Aronson

    I was going to look at Tony Pena in context of Robin Yount, who was rushed to the majors way before he knew how how to hit because the Brewers had nobody else in the their organization who could field at all well. But OPS+ of 79 first seasons versus Pena’s? What are the Royals thinking? Heck, I’d have given Berroa another shot instead of trotting Pena out there.

  32. Jon Schmidt

    Just to finish the thought, Milton Bradley currently leads the AL in RCO at .381. Assuming 25 outs (AB-H) per game, this translates to about .381 x 25 = 9.53 Runs Created per game, which I like to call Runs Created Average (RCA), kind of like a pitcher’s Earned Run Average in reverse.

    The idea is that a team with nine hitters all performing at a level consistent with what Milton Bradley has done so far this year would be expected to score about 9.53 runs per game. When you compare this with the current AL average of 4.62 RPG, you get a ratio of 206, almost identical to the ratio of his RCO to the league’s (205). David DeJesus leads the Royals with a 6.13 RCA, for a ratio of 133 (vs. 132 based on RCO).

    Looking just at the overall AL numbers, you get 6,027 simple Runs Created vs. 5,948 actual runs scored, a difference of 1.3%. RCA is 4.65 vs. 4.62 RPG, a difference of 0.65%. Both of these seem close enough to me!

  33. John McCann

    Base Runs is the current state of the art at the moment. On the hardballtimes.com site they even replaced Bill James’ RC with base runs when they figure the in season Win Shares.

    I have actually been partial to the three versions of extrapolated runs for awhile now myself. It used to be hard to find the up to date coeffiecients to Base Runs online, and extrapolated runs is the same formula every year.

    Base Runs, RC, and XR are of course anagalous to Runs Scored, so it helps to have lots of PA, and they need to be put in context of the run environment and outs used.

    Anyway, if you really want to be accurate, why not compute Runs Created 3 or 4 different ways, and use the average. That should be the most accurate number based on the jellybean jar theory.

  34. Bill C.

    Since a couple of people have chimed in about Cruise v. Costner (AK and Jhohnny…if you were serious it was not a hijack, it was just that people wanted to comment on the poll Joe put up) I couldn’t resist defending, after a fashion, both of them.

    Kevin Costner is a perfect fine actor in certain limited roles. Not unlike Elton John as a singer, who doesn’t really have much of a vocal range, but is quite good when he stays within his range. Maybe it’s more dmaning to say an actor has no range, but Kevin Costner is pretty damn good in Bull Durham and Tin Cup. He’s excellent in the wildly underrated Wyatt Earp and A Perfect World, he’s ok in The Untouchables, and Field of Dreams, and he’s pretty entertaining in Silverado.

    That’s 7 good movies in which he is also good, which is more than a lot of actors. Basically, he’s pretty good as long as he doesn’t have to be overly romantic (For Love of the Game), an action hero (Waterworld), do an accent (JFK), or all 3 of those (Robin Hood).

    As for Tom Cruise, he’s a weird dude in many, many ways, but the notion that he’s not a good actor is sort of silly. He’s been at least good, and often better than that in many, many movies.

    Even being less than generous, he’s been good or better in at least 10 movies. Risky Business, The Color of Money, Rain Man, A Few Good Men, Interview with the Vampire, Jerry Maguire, Magnolia, Minority Report, Collateral, he’s good in all of those, which I happent to think are his 9 best, and I’m leaving out plenty of others that people respect such as Born on the 4th of July and Vanilla Sky. Seriously, other than a lousy Irish accent in Far & Away when has he been bad?

    His weird public persona has been overly conflated with his acting ability. He’s not Laurence Olivier or Daniel Day-Lewis, but he’s really quite good in many, many films.

  35. Snowman

    If you want to figure a team that has this season underperformed their “should have” numbers, plug my Bravos into the spreadsheet. Betcha they’re off by more than 3%, and betcha it’s strictly because of their awful numbers with RISP.

  36. Eric J

    Jon Schmidt - fair enough; AB-H does give you a pretty good “outs made” total, and dividing by AB would of course give 1-AVG, for “outs per AB.” I think what most people are looking for is “outs per PA,” which would basically be (PA-times on base)/PA, or 1-OBP.

    Also, the reverse of a pitcher’s ERA would be normalized to 27 outs, regardless of how many actually occur in an average game; of course, the leaders wouldn’t change at all, but the numbers will be bigger.

  37. Pokey Joe

    Gosh, Joe, there wasn’t a single National League player that interested you? I think you’ve spent too much time in Kansas City.

    And by the way, as much as I love baseball and these here newfangled stats, I don’t understand them. So there are others passionate about the game who are behind you on the learning curve…

  38. Shelby

    Why no love for Open Range? Probably the best thing Costner’s been involved with, by my estimation.

    And what about Fandango? Also one of his better ones.

  39. Jon Schmidt

    Eric J: The denominator of RCO is outs–not outs per at bat or outs per plate appearance, just outs. 1-AVG is correct in this case, rather than 1-OBP, because the denominator of slugging percentage is at bats, not plate appearances. 25 outs is the right number to use for RCA with the simple RC formula that only takes AB-H into account. For more sophisticated run estimators that include most or all of the other events (positive and negative), it is proper to use the full 27 outs.

    For pitchers, the number of outs is always exactly IP x 3; no adjustment is necessary. I have played around some with a reverse RC for pitchers, which I call Runs Enabled (RE) and translates to a Runs Enabled Average (REA) when you multiply by 9/IP. It is obviously straightforward to calculate if you have the AVG, OBP, and SLG against numbers, but that is not generally the case when looking at historical data. I have gotten pretty decent results approximating OBP = WHIP / (3+WHIP) and TB = 4/3 x (H + 2xHR).

    One last observation–simple Runs Created can be viewed as a version of BaseRuns in which the percentage of baserunners who score is assumed to be equal to total bases per plate appearance. It is rather remarkable that this turns out to be as accurate as it is.

  40. Jon Schmidt wrote:
    >What bothers me about OPS is that you are adding two fractions
    >(OBP and SLG) that have different denominators–always a
    >mathematical no-no.

    While there are lots of things that could rightly bother you about OPS, this is NOT one of them.

    1) OBP and SLG are decimals, not fractions. You know that.

    2) You can add fractions quite easily. Look, I’ll do it for you right here: 1/2 + 1/3 = 5/6

    3) One way to add fractions is to convert them to decimals. I’ll do it right here: 1/2 + 1/3 = .500 + .333 = .8333

    So, no, Jon, it has never been a mathematical no-no to add fractions with different denominators.

    Of course, there is also the possibility that you added that comment to be funny. In which case, I clearly have spoiled the humor by overanalyzing.

  41. My problem with using individual base runs and/or base runs per out as Joe suggests is that I think that it is libel to be be misused or misunderstood.

    For example, the Yankees did not score 43 more runs than they would have had without him. There would have been another player in his place, perhaps Tony Pena Junior. :)

    The concept of quantifying “Above Average” or “Above Replacement” — the latter an idea that Football Outsides has implemented wildly incorrectly — is one of the major contributions of sabermetrics, right up there with Park Effects and adjusting for the quality of the opposition. While I prefer “Above Replacement” stats, “Above Average” is more understandable to a lay audience.

    And so, for a mainstream columnist to use this kind of stat, s/he ought to use IBRAA (Individual Base Runs Above Average), “a statistic that shows how many runs this player produced above the average starter at his position.” Give it a catchier name, like JoPos, and you’re all set.

  42. Aaron B.

    http://www.insidethebook.com/woba.shtml

    Read that. And let’s bury plain ol’ OPS

  43. Bart

    ALEX….you’ve gotta be kidding with the ‘different denominators’ post. He means that OBP’s denominator is AB+BB+HBP+SF and SLG’s is AB.

    It’s a valid complaint.

  44. Jeff Wright

    Base Runs really adds nothing to the game of baseball, only more self-serving statistical blather.

  45. Josh Cookson

    If average with runners in scoring position effects base runs, this would explain why the Red Sox are below their expected outcome. Crisp, Varitek and Lugo (the 7-9 hitters with Ortiz out) have combined to hit .173 (35-202) with runners in scoring position. This is truly dreadful. The terrible 3 managed to botch a bases loaded no out situation the other night without grounding into a double play. Truly amazing.

  46. Brian

    Re: BART

    It’s a valid complaint only if you’re trying to find some exact mathematical value. What you’re really doing is adding two percentages with different denominators, which IS mathematically non-sensical. But that doesn’t really matter here. what they are trying to do is create a stat where bigger is better (i.e. a player with a higher OPS is ALWAYS more valuable to his team). AVG. doesn’t do that. SLG doesn’t do that, OBP doesn’t do that. OPS does, and it doesn’t have to follow mathematical rules to make sense.

  47. Jon Schmidt

    Brian: It is also a valid complaint if elegance is a desirable attribute, which is the case for me, but not for everyone. Besides, RCO or RCA meets your criterion of always being higher for a more valuable player.

  48. Welcome Joe. We’re already on our fourth bottle of champagne each, but you can double up until you catch up(*).

    (*) By the way, I need kid jokes. My kid is telling me all his jokes, and then he says “your turn”. I’ve got nothing for him. His favorite joke, which he likes to say all the time, which I have to laugh at all the time is: Why was the mustard last in the race? Because it couldn’t… catch up (catsup… ketchup). Joe, please work your magic, and help us out.

  49. By the way, Hardball Times calculates BaseRuns for you. You can see it for the Royals here:
    http://www.hardballtimes.com/thtstats/main/index.php?view=batting&linesToDisplay=50&orderBy=rc&direction=DESC&qual_filter=1&season_filter=2008&league_filter=1&team_filter=KC&pos_filter=All&Submit=Submit

    It’s a counting stat, and therefore, if you have a problem with it, you have a problem with walks, HR, and runs. You can convert it to a rate stat by dividing by outs made.

    And the above site calculates the Base Runs by figuring the team with and without the player, and giving him the difference.

    EQA goes to the trouble of Park factors, which is a plus in its favor. That doesn’t take anythign away from the basic version BaseRuns. There’s also an EqA version without park factors. The point is that given the same stats, BsR fits them together better than EqA does. Anyone who has studies *both* of these metrics doesn’t dispute it.

    The problem with the other metrics being discussed in this thread is that we can show (and have shown) that the weights are not correct: they undervalue walks and overvalue HR. Basically, you need to get those right, among all the other items. BsR does get it right. Other than Linear Weights, all the other metrics do not get it right.

  50. Sky

    I have a lot of respect for you, Joe.

    There are some small issues with your implementation of BaseRuns (haven’t accounted for outs used, need to compare to a baseline like league-average or replacement level, etc.), but just the fact that you’re measuring player production in runs is awesome. This is linear weights, but without the scary name and explained in a fun way.

    For people interested, you can find runs produced compared to the average player on a hitter’s Baseball-Reference page. It’s in the Special Batting section on the way right — BtRuns. It’s already park adjusted, too!

    Joe, your next step is to find some of the good fielding stats measured in runs.

  51. Joe, you should definitely write for the Hardball Times Annual. I’d bet they may even take part of a chapter of your upcoming book (if you happen to be writing one) as an excerpt in their annual.

  52. Creston

    I will readily admit that I had no idea of the concept of Base Runs. It just seems like it’s more useful for a team than for a single hitter. Also, how many BaseRuns is considered to be “good” at some particular time? Ian Kinsler has 74, which is apparently impressive because it leads the league. But is A-Rod’s 57 still any good? He’s OPS’ing about 1.000 so I’m guessing yeah.

    Thanks for the linkie Tangotiger!

  53. Jon Schmidt

    Most (if not all) run estimators work better at the team level because, except in the case of a solo home run, scoring runs requires some people to get on base and others to advance them to home. However, the impetus for their invention and refinement is to assign credit for runs to individuals as a way of measuring their overall offensive performance.

    Personally, I think that the best way to do this would be to attribute bases gained and lost to hitters and runners within each inning, add up the net results, and then divide the total by 4 to calculate runs. This is not as clever as any of the formulas that use traditional statistics for input, but it has the indisputable advantage of always being exactly right at the team and league level. The big problem for any kind of historical application is that you must have complete play-by-play data; a box score is unlikely to provide enough information.

    With any of the well-known run estimators, including both Runs Created and BaseRuns, I think that ending up with more than 100 for an entire season is pretty good. Last year, there were 51 players who reached that mark (in “sophisticated” RC), and only 19 exceeded 120; A-Rod led the way at 166. However, his simple RCA (9.92) put him behind both David Ortiz (10.34) and Magglio Ordonez (10.13) because he made more outs than they did.

  54. If the issue is about “personal opinions”, then there is no right answer, and the debate becomes just a yap-fest. Feel free to prefer RC or any other half-a$$ed metric out there for whatever reason one chooses.

    If the issue is one about logic and illumination, then that’s what BaseRuns provides and Runs Created does not. Kudos to Joe for taking the steps in the right direction.

  55. Jon Schmidt

    TangoTiger: I have great respect and appreciation for the work that you and others have done with BaseRuns and various tools of baseball analysis. However, I do not believe that you help your cause by disparaging those who see things a bit differently than you do. To suggest that Runs Created is “half-a***d” and provides no “logic and illumination” is silly.

    Frankly, I suspect that BaseRuns will have trouble catching on with most fans because they will never be able to calculate it for themselves. That may be a flaw of the fans, rather than BaseRuns, but it is a reality. RCO (or RCA) combines three statistics that are already quite familiar into a single number that characterizes a hitter’s overall performance much more meaningfully than any of its components in isolation. I think that makes it very valuable, especially if it can be popularized (Joe?).

  56. tangotiger

    Thanks for the kind words.

    I didn’t disparage anyone.

    RC is half-a$$ed. It doesn’t properly value things. And if BaseRuns existed, RC would never have been invented. OBP*SLG*whatever is a quick and dirty approach… half-a$$ed.

    As for the fans, I don’t particularly care if BsR catches on more than RC or not. I’m happy to lay out the case for BsR for those interested in learning about the two. Obviously, the basic version of RC is easier, but it’s also easier because it’s not as good.

    Half-a$$ed things are valuable… that doesn’t make it better. And that’s the argument. Do you want logic and illumination, or do you want something quick and dirty, that sacrifices logic and some illumination to get you something “good enough”.

    To each his own…

Reply to “Late to the Base Runs party”