A Ridiculous New Statistic
Posted: August 14th, 2009 | Filed under: Baseball | 28 Comments »
Well, it’s been a while since I’ve come up with a new baseball statistic.
So here you go: This is hardly new. In fact, I’m sure someone has already come up with a much better version of this statistic. But I can’t find it … so I’ll claim it as my own for now. This effort is actually based on a Sabermetric effort I ran across during my 09/09/09 research. There was this guy back in 1975 who had invented his own statistic (if I remember correctly — I can’t find the clipping itself right now) that went like so:
(Total bases + Times on base) / Plate Appearances.
I don’t remember the guy’s explanation precisely … but I do remember thinking that is sounded quite lucid, especially since this was almost 35 years ago. And I do remember that he was saying the most important part of the game was getting on base and that the second-most important part was advancing on the bases, and so he combined the two into this one statistic — and this definitely seemed to presage the rise of on base percentage and OPS and so on. I have no earthly idea if the formula itself makes any mathematical sense at all … but it was a fun thing, especially for 1975 when batting average was pretty much the ONLY statistic going.
So I decided to figure this years numbers with that stat. Of course, I had to put in a couple of modifications. Basically, I figured in stolen bases, caught stealing, grounded into double plays and, yes, sacrifices. I didn’t want to put sacrifices in there but then I thought I should at least pay some homage to those who give themselves up in the fight. Or something like that.
So my statistic looks like this:
((Total bases + Times on Base + Stolen Bases + Sacrifice hits and flies) – (caught stealing + grounded into double plays)) / (Plate Appearances + Caught Stealing)
I’m sure that I’m committing crimes against mathematics — I expect to see Pythagoras in my nightmares tonight. I double count caught stealing and I probably am not doing the fair thing with GIDP. I’m sure there’s a better way to get at what I’m trying to get at here … and I’m sure you will help.
But for now, hey, I do like the list.
Top 10 Players in 2009 (250-plus PAs):
1. Albert Pujols, .990
2. Joe Mauer, .969
3. Jason Bartlett (!!), .937
4. Hanley Ramirez, .932
5. MannyBManny, .908
6. Mark Reynolds, .900
7. Prince Fielder, .895
8. Raul Ibanez, .889
9. Carlos Beltran, .886
10. Chase Utley, .882
And the Bottom 10 Players in 2009:
10. Willy Taveras, .588
9. Edgar Renteria, .586
8. Yuni!, .580
7. Adam Everett, .578
6. Dioner Navarro .558
5. Jason Kendall, .553
4. Nick Punto, .544
3. Alex Gonzalez, your newest Boston Red Sox star, .539
2. Ronny Cedeno, .538
1. Brian Giles, .502
We’ll try to have some more fun with this statistic over the weekend. And, of course, comment your suggestions and thoughts to make the statistic better. And you can also try and come up with a name for it. Right now, I’m leaning toward: “The Kuiper.”
Wow at least the Reds got rid of 1 of the bottom 10. Now if they can do something about Taveras….
Of course that 1975 formula makes sense. (Total bases + Times on base) / Plate Appearences = (TB / PA) + (ToB / PA) = (something similar to SLG) + OBP. So it’s essentially OPS. The only difference is that it uses TB/PA instead of TB/AB.
And so your formula is, more or less, OPS with GDIP, SB, and such added in. Which explains why your lists make so much sense.
Why wouldn’t you count caught stealing twice? A runner caught stealing does two things: adds an out and removes a baserunner.
Poz, PLEASE read this first:
http://www.hardballtimes.com/main/article/bases-and-outs-ad-nauseum/
“How many times can we reinvent the wheel? ”
Don’t add your name to the list…
This must be named after Yuni…
Yawn-inducing
Unimportant
New
Informational
‘
Statistic
=YUNI’S
It looks like it needs a little tweaking to the formula, but I like where this is going.
The 1975 stat is better than OPS because it uses the same denominator. People will use OPS forever though, because they can do it by adding two already published #,s toghether.
I have been using a version of your stat beginning @1979. I do not penalize caught stealing twice because plate appearances are the denominator, not outs. I used, as a companion stat, another one I called bases moved per out , but there was a writer about that time doing that and calling it total average. Very cool for 1979, a little out of date now. There is nothing new under the sun.
I would suggest taking the caught stealing out of the denominator, but otherwise, if it resonates for you, use it. I have been using a bastardized version of runs created for years, with a version for pitchers, (so I can compare them directly with each other independent of fielding-and with hitters) and additional base running for the hitters.
It is good for me, but I don’t need everyone to do it. I do it because it works for me, and I’m a stat geek. Use what works for you!
Actually, after reading the link in #4, my stat was more like BOP, not total average. The original total average was more like yours. Around the same time but logical in the days when we could not get every stat updated daily on the internet.
You can make valid comparisons with many of these stats as well. Simple and accessible but reasonably effective.
@ #4:
this article also states:
“The amazing thing about this to me is that, as far as I can tell, all of these were presented as new ideas and new statistics. There were no disclaimers along the lines of “Statistic X is similar to Barry Codell’s Base-Out Percentage and Tom Boswell’s Total Average. However, I have decided to give it a different name because I excluded stolen base attempts and hit by pitch from the equation.” Quite the opposite: Some of the metrics were introduced breathlessly by their respective authors as a completely new way to look at the game.”
joe nailed the disclaimer.
Khazad, it’s the difference in denominators that gives OPS its advantage over most bases per outs (or worse, bases per PAs) stats invented. You end up underweighting the walk and overweighting the HR in OPS, but that’s better than stats that treat the walk and single as identical.
I have an idea for a ridiculous stat of my own but I don’t know how to compile the data.
You might call it Runner Advancement Percentage. The idea is to gauge a hitter’s overall effectiveness at moving runners along. For every PA the total number of bases that runners advance would be added into the numerator, while the greatest possible number of bases they could have advanced goes into the denominator.
Example: Sac fly with runners on 1st and 3rd. That’s one base advanced of a possible four.
Draw a walk with the bases loaded. That’s 3 of a possible six. Single with men on 2nd and 3rd would be 3 of a possible 3, and so on.
A variation on this would be to include the batter himself as a baserunner as well, adding 4 to the denominator on every plate appearance.
I like that it’s a percentage stat and therefore you’re not penalized for hitting behind guys who don’t get on a lot, and I like that it accounts for things like moving a guy from 2nd to 3rd by hitting to the right side. A player’s number could be inflated by hitting behind an exceptionally good baserunner, although intuitively I think that would be a pretty small impact.
Anyway, I’ve been thinking about this off and on this season and I think this could be an interesting stat but I don’t know how to compile using BR.com. Can’t think of a way to pull the numbers without literally going through play-by-play charts of every game, which I just can’t do.
Say it ain’t so, Joe!? (TangoTiger in #4 beat me to the obvious)
This is my favorite one of hundreds:
http://www.tangotiger.net/wiki/index.php?title=Base_Runs
[...] Joe Posnanski » Blog Archive » A Ridiculous New Statistic By Joe Posnanski Prince Fielder, .895 8. Raul Ibanez, .889 9. Carlos Beltran, .886 10. Chase Utley, .882. And the Bottom 10 Players in 2009: 10. Willy Taveras, .588 9. Edgar Renteria, .586 8. Yuni!, .580 7. Adam Everett, .578 6. Dioner Navarro .558 … Joe Posnanski – http://joeposnanski.com/JoeBlog/ [...]
#4 – Joe was very clear in his second paragraph that this was hardly new. Don’t be a pedant.
Joe,
Please don’t go all Tom Boswell on us!!!!
http://en.wikipedia.org/wiki/Total_average
I am shocked that it is even poossible to devise a stat that doesn’t say Willie Taveras is the worst everyday player in baseball.
Joe,
Why add in sacrifice hits and flies? Ultimately, these are outs — shouldn’t you subtract them?
I actually think your stat sounds a lot like EqA (Equivalent Average), a Baseball Prospectus stat.
So, let me see if I have this right….We’re going to count two outs against any batter who hits a ground ball that results in a double play, despite the fact that the runner on first made one of those outs, the manager didn’t have that runner moving with the pitch, and the batter had absolutely nothing to do with the runner being on base in the first place, or that runner’s speed, ability to break up the double play, etc. We’re just going to count the raw number of GIDPs, independent of whether that batter faced 200 GIDP chances or 50, and we’re only going to do this in the case of ground ball double plays, completely ignoring double plays (or triple plays for that matter) where the batter is actually more culpable for the second out (a whiff that doesn’t protect a runner who is then thrown out, a fly ball that isn’t deep enough causing a tagging runner to be thrown out, a line drive right at someone that doesn’t give a runner time to get back to the bag, etc.). The reason for accounting for this kind of DP and no other is apparently because someone back in the ’30s decided to keep track of that kind of DP and chose to ignore every other kind, giving full blame to the hitter and only the hitter, and we’re going to accept that logic 70+ years later because, well, I guess all 30’s-era logic is still sound, like having your star 20-year old pitcher (a.k.a. Bob Feller) throw 300 innings each year, and barring black players from playing.
Makes perfect sense.
Back when I was a successful Strat-o-matic manager, my formula for evaluating hitters was (each number multiplied by the chances of rolling that number) 3 points for each walk or HBP, 4 points for each single, 6 for doubles, 8 for triples, 10 for homers. I did this against RHP and LHP separately, natch. This number was effectively normalized for the 108 chances on the batter’s card. I didn’t cover stealing ratings because it was an afterthought. I mean, I *liked* having good base stealers, but I only ran with an estimated 80% chance of success or higher except in late and close, because on a good offensive team the caught stealing hurts more than on a bad one. My rule of thumb was I wanted at least six players in my lineup to have 200 or more points on this scale, because six in a row gives really good chances of scoring runs in several innings per game. My last year I could get 8 batters at 200+ versus LHP and 6-7 versus RHP, with a couple of guys close. And yes, I did have some weird platoons, with four outfielders in my regular rotation.
You could take the same measurements and then divide by PA to normalize it. So using your list above, Albert Pujols is at 2.18. Manny Ramirez is at 2.11. Joe Mauer is at 2.09. Andre Ethier (a good but not great player with an OPS of .870 and OPS+ of 127) is at 1.73. Last season, when Ethier’s OPS was .885 with more times on base but fewer homers, his score was 1.75. And last on your list this year, Brian Giles, is at 1.14, half of Pujols. The best offensive team in the NL (by runs scored) is the Phillies, with a team average of 1.543. The Royals grade out at 1.334. The Yankees, the best offense in baseball, are at 1.664. I think we can assume that 1.50 is decent, and make whatever adjustments we want to make for speed and defense.
The beauty of this kind of system is that you can easily weight it based on your preferences AND upon seen results based on the team. For example, to me there’s no real way the Angels should be so close to the Yankees in runs scores. The Yankees as a team have an OPS of .837. The Angels are at .804. That should be a hell of a lot bigger difference than only five runs. In my rankings, the Angels are at 1.596, which is a lot closer to the Yankees than OPS. But it still isn’t 5 runs difference. However, it does suggest that this weighting system is more accurate than just OPS.
What that says to me is that the Angels have the best offensive manager and base coaches in baseball. Yes, they steal more than the Yankees, but they also steal at a less effective rate, and it’s not like comparing the fastest team in baseball to the slowest. The only source of divergences are either the way we rank is wrong, or (and this is far more likely) the Angels manufacture more runs through their well documented team philosophy of always pushing for more on the basepaths. And that dovetails nicely with the Royals underperformance by not pushing at all on the basepaths. Note that the Rays are a fairly close third in OPS, have stolen a LOT more bases than the Angels with FEWER times caught stealing which if anything should move them closer to the top two, have a ranking almost tied with the Angels at 1.592, but are not close to the two leaders in runs scores. So it’s not steals that are driving the extra run production.
Setting aside the anomalous Angels, one advantage of this kind of method is that it lends itself to easy manipulation based on actual results. Excel could calculate all the numbers per team per year, and then the weightings could be tweaked to come up with something more accurate. One thing that almost scares me is that the Angels have a huge lead in batting average, even though they are only second in OBP. So maybe singles should count for more. And maybe Batting Average does count for more, like the old timers thought.
The Angels also do a terrific job of putting the ball in play. Very few strikeouts, not many walks, very few hit by pitches. So perhaps there is synergy; batting average plus defensive outs (many of which advance runners) leads to more runs than predicted. I don’t know. It’s just another method. But I think Mike Scioscia deserves a raise. And Hillman should look at how the Angels attack on the basepaths all the time. I bet the Angels have more unearned runs than more teams.
Could you find a way to include HBP? I don’t like seeing Brian Giles and Edgar Renteria at the bottom. I actually devised a stat a while back which had Yuni near the top: strikeouts/(runs + RBI), so I never used it again.
Mathematically the caught stealing would simply cancel out so neither would count in the overall equation if you’re going to divide it so I think it needs a little tweaking, for instance simply multiply the caught stealing twice in the numerator and that should help reach your goal, but good idea
For those being upset at #4, let’s not get too angry. His stat that compiles these things (and the link he gives) is called wOBA, and does so more scientifically than Joe’s does.
Now Joe doesn’t presume to claim his stat is accurate or not screwing up somewhere…he fiddles around and was just posting something that came to mind. But if Tango and others agree that wOBA does more or less what Poz is trying to do spur of hte moment, is there any problem with showing Joe the stat does it more accurately?
I mean, yeah, the “dont reinvent the wheel” bit is a bit too snarky than necessary, but well…it’s beside the point.
@ Joe Posnanski
“((Total bases + Times on Base + Stolen Bases + Sacrifice hits and flies) – (caught stealing + grounded into double plays)) / (Plate Appearances + Caught Stealing)”
Youre giving double weight to all hits in this calculation.
Break TB and TOB down into their components to see what I mean
Singles + 2* Doubles + 3* Triples + 4* HR = TB
Singles + Doubles + Triples + HR + BB + HBP = TOB
An example of the problem: (using Justin Morneau)
A) 246 TB, 198 TOB, 0 SB, 6 SF, 0 CS, 11 DP, 510 PA = .861
(132 Hits, 63 Walks, 3 HBP)
B) 229 TB, 198 TOB, 0 SB, 6 SF, 0 CS, 11 DP, 510 PA = .827
(115 Hits, 80 Walks, 3 HBP)
There is technically no difference in the two Justins. “A” merely has 17 more singles but 17 fewer walks, both end up at First Base the same amount of times. Your calculation incorrectly creates a rather large .044 variance between the two though.
To correct it, just subtract Hits after Total Bases.
@ #20: Brad Templeman
“Could you find a way to include HBP?”
It is included in TOB, as are walks.
Also, using #4 (Tom Tango)’s Stat, the top 10 hitters are:
1. Pujols
2. Mauer
3. Hanley Ramirez
4. Youkilis
5. Prince Fielder
6. Mark Reynolds
7. Chase Utley
8. Adam Dunn
9. Jason Bartlett
10. Ryan Braun
11. Raul Ibanez
Pretty close right? Mind you, two players on Poz’s list don’t have enough ABs to qualify for this list (Beltran and Manny) so it’s more or less identical there.
Bottom 10 (230 PAs minimum):
1. Alex Gonzalez
2. Ronny Cedeno
3. Brian Giles
4. Dionar Navarro
5. Willy Tavares
6. Nick Punto
7. Yuni
8. Adam Everett
9. Bill Hall
10. Jason kendall
Once again, essentially identical (I chose 200 PAs for the cutoff here, because Qualified is like 400 PAs, and players this bad don’t get that many PAs usually, and Yuni and like 4 others on Poz’s list don’t qualify).
So yeah, the stats show Poz is on to the right track, but he could be more advanced and find the stuff easier by just looking up wOBA on fangraphs.com
Ok, finished what I was working on so thought I would help before I sleep. I exported the Baseball-Reference numbers for the league (as of 8/16) into excel and quickly ran the calculations.
When making the Hit correction I outlined previously (post #23), we see this as the Top and Bottom 15. (I used a min 200 PA as well)
Top
.723 Pujols
.659 Mark Reynolds
.659 Joe Mauer
.644 Manny
.641 Fielder
.635 Utley
.633 Dunn
.623 Hanley
.622 Ibanez
.621 Josh Willingham
.620 Youkilis
.620 Zobrist
.616 Bartlett
.615 Torii Hunter
.611 (3 tied – Bay, Cruz and Teixeira)
.472 MLB Average
Bottom
.367 Ivan Rodriguez
.366 Melvin Mora
.366 Adam Everett
.361 Edgar Renteria
.359 Yuni
.359 Bill Hall
.357 Dioner Navarro
.356 Brian Anderson
.355 Jason Kendall
.348 Ronny Cedeno
.342 Alex Gonzalez
.340 Alexi Casilla
.339 Emmanuel Burriss
.333 Delmon Young
.332 Giles
Of real note, 3 of the Bottom-5 do not make the Bottom-10 wOBA list garik16 provided above, while Tavares (#22) and Punto (#20) escape infamy this time around.
Also, I truly cant believe how much of a lead Pujols has over all the others. The difference between #2 and #15 is merely .048 points. Meanwhile, Pujols has a .064 lead on second place.
Also, calculated Aaron Miles (148 PA with .478 OPS) for the fun of it. He scored a .297
Tavares and Punto escape probably due to SBs….that isn’t included in wOBA.
Pudge comes in at 26th worst, while Mora comes in at somewhere around 40th.
Probably penalizing a lack of speed here, and overcounting other stats.
I want a new stat called “Excitement Factor.” Triples count more…late-inning clutch counts more….grand slams count more….stealing home is off the charts….stealing third is nice too…unbelievable defensive plays get factored in….I have no time to think this through right now, but seriously, I want an “excitement factor.” We can empirically determine what is “excitement,” then weight things appropriately.
Who would be tops?
(okay, no kidding, i just thought of that, and before I hit “submit” I checked google and found this:)
http://www.associatedcontent.com/article/1507571/batting_excitement_factor_bef_explained_pg2_pg2.html?cat=14
there is officially a stat for everything.