Hitting Average
Posted: September 5th, 2009 | Filed under: Baseball | 134 Comments »
It’s obvious that I do not know enough about math to be fully disgusted by OPS the way may are. But I don’t like the stat. I don’t like it because it seems to overstate the importance of slugging percentage while understating the importance of on-base percentage. I don’t like it because on-base percentage is based on the number of plate appearances a hitter gets while slugging percentage is based on the number of at-bats a hitter gets. I don’t like it because I think it should be spelled out — as in Albert Pujols has a 1.114 oh-pee-ess* — while others sound it out so it sounds like something they do in the military.
*Remember O-Pee-Chee sports cards? Those were the Canadian baseball and hockey cards — and the baseball cards were precisely like the Topps cards except they had a little O-Pee-Chee logo on them. As I recall this O-Pee-Chee logo immediately made the card worth about 60% less, though I never really knew why.
Mostly, though, I don’t like OPS because I have no idea why THAT turned out to be the semi-advanced stat that went mainstream. I may have written this before … but I feel about OPS the way I felt about Hootie and the Blowfish. I have no beef with the band — I’ve been told by several people that they are really nice guys, and to me their music is perfectly harmless. But for a short while, they were GIGANTIC and I never understood that. Seems to me there were roughly 1.4 million bands in garages and bars across the United States that sounded just like Hootie and the Blowfish — so what was it that made Hootie click with the world? Was it the name? Was it the sports connection? Was it that they just seemed likable?
And so — what is it about OPS? I know people say that it made it because it’s simple. But it really isn’t all that simple. I mean, it’s not complicated but it still means adding on-base percentage (Times on base / plate appearances) and slugging percentage (Total bases / at-bats). Until recently, nobody cared about times on base or total bases. Until recently, on-base-percentage and slugging percentage were hardly mainstream. It isn’t THAT simple.
Maybe it’s because OPS has a clear line of excellence. A 1.000 OPS is sensational. So in that way it works like our old school grading system. Anything above 1.000 was A+. Anything between .900 and 1.000 was an A. and so on.
Then again, maybe it’s just a mystery … maybe OPS just happened to hit Gladwell’s Tipping Point when other statistics that probably tell a much clearly story such as Eqa, wOBA, runs created, Win Shares and so on just watched on from their garages, left only to play for their cult audiences.
Well, I put out a call to the brilliant Tom Tango and his folks to try and come up with our own statistic, the stat that we will try to make te official stat of this blog. I asked for a handful of things:
1. It has to be simple. Very simple. I do appreciate that the NFL has passer rating, which is obscenely complicated. But really nobody cares about NFL stats except when it comes to their fantasy leagues anyway. The stat has to be simple … and by simple I think it needs to be something that can be easily explained and have people on the other end nodding.
2. It should correlate directly to runs. I will get 100 to 200 emails or letters every year from people who want to know why I don’t like batting average. The point of the game is to hit the ball! And batting average measures that! It was good enough for our parents it’s good enough for us! And go move to Russia!
The problem with batting average is precisely that the point of the game is NOT to hit the ball. The point of the game is to score runs. And batting average does not correlate all that well to scoring runs. The Baltimore Orioles have a better batting average than the Boston Red Sox. But the Red Sox have scored 92 more runs. The Texas Rangers are hitting .260 to Seattle’s .258, but the Rangers have scored MORE THAN 100 more runs than the Mariners. The New York Mets have a MUCH better batting average than Philadelphia (.269 to .257) but the Phillies have averaged almost one run more per game. It’s always like this.
So, what I want here is a simple statistic that can tell us something that matters. And I’d like that statistic to be somewhat independent of the team context. Sure, I do understand why so many people love runs and RBIs and will never let go of them. But runs and RBIs (minus home runs) rely on what your teammates do. Players on good teams get more run and RBI chances. Leadoff men come up more often with nobody out, and men in the middle of the lineup come up more often with men on base.
If you go 4-for-4, all doubles, but there is no one on base and no one drives you in, then by the run/RBI measure another guy who goes 0-for-5 with three double play groundouts but one run-scoring fly ball and one run scored after reaching on a fielder’s choice had a MUCH better day than you.
3. It should be catchy. I’m not sure what makes a statistic catchy, but we’ll try.
So, Tom Tango and his group went back into their statistical bag and came up with a stat that might fit the bill. He called it “Linear Weights Ratio” which — and I mean no disrespect — is a non-starter. Can’t have “Linear” AND “Weights” AND “Ratio” in our stat title. Frankly, any of those three words would be enough to send the masses screaming running into the arms of Tim McCarver. I’m trying “Hitting Average,” in homage to Tom Boswell’s “Total Average,” which never quite took off.
The Hitting Average stat really is simple though. You get a Baseball Point for every single you hit, a little more for every double, a little more for every triple, and a little more for every homer. You get a little less for a hit-by-pitch or walk, and a little less than that if it was an intentional walk — that should please all those who get irritated whenever someone talks about walks. You get a little bit for a stolen base and you cost yourself a little bit if you get caught stealing.
And that’s it. If you want the full formula so you can play with it at home it’s like so:
(1 * singles) + (1.6 * doubles) + (2.2 * triples) + (3.0 * homers) + (.7 * unintentional walks) + (.7 * hit by pitch) + (.4 * intentional walks) + (.4 * stolen bases). Then you subtract (.4 * caught stealing).
That gives you the plus side — the positives that you have contributed to the team. And you should know that baseball points has a HUGE correlation to runs scored.
2009 leaders in Baseball Points
1. Albert Pujols, 327
2. Prince Fielder, 301
3. Chase Utley, 289
4. Ryan Braun, 289
5. Mark Teixeira, 286
6. Adam Dunn, 284
7. Mark Reynolds, 282
8. Ryan Howard, 281
9. Miggy Cabrera, 280
10. Hanley Ramirez 277
Then you need the negative side — the outs that you make. For some reason, Tom Tango includes sacrifice flies but NOT sacrifice hits in this. He also does not put double plays in the formula … I suppose for the same reason that he doesn’t include RBIs or runs. You can’t hit into a double play unless someone is on base. Although Yuniesky Betancourt has tried. Anyway, I don’t question Tom when it comes to the numbers.
So the formula is:
(At bats – hits) + sacrifice flies + caught stealing (not sacrifice hits like I originally had).
2009 leaders in Negative Baseball Points
1. Jimmy Rollins, 427
2. Aaron Hill, 417
3. Vernon Wells, 400
4. Orlando Cabrera, 400
5. Brian Roberts, 396
6. Rafael Furcal, 394
7. Nick Markakis, 392
8. Curtis Granderson, 392
9. B.J. Upton, 389
10. Alex Rios, 388
Then, all you need to do to find the most effective hitters is … divide the plus points into the negative points.
Baseball Leaders in Hitting Average (League average is just about .600 if you include pitchers)
1. Albert Pujols .990
2. Joe Mauer, .941
3. Kevin Youkilis, .876
4. Chase Utley, .869
5. Hanley Ramirez, .860
6. Adam Dunn, .856
7. Prince Fielder, .854
8. Miggy Cabrera, .840
9. MannyBManny, .826
10. Adrian Gonzalez, .808
Or if you prefer Hitting Average Plus — where the league average is exactly 100 — it would go like this:
Top 10
1. Albert Pujols, 165
2. Joe Mauer, 157
3. Kevin Youkils, 146
4. Chase Utley, 145
5. Hanley Ramirez, 143
6. Adam Dunn, 143
7. Prince Fielder, 142
8. Miggy, 140
9. MannyBManny, 138
10. Adrian Gonzlez, 135
Bottom 10 (minimum 300 at-bats)
10. Gerald Laird, 79
9. Cesar Izturis, 77
8. Adam Everett, 77
7. Emilio Bonifacio, 75
6. Nick Punto, 75
5. Ronny Cedeno, 72
4. Yuni, The Musical!, 71
3. Dioner Navarro, 70
2. Alex Gonzalez, 69
1. Willy Taveras, 68
So there you go … hitting average using nothing but the basics — hits, walks, hit by pitches, stolen bases, and caught stealing. There is also a fairly simple way you can convert this information into estimated runs created, but let’s leave it where it is for now.
Math scares me. This terrifies me.
Sounds good. Any chance we’ll see the Hitting Average Against for pitchers?
This guy has a crazy worth ethic for pointless stats. In other news the KC Royals have score one more run than the Angels after four innings.
I like it with one exception, but I’ll get to that in a second.
Just to explain the OPS thing (and I never like that stat, I’ll get to that in a second, too). Here’s why it caught on:
You can do it in your head.
The only math involved is adding two numbers that appear on the main table on Baseball-Reference or any other stat site (ESPN, SI.com, BP or Fangraphs without digging into the advanced stuff, etc.).
This is not that.
That’s not my exception though. My exception is that a CS should deduct twice as many points as a SB add. Tango should know this. There’s a reason a base stealer has to be above 66% success to be helping his team. I’ll spare the number crunching here, but the simply way to say it is that a CS not only results in an out, it eliminates a baserunner, that’s two negatives, not just one (though that’s not really the mathmatical reason, but blah blah).
At any rate, I’m still ticked off that GPA never caught on as an easy OBP-weighted OPS replacement.
I believe Aaron Gleeman came up with it back in the early days of baseball blogging. I’m half tempted to believe that his original name “Gleeman Production Average” has something to do with why it didn’t stick. Now called “Gross Production Average” by The Hardball Times, it’s as good as ever and as ignored as ever.
It very simply gives proper weight to OBP (1.8:1 vs. SLG), adds the weighted OBP to slugging, just like OPS, then adjust it to the batting average scale, so it’s in a familiar language (.265 is average, .300 is excellent, etc.). It even correlates fairly well to EqA, despite the fact that EqA (also on the batting average scale) has a ton more math going on, including adjustments for league, etc.
GPA = (1.8*OPS+SLG)/4
You can calculate with a hand-held calculator in five seconds. (OPS*1.8, enter, +SLG, enter, /4=GPA!)
Right now:
Pujols: .367 GPA
Yuni!: .203 GPA
Joe,
One thing I think you should think about since you like E=MC squared so much. I’m not smart enough to know what it is, but I’m sure Einstein had an objective to finding this equastion out. Figuring out these pointless stats achieves nothing.
Watch the beautiful game of baseball and enjoy all that matters is who scores more runs. They have an old saying “thats baseball”, really its that simple.
The object of the game is neither to hit the ball nor to score runs. The object is to win the game . No INDIVIDUAL stat measures that directly, though Win Shares is (I guess) an attempt to do so. I’m not really sure because though I’ve read the explanation several times I don’t understand it at all. I think the reason for that is that Bill James doesn’t understand it either and he wrote the explanation. If you’re going to INSIST that scoring runs is the object, why not use Chadwick’s favorite (and only) stat RUNS SCORED? If you list all the players in the league in order by the number of runs they scored you’ve really got them in the order of the offensive contribution they made to their teams. (Pretty much) Finding a stat that CORRELATES to Runs seems kind of pointless to me. All the attempts to find a more MEANINGFUL stat seem to require modifying the run to eliminate the contributions of others, but that doesn’t measure the players contribution, but rather his POTENTIAL contribution. He WOULD have scored a run if only… I would think we should concern ourselves with what the player actually did—not what he might have done if things had been different. Ted Williams may have been a better hitter than Joe Dimaggio because he played in Fenway instead of Yankee Stadium, but that doesn’t change the fact. For whatever reason, he was the better hitter and made more of an offensive contribution to his team than Dimaggio did. The fact that you may not LIKE that doesn’t alter anything. Actually, the main reason Williams was a better hitter was that he hit lefthanded. All lefthanders have a natural advantage that is completely unfair, but that doesn’t change anything.
“All the attempts to find a more MEANINGFUL stat seem to require modifying the run to eliminate the contributions of others, but that doesn’t measure the players contribution, but rather his POTENTIAL contribution. He WOULD have scored a run if only… I would think we should concern ourselves with what the player actually did—not what he might have done if things had been different. ”
I’m not smart enough to understand this sentence.
The reason why you (try to) eliminate the player’s teammates is because you are interested in the player, not his teammates. I don’t want DiMaggio to get any benefit from the fact that he played with Charlie Keller and Johnny Mize, and Williams didn’t. I also don’t want Williams to get any benefit from the fact that DiMaggio needed a driver and a three iron to hit the ball out of Yankee Stadium in left field.
I don’t see that as looking at his potential contribution so much as looking at his performance in isolation, so you can compare him against Ty Cobb and Jackie Robinson and Albert Pujols.
Mike/6: WPA (or WPA/LI) might measure what you want
Cliff/4: Joe misstated the “outs” portion.
http://tangotiger.net/lwr.html
It reads as follows:
AB-H + SF + CS
Joe said “SH” when he should have said “CS”.
Joe/0: the reason I don’t include SH is that it’s a pretty neutral play (as well as being elective on the batter’s manager’s side). I’m surprised I kept the IBB in there, though. I go both ways on that… sometimes I put it in, and sometimes I don’t. The SH, however, I never include.
I’m not entirely sure why it’s good/bad instead of good-bad (or possibly (good-bad)/pa). But still, seems interesting.
The stats proposed here may be more accurate as an overall assessment of hitting value, but there are two fatal weaknesses.
First, as mentioned, they are too difficult to compute.
Second, they seem arbitrary – why give doubles 1.6 times the value of a single? Why have an unintentional walk count for more than an intentional walk? It’s too subjective.
For OPS, adding together OBP and SLG is arbitrary, yes, but the underlying components (OBP and SLG) are not. Total bases and times on base are concepts grasped intuitively, without any subjective multipliers added in.
This is essentially Base-Out Percentage or Total Average with a bit more weight given to getting on base (a homer counts 3x singles instead of four). It may correlate better with runs (it’d be nice to see the numbers if you ran them), but the additional complexity outweighs the benefits, IMHO.
Why do you give a 1.6 weight for doubles and 2.2 for triples? I don’t understand the reason for these, and if math don’t have a reason then it’s likely not really describing what you’re trying to describe.
The problem that I have with OPS is that you are deriving a statistic that involves the addition of 2 different percentages. This does not make any mathematical sense. I like Joe’s formula, simply because I like math, but the average baseball fan won’t like to do the calculations.
Go to baseballreference.com. Pick any team and any year of a team that you never watched. For examplthe 1931 Reds. Ok. I took my advice. Looking at the stats with no crazy formula (except the ones provided), I have concluded in less than 1 minute that, Tony Cuccinello was the best player on the ‘31 Reds. He also probably batted clean up or 5th.
In your 0 for 5 example, a sac fly does not count against batting average.
First off, props on using average. I’m sick of seeing everyone continue to use percentage with on-base and slugging when both are clearly averages.
Second of all, this has no chance to catch on. Way too complicated a formula and the weighting seems pretty arbitrary. It makes sense for the most part, but there’s no obvious reasoning behind the weighting.
I agree that OPS is not the the greatest stat. It has many problems to it. It just seems odd that two stats that measure two completely different attributes of a hitter are just arbitrarily added together.
I also find it interesting that the creator of OPS was obviously trying to come up with something that is a better measure of batting average and wanted to reward the players who would walk and/or hit a lot of extra bases that wouldn’t show up in batting average. However, OPS ends up counting batting average twice, since batting average is a component of both on-base average and slugging average.
So, how about taking out the extra batting average component by making one the slugging into IsoP, which is just slugging minus batting average. What you end up with is just OBA + Slugging – BA or (OPS)MB or you could shorten it to OMB.
The current AL average for OMB is .498, so that works really well and would make it pretty easy to get a rough idea of OMB+ since you just take the difference and divide by 5.
Ichiro is a great example of a guy who is rewarded too much in OPS because he has a very high average, but rarely walks and doesn’t hit for much power. His OPS+ is 129, but is OMB+ is 100.
I think tangotiger addressed this, but I am not about to break out the scratch paper and do the math right now but…
If we are disregarding sacrifice hits, why do you have it as part of the Negative Baseball Points equation?
Seems like that could gum up yr formula a bit.
I remember there was a lot of talk about developing a metric that weighed on-base percentage at a higher level than slugging percentage at the time, and then Aaron (who I like) came along and wrote a blog post and slapped his name on it and called it GPA. Which isn’t a good way to earn much appreciation.
Joe, as to why OPS became the most popular, for me it was entirely Rob Neyer. I started reading Rob’s ‘Chin Muzak’ when it started at ESPN.com around ‘95/96 and that was the first place I ever encountered the concept of ‘advanced stats’. OPS was the one that Rob pushed the hardest, largely, I assume, because it was so simple to calculate. After all, ESPN.com took many years to add it to the stats page, so for ~5 years I would auto sort hitters by OBP and then scan the list in real time, adding SLG to get their OPS, and re-rank them in my head. Rob continually pointed out that something like ~1.5OBP+SLG was more accurate, but said that it was not enough so to justify the extra math.
So to extrapolate from my available data pool, I’d say 100% of people default to OPS because of Rob.
I’ve always preferred GPA (Gleeman Production Average a.k.a. Gross Production Average) — the stat Cliff Corcoran mentioned in comment #4 — to OPS, but the easy math of OPS all but ensures it will remain in wide use.
For those unfamiliar with GPA, Cliff described it correctly above but made a typo in the actual equation (inserted OPS instead of OBP). This is the correct version:
GPA = [(OBP*1.8) + SLG] / 4
As a Reds fan, it just thrills me to see the bottom two guys in hitting average + are two players that Dusty frequently felt compelled to use 1-2 in the batting order.
AGon is now in Boston. Willy T. is on the DL. Any coincidence that the Reds have started to win a few of late?
Dusty is a moron.
For (10) and others; this is essentially just linear weights with the value for a single changed from its actual average run value, to 1 and the other values changed in proportion to that. A double is given 1.6 because on the average of all situations, a double is worth about 60% more than a single.
The other bigest change from the way linear weights are usually used is in the outs. Usually you just subtract run values for the outs from the run values for the positive events like hits walks and stolen bases. We could do that here; a batting out be worth about -.6 of a single, a caught stealing, costs about as much as a single adds. Instead the choice was made to take positive linear values minus something for caught stealing, divided by total outs used. Which I think is a simplification but defensible.
As to whether or not we need a new statistic,
and can readily influence its popularity; that’s a different discussion, and I’m dubious.
I agree that OPS is far from the perfect stat. But the top 10 players in Hitting Average just so happen to be the same 10 players currently leading in OPS (if you pretend Manny has played enough to qualify). The order is only slightly different.
Just sayin’.
Damn typos. Thanks, JK [20]
Tango, [8] I see what you did there. A SB = 0.4 and a CS negates that 0.4 on the numerator and adds -1 (outs) in the denominator. That’s fair. I didn’t think it was really possible that you’d have a straight SB=CS valuation in there.
That said, this all points back to why this stat, while a good one, is not the sort of simple intuitive stat that has much chance of going mainstream, at least not until other “gateway” stats get there first. I always felt OPS was worth supporting because it would be a gateway stat to GPA, which in turn would be a gateway stat to things like EqA, but we’re still standing in the first gateway without having walked through it.
Bill [18], so it seems my theory that Gleeman killed GPA by naming it after himself has some other adherents.
[pouts, shakes fist]
the reason OPS is also attractive is because it seems, at some abstract level, to capture the concept of “scoring runs”…as in “getting on base helps score runs,” and “slugging the crap out the ball helps score runs.” It’s a theoretically muddled stat, but it seems to reflect what “the game is about” from a hitting POV.
in order for your stat (HA..the stat, I’m not laughing at you–perhaps a better name IS in order!) to be of inherent value, one has to believe that every at-bat is of equal worth. and plain and simple, they don’t all matter the same.
it seems to me that there has to be a way to take this no-nonsense straightforward kind of thing, and add an optional “layer” of added-value beyond the basic layer. while still keeping the playing field even (using percentages, versus absolutes), reward people who do well when it matters more and penalize those who suck.
to show the problem with your stat:
if two people come to the plate 600 times, and both have the opportunity to knock in the same amount of people, and they hit the same amount of singles, doubles, etc…yet one person drives in 120 runs and the other drives in 80, your stat will say they have had an equivalent season, that they have contributed the same “positives to the team.” But they haven’t, have they?–one “guaranteed” his team 40 more runs, the other *at best* offered his team a percentage chance at those runs, which is going to end up lower than the absolute.
So yeah, RBIs are BS…but only to a point. No, you can’t penalize a guy from the Royals because his teammates are unable to get on base…and you can’t reward the Red Sox hitter who has teammates crawling all over the basepaths. The Sox hitter may end up with 50 more RBIs, but if the Royals player has the better “efficiency” (production as a function of opportunity) that’s “better production”.
But your stat completely ignores this aspect of the game, and says all at-bats are created equal. Look, if a hitter sucks at knocking in runs when given the opportunity, he is probably of less value than someone who doesn’t.
Consider a situation with someone at second base and one out. Everytime you “don’t” knock in the run, you’ve either lowered the odds of that run being knocked in (by making an out), or kept the odds of that run roughly equal or increased a bit (by walking, or singling him to third without making an out, etc.) while putting forth new opportunity of you scoring later (which of course is subject to others and thus problematic). But someone who more frequently knocks in that run is superior on the “guaranteeing runs” front. So somebody good at knocking in runs when given the opportunity (a percentage weighted knocking in runs of people when they’re on first, then factor in when they’re on second, then when they’re on third, etc.) is obviously more valuable to the team than somebody who is a bases-empty hero but sucks with runners on.
And of course your stat will have high correlation to runs scored (it’s built of “things” that are of increasing value the closer they are to scoring a “run”–e.g., a homer counts the most because it is a run. But correlation is not necessarily causation. Ice cream sales are correlated to sunburns, but don’t cause them.
That stat also only captures “baserunning” as steals and caught stealing…which is *completely wrong*. okay, now i’m joking. you did want the K-I-S-S method. it just seems like the K-I-S-S method could incorporate some additional percentage-based measure incorporating the reality of opportunities to score…or be the “individual-based” stat, while having a cousin stat to satisfy people like me (and your KC broadcasting buddy) that think these stats that say “nothing can ever matter except what the individual does, regardless of situation” are kind of silly, having played this game, and watched the pros play it, for years and years.
At the end of the day, I think an RBI stat that is made “fair” would appeal to people. And probably already exists, unbenownst to me.
I’m not sure much of what I’ve written makes sense, but I hope there’s something there that a superior-math-geek can use. That’s usually my relationship with stats/math–lots of spider-sense tingling, zero clarity.
“That gives you the plus side — the positives that you have contributed to the team. And you should know that baseball points has a HUGE correlation to runs scored.”
Did you actually do the correlation or are you just assuming?
i like it joe. now, take a nap or go to the park with the kids!
Sorry, but Hootie and the Fish that Blow are to OPS as OPS is to E=MC(2)…
~
#10 & 22: Weighting a single as 1 and a double, triple and HR less than 1/base, you accomplish what Joe set out to do – reward OBP over SLG.
I always thought that reaching on error should be credited. It would only make sense not to credit it if it were entirely random, but it is not, as speed appears to correlate considerably with reaching on error.
I get a chuckle out of all these comments that say the weighting is arbitrary – these are about the only possible weightings that really, truly mean something.
Lemme try to explain this in the best way I can. What we want to know about is runs, because runs are our fundamental unit. Now, remember:
1 run = 4 bases
You have to advance four bases (first, second, third, home) in order to score one run.
So, how many bases is a single worth? Someone out there is going to answer “One,” and wonder how I can be so stupid.
And, yes, the single typically advances the batter by one base. (Not always – sometimes a runner is thrown out at second attempting to stretch a double, sometimes he advances to second on an error.) But it also advances the other baserunners.
So let’s say that a single advances a total of roughly 1.9 bases on average – the fractional number of bases is a reflection of how often a single is hit with the bases empty, versus having runners on.
Now, how many bases is a double worth? Let’s say a double is typically worth roughly three bases – two for the batter, and one for the runners on ahead of him (again, this is tempered by the fact that most doubles occur with no runners onboard).
So, what’s 3 divided by 1.9? Just about 1.6.
And that’s why the numbers are what they are – they’re empirically derived from the relative value of the various events and how they contribute towards team run scoring.
OPS, of course, places a different emphasis on the double compared to the single – about 1.8 (this number of course flexes based upon things like a player’s walk rate – which of course makes no sense). The problem is that it’s buried under the false simplicity of “OBP plus SLG” – which is simpler because you’re IGNORING what it’s doing under the hood. OPS is doing the same sort of relative weighting process as what’s going on here, and its weights truly are arbitrary. You just don’t have to look at them.
The weights are from:
http://www.tangotiger.net/lwr.html
The link to fanhome doesn’t work anymore sadly.
The reason to not use OBP or SLG is because you can’t combine them while fixing their flaws, as The Book says when it introduces wOBA. Also, LWR is more flexible in that you can recast it so 100 is league average and compare the ratios, a la OPS+.
And yes, the extra complexity is needed. Baseball is a complicated game and throwing away results because it’s “complicated” is silly. Because really, do you calculate OBP every time or do you just look it up? Ditto for wOBA, FIP, et cetera. The point in giving the equation is for critiques about its methodology, not complication.
How exactly does this have to do with runs? I’m not seeing it.
And Mike, the “baseball points” are pretty much linear weights multiplied by two. So the correlation with team run scoring should be very good indeed. (And if you just divide the baseball points in half, you should see pretty good average error as well, probably less than 20.)
As a child, I always went to the big baseball card shows at the O.P. Merchandise Mart. My brother and I would spend hour upon rapturous hour canvassing the rows of vendors looking for the cards we just had to have.
I loved Frank Thomas and Thurman Munson (my Dad said I played catcher like him, and it was OK to like him even if he was a Yankee). My brother scavenged for Kirby Puckett.
I’ll never forget the sinking feeling of finding the perfect card, forking over hard-earned lawn mowing money, and looking closer at my prize only to see “O-Pee-Chee” where the “Topps” should’ve been. Like Jacob pulling back the wedding veil to find the plain face of Leah instead of his darling Rachel, I had been bamboozled.
I was barely old enough to multiply for crying out loud. I didn’t even know O-Pee-Chee cards existed. I’m sure similar stories have played out at card shops and shows across the nation. Shoot, I bet even Canadian kids would be disappointed if they were hoodwinked by the O-Pee-Chee gambit. They’d probably know better, though.
Dealers that intersperse O-Pee-Chee’s with their Topps cards should be condemned to 7 years hard labor.
So Cal Twins fan,
They’re all percentages, on base, slugging, and batting average is really a misnomer, it’s really a percentage as in you go 3-10 you hit .300 or you’re 30% successful in getting a hit.
I like Ops+ because the numbers are fairly low and it’s based on 100 being average. plus it takes into account league averages and park factors. You have a player with a 150 ops+ and you know he had a good season.
You can compare Albert Pujols fairly with Mickey Mantle, something you can’t do with just ops.
The problem I have with ops+ is that it doesn’t take into account stolen bases or the position a player plays like Vorp does.
Vorp is very good at showing offensive values but I think giving it a goofy name kind of backfired on the baseball prospectus people. It’s an easy target for the anti-saber crowd.
Warp seems like a good stat but the defensive replacement numbers give you some wacky results.
Robin Ventura 1992 in WARP 3, is among the best seasons in baseball history because of his defensive numbers?
War seems like a good unit by the people over at Chone projections.
Win Shares is very good because it’s based on 0 instead of replacement level. But I think it’s suffered since Bill James went to be employed full time by the Red Sox.
Why do we need one stat that tells us everything about a player? People know that a guy who steals bases (at a high success rate) is more valuable than a guy who does not steal any bases. People know that a catcher is more valuable to his team than a DH with the exact same stats. These things are just baseball common sense and not something that need statistical expression.
I do like OPS+ better than straight OPS but OPS+ gives me a good enough picture of what a player is doing the same way as ERA+ does for a pitcher. If I want more details I can get them. You’ll never find one stat that will combine everything important about all players in aa way that will satisfy everyone.
@ 25: jay
“Consider a situation with someone at second base and one out. Everytime you “don’t” knock in the run, you’ve either lowered the odds of that run being knocked in (by making an out), or kept the odds of that run roughly equal or increased a bit (by walking, or singling him to third without making an out, etc.) while putting forth new opportunity of you scoring later (which of course is subject to others and thus problematic). But someone who more frequently knocks in that run is superior on the “guaranteeing runs” front. So somebody good at knocking in runs when given the opportunity (a percentage weighted knocking in runs of people when they’re on first, then factor in when they’re on second, then when they’re on third, etc.) is obviously more valuable to the team than somebody who is a bases-empty hero but sucks with runners on.”
Is it not rather established that players, if given enough AB, will generally end up with a slash-line similar to their normal over “clutch” situations?
But even getting past that, your attempting to factor for how many runners a player actually drives in from bases opens up another huge can of worms for you – who those actual players on base are.
Say for instance you are the leadoff hitter for the Phillies. You would come to the plate with a possibility of Pedro Felix, Carlos Ruiz and the Pitcher on base ahead of you. That would, I believe, be the slowest combination of players you will find in a normal lineup this season. Now, compare that against the leadoff hitter for Cleveland (pre Lee/Francisco trade) with Jhonny Peralta (extremely slow), Luis Valbuena (very fast) and Ben Francisco (extremely fast). A leadoff guy in Philadelphia is much less likely to drive in those runners then the leadoff guy in Cleveland – the baserunning ability of the specific players is an extreme advantage for the Indian.
@ 36: Michael_Q
“I do like OPS+ better than straight OPS but OPS+ gives me a good enough picture of what a player is doing the same way as ERA+ does for a pitcher.”
OPS+ and ERA+ are really no different then straight OPS/ERA – its just that numbers against the league average with an attempted park factor added in. OPS+ still weighs BA way too heavily, where ERA+ still tells you nothing about a pitchers ability – just how many runners happened to cross the plate.
@38: JoeyO “ERA+ still tells you nothing about a pitchers ability – just how many runners happened to cross the plate.”
Considering the pitcher’’s job is to stop runners from crossing the plate, that does tell me a lot about the pitchers effectiveness.
Of course defense plays a role but I’m not convinced people really understand enough about defense to completely separate defense from pitching yet. Looking at FIP as a secondary stat has some value but I think ERA+ will give the best indication of the pitcher’s effectiveness at preventing runs (which he always does in conjunction with his defense and not in a vacuum).
I know some people will disagree and perhaps when FIP stats advance more I will change my mind and start putting more stock in them but for now I’m dubious of their accuracy.
“Of course defense plays a role but I’m not convinced people really understand enough about defense to completely separate defense from pitching yet. Looking at FIP as a secondary stat has some value but I think ERA+ will give the best indication of the pitcher’s effectiveness at preventing runs (which he always does in conjunction with his defense and not in a vacuum).”
ERA (and in turn, ERA+) doesnt even tell you how many runners crossed the plate when the specific player was on the mound. A pitcher could leave with a man he intentionally walked (which would be the managers decision) on first, two outs and two strikes on the guy at the plate yet still get charged with an additional Earned Run if the catcher cant hold onto the ball in a play at the plate up to two hitters later.
Defense, Managers, Bullpens, Umpires, Ballparks, Base-runners, even the current score – it all affects ERA quite a bit and none of it is in the pitchers control. With so many variables in play, it really becomes more of a team or situational stat then individual measure. FIP is a much better measure of a pitchers performance if in a neutral setting. It isnt quite perfect, but it is light years closer to it then ERA.
Kevin-
You seem to fit the bill of a classic numbers-phobe who has never read an actual Joe Posnanski article.
Joe loves numbers BECAUSE he loves the game, and watching the game. He has written countless articles that are strictly game observations. Read a few. They are great.
Second- These stats aren’t pointless. They help answer cross-league questions (who’s the best player? Is so-and-so overrated? etc.) that can’t always be determined empirically. Calling them pointless doesn’t make them so. Likewise, Joe points out why the commonly accepted stats don’t cut it.
Open your mind, Kevin, and please don’t make such ignorant observations. The stat-lovers love the game of baseball as much as anyone else.
“Until recently, nobody cared about times on base or total bases. ”
Are you kidding? Rice just got elected to the Hall of Fame because he cracked 400 total bases 30 years ago.
@Mike Bagnall: While I think it’s reasonable to question the merit of assigning values to individual players, that train left the station long ago.
Now, Runs Scored is not a bad choice. But it does have one significant flaw, which is that there are a number of common circumstances where RS doesn’t provide an answer that meets the common sense standard.
Two examples that leap immediately to mind. Both start from a tie game in the 9th inning. In our first example Abel walks, and Baker hits a double; Abel scores the winning run. In our second example, Abel, Baker, Charles and Daniel all walk; Abel scores the winning run.
Does it make sense that Abel gets “all” of the credit for the run? If all you are doing is giving our prizes, then the answer is a matter of taste. If you are hoping to make accurate predictions based on credits previously earrned, you’re likely to discover this evaluation is objectively inadequate.
Of course, when we say “accurate”, if we mean anything more than two digits of precision we are fooling ourselves – Tango would get behind you on that wagon, I’m sure.
I learned my “advanced stats” back in the days of Usenet on trn…
Addressing Joe’s point that OBP should play a larger part in OPS, I recall that back in the early to mid 90s, when I was following rec.sport.baseball, the consensus was that OPS should be calculated by multiplying OBP x 1.2 before adding it to SLG. Thus:
OPS = (OBP x 1.2)SLG
I guess along the way it just became easier to add the two together.
Hi Joe and everybody
I’m a relatively recent baseball convert (i’m irish, and became hooked when living in toronto a few years ago). i’m also a mathematician, and so am interested in (and often frustrated by) baseball statistics. while i understand that ops can be a good indication of runs scored etc, mathematically it doesn’t make sense. it isn’t an average, and you’re adding two things with different denominators. (i also don’t think that multiplying them makes any sense either)
I’ve always thought that the following would be a good measure. It probably already exists (because every possible combination seems to have been done at some stage!)
If we want to see how a batter performs, completely independent of his teammates or game situation, then what we should look at is how many bases they advance, on average. So it would be what i’d call ‘bases earned’ (i think i recall seeing it called complete bases somewhere) per plate appearance. so
BE/PA = (TB + BB + HBP + SB – CS)/PA
It measures how many bases you’d expect a batter to advance, under his own steam, each time he comes to the plate.
if you want to compare it to ‘baseball points’
out= 0
single, walk, hbp = 1
(single, walk, hbp) + cs = 0
(single, walk, hbp) + sb = 2
double = 2
etc.
I really think it should be measured per plate appearance, not as per out. It is a much more natural way to view it in my opinion, and more mathematically meaningful too. It also removes worries about caught stealing – if you walk and get caught, you might as well have not walked at all.
the problems with it include: singles and walks are counted as equal. clearly a single is worth more – but only if a man is on base.
similarly a gidp perhaps should be worth less than an out (maybe it should be counted as -1 in the formula – you don’t just fail to get a base, you erase a base your team had earned) – but again, that’s only if a man is on base.
a sacrifice counts as an out – maybe it should count as 1, as you’ve earned your team a base, but i don’t think so. maybe if they weren’t counted at all as plate appearences (but only if it’s a sac bunt, where you’re really not trying to get on base at all)
so i prefer to look at this stat as if every time you come to bat, the bases are empty – now how far are you going to advance? an average of .500 would means that on average, you will get to first base once every two at bats.
while this stat will inevitably be less perfectly matched to runs scored by a team (because in, for example, GPA, they use a seemingly arbitrary combination (the 1.8 mutliplier), calculated after the fact to match run production), i think it is much easier to understand.
Let me know what you all think!
PS in a previous post you had an almost perfect ‘runs scored’ predicting stat, and wondered about where the small error may come from. i think perhaps outfield assists against may have a bearing. for example, if you’re thrown out trying to stretch a double into a single, it gets scored as a single. perhaps it should be included in the stat as a ‘caught stealing’ – you hit a single, which could score a man from 2nd or 3rd, but you ran into an out. just as if you’d stopped at 1st and been caught stealing. just a thought.
Looks the same as the OPS top 10 list in a slightly different order, except that K. Morales is in the OPS top 10 instead of Manny (ESPN sortable stats). Seems like OPS works just fine to me. Why not accept OPS as the first step in the right direction? It is so easy…
To joeyo, and Joe Poz, and Tangotiger, and everyone else posting these wonderful statistics that treat every at-bat as equal:
I think you’re right…but wrong. And mostly wrong. But that’s probably because by brain is flawed. Allow me to explain.
Okay, you’re right:
I get the fact that *any* time another player is involved some bias enters the equation. So you take others out, and the only way to do that is say every at bat is equal, every situation is equal, and only what the player does *himself* counts.
But….point from my flawed brain #1:
I’m not a believer in the assumption that over enough at bats there is no such thing as “clutch.” I understand regression to the mean, and the issue of sample sizes, etc. But I also understand that we are humans and macro-based assumptions are often faulty (for an interesting non-sports example, see prospect theory by tversky and kahnenmann, which is a pysch theory that was worthy of a nobel prize…in economics).
All I’m ultimately sayin’ is that any stat that completely, absolutely, and positively states that “all at-bats are created equal” is flawed. It may be the best we can do (although I don’t believe that), but it’s inherently flawed.
Point from my flawed brain #2:
Again, returning to the example of two players, same amount of at-bats in same situations, same amount of singles, doubles, etc….but one knocks in 120 runs, the other 80. HA would say they are equal. I would rather have the guy who can hit with runners on base.
Real life empirics for P.F.M.F.B. #2:
Adam Dunn, for his career, has a .251 BA, .385 OBP and a .523 SLG, for a .908 OPS. With runners in scoring position, he’s hit .231, with a .421 OBP and .486 SLG for a .907 OPS. Mark Teixeira has similar career stats. He has .288 BA, .378 OBP, and a .541 SLG, for a .918 OPS. But with runners in scoring position, he’s hit .317, with a .438 OBP, .595 SLG, and a 1.033 OPS. So if all at bats are created equally, especially if one favors OBP over SLG as more important, it’s easy to say that Adam Dunn has has AT LEAST an equal career to Texiera. And a .421 OBP with runners on is really, really good. But the guy’s batting average in that situation is .231, so obviously he’s drawing tons and tons of walks. Tex, on the other hand, also steps up OBP, but also knocks the crap out of the ball more (nearly +100 pts in SLG). All in all, Teixeira actually steps up his game with runners on base, and Dunn is just himself. He draws a lot of walks with runners on…Teixeira gets on base as well, and knocks people in. I get it, they played for different teams, pitchers probably had to pitch them differently with runners on, etc etc etc etc etc No stat is perfect. But the HA methodology would probably say their careers are very, very similiar. But there’s a reason why Mark Teixeira is making twice as much money and why the Sox and Yankees were both drooling over him. He’s been a very very good hitter overall, and a GREAT hitter with runners on base. Those are stats for their CAREERS. It’s a big enough sample size to say “overall, Tex has been a hell of a lot better with runners on base that Dunn.” (Which, by the way, makes 2009 remarkable–because they’ve completely switched. With RISP–Dunn, over a 1.000 OPS, Tex, below .900. Which is one reason why Dunn’s season is so underrated, and Tex’s overrated.)
Point from my flawed brain #3:
It’s distinctions like these–can a guy actually knock people in–that “every bat is equal” stats ignore. And that’s why they are cold and unappealing to some people, while RBIs are not. I think both kinds of stats are inherently flawed, for different reasons.
My flawed brain’s conclusion:
Somewhere in the middle lies the answer, the “magic stat” that will say how good a hitter REALLY is.
And a question for all of you stat-heads:
Honestly, if you were the GM of a team, and thinking about trading for a guy, would you not be CURIOUS how well a guy hits with RISP? Or in close-and-late situations? Because, if so, I think you’re lyin’.
did someone just claim that indians’ hitters have, “an extreme advantage” for driving in runners over phillies’ hitters because there are slow phillies’ runners on all of the bases?
dusty, welcome to the blog, i figured you were too busy mismanaging the reds’ to comment here.
Joe’s formula values an unintentional walk more than an intentional walk (actually, a lot more). I assume he didn’t do this arbitrarily, but I don’t see the reasoning behind it. The one advantage of a UBB (?) over an IBB is that presumably you have made the pitcher work harder in that at-bat. Is there something else I’m missing?
yeep – I like math and this hurts my head.
I most agree with jay (#46) – there are a lot of logical fallacies in all “true worth” stats. I get really bothered by the fact that (almost) everything seems reverse-engineered – i.e. find the combination of multipliers that gives you the least variation from the actual results, over a million games and a million at-bats and a million innings pitched. Reasoning from the conclusion to the premises ain’t right.
And too, yes, a team wins or loses. Drives me nuts when PECOTA says team x shoulda won 5 more or less games, so we’ll just project the season as if they did. It’s over, the games are in the books, sure, PECOTA or whatever might be just fine going forward, but what happened happened. Okay, player X hit 50 points better in the “clutch” than at other times. Okay, that’s not “repeatable”. But it did happen, you can’t change that.
The totality of degree of error makes things unreliable. I don’t have any better suggestion, though. I think there is some validity in win expectancy, and too, why can’t one just calculate an OPS-type number the same way as OBP, except with total bases replacing hits? I really doubt that everyone sits down and uses pencil and paper to calculate these things – with them things called computers, any damn stat can be produced.
OPS really really bugs me.
This is going to be long, and I’m sorry.
Jay @25:
“if two people come to the plate 600 times, and both have the opportunity to knock in the same amount of people, and they hit the same amount of singles, doubles, etc…yet one person drives in 120 runs and the other drives in 80…”
…then the “other” guy must be batting behind some slow-ass dudes, because outside of complete random chance that’s the only way for such a discrepancy to happen.
Look, I realize people desperately wish to ascribe relevance to a player’s ability to “produce” runs all on his own, but the fact of the matter is that if you go look up a bunch of good* players you’ll discover that not only over the course of their careers, but on a season-by-season basis, anywhere from 15-19% of the runners on base at the time they come to the plate will score as a result of their plate appearance. In fact, 15% and 19% are outliers; most decent players I spot-checked were at 16-18%.
I don’t want to get too much into a big long thing here (someone like Tango would be much better suited to analyze this more carefully), but in terms of the raw numbers of runners scoring while they’re at the plate, the difference between even 16-18% is pretty small. We’re talking on the level of about 11 runs (not all of which can be presumed to result in an RBI, mind you), assuming a similar number of opportunities.
But even more importantly, a lot of the difference between a guy at 16% and a guy at 19% can be explained by the fact that the guy at 19% simply had a better season anyway (i.e., better stats rather than “exactly the same”). I think a diligent researcher would be hard-pressed to find two players who had remarkably similar seasons and remarkably similar numbers of RBI opportunities to have RBI totals which were notably dissimilar. And having them separated by 40 RBI as you suggest? Absolutely not happening.
(*-Obviously, Tony Pena Jr (12%) is going to be less effective at causing runners to score from his plate appearances than Magglio Ordonez. The major league average is 14%; the point, however, is that players of somewhat similar talent will fall within the same basic range. We aren’t discussing whether Tony Pena Jr is a better clutch hitter than Magglio, after all; we’d be comparing Magglio and his 103 1998 RBI (19%) to A-Rod and his 103 1998 RBI (16%), for example. No player I consider to be “good” turned up under 15%; no player I checked period exceeded 19%.)
“mathematically it doesn’t make sense. it isn’t an average, and you’re adding two things with different denominators.”
@btc: So what? You can get similar denominators by multiplying each term by a different constant to get the units right.
The formula you are looking at is essentially TotalAverage ( http://en.wikipedia.org/wiki/Total_average ).
My feeling is that Total Average (and it’s bretheren) have no audience – it’s too complex for an audience that needs simplicity, but insufficient for an audience willing to work for accuracy.
And more for Jay (@46 this time).
Hey, I don’t disagree with you. One fallacy that has been perpetrated despite his efforts to point out he never actually said it is that Bill James claimed there’s no such thing as clutch hitting. What he actually claimed was that there was no evidence to prove that it exists as an actual ability; that is, the variations between a player’s BARISP from season to season are greater than his simple BA variations. It’s small sample size.
But clutch hitting does occur, and even James will tell you that. RISP is not “phony”; it’s real, and it’s relevant to the extent that it measures something which did, in fact, happen. The problem is that it isn’t a predictor of things to come — or, rather, it isn’t a predictor of things to come which you can’t predict just by looking at what you’d normally look at anyway. I mean, George Brett (19%, referencing my last post) was a great clutch hitter. Well, duh. He was a great hitter, period.
In answer to your question… seriously, I would want to study what percentage of runners scored while the player was at the plate after removing his walks from the equation, and compare that to other players. A guy who’s consistently good at that? Gimme.
It’s not that the ability to drive in runs is irrelevant; it’s that simply looking at RBI as a stat is irrelevant. I mean, if Babe Ruth played for the 2009 Royals and hit 60 homers, he’d be hard-pressed to get 100 RBI…
Marc @49: the point of PECOTA isn’t to say “this should have happened, so let’s pretend it did”; it’s to say “this is what should have happened, so let’s base our decision-making going forward on that premise.” If a player goes .300/.400/.500 for three seasons in a row, and then suddenly goes .250/.350/.450… well, unless we know there’s a reason for the decline and there’s absolutely no reason to suspect the player will bounce back, then why shouldn’t we assume he’s a .300/.400/.500 hitter who just had a bad season? Similarly, if a mediocre player has a great season all of a sudden, unless we have evidence that he’s developed some skill previously unseen… why should we pretend he’s suddenly a good player?
On the team level, it also means that if this team should have won 92 games and only won 76, maybe we shouldn’t start tearing things apart too quickly… and if a team should have won 70 and wins 81, maybe printing next season’s playoff tickets is a bit premature. It is, primarily, a sanity check.
Jeez – in such a rush I can’t be bothered to spell my full handle?
@marc “I get really bothered by the fact that (almost) everything seems reverse-engineered”
There have been a few efforts to go forward, but fundamentally we’re talking about theory vs experiment here – if the theory doesn’t match the experiment, then the theory is wrong. So we might as well make sure that we understand the experiment.
That said, most people who take their curve fitting seriously will make an arbitrary separation of the data – get the least variation of half the data, and then see how well that predicts the other half. (Seasons in odd and even years is the divide that makes the most sense to me).
“Drives me nuts when PECOTA says team x shoulda won 5 more or less games…”
We’re talking about a sport where the “2008 Champions” lost 6 games to the Nationals. From this, I conclude there has to be room for variance in the discussion.
This stat (Linear weighted) is almost exactly what my fantasy league is run on, and it produces a startlingly clear and very different valuation system from a) most fantasy leagues and b) the “popular consensus” on player values.
Except for one small element that remains the same in every system.
Albert Pujols is #1.
Ah c’mon, my 120 vs 80 RBI examples were just simplistic exercises. But they could be explained by “slow-ass guys” or by Joe Clutch hitting .350 with guys on base and Joe Not-So-Clutch hitting .250 with guys on base. In a case like that, an “objective stat” could easily say they had the same seasons. But a stat that incorporated situations (or any fan who watch the team) would say HOLY CRAP did Joe Clutch have a “better” season…I’m buying his jersey / naming my son after him / etc.
It’s Tex vs. Dunn. (Except for this year.)
But overall, I’m more with ya than not. Look, I’m a stats guy at heart (I’m in the middle of a ph.d. program and have taken stats courses up the proverbial tushy at this point). The traditional BA, HR, RBI trinity are all severely flawed stats.
But the point is that things wash out in the big analysis. But that’s why we look at microeconomics (bottom-up) as well as macroeconomics (top-down) to determine the overall economic picture.
And anyone who watches baseball knows that what situations matter, a lot–of course they do! Prima facie, any stat that says they do not–that differences will wash out–is theoretically unappealing to me. More appealing than BA/HR/RBI? Yeah. Like driving a 2005 Toyota would be more appealing than a same-sized 2000 Kia. But I WANT a 2009 Lexus baseball stat. One that combines the logic of objective thought with the deep-down-we-all-know-it’s-true acknowledgment of “of course situations matter.”
Another Empirical Example
This year’s Angels–
.287 .351 .446 .796
149 RBI
132 SB, 52 CS
454 BB
The Rays–
.265 .347 .449 .795
170 HR
170 SB, 51 CS
549 BB
These two teams have virtually identical OPS. Sure, the Angels hit for a higher BA, but as discussed ad naseum, BA shouldn’t matter that much vs. OBP. The Rays also crack more HRs…steal bases much more efficiently (and often)…exhibit better plate discipline in walking more. They are clearly, by objective stats, a superior offensive team to the Angels.
BUT…..
Rays “When it Matters Most”:
RISP .264 .368 .425 .793
2- outs, RISP .228 .360 .371 .731
Late and Close .257 .357 .427 .784
Angels “When it Matters Most”:
RISP .301 .378 .471 .849
2-outs RISP .294 .392 .493 .885
Late and Close .296 .381 .451 .832
The Rays, simplistically, don’t get it done “when it matters”–especially “when it REALLY matters.” The Angels do.
Guess what? Despite objective stats saying that they are an inferior team, the Angels have scored 751 runs, the Rays 698.
I’m extremely wary of claiming causality, so if I’m missing something here, please let me know. But it seems like a case of the Rays simply not being as “clutch” as the Angels–despite getting on base as much, hitting with more power, speed, plate discipline–has been worth over 50 runs. That’s nearly half a run a game, and that’s a big deal.
As anther interesting thought exercise, go look at 2004 NL MVP candidate stats (ignoring the dude with the giant head–too much nerve tonic, probably–on top). Candidates 2-6 have years that, from, say, an OPS perspective, look incredibly similar. But dig into each guy a little bit. With RISP, some were “their normal selves” (Berkman). Some were “better than themselves” (Pujols). And some were “worse than themselves” (Edmonds…who was Babe Ruth with the bases empty). I know, I know…they had different people following them in the lineups, yadda yadda yadda. But they all objectively look somewhat equal–but in reality, they really probably weren’t. Pujols was, as usual, the gold standard (besides The Skull That Ate San Francisco).
And honestly, I don’t give a hoot about the “it’s not useful for predicting” logic. I don’t believe (maybe wrongly)–but even if it’s true, we still use stats to explain what did happen (as you acknowledged). And to compare seasons as they are happening (as we love to do here). And even if we’re going to do that, any stat that says “every at bat is equal” is, to me, inherently limited. Still useful, but limited.
@ Jay
Re: “Clutch”
You used Teixeira as your example of a hitter who is better with RISP. Here in lies a couple problems
.317/.438/.595 with 67 IBB and .334 BAbip = RISP
.282/.355/.527 with 0 IBB and .307 BAbip = No Men On
First, correct for free passes. Removing the IBB, we see this
.317/.408/.595 RISP
.282/.355/.527 None On
Now, correct for BAbip difference. (I will put them both to .307)
.297/.391/.575 RISP (21 Hits were removed)
.282/.355/.527 None On
Now try to account for XBH rates to see how those extra 21 hits would have likely been spread. He originally had a hit spread of 330 Hits, 182 Singles, 74 2B, 6 3B, 68 HR. Take the HR from the Hits (as these are not Balls in Play) and you have 262 hits in play, 72 2B, 6 3B. .702 1B/HIP, .275 2B/HIP, .029 3B/HIP. Spread it over 241 Hits in play (again, H minus the 68 HR) and you get 169 Singles, 66 2B, 6 3B and those 68 HR (our 309 Hits) and it brings our total comparison to
.297/.391/.567 RISP
.282/.355/.527 None On
You have a BA within .015 points and a Slugging within .040 – easily a normal fluctuation rate. The OBP is still a bit out of whack, but it is understandable when you consider 1B is often open and a pitcher would be semi-pitching around him in quite a few situations.
Specifically, we end up with these rates
12.4 BB%, 20.5 K%, 29.7 H%, 13.4 XBH%, 6.5 HR% – RISP
9.1 BB%, 20.8 K%, 28.2 H%, 12.8 XBH%, 5.7 HR% – None On
Overall, he ends up a very similar hitter with RISP.
Also, if you were to do this calculation prior to this season, you would see an even stronger RISP performance difference then you do right now. This would of course be a problem if you relied on it when signing his contract, as he has hit .272/.415/.456/.871 with RISP this season versus .282/.366/.606/.973 when empty. Another season like the one he is currently seeing and his minor differences we saw above would all but vanish – making him a rather dead even hitter (sans BB%) with RISP or Empty.
@ 47: Lukehart80
“did someone just claim that indians’ hitters have, “an extreme advantage” for driving in runners over phillies’ hitters because there are slow phillies’ runners on all of the bases?
dusty, welcome to the blog, i figured you were too busy mismanaging the reds’ to comment here.”
You sure didn’t follow the conversation very well, did you. I will further explain the situation guidelines originally outlined by Jay:
Indians Leadoff hitter comes up with 100 players on base a season. We will say there are 50 at 1B, 30 at 2B and 20 at 3B – those players are Peralta, Valbuena and Francisco.
Phillies Leadoff hitter comes up with 100 players on base a season. We will say there are 50 at 1B, 30 at 2B and 20 at 3B – those players are Felix, Ruiz and a Pitcher.
Those are the guidelines as given, and under those circumstanced the Indians hitter is at a distinct advantage as the runners he has on base are drastically superior base runners. Felix, Ruiz and the average Pitcher are horrendous baserunners (possibly the worst combination you can find in the game), where Valbuena and Francisco are well above average.
So in closing yes, the Indians hitter quite clearly has “an extreme advantage” with regard to driving in runs over those situations. Even Dan Quayle could recognize that.
Congratulations! All that verbiage and all you did was put fresh lipstick on Boz’s Total Average from 1977. Yawn.
I love your analysis JoeyO. I’m not being sarcastic.
But I think, given my argument that situational hitting should be taken into account, that I’d pass on deflating stats for BABIP when comparing the two situations. I get the argument for BABIP, but I’d like to think that someone who is “a better hitter in the clutch” might actually do a little more than make more contact–they’d get more hits! So deflating it to the non-clutch levels would be atheoretical. Same, for that matter, with IBB. Good god, the guy is feared enough–in these situations–to get 67 IBBs. Most people aren’t. Let’s leave those in.
I like the stats as they are, thank ye very much. I think your correcting may not be correct…but I did enjoy the analysis.
And apologies to all for my too-frequent, too-long posts. I’m writing my dissertation right now, and am literally on my laptop all day long…and I’m really, really sick of my dissertation. But not baseball.
Very interesting indeed. I wonder if anyone has noticed the high percentage of NL players on the “best” lists.
Perhaps there is some other reason than a paucity of NL talent to explain why pitchers from the AL seem to do so well recently in the NL?
And yes there should be a sarcasm font
John Q,
“They’re all percentages, on base, slugging, and batting average is really a misnomer, it’s really a percentage as in you go 3-10 you hit .300 or you’re 30% successful in getting a hit.”
No, the misnomer is the use of percentage. .300 is an average. 30 is a percent. Percent means “per 100″ so you take the average (.300) and multiply by 100 to get 30.0. Voila.
If someone’s batting “percentage” was .300, that would mean it would take them 1,000 at-bats to bet three hits. Not even Yuniesky is that bad.
Average is per at-bat. Percentage is per 100 at-bats, and no one writes it out that way. An on-base average of .400 would be the same as an on-base percentage of 40.
@ 59: jay
Fair enough, we all have our own thoughts.
Did want to mention one or two more quick things though.
“Same, for that matter, with IBB. Good god, the guy is feared enough–in these situations–to get 67 IBBs”
Of course he would be, he is a good hitter afterall. No one would ever question that. But that 67 IBB isnt that high of a number. Dunn (the other guy you mentioned) has 95 which is good enough for a 6.4 IBB%, where Tex sits at 5.1 IBB% in those RISP situations – and that is despite Dunn supposedly not being as “clutch”, right?
But most importantly, this whole “clutch” thing based off stats with RISP is rather silly to begin with – that isnt the mark of clutch at all. Coming up with a runner at 2nd in the second or third inning isnt exactly a real pressure situation, just like runners on 2nd and 3rd when you are up or down by 10 isnt. They are places you would like to do well, but it is hardly necessary. If you were to truly try to find the situational mark for “clutch” to show itself, it would have to be the “close and late” category – those are the times when a player has the most pressure to do well. The game is actually on the line then.
In those situations we see this
.282/.376/.559/.935 with .295 Babip (Late&Close for Tex)
.288/.378/.541/.918 with .306 BAbip (overall for Tex career)
You need a magnifying glass to see the difference in those two lines, and it takes all the steam out of the “Teixeira is clutch” argument, doesn’t it?
One minor quibble, Jay; if you had one guy hitting .350 with runners on and another hitting .250, you would have one guy probably reaching the “20% of runners scored during his PA” threshold*, and one guy lounging around near 13-14%. But even then… you’re probably not talking about two guys who, on the whole, are .300 hitters. That was the sort of thing I was getting at; most of these differences in RISP, etc., can be explained by first looking at the guy’s overall performance.
*-I said nobody I spot-checked reached 20% for their career, but it does happen seasonally; for instance, the Best Royals Team Ever (1977) had two guys reach 20% — Amos Otis at 20 and Al Cowens at 21. The most-feared clutch hitter in Royals history? He “only” managed 19, as did his BFF Hal Mac. (The team as a whole managed 17%, which I’d say had a lot to do with winning 102 games.)
As for the BABIP discussion… seriously, the variances are essentially random. Very, very few players are able to consistently sport a “very good” BABIP, and when you start looking at the list of culprits you get one thing in common. They’re guys like Gwynn, Brett, Williams, etc; low-strikeout, high-contact guys who are renowned for being able to see the ball well enough to be able to hit it where they want to a lot of the time. But even then… in 1977, Brett hit .312.
He had a .296 BABIP; his higher average was a result of only striking out 24 times while hitting 22 HR (which, most people don’t catch onto this, is basically the same as starting with a .296 average, and then going on a 22-46 tear to finish the season).
Socaltwinsfan,
“Batting Average” is a percentage, it’s just written in it’s decimal equivalent form over three places from the decimal point.
A .500 Average really mean’s a 50% hit success percentage.
The formula for “Batting Average” is (Hits/Plate Appearances, 250/500). So lets say Player A gets 250 hits in 500 Plate Appearances. That would be a .500 “Batting Average” or really a 50% hit success percentage.
An “Average” would be (Plate Appearances/Hits, 500/250), which would give you “2″. So a “real batting average” would be something like: “Player A had a hitting average of [(1)hit per (2)plate appearances] or (1 hit for every 2 plate appearances). In a true average the lower the number the better the hitter.
Joey/Jon–enjoyed the thoughts, thanks.
I get that usually a guy who’s a great hitter is going to be great consistently, and someone who sucks won’t turn into Roy Hobbs with RISP…but the point is that unlike *maybe* BABIP, it’s (to me) not enough of a no-brainer to just say “okay, everybody’s going to fall into a range, so let’s move on.” I suppose I need to do more research there.
I also suppose that kicking tush with RISP doesn’t make Tex clutch (I muddled those up, per the usual). It does, however, make him an RBI hound, which makes him $20million per, right or wrong.
If Tex does as well in close & late as his normal All-Star self, doesn’t that say he’s pretty impervious to pressure? And therefore clutch? Just not “superclutch” or whatever else we could call him if he did even better? It’s all in the definitions I guess.
This is all sh*ts and giggles. Ya gotta love baseball stats–and seriously, somday soon I’m going to lock myself in my computer room for a month with baseball-reference.com, SPSS, and 50 pots of coffee, and emerge with a stat that embraces the “objective” and some weights it for “situational performance.” Because (allow me to kick the horse again) I just can kind of, but not completely, buy into a world where all trips to the plate are created equal, where a guy who hits .231 with RISP is doing his job pretty well, and where the Rays are “better offensively” than the Angels yet score 50 less runs to date, and where (insert another 100 idiosyncratic examples)….
And then y’all can tear it to pieces…but it will be fun for us all!
To the people who keep thinking I treat each at bat equally:
I provide MANY different metrics. Each metric has its own assumptions. Some are very context-dependent, and others are context-neutral. Just because you find a “fault” with one metric, just realize that this is not a bug but a design feature.
YOU tell me what kind of considerations you want, and I’ll point you to a (likely existing) metric that you need.
Otherwise, you can make the argument that OBP doesn’t work because it treats a walk and HR as equally. And it does so, regardless of the game situation. This is a feature, not a bug, of OBP.
“An “Average” would be (Plate Appearances/Hits, 500/250), which would give you “2?. So a “real batting average” would be something like: “Player A had a hitting average of [(1)hit per (2)plate appearances] or (1 hit for every 2 plate appearances). In a true average the lower the number the better the hitter.”
Nope, the batting average is is an average. It is not a percentage, because per cent means per 100. 0.350 per 100 would not put you in contention for a batting title (it would, however, be about what I’d hit against big league pitching).
A .300 hitter *averages* 0.3 hits per at bat. Hence, average.
Tangotiger–not saying you don’t have a full collection of stats. I’m admittedly glib when it comes to some, and I’ll gladly take your offer.
I’d appreciate your suggestion (and/or thoughts) on a stat that is:
MOSTLY
(1) an objective metric of individual performance on a “every at bat is equal basis”
but is
AT LEAST SLIGHTLY
(2) weighted for situational baseball…the conceptualization I have is “the more important the situation is, the more weight it is given.” However that is best handled.
And is not completely convoluted to the point it will make heads explode.
Anything in the cupboard like that? Thanks!
JoeyO: Actually I would like to see a stat which was like ERA except also including unearned runs. This would give a better snapshot of a team’s ability to prevent runs when a given pitcher was on the mound. FIP are good PREDICTIVE stats. They can say “This pitcher is really getting unlucky” ERA+ is a better DESCRIPTIVE stat. It tells you how effective the pitcher along with his defense is at preventing runs.
Theoretical peripherals are nice but run prevention is the name of the game for pitchers and no one ever said everything in baseball was fair or that players performances always reflect their true talent and ability.
ERA+ is not perfect. Like I said before I believe no catch-all stat is perfect and people should look at multiple metrics when assessing players but ERA+ and OPS+ do give you valuable information about scoring and preventing runs. They aren’t “bad” stats.
Interesting that the Tigers have two everyday players in the bottom 10… and yet they’re leading the AL Central by 7 games.
“A .500 Average really mean’s a 50% hit success percentage.”
John Q,
You keep using that word. I don’t think it means what you think it means.
However, in this case, you are exactly right! Percentage is just the average multiplied by 100 (per cent means “for every 100″). 50% means that for every 100 at-bats the player “averages” 50 hits. 50 hits in 100 at-bats would be a .500 batting “average.” However .500% (look where the decimal point is, this is key) means the player “averages” .5 hits for every 100 at-bats, which would be a .005 batting average, the worst player in the history of the game.
It’s always about average in baseball. That’s the whole point. People want to know how this player does “on average.” If you want to use the term percentage to refer to everything, that’s fine. Just use proper notation.
Try this: Joe Mauer leads the AL with a 36.7 batting percentage, a 43.5 on-base percentage and a 60.7 slugging percentage.
Just a guess, but I’m sure those who first started to use “on-base percentage” really were using percentages. They’re a little easier to understand for the “average” person. However, since batting averages had been used forever, they switched to averages, but it was too late to rename the stat.
Jay – check out RE24 on Fangraphs. It should do what you’re looking for.
Bill James would hate this.
Hitting Average is a cool name, so you have that working for you. But wasn’t one James’ biggest contributions to baseball observation the point that hitting environments differ?
Youkilis benefits greatly from hitting in the batting paradise that is Fenway, and A Gonz has to play half his games in freaking Petco! How can take a stat seriously that neglects to include this obvious fact?
One of my favorite articles you ever wrote was the one where you said that Dave Kingman would be a Hall-of-Famer had he played for the Red Sox. Even OPS+ uses this concept.
Other than that, cool blog topic.
WPA’s useful too; its one major drawback is that it undercredits the guys who start chipping away at a huge deficit; not much credit is given to the guy who doubles in two runs with the team down by 7 in the 7th inning, yet I think we’d all agree that’s a pretty damned valuable outcome if it starts the team on a comeback.
That drawback is, of course, by design, as it “penalizes” guys who add useless stat-padding in blowout wins in the exact same fashion (even if those runs end up being relevant by the 9th inning). But I think if any stat which is accepted as useful and relevant by the sabermetric community truly measures “clutch” in any usable fashion, WPA is probably it.
Jay:
As Colin said, you may go for RE24, or WPA/LI, both on Fangraphs.
Others may prefer WPA.
***
Really, all one has to do is decide what the user requirements are, and we can create a stat for it. It’s really that simple.
Socaltwinsfan,
Baseball is just very weird in the way terms have evolved.
I mean no one “bats” 300. They bat 30.0. as in 30% hit success.
Your Mauer analogy is a good one. Mauer really is hitting 36.7. As in he’s successful in getting a hit 36.7% of the time. I don’t know how something like that evolved into “hitting 367″. In real life nobody would refer to .300 as “300″. They would refer to that number as the decimal equivalent of 30% or 3/10th . Maybe the old time writers put it that way to make it seem like a larger number in order for it to sound more impressive.
It would have been interesting if a “HPAA” Hit’s Per Plate Appearance Average would have evolved. Kind of like a “ERA” for hitters. A batter who gets 150 hits in 500 would be a .300 hitter but in “HPA” (500/150) plate appearances would be hitting “3″. As in 1 hit for every 3 plate appearances. But I guess this is counter intuitive because the lower the number the better the hitter. A .250 hitter would be hitting “4″.
John Q,
I’m going to try to explain it one more time and then I’m done.
In any average, you are generally trying to measure how often you are successful in something per attempt. The successes are always first, which means they go on top. So it would by hits over plate appearances, or in your example 150/500, which when divided out comes to .300 or if you want to keep it in a percentage, 30.
It’s easy to remember when you say: “hits per plate appearance” you just replace “hits” and “plate appearances” with those numbers and “per” with “/” and you get: 150/500.
SoCalTwinsFan, please stop. You’re confusing the definition of average with the definition of several other things (most notably rates.) On base percentage is a percentage. Slugging is an average.
(OBP is the percentage – yes, percentage – of times a player reaches base safely. Slugging is the average number of total bases per at bat. The max possible OBP is 1, but the max SLG is 4.)
And no, a percent is NOT an average times 100. No no no. No no no no no no no no. No.
G Hawk – One simply cannot win. Half the thread says that this is way too complicated and will never catch on compared to OPS. The other half is saying that this is useless because it doesn’t automatically adjust for park and league context and handle situational hitting properly and so on and so forth.
Can we park adjust hitting average? In the words of our president, “Yes we can.” We can do all sorts of things with it – we can retune the individual weights to a variety of different contexts (the relative value of events in the late sixties is probably different than in the late nineties, ferinstance.) We can adjust for park. We can do all kinds of things once we agree upon a basic framework.
I think what some people are missing is that there are some stats that are best used for describing what just happened and some that are best used for predicting what will happen in the future.
RBIs are awesome for describing what just happened. If a player drives in 130 runs, you know that he contributed in a lot of winning games for his team. However, it doesn’t tell you a whole lot about what he is likely to do next year when he gets traded to Oakland.
Stats like IsoP or HR/FB%, which measure a hitter’s fundamental ability to hit for power, are a lot more stable, and thus a lot more predictive, than a stat like Batting Average.
I don’t think that anybody is denying that the Angels are having a better offensive year than the Rays. They’ve scored more runs and scoring runs is the object of baseball offense. However, I don’t think that any sane person would stake their lives on the Angels being nearly .150 OPS points better with 2 outs and RISP than the Rays next season. The idea is that, going forward, the difference in 2-out RISP OPS between the Angels and Rays will be a lot closer to 0 than to .150.
Is there something truly, fundamentally different about the collection of hitters on the Angels (the same Angels that folded in the playoffs last season while the Rays went to the World Series!) as compared to the Rays? Does this fundamental difference allow the Angels to hit better with 2 outs and RISP than the Rays? Perhaps, but I learn toward no.
Moreover, if you had to make a $1000 wager on the difference in 2-out OPS between the Angels and Rays being closer to .150 or 0 for the rest of this season, which would you choose? (Pos-esque sidenote: actually, for the rest of this season, you might be justified in choosing .150, because there are so few games left that the variance is likely to be high but pretend I asked this question at the beginning of August)
If you really want to debate semantics, technically, OBP could be called both an average and a probability but not a percent. OBP is traditionally expressed (.400) as a probability, which expressed the likelihood of a given event happening out of the total sample space of 1. “Per cent” literally means “through 100″ so to change a probability (sample space of 1) to a percent (sample space of 100), you do need to multiply by 100.
A .400 OBP is a probability which could also be said as somebody gets on base 40% of the time.
Technically speaking, you also could call OBP an average. Average is typically colloquially exchanged for arithmetic mean. Since OBP is a binomial trial with a result of 0 (did not get on base) or 1 (did get on base), the arithmetic mean will be equal to number of times the player reached base divided by plate appearances, which is exactly OBP.
I generally like to thing that the creators of OBP meant it to say “on base probability” and thus I read it like that. It satisfies my math-geek brain. Alright, that was way too much geeking out, I need to go change or tire or beat a guy up or something.
1/4 = 25% = 0.25
The difference between all of those is the notation used to write it, not what the notation is trying to represent. When you multiply 100% by 100%, what do you get? Certainly not 10,000%.
If you want to do any actual math with your percentages you have to convert them to their decimal equivelents, and in doing so you don’t change what’s actually being expressed. At some point hyperpedantry gets in the way of understanding.
As far as RBIs reflecting what actually happened, I disagree. And RBI treats driving in a runner from third the same as a driving in a runner from first. Or, as I like to put it:
The first batter in an inning gets hit by a pitch.
The second batter in the inning slaps a double down the left field line, runner holds at third.
The third batter makes a lazy flyout to the warning track, one run scores.
The first batter recieves credit for a run scored. The third batter recieves credit for an RBI. The second batter – who was not only essential to scoring that run, but probably the most important batter in scoring that run – gets credited for nothing.
[...] Posnanski wrote about his desire to adopt a baseball stat for his blog. He hinted at reasons for disliking OPS [...]
As I said in an above message OPS+ and ERA+ are the best descriptive stats. They give a pretty good snapshot of how the player has contributed to scoring or preventing runs.
FIP, BABIP, LD% are good predictive stats. they try to tell you when a player is getting particularly lucky or particularly unlucky.
RBI aren’t very descriptive at all it is a very flawed and almost useless stat.
@ John Q
I’m sorry, but what the heck are you talking about?
Batting Average is not a “percentage”, it is the number of hits a player produces per AB. If the creators wanted it to be a percentage, then they would have outlined the calculation as H/AB*100. They didn’t do that, did they? So, they obviously weren’t intending the statistic to be quoted as a percentage. Do we have any other evidence to this? Oh yes, they didn’t name the stat “Batting Percentage”, did they?
But let us break it down and see where the problems come in. Hits is an “outcome” where AB is an “opportunity” giving us an Outcome/Opportunity scale. Now, substitute other outcomes and opportunities. Lets say I make Widgets at my job, I work 8 hours a day and I get 3 done over that time. We will use both average and percentage
A) On average I create .375 Widgets per Hour
B) I get 37.5% of a Widget done an hour
One of those statements is very true, one of them is not. I do average .375 Widgets per hour, there is no arguing that. But I do not necessarily get 37.5% of a Widget done each hour, that statement would be incorrect. I would merely “average” 37.5% of a Widget done every hour, or get 37.5% of a Widget done on “average”.
Now, lets go back to baseball and the Hit/AB for our Outcome/Opportunity scale. Lets use Derek Jeter here. He has 183 Hits in 548 AB this season, for a .334 Batting Average. We will give our statements
A) Derek Jeter averages .334 hits per AB
B) Derek Jeter gets a Hit in 33.4% of his AB
Which of those is true? A definitely is, no one can argue that. But if B was true, then he would expect never to go 0-4. See, B once again implies he gets a hit every 3rd AB which is not true, he merely “averages” a hit every third AB.
BTW,
“And no, a percent is NOT an average times 100. No no no. No no no no no no no no. No.”
“Percentage” specifically means “For every hundred”. If we used a “for every hundred” method on Jeter, we would see this
First Hundred = 27.5%
Second Hundred = 33.0%
Third Hundred = 31.6%
Forth Hundred = 37.0%
Fifth Hundred = 37.0%
Sixth Hundred = 35.4% (incomplete, but would be on that pace)
33.4% would merely be the “average” of all the “every hundred” marks.
@ 65: jay
“If Tex does as well in close & late as his normal All-Star self, doesn’t that say he’s pretty impervious to pressure? And therefore clutch?”
Nah, it just means he is like almost everyone else
@ 83: Michael
“As I said in an above message OPS+ and ERA+ are the best descriptive stats. They give a pretty good snapshot of how the player has contributed to scoring or preventing runs.”
This just isnt true though.
OPS+ is only a good snapshot if you judge OBP and SLG equally, and give twice as much value to every hit. Besides, the simple fact that OPS+ completely ignores the SB shows that it does not measure a players contribution to scoring runs.
ERA+ is only a good snapshot of a team effort, or results based of very specific situations encountered.
Neither gives a very good snapshot of the value of a player. That’s the problem. If using stats to attempt to figure out a players value, these two are horrible skewed and do not show much at all. The predictive stats are much better because they are “predictive” because they tell you what “ability over the average situation” should be, neutralizing everything else. The simple fact they quite often predict the future should tell you they are the better gauge as to what the past should have looked like with regard to any specific individual players contribution. They wouldnt be predictive if they didnt show ability and ability is a reflection of a players specific performance to this point, if his results dont match the performance, then his ability isnt the one that is out of whack.
Please, please stop this. This is embarassing.
Each decimal place has a name – tenths, hundredths, thousandths. So 0.1 is one tenth. 0.01 is one hundredth. A hundredth is one out of a hundred or one percent.
There is no difference between .3 and 30% – they are different ways of writing the exact same number! A player with a .300 batting average gets a hit in 30% of his at-bats.
Or to put this another way – the percent sign means “multiply by 1/100.”
So:
30% = 30 * 1/100 = 30/100 = .3
As far as this:
“Which of those is true? A definitely is, no one can argue that. But if B was true, then he would expect never to go 0-4. See, B once again implies he gets a hit every 3rd AB which is not true, he merely “averages” a hit every third AB.”
Stop. Using. Words. You do not know what you’re talking about.
Do you really think that if, say, 54 percent of people polled say they approve of the job the president is doing, that if you line up 100 people the first 54 will say “I approve” and and the next 46 will say “I disapprove?” And that if you line up the next 100 people, the exact same pattern will repeat? Because I’m sorry, but that’s not how it works. At all.
This is going to sound like I’m a snob, or like I’m trying to stake my claim to being the original Poz fan, but…
what happened to the level of comments around here? For a long time we’ve been discussing what’s really valuable in baseball… now we have a sudden influx of the “I only like BA” crowd.
I would prefer this blog not to turn into all WARP/VORP/EqA/FIP all the time, but if you’re not interested in thinking about how baseball really works, what on earth are you doing here?
Tangotiger–thanks for the message, I am admittedly glib when it comes to sabermetrics. After reading this whole thread, I have settled on three things:
1. Like is indeed about preferences. I prefer a world where hitting stats, baserunning stats, and fielding stats are all separate. I don’t really like SB and CS being included in my pure hitting stats–like this HA one we’re discussing.
2. I appreciate the value of many different stats, from BA to OPS+ to the more advanced. I do think there’s room for an appealing mainstream stat (WPA and the others you mentioned TangoTiger are awesome…but almost like grasping how a nuclear bomb works versus a gun…maybe too awesome for the average mind to understand) that incorporates the purity of “every at-bat is equal” adjusted *slightly* for “situations matter” and is somewhat intuitive. Just no idea what that is…yet.
3. I believe in God, ghosts, and “clutch.” Although it’s hard to support them empirically.
4. Even though I’m a Sox fan and should be grateful, I have never really liked Josh Beckett. (okay, that’s nothing to do with this thread but rather the game I’m watching right now).
And TangoTiger, fangraphs.com is….wonderful. Like Oz (the Marvelous Land, not the Over-the-Top Prison Show). It’s a magical place that I plan on spending many many hours at.
(Oh yeah, the arguments about batting average are probably making Joe question the collective sanity of his hardcore readers. But maybe he does that already.)
“if you’re not interested in thinking about how baseball really works, what on earth are you doing here?”
Um, reading and commenting on whatever Poz writes about?
I didn’t check the rest of the comments so I apologize if i’m being repetitive, but I had to rush to point out that by this measurement, Dusty Baker hit the 2 worst hitters in the Majors Leagues #1 and #2. For half the season.
@ 86: Colin Wyers
Look, the only thing embarrassing around here are statements like this:
“Or to put this another way – the percent sign means “multiply by 1/100.””
Moments after you give us this statement:
“And no, a percent is NOT an average times 100. No no no. No no no no no no no no. No.”
when questioning an equation designed to provide an Average.
And the original “And no, a percent is NOT an average times 100. No no no. No no no no no no no no. No.” statement is just flat out pitiful, which is well beyond embarrassing.
But why is it pitiful? Well, if you were to remove the negative in that sentence, you come up with
“And yes, a percent is a NUMBER times 100. Yes yes yes. Yes yes yes yes yes yes yes yes. Yes.”
The problem is, it does not distinguish specifically what that number is, and if the number in question is an “average” then this is true:
“And yes, a percent is THIS “AVERAGE” times 100. Yes yes yes. Yes yes yes yes yes yes yes yes. Yes.”
Because we all know a “Percent” is nothing more than “for every hundred”, which is solely “a way of expressing a number as a fraction of 100”. If your original number is random, say 14, then your percent is 1400%. But if your original number is an “average” (as is the case with Batting Average), then your percent is “Average”*100.
And really, that is it! Period!
.
But I realize you will try to argue that ( :/ ), so we will do this for you,
“There is no difference between .3 and 30%”
A) 183/610 = .300 yes
B) 183/610*100 = .300*100, yes
but there IS a difference, there is nothing saying the *100 you insist is necessary should be there – you added it when it is not in the original calculation in question (BA is solely H/AB, not H/AB*100). But even besides that point and the most important aspect of this conversation is this: just because you do put the *100 there, it doesn’t make it any less an “average”.
That is why there are two major flaws with initially claiming BA is a percentage:
1) BA is not a percentage; it is a fraction which is calculated into in decimal form.
2) You still have an “average”, even if you did decide to convert your decimal into a percentage to show the “for every hundred” outcome.
John B.
I like Joe Posnanski thats why I’m here.
I have recomended him to every baseball fan I know. I even predicted that he would be writing full time for Sports Illustrated when I saw him on Si.com. Sorry, still think this is overboard. Thats just my opinon. I know that Joe loves baseball. Again try my excersise.
I like the statistic overall and Joe if anything you shouldn’t be worried about how cmplicated it is. Batting Average is a ridiculously complicated stat when you think about it. What we really need is for one stat to be pushed instead of OPS and just rename it so it sounds cooler/better/more appealing
Yuni, The Musical!
Anybody else catch that?? I just peed a little…
Hootie was one of those zeitgeist things. America was just ready for a band where the black guy was the frontman and the backing musicians were white, a reversal of the usual pattern. And it helped that the black guy didn’t really sound black, and he looked unthreatening (no dreads). He’s taken then not-very-black thing too far by playing golf and recording country music in Nashville.
A suggestion. Call it “Production.”
“Let’s take a look at Albert Pujols Production this year.”
“Well Bob, he’s producing at a .990 clip, that’s more than 120 points higher than Chase Utley.”
“Man, Phil, Albert’s having another season for the ages.”
While it might seem like an oxymoron but I actually think that what is lacking in all of the advanced metrics I’ve seen is a way of tempering averages with counting stats.
Example: If you look at Albert Pujols numbers you have the high end of both counting stats and average stats.
If you compare those numbers to Adam Laroches numbers when he was traded back to the Braves, his ops was incredible. He was really destroying the ball for a bit. Adam Laroche’s small sample size showed him as almost equal to Pujols and that clearly isn’t quite right, but if you look at his counting stats it doesn’t do justice to how well he was smacking the ball around in that stretch.
Obviously averages allow you to extrapolate some kind of production based on rate but that isn’t as valuable as long term actual production. I’m not a mathematician but I would have expected that someone would come up with a scale that skews for sample size as well. Am I crazy?
Joey O.
Just to clear things up, I never said “a percent is an average times 100.”
That being said, well done in your explanation, I see the my error.
What I was trying to point out is the confusing way .300 is referred to hitting “300″.
So is On base percentage and Slugging percentage really an average as well?
I did something similar to this a few years ago (2003 & 2004) when I couldn’t find a good fantasy baseball site I liked, so I made my own for my friends. I wanted to be able to watch my players in ballgames and be able to say, for example, “Woohoo! 8 points!” for a certain play. These are the weights I came up with – no scientific basis whatsoever, I just tweaked them until I found a player ranking I liked.
AB: -0.1
R: 2
H: 4
2B: 1
3B: 2
HR: 3
RBI: 5
SB: 3
CS: -1
BB: 2
IBB: 1
SO: -1
SF/SH: 1
HBP: 0
GIDP: -2
So, for a solo HR, a player would get -0.1+2+4+3+5 = 13.9 points.
Here’s the current MLB ranking on my formula:
1 Albert Pujols 1788.8
2 Prince Fielder 1551.1
3 Hanley Ramirez 1477
4 Mark Teixeira 1451.2
5 Ryan Braun 1437
6 Ryan Howard 1401
7 Matt Holliday 1399.8
8 Ryan Zimmerman 1381.6
9 Bobby Abreu 1364.1
10 Derek Jeter 1362.4
Bottom 10 (>300AB):
1 Chris Davis 488.7
2 Alex Gonzalez 501.8
3 Cesar Izturis 511.2
4 Dioner Navarro 517.5
5 Tony Gwynn 525.1
6 Gerald Laird 528.2
7 Ivan Rodriguez 530.3
8 Jeff Francoeur 532.6
9 Anderson Hernandez 552.7
10 Delwyn Young 558.8
Mr #52 – I just wanted to clarify what I meant re: Pecota –
I was intending to refer to team level stats and predicting what a team’s record will be at the end of this season (2009). Baseball Prospectus uses (I assume) Pecota going forward, but also “replays the season”, thus taking away or adding wins to a team’s real record to date.
My point was that the disparity points to the logical fallacies in Pythagenport calculations (however adjusted), and that taking the position that the past should have happened differently (or actually did) is absurd.
I do certainly agree that on a predicting-the-future basis it is an excellent (or among the best we might have) tool. But in assessing the past, it’s like win shares – making the parts equal the sum.
I love this forum… LOL! I hadn’t read all the posts yet.
Nowhere else can you find 20 posts on what “average” and “percentage” mean. Oy gevalt.
Classic. This is great place.
Greg,
that’s a pretty good system assign a point value to everything.
Although 5 points for an RBI is way too much.
-1 for a strikeout is too much of a deduction.
@ John Q
I am truly sorry, I mistakenly combined your posts with that of Colin Wyers – as if they were coming from the same person. I even replied to you in an aggressive way because of Colin’s post at #78. I do apologize.
To your question:
“So is On base percentage and Slugging percentage really an average as well?”
Yes, they are both averages. That is actually why quite a few people refer to OBP as “OBA” or “On Base Average” and Slugging Percentage as “Slugging Average”, or solely “Slugging”.
No clue why someone would make such a blatant mistake when naming the stats, especially when Branch Rickey was involved with its introduction. Rickey had a law degree, you would think he would either know better and ensure it was correct, or at least find out the proper terminology before coining a term which makes no sense.
@ Colin Wyers (or anyone else not clear on what an Average is)
Once separating the two people in questioned, it dawned on me that your issues are solely with terminology, and possibly not understanding the terms very well.
“Average” is the Arithmetic Mean
Factoring the Arithmetic Mean for this series of numbers (3 8 9 4 2 6 7) is
3 + 8 + 9 + 4 + 2 + 6 + 7 = A + A + A + A + A + A + A
which is (3+8+9+4+2+6+7)/7 = A/7
which is 39/7
which is 5.57/1
which is 5.57
Arithmetic Mean (or “Average”) of 3 8 9 4 2 6 7 = 5.57
.
Now, let us relate it to Baseball, and specifically “Hits” to give us a “Batting Average”
Hits are factored off a very specific rate, that being Hit per Opportunity (or H/AB) and this rate occurs for each and every AB a player has. If the player successfully produces a Hit, his rate over that AB is 1/1 and his hit value then becomes “1”. If the player does not produce a Hit, his rate would then be 0/1 or a hit value of “0”.
Now, if your hit values look like this
1 0 1 0 0 1 0 1 1 1 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0
then your Arithmetic Mean or “Average” is 9/25 which is .360. The easiest way of factoring this of course, is to use the rate equation of “X = Total-Hit/Total-AB”.
You may then present this question – Why is BA an “average” and not a “rate”? Well, it is both – kind of.
Technically, Total-H/Total-AB is a formula which would be used for “average rate”, and that is the formula we use to get our end results. You will notice “average” is included with the word rate because it is quite specifically the average of a series of rates. That is because rate means “per fixed part”, a fixed part in baseball being a single AB. Taking a series of rates (our Total-AB) and finding the “arithmetic mean” of those gives you your “average rate”.
There are two other main forms of rates; “instantaneous” and “constant”. “Instantaneous rate” would be looking at a specific moment in time for your rate (ie, “I am one for one”, or a more common “I am driving 35 MPH”). A “constant rate” would be a continual cycle of the same results (like a clock which chimes 17 times every single hour or a player if he could get a hit in every single AB).
Back to our example.
1 0 1 0 0 1 0 1 1 1 0 0 0 0 0 0 1 0 1 1 0 0 0 0 0 = .360 average Hit Value
9/25 Total-Hit per Total-AB rate = .360/1 average rate = .360 average Hit
.360 is the hit average on an H/AB rate. If you were to express the number as “.360/1” or “.360 Per AB” then you are showing “average rate”. Either way you do it, it is an “arithmetic mean” or “average”
.
I also wanted to say to you that above (post #78) you stated:
“On base percentage is a percentage. Slugging is an average.”
This too is horribly incorrect. The Slugging part is correct; Slugging is the “average” number of bases on a Hits per AB rate. Unfortunitly it is called “Slugging Percentage”, but that is a terminology mistake on MLBs part. OBP is NOT a percentage though, and that is quite clear.
OBP is a straight average. It is the players Average Times On Base per Plate Appearance (TOB/PA), much in the way H/AB is the Average Hits per AB. If OBP was an actual percentage, the mathematical problem would need to look like this:
(TOB/PA)=OBA*100
or
(TOB/PA)/100 = OBA
If this was done for Albert Pujols, we would see:
267 TOB / 596 PA / 100 = an OBA of 0.00448 on a TOB/PA rate
Everyone is smart enough to realize it does not take Pujols 223 PA to reach base one single time.
But, I would also think everyone was smart enough not to say “his OBP is 0.448%” or “his OBP is .448/100”. And unfortunately, that is exactly what you said when stating “On base percentage is a percentage”.
Average is any number of measures of central tendency, only one of which is the arithmetic mean. But yes, thank you for that.
And, again:
.36 = 36%
One means 36 parts out of a hundred; the other means 36 parts out of a hundred. They are two ways of writing the exact same thing. In the case of batting average, it is correct to say for a player with a .360 BA that 36 percent of his AB result in base hits. I do not know why you are still hung up on this.
“I do not know why you are still hung up on this.”
Because you were, and still are, blatantly incorrect – and even worse, attempting to use your incorrect statements to imply others (including myself) are the ones in the wrong.
First,
.360 does equal 36%, as does 360/1000 and 9/25 and (1846.8/6)/855 and countless other mathematical equations.
The problem is, the mathematical problem we are presented is not asking for an end result of a mathematical problem – it is asking for the “average” of a series of rates to give a hit (or Batting) average.
But to explain that in the simplest terms for you once more: the Equation is H/AB not H/AB*100. The end results are intended to be H/AB not H/AB*100. BA is defined as H/AB not H/AB*100. how can you not comprehend this?
Or, if you want to think of it this way:
“One means 36 parts out of a hundred”
this is true only if you MAKE it mean that. It can also mean “360 parts of one thousand” or “3.60 parts of ten” or most importantly since it is what we are specifically factoring here, “.360 parts of 1”. You know, a math problem of .360/1 which (in this case) is better known as the H/AB we were told to use!
So if you want to take the end results and make a percentage out of it, feel free – but what it does is add mathematical steps which are not present in the problem first given. Does BA = BA*100, no! BA*100 = BA*100, but we weren’t asked to find BA*100 were we?
Also,
“it is correct to say for a player with a .360 BA that 36 percent of his AB result in base hits”
Once more, this is just not true. Giving no occurrence notation with regard to rate gives a result of “constant”. H/AB being our rate and H/AB*100 being your adjusted rate percentage, you are wrong as soon as the player does not go 36-100. It is correct to say “an AVERAGE of 36% of his AB result in base hits”, but without distinguishing it as an “average” or specifying your base AB number you are claiming that any given 100 AB mark will provide a 36 Hit total. Doing so is a blatantly incorrect statement. It is why most polls start with “of so&so many polled”.
Which is also why you were blatantly incorrect in your previous example of “54 percent of people polled say they approve of the job the president is doing” once you started talking about multiple hundred voting.
Then once more back to why I am so hung up on this – remember why this conversation is even taking place? You were hung up on explaining your very wrong interpretations of the way things were, implying people (including myself) were the ones that were incorrect. If you didn’t want things explained to you, then maybe you shouldn’t have incorrectly explained them to others.
(did this go through? It did something strange when I tried to post and not sure it registered. It now says “comment awaiting moderation”, which I would take to mean it isnt present. But does this site have moderation? Humm… lol. Anyway, here is a nice long one – again, if it did go through the first time
)
@ Colin Wyers
“I do not know why you are still hung up on this.”
Because you were, and still are, blatantly incorrect – and even worse, attempting to use your incorrect statements to imply others (including myself) are the ones in the wrong.
First,
.360 does equal 36%, as does 360/1000 and 9/25 and (1846.8/6)/855 and countless other mathematical equations.
The problem is, the mathematical problem we are presented is not asking for an end result of a mathematical problem – it is asking for the “average” of a series of rates to give a hit (or Batting) average.
But to explain that in the simplest terms for you once more: the Equation is H/AB not H/AB*100. The end results are intended to be H/AB not H/AB*100. BA is defined as H/AB not H/AB*100. how can you not comprehend this?
Or, if you want to think of it this way:
“One means 36 parts out of a hundred”
this is true only if you MAKE it mean that. It can also mean “360 parts of one thousand” or “3.60 parts of ten” or most importantly since it is what we are specifically factoring here, “.360 parts of 1”. You know, a math problem of .360/1 which (in this case) is better known as the H/AB we were told to use!
So if you want to take the end results and make a percentage out of it, feel free – but what it does is add mathematical steps which are not present in the problem first given. Does BA = BA*100, no! BA*100 = BA*100, but we weren’t asked to find BA*100 were we?
Also,
“it is correct to say for a player with a .360 BA that 36 percent of his AB result in base hits”
Once more, this is just not true. Giving no occurrence notation with regard to rate gives a result of “constant”. H/AB being our rate and H/AB*100 being your adjusted rate percentage, you are wrong as soon as the player does not go 36-100. It is correct to say “an AVERAGE of 36% of his AB result in base hits”, but without distinguishing it as an “average” or specifying your base AB number you are claiming that any given 100 AB mark will provide a 36 Hit total. Doing so is a blatantly incorrect statement. It is why most polls start with “of so&so many polled”.
Which is also why you were blatantly incorrect in your previous example of “54 percent of people polled say they approve of the job the president is doing” once you started talking about multiple hundred voting.
Then once more back to why I am so hung up on this – remember why this conversation is even taking place? You were hung up on explaining your very wrong interpretations of the way things were, implying people (including myself) were the ones that were incorrect. If you didn’t want things explained to you, then maybe you shouldn’t have incorrectly explained them to others.
“A) Derek Jeter averages .334 hits per AB
B) Derek Jeter gets a Hit in 33.4% of his AB
Which of those is true? A definitely is, no one can argue that. But if B was true, then he would expect never to go 0-4. See, B once again implies he gets a hit every 3rd AB which is not true, he merely “averages” a hit every third AB.”
B IS true, and nobody would expect him to never go 0-4. For someone hitting .334, there is a 19.7% chance that they will, in fact, go 0-4. Which amazingly enough is also the case for someone who gets a hit in 33.4% of their AB. I’m really at a loss as to what sort of pre-pubescent mathematical knowledge might induce someone to believe that doing X in Y% of attempts somehow implies that one will succeed every Y/10 attempts. This nickel in my pocket comes up heads 50% of the time when I flip it, but everyone and their 10-year-old sister knows that this doesn’t mean my coin goes “head, tail, head, tail, head, tail…”
In fact, since pedantry and nonsense have apparently taken over the discussion, I may as well assert that A is actually closer to not being true since there is no such thing as a partial hit. Which, of course, is why you’ve never seen a .333 hitter with a number of AB less than 500 and not divisible by 3.
So, levity aside, to put this to bed: The numbers we are familiar with seeing are all percentages, period, with the exception of SLG*. .333 IS 33.3% of 1, and in no reasonable manner does the number “.333″ have any meaning outside of that context. The only and absolute intent of the number, by itself, is to indicate “33.3% of the time.”
They can also be considered to be “averages” in a mathematical sense, but only as a sum of all numbers in a set (the numbers being either “1″ or “0″, and the set being the results of each AB or PA) divided by the quantity of numbers in the set (i.e., arithmetic mean). Insofar as we do not view a hitter’s statistics as a series of ones and zeros, but as a whole number divided by another whole number… well, the argument that “average” is the wrong term has merit in mathematical terms. Linguistically, what’s actually being said is that “Joe averages a hit 33.3% of the time,” but that meaning of “average” is subtly different from the mathematical connotation.
*-SLG actually IS an average. The values in the set can range from 0 to 4, and the result of the equation is the arithmetic mean. It is NOT a percentage, even though the net result is to indicate that the player amasses x total bases per AB.
@ 108: Jon Morse
Ok, well we will break that bullcrap into sections:
“B IS true, and nobody would expect him to never go 0-4. For someone hitting .334, there is a 19.7% chance that they will, in fact, go 0-4. Which amazingly enough is also the case for someone who gets a hit in 33.4% of their AB. I’m really at a loss as to what sort of pre-pubescent mathematical knowledge might induce someone to believe that doing X in Y% of attempts somehow implies that one will succeed every Y/10 attempts. This nickel in my pocket comes up heads 50% of the time when I flip it, but everyone and their 10-year-old sister knows that this doesn’t mean my coin goes “head, tail, head, tail, head, tail…”’
Wrong
1) B is not true. Jeter does not get a hit 33.4% of the time. 33.4% means 33.4 for every 100 – when is the last time he got 33.4 hits in 100 AB? What he does is average 33.4%, and he has done that over 584 AB.
2) “This nickel in my pocket comes up heads 50% of the time I flip it” – By definition, this means 50 in 100 times. It doesn’t mean Head/Tail/Head/Tail, and no one ever said it did, so not sure where you are getting that crap. But, what was said is % = “for every hundred” and that means FOR EVERY HUNDRED unless specified otherwise. You flip that coin 100 times and don’t get 50 heads/50 tails then saying “I get heads 50% of the time” is blatantly incorrect. Your Odds might be 50%, but not necessarily your results. And I will bet you a 100 dollars you dont get 50% heads.
.
“In fact, since pedantry and nonsense have apparently taken over the discussion, I may as well assert that A is actually closer to not being true since there is no such thing as a partial hit. Which, of course, is why you’ve never seen a .333 hitter with a number of AB less than 500 and not divisible by 3.”
You obviously don’t understand the concept of “average” or “arithmetic mean” very well, do you? Which makes it funny when you try to act like a jerk.
.
“So, levity aside, to put this to bed: The numbers we are familiar with seeing are all percentages, period,”
Oh, so players are graded over every hundred AB then? Really? Even the players that don’t even get 100 AB are judged on what they did “for every hundred”? Yeah, that makes much more sense then PER AB rate we were told to account for actually meaning PER AB. I mean, yeah, what kind of moron takes PER AB to mean PER AB when it so clearly means PER EVERY HUNDRED AB. You fool.
.
“The only and absolute intent of the number, by itself, is to indicate “33.3% of the time.””
Wrong. It is presented .333 because it is intended to be .333. Why? Because it is PER AB!!! And how do we know this? The equation is H/AB ***not H/AB*100***.
.
“They can also be considered to be “averages” in a mathematical sense, but only as a sum of all numbers in a set (the numbers being either “1? or “0?, and the set being the results of each AB or PA) divided by the quantity of numbers in the set (i.e., arithmetic mean). Insofar as we do not view a hitter’s statistics as a series of ones and zeros, but as a whole number divided by another whole number… well, the argument that “average” is the wrong term has merit in mathematical terms.
Wrong. Here, try this
1/1, 1/1, 0/1, 0/1, 0/1, 0/1, 0/1, 1/1, 0/1, 1/1
Find the arithmetic mean. And I’ll give you a hint, it ISNT the X/100 you are trying to claim.
H/AB is set up to simulate an equation like the one above. It also does this, perfectly. That’s why its used! Go figure, huh.
Linguistically, what’s actually being said is that “Joe averages a hit 33.3% of the time,” but that meaning of “average” is subtly different from the mathematical connotation.
Wrong. If that were “actually” the case, the equation would very simply “actually” state H/AB*100, wouldn’t it? It doesn’t though, does it? Wonder why that is since they “actually” want the equation to be H/AB*100. I mean, if I want something, I say what I want – not say part of it then leave it up to the other person to decided what I am actually asking for.
And let is delve into that even a little bit deeper. Why is H/AB “actually” saying “H/AB*100”? What is the significance of 100 AB with relation to baseball? Wouldn’t it make more sense to assume something which relates to the game? Like, say, 600? 600 is in the ballpark of the expected AB a player should receive over a season. Are you sure it isnt “actually” saying the implied equation should be H/AB*600? Wouldn’t this make sense to someone who follows a team? He would know how many hits to expect over a season – so he would know roughly how many are left at any given point in the season.
Or a player generally gets 4 AB per Game, are you sure it isnt “actually” implying H/AB*4? Wouldn’t this make more sense to a person who happens to, say, go to a game? He would know the number of hits the player should receive while he is watching him! That would make sense. And making this even more intriguing of an idea – why is it indicated that pitchers should be judged by the game? A pitcher ideally goes 9 innings, it was that way when the stats were introduced. And the pitchers stat is clearly defined to show ER per 9 innings pitched. So why would a pitcher so clearly be judged by the game but a batter be judged “for every 100” AB, when 100 AB has no meaning in the game?
And then you get back to the original problem – why is it so clearly stated that pitchers should be judged off 9 innings, but only implied hitters should “actually” be judged off every 100? It was so simple to give a ER/IP*9, why is it so hard to give H/AB*100?
.
But since you so clearly are telling us the “actual” equation for BA is H/AB*100, how about you show us where this is “actually” stated? Where can we find proof of this “BA = AB/H means BA = AB/H%” theory you are presenting? I mean you says it is what is “actually” being said, so WHERE IS IT SAID???
Oh, and just so you know, right here by you doesn’t count. Believe it or not, you saying something doesn’t make it true – which is clearly evident by your entire post.
@ 108: Jon Morse
Oh, and because this one kind of made me chuckle:
“*-SLG actually IS an average. The values in the set can range from 0 to 4, and the result of the equation is the arithmetic mean. It is NOT a percentage, even though the net result is to indicate that the player amasses x total bases per AB.”
Ok, so a .333 BA is “actually” supposed to mean 33.3 Hits for every 100 AB (despite it no where saying this) and a .416 OBA is “actually” supposed to mean 41.6 TOB for every 100 PA (despite it no where saying this) BUT when it comes to the .567 SLG, it is NOT intended to be 56.7 TB for every 100 AB. And the reason for this is because a slugging average “can range from 0 to 4” but never surpasses the 0 to 1 range?
And what are the guidelines here? Because SLG is always 0 to 1, it is an average? Where BA is 0 to 1 so it is a percentage, and OBA is 0 to 1 so it is a percentage? Guidelines being(?)
Is 0 to 1 = average
Is 0 to 1 = percentage
Is 0 to 1 = percentage
Am I understanding that right? That sure seems to be your guidelines based off what you said. And honestly, when it is said like that, I have to say it sure makes perfect sense! :/
@John Q (102):
Agreed on the RBI. This was really just a very quick formula to make fantasy baseball more fun to follow. I’m sure I could base it in The Real World much better if i tried.
Joey:
Strawman much?
No matter how you slice it, if a guy has produced 120 hits in 500 AB, he has gotten a hit 24% of the time. It does not, never has, and never will mean that he will get a hit 24% of the time; it does not mean that “he’ll get 12 hits per 50 AB” or “he’ll get 60 hits per 250 AB” or “he’ll get 6 hits per 25 AB.” Your conflation of the two situations is the root of your entire inability to grasp the difference, and your — to be quite honest — absolutely ludicrous assertion that somehow “33.4%” implies that someone can’t go 0-4 is the reason I jumped in here to begin with. You said you weren’t sure where I was getting that crap, and you quoted yourself spewing it right before you said it.
“Percent” does not mean “X per every selection of 100.” The fact that you believe that to be true pretty much explains this whole fiasco. “48% of the states in the United States of America lie fully or partially west of the Mississippi River.” If “percent” is a strict “per 100,” that statement would have no meaning; indeed, it would be invalid, because last time I checked there weren’t 100 states. Yet it is not invalid, and in fact it’s an incontrovertibly true statement unless you want to argue about the definition of “fully or partially west of the Mississippi River”. I dunno, maybe someone will want to complain that parts of Wisconsin are west of Saint Louis, which would be about as logical and rational as your response was. Likewise, a guy who got 120 hits in 500 AB did in fact get a hit 24% of the time, just as a guy with 12 hits in 50 AB did. Period.
As to your objection to my assertion that SLG is an average and not a percentage, I guess maybe you’ve just managed to give 110% of your effort in making your point. A hitter with a SLG of 1.200 cannot be said to have gained a TB in 120% of his AB, now can he? No, a hitter with a SLG of 1.200 has gained an average of 1.2 TB per AB. (Before you start ranting, I’ll get back to this in a moment.)
Of course, since you threw out some nonsense about SLG never exceeding the 0 to 1 range, I’m probably talking to a wall here. SLG is not always 0 to 1; it is 0 to 4. Guy leads off the season with a homer, his SLG is 4.000, not 1.000. The fact that nobody has broken the .900 barrier for a full season is, for this purpose, completely irrelevant.
Lastly… “But since you so clearly are telling us the “actual” equation for BA is H/AB*100″
Would you care to point out where I so clearly told anyone that? The equation for BA is H/AB, and I don’t think I stated, implied, or even hinted at anything different. I think it’s only “clear” to you because you’ve gotten yourself stuck on this idea of percent meaning “in every sample of 100.” You seem to have forgotten that I actually agreed with you that BA is an average, in addition to being a percentage. Just as a player with a 1.200 SLG has gained an average of 1.2 TB/AB, a player with a .333 BA has gained an average of .333 H/AB. I just think that calling it a percentage is more sensible, since, you know, that’s what it’s expressed as whereas SLG is not.
Oh, and ERA? You actually bring up a valid point; the catch is that you simply cannot do the same thing for hitters in any meaningful way. If you tried, every hitter would either look like a .250 hitter, or like a .200 or .400 hitter, depending on how many AB you presume equals a game. And that, sir, is why BA is expressed as a percentage, while ERA is not.
Cheers.
And here we go again with the asinine crap…
“No matter how you slice it, if a guy has produced 120 hits in 500 AB, he has gotten a hit 24% of the time”
Correct – kind of. “He has gotten a hit 24% of the time ON AVERAGE or OVER 500 AB” is the correct statement. Saying “he has gotten a hit 24% of the time” specifically means 24/100 – and he hasn’t done that.
But more to the point – that isnt what BA calculates. BA can be altered to say that, but that isnt what BA is. No where, ever, does BA give a calculation for a 1/100 scale.
.
““Percent” does not mean “X per every selection of 100.” The fact that you believe that to be true pretty much explains this whole fiasco. “48% of the states in the United States of America lie fully or partially west of the Mississippi River.” If “percent” is a strict “per 100,” that statement would have no meaning; indeed, it would be invalid, because last time I checked there weren’t 100 states.”
WRONG
Percent is, by its definition “for every hundred”. In saying “48% of the states are west of the Mississippi” you are saying “48 states for every hundred are west of the Mississippi”. The fact there are fewer then 100 states mean that fewer then 48 states are west of the river, but there being fewer then 100 states in no way makes 48% any less 48/100.
.
“A hitter with a SLG of 1.200 cannot be said to have gained a TB in 120% of his AB, now can he? No, a hitter with a SLG of 1.200 has gained an average of 1.2 TB per AB.”
WRONG
Saying a “SLG of 120%” is saying the same exact thing as calculating a BA into a percent – but I will get to that later.
.
“Of course, since you threw out some nonsense about SLG never exceeding the 0 to 1 range, I’m probably talking to a wall here. SLG is not always 0 to 1; it is 0 to 4. Guy leads off the season with a homer, his SLG is 4.000, not 1.000. The fact that nobody has broken the .900 barrier for a full season is, for this purpose, completely irrelevant.”
Yes, I said SLG can be 1-4, but it is almost always 0-1. And its based off a rate of 4, so if your theory of ‘BA “actually” is supposed to be made a percentage’ is true, why not make SLG a percentage? Why not give make it a percentage as well? That would be a calculation of (TB/4)/AB*100 = SLG. OR, why not leave it just how it is and just randomly call SLG a percentage in the way you are calling BA a percentage? Here, this is what I mean
In insisting that BA is a percentage, you are ending up with an end result of a % of a single Hit per AB per the definitions in question. Case in point, player has averaged .367/AB = player gets 36.7% of a hit per AB. It coincides with the number of hits he gets per 100 AB because of the fact we are based off a 0-1/1 scale, but it does NOT say “hit 36.7% of the time” when the calculation in question is quite clearly PER AB
So, lets put that in work, making everything a percentage for you since you are so adamant about it:
(using a slash line of .367/.416/.536)
BA = PER AB he averages 36.7% of a Hit
OBP = PER PA he averages 41.6% of a Base
SLG = PER AB he averages 53.6% of a Base solely with his bat
And to clear up SLG, we will get back to it. Say SLG is 1.200. We get:
SLG = PER AB he averages 120% of a Base through Hit
Unless you know nothing about the game, you realize this puts him between 1st and 2nd on average PER AB
That is what is “actually” being said when you claim BA and OBP are “actually” percentages. And why does it end up like that? Because it is based on a PER AB (or PA) scale, not “Per every Hundred” like percentage defines as.
So a 36.7% BA means 36.7% of H / AB or (36.7/100)/1 where H are factored to a 1/100 scale to get your “for every hundred” you insist is implied while PER AB stays exactly where the definition of the statistic says it should be.
But the problem? Why is saying “36.7% H PER AB” any better then saying “.367 H PER AB”? It isnt. And BA (as well as OBP and SLG) are obviously not Percentages unless you make them into percentages to tell something other then the stat itself tells.
And that’s the thing; what are you saying by BA = % = “He has gotten a hit 24% of the time”? What is “a time” and why do we now gauge BA off of it? If we substitute AB (what the stat calls for) in “time” we get: “He has gotten a hit 24% of the AB” – which is what BA, if considered a percentage, tells us.
.
“Would you care to point out where I so clearly told anyone that?”
You serious? The entire time you have insisted BA is a percentage. If BA is in fact a percentage, then the calculation would have to be H/AB*100 (which you are continually saying it is) or H*100/AB (which it would actually become per the definition of the stat). Since you say it specifically means .334 is intended to be written 33.4/100, which is (H/AB)*100, then it should “actually” be written like that in the definition. Why expect a return you don’t ask for afterall? If the expected return was a percent, it must be asked for in the calculation. Where is H/AB*100 “actually” asked for if that is what you are saying they are “actually” asking for?
And this is what it all accumulates to:
“Oh, and ERA? You actually bring up a valid point; the catch is that you simply cannot do the same thing for hitters in any meaningful way. If you tried, every hitter would either look like a .250 hitter, or like a .200 or .400 hitter, depending on how many AB you presume equals a game. And that, sir, is why BA is expressed as a percentage, while ERA is not.”
Excuse me? IF you convert BA into a percentage you are either saying it the way the stat was intended and altering the H portion to reflect a 1/100 scale (that being %H Per 1 AB) or completely altering the definition from its current “H per AB” to “Hits if over 100 AB” (H/AB*100). Now, if you did the first, it isnt completely wrong – just pointless, and not intended (since no where does it say “average percent (or 1/100) of a hit the player receives per every AB”.
But we know which of those you are trying to claim; that is BA really means #H if given 100AB. And this is meaningful because in baseball the 100 AB marker has significance how? Here in lies your entire problem, as I have been relating it to you the whole time. Really, what is the significance of 100?
Pitchers are based off 9 innings, but this isnt really significant to them either – they get pulled before the end of the game quite often. And there have always been relief pitchers in some form, yet they always got judged off the same “per 9” as well. Everyone gets the “Eearned Run Average per 9 innings pitched” outlined in the actual stat. Yes, that is correct, ERA clearly states that the pitchers ERA should be factored to a particular number. BA doesn’t have an outline in the stat, leading to you assuming that it should be calculated to a 100 AB rate. 100 has no significance, but based off the pitchers who are judged per full expected game regardless if they are to pitch a full expected game, then you would only be able to assume the same “expected game” for the hitters. And actually, a player would receive a minimum of 3 AB per game if playing the entire thing, so that is probably the number that should be used.
So if we were to take your assumption that “Hits per AB” is actually intended to be “Hits per AB factored to another number of AB” then 3 is much, much, much more meaningful to the game then 100. And the end would give you H/AB*3 for BA, TOB/PA*3 for OBP and TB/AB*3 for SLG for every single player just like ERA is ER/IP*9 for every single pitcher to denote a games worth of performance based on his rate.
Yawn.
Get back to me when you learn to actually read what people post, rather than making stuff up.
For everyone else’s benefit, if they’ve been following this:
The following is the definition of “percentage” as published by M-W:
“a part of a whole expressed in hundredths” (emphasis mine). Nowhere does it reference any actual connection to “units per hundred.”
The following is a definition of “percentage” as published by Oxford:
“any proportion or share in relation to a whole”.
I don’t think I can make the root of this entire stupid pointless pedantic waste of time any clearer than that.
“Yawn.
Get back to me when you learn to actually read what people post, rather than making stuff up.”
I guess I can say I’m sorry you’re wrong, but I haven’t made up a single thing.
“a part of a whole expressed in hundredths” (emphasis mine). Nowhere does it reference any actual connection to “units per hundred.”
“hundredths” = “One part in a hundred. Equivalent to 1/100.”
.
“any proportion or share in relation to a whole”.
That is not the mathematical definition and does not pertain to what you are saying when you give a basis for your argument as “The only and absolute intent of the number, by itself, is to indicate “33.3% of the time.” That Oxford definition you just gave is used to distinguish a certain portion of a complete. ie. percentage of his hits to left field – Hits being the whole or complete while “to left field” is the part. This is Oxfords specific use of that definition “only a small percentage of black Americans have Caribbean roots” – Black Americans being the whole, those with Caribbean roots being the part.
And of course this is the specific Oxford definition as it relates to our situation:
“a rate, number, or amount in each hundred”
ie. x/100
That is exactly what you claimed you made the basis of your argument. And that is where your complete downfall lies. You specifically used the %, denoting a mathematical equation of 1/100 – and you have continued to use it over and over since. Once you did so you brought in the proper definition of “percentage” or “percent” or “per century” or “for every hundred”. If you were to have made your entire argument solely off the vague use of the word and state it gives no mathematical adjustment (always leaving us with the intended “hits per AB” to provide the “H/AB = .xxx” equation) then you wouldn’t have been pitifully incorrect and could get away with referring to it as a percentage. It would never make BA or OBP or SLG or ERA, etc, any less the Average they are, but you would have a verbal usage which could loosely be interpreted as passable. Heck, the creators of OBP and SLG did when they named both of them percentages. Intelligent people (like our boy Tango there) of course knew them to be averages though, and ensured when they gave mathematical weights the name was reverted back into the proper On Base Average. The reason why we have wOBA or “weighted On Base Average” today
But ya didn’t call it the average it is and didnt use the extremely vague use of the word percentage when defending it possibly being called that. No, you wanted to insist it secretly implied an unsaid alteration to the equation which would give you and end result of x%. That is where you ended up completely incorrect and solely your preference where the next guy may prefer ‰ (that is, permille) where the guy after him possibly wants /3 (that is roughly per game) where the guy after him may like /600 (roughly per season) and everyone would only be showing their personal chosen way to make an adjustment to a very specific calculation given to form an average.
Sorry, tiger, but he called it “wOBA” so that the name would sound like a song from Sesame Street.
@ 118: Colin Wyers
Partially true. The “wOBA” itself is because it happens to coincide with a song from the show. But it was merely renamed from “lwtsOBA” to “wOBA” as homage to Grover’s vocal performance and the point doesn’t change.
Joey… get this through your skull.
I never said anything about “per 100″. You are confusing me with Colin, and every single time you’ve dredged up that part of the argument, you’ve been referencing something which you initially disputed with Colin. Do you see now why I find this whole thing laughable, and really have no interest in continuing to debate with you as if you actually know anything?
You have now accused three different people in this discussion of trying to use “H/AB *100″, and not ONE of them has been trying to say that. Just give it up, already.
“I never said anything about “per 100?. You are confusing me with Colin”
I cant believe how clueless you really are. Here:
108: Jon Morse said at 5:57 pm on September 8th, 2009:
“So, levity aside, to put this to bed: The numbers we are familiar with seeing are all percentages, period, with the exception of SLG*. .333 IS 33.3% of 1, and in no reasonable manner does the number “.333? have any meaning outside of that context. The only and absolute intent of the number, by itself, is to indicate “33.3% of the time.”
I like the idea of creating a simpler stat for people to follow, but I was wondering where some of the numbers for doubles, triples, and sb’s came from. To me, they should all be interrelated. Now, I’m not exactly sure how you came up with the numbers for doubles and triples as opposed to singles (although I’m sure there is a reason for it that I’m missing), but doesn’t a single plus a stolen base equal what the net result of a double is? The only difference is if there are runners on base.
Because of this, I was wondering whether some of the numbers could be more developed. For example, I would make a stolen base equal to the difference in values between a single and a double and a double and a triple (which according to your numbers is .6). Furthermore, I think it should separate runners caught stealing according to the base the runner was thrown out at. If a runner was picked off at first or caught at second, he loses the value of reaching first base (which is why I think walks and singles in this should account for the same value). If he was picked off second or thrown out at third, he should lose the value of the 2 bases that he had accrued. And if the runner is picked off at third or caught stealing home (however rare it is), he loses the value of a triple. In fact, I would have them lose the value of the base plus add on the negative points, since that basically what a cs does anyway.
Although it would complicate the formula, I think these changes could make “hitting average” a more accurate indicator of a player’s value. I think this doesn’t have any holes, but I could be wrong.
Oh, so you mean to tell me that you think 33.3% of 1 relates to the number 100 somehow, rather than relating to… 1. See, nowhere in there do I make any reference to anything being “per 100.” And you call me clueless, when you’re functionally illiterate?
It’s a part of a whole expressed as hundredths. The key word there is not “hundredths.” The key words are “part of a whole” and “expressed as”. Expressed as, Joey. 1/4 does not mean “one out of every four” any more than 30% means “30 out of every 100.” Especially not in the mind-bogglingly idiotic sense that you argued that stating a player gets a hit 33% of the time implies he’d never go 0-4.
You know what? In fact, that shoots your entire argument to smithereens in and of itself — because your argument is based on your asinine insistence that 33% means 33 out of every 100, yet last time I checked, 4 is not 100. You’re babbling on and on like you know what you’re talking about, yet you fail to grasp the simple concept that the percentage is a part of the whole, not a part of a sample of the whole. It is, of course, entirely possible to get a hit in 33.3 percent of your AB over the course of the season while still having multiple 0-4 games, because the 33.3% doesn’t refer to every 3 AB, but to the whole.
Which is also precisely, indisputably, why it does not refer to every 100 either. A hitter can go 10-100, then go 35-100, 35-100, 35-100, and 35-100… and they’ve gotten a hit 30% of the time, without ever going 30-100. The only extent to which a percentage refers to “X per 100″ is in the manner by which the number is expressed, which is the entire point I’ve been making since this entire ridiculous conversation began.
Anyone else reading this completely understands at this point, and I frankly don’t care whether you do or not, so I’m done here. You can continue your barrage of personal insults which began with your very first reply to me at your leisure, and you have loads of fun with that.
I feel bad about this, but I am completely incapable of seeing Tom Tango’s name now without thinking of Clay Dreslough’s douchiness. Completely unfair to Tango, I know.
::sigh::
(re: BA) “The only and absolute intent of the number, by itself, is to indicate “33.3% of the time.””
“%”
Definition: for every hundred
“the time”
Definition: when something happens
Without a numeral given for “the time”, “33.3% of the time” specifically means “33.3 for every hundred”.
.
“It’s a part of a whole expressed as hundredths. The key word there is not “hundredths.” The key words are “part of a whole” and “expressed as”. Expressed as,”
“part of a whole” = our hits
“expressed as” = shown
“as hundredths” = for every hundred
.
“Joey. 1/4 does not mean “one out of every four” any more than 30% means “30 out of every 100.””
Correct if you are giving 1/4 as a fraction. We werent talking about fractions – we were talking about a “percent” or “per cent” or “per century” or “per 100″. 30% specifically means “30 for every hundred”.
“Especially not in the mind-bogglingly idiotic sense that you argued that stating a player gets a hit 33% of the time implies he’d never go 0-4.”
You are taking a statement out of context. Putting it back into context: If you do not distinguish the “33%” as an “average” or state what “the time” represents, then 33% on a scale of 4 means Jeter must get 1.32 Hit in the 4 AB.
.
“because the 33.3% doesn’t refer to every 3 AB, but to the whole.”
and without defining the whole or giving the correct “average” distinction you are stating “33.3 in 100”. That was always my entire point – the “average”. Making something a percentage at a later time doesn’t remove the “average”. And even if a person chooses to make something into a percentage, it doesn’t alter the definition.
End result:
Batting Average is an Average even if you convert it to a percentage at a later time. It is the mathematical problem used to represent a series of outcome/opportunity rates to the arithmetic mean.
OBP is also an Average, even if you convert it to a percentage. It is the mathematical problem used to represent a series of outcome/opportunity rates to the arithmetic mean.
Slugging is an Average, even if you alter it into a percentage. It is the mathematical problem used to represent a series of outcome/opportunity rates to the arithmetic mean.
And that is what everyone should be able to grasp. (well, other then yourself for some strange reason)
“Correct if you are giving 1/4 as a fraction. We werent talking about fractions – we were talking about a “percent” or “per cent” or “per century” or “per 100?. 30% specifically means “30 for every hundred”.”
(since you will jump on it, this paragraph experienced an editing problem. Please read the “fractions” as “an independent fraction or outcome”)
Joe, the thing is, despite your stated druthers, the guy who goes 4 for 4 with four bases empty doubles is worth a LOT more than the guy who goes 0 for 5 with a run and RBI. And to me, that’s how it should be. Not to go all quantum on you (sorry, spent last weekend at the North American Discworld Convention) but there are actual runs (those which cross the plate) and potential runs (those situations which make it easier for runs to cross the plate even if in this instance they don’t). Of course, your two hitters were an extreme small sample, but in the long run, batters who make it easier for their teammates to get RBI are worth a lot more than batters who don’t, even if their teammates happen not to cash them in.
And there is definite value to potential runs. It is more stress on the pitcher, which is likely to lead to more innings by the bullpen, which is a good thing for the offense. It is almost always more pitches thrown, since 4 for 4 used up zero outs. Having men on base generally increases the chances of batters getting hits. That man on second needs only a useful out to reach third, and then often the infield will have to play in, which increases batting averages (else the infield will always play in).
In short, I’m perfectly happy with the new stat, but it’s more work than I’m willing to do. When I was playing Strat-O-Matic, I used something similar: every card was ranked with 3 points for a walk or HBP, 4 for a single, 6 for a double, 8 for a triple, and 10 for a homer. Since each card was trued to 216 chances (rolling 3 dice) I could compare the total points of all cards. I could do this in my head. I can also calculate OPS in my head from all web sites, as listed above. I cannot do this new stat in my head. So I won’t use it much.
Jon Morse postulated comparing two batters with equal numbers and chances but differeing RBI, and said the base runners would have to be drastically different to accomodate that result. I will cite an example in opposition.
Juan Pierre is batting .318 this season. He has no power, so the outfield plays shallow against him. But he has great speed, so he gets a lot of leg doubles and triples.
A slower more normal player might have comparable total numbers, but would achieve them with an outfield playing deeper. Thus, runners would be likelier to grab the extra base (and the run scored) just because the outfielder would have to make a longer throw.
The beauty of baseball, IMO, is that it is impossible to generalize. I’m a big fan of many modern stats. But I also recognize that as long as they get on base enough, there is more value to speedsters than OPS+ measures. The fast slap hitters force infielders to hold them on, opening holes for the batters. They force the pitcher to devote some attention to them, which is unlikely to help them pitch better. They are more likely to force errors, outrun a force play, take the extra base. And I don’t think modern slugging based stats measure that.
One more thing about pitcher comparisons. In 2004* the Dodgers had three gold glove caliber infielders in Adrian Beltre (who has won GGs), Cesar Izturis (ditto) and Alex Cora (who really isn’t quite good enough a hitter to play regularly enough to win a GG, but is one fine fielder).
By the way, whoever was looking at NL MVP above for 2004 without considering Bonds did the #2 MVP guy, Beltre, a MAJOR disservice by leaving him off the list. Especially in context of Dodger Stadium, Beltre led the league and set a franchise record for single season homers without a great supporting cast. Rolen, Pujols, and Edmonds right there reinforce each other, making it easier for pitchers to pitch under stress and allow one of the three to do something good. I won’t argue that all three Cardinals didn’t have great seasons, but I think Beltre should have been included in the list.
In part because of their defense, a marginal ground ball pitcher like Jeff Weaver became more effective. The Dodgers made the playoffs. Weaver pitched less well the next season without Beltre, far worse the next three seasons with other teams, and is now a long reliever/spot starter. But in the right ballpark, with the right defense behind, I think Weaver could be a useful #4 or #5 pitcher.
In 2005, the Dodgers had a supposed major upgrade at second base, with Kent replacing Cora. Yes, Kent provided a lot more offense, but he had to dive at balls Cora fielded standing up. Plus the Dodgers had picked up another ground ball specialist (Derek Lowe) meaning every run Kent earned on offense he gave back on defense for two pitchers, not one. And the team worsened by over 20 wins.
So yes, I’m a big fan of what the Mariners are doing. Defense can help make teams.
Lol. Ok, seeing it now for the first time since, please just disregard that entire problem paragraph up there. Was the last thing I was rewriting when interrupted by company. Train of thought was lost, the entire post wasn’t altered and I assumed it still held its overall weight where it doesn’t actually make sense at all and contained errors upon further reading. Didn’t notice it before the correction post because it was actually provided mere seconds after the first, but it asked for verification which I only noticed 12 minutes later obviously. (there is your clarification on that to deny the possibly oncoming attack)
Now if anyone wants clarification of where it was intended to go and ultimately try to explain, then here. (and I will take this all the way to explaining completely why BA, OBP, SLG and others are always averages)
If saying 1 for that specific 4, then 1/4 does not mean 1 for every 4. If you say “25%” regarding that specific 4, then one would take you as saying a rate of 25 for every hundred calculated to that specific series of 4, or 1 for that 4. Could be taken as something else though and will explain in the next.
If saying he gets a hit for every 4 AB because of an overall 5/20, then you are technically saying 5 series of 1 for 4. Saying 25% because of those 20 would generally be taken as saying the overall 5 for that 20 but can be taken as anything between that 5 for 20 down to 1 for 4 and technically even a hit value of .250 for 1 (or a complex fraction of ¼ /1) if you didn’t specify.
If saying he gets 1 hit for every 4 AB with regards to an overall 100/400, then you are technically saying that 1 for every 4 rate shows up 100 times. Just like saying 25 for every hundred for an overall 100/400 generally means exactly what it says – 25 for every hundred. Now without specification it too can even be open to the same interpretation as above though (100/400 down to .250/1). As you said, “33.3% of 1” – which means you are giving a %/1 or a complex fraction of X/100 /1
Also, if saying “25% of the time” without specifying what “the time” is at all, your “25 for every 100” technically has to hold up under all possibly outcomes of “the time”. Meaning the first time he goes 0 for 4 your “25% of the time” is shot to hell – or if a person wanted to take it to that extreme you left an opening for, the player should be getting a hit value of .250 every time he comes to the plate or you statement is proved invalid.
It’s the problems you are going to run into by not specifying when creating a percentage off an original calculation presenting the arithmetic mean. And you get an arithmetic mean every time you calculate an outcome per opportunity for outcome. That’s why BA and others will always be averages no matter what is eventually done to them. Even BB% is an average expressed as a percentage.
Joey: I’ll go ahead and respond, since I think we’re getting toward the point where we’re at least on the same chapter, if not the same page.
“That was always my entire point – the ‘average’. Making something a percentage at a later time doesn’t remove the ‘average’.”
Well, jeez louise, man, I said in my first comment to you that was correct! We haven’t been arguing about whether they’re averages or not, but about whether they’re percentages or not. So what exactly is it that I don’t get?
My point, lost amidst all this froofroo: they ARE averages and they ARE percentages. Really, it’s just a semantic argument, because one answer presumes that you’re trying to determine the average number of hits a batter gets per at-bat (average), whereas the other answer presumes that you’re trying to express the rate at which the batter gets a hit (percentage). My main reason for preferring percentage here is that for all intents and purposes, knowing “how many hits a batter gets per AB” is useless since he either gets one or none in each individual AB. We’re interested in how often he gets a hit (or gets on), even though they’re functionally the same thing.
Does my thought process at least make sense there?
Anyway.
Richard: Hang on there; I didn’t mean that differences in runners were the only reason. Your example is a pretty viable exception, although there’s one respect in which it’s probably not all that workable: a hitter of Pierre’s profile with similar stats to someone who gets their XBH via power are very, very unlikely to have similar HR totals. (Not saying it can’t happen, and I’d even support you in noting that it’s fairly common for younger players with power potential to hit a lot of doubles and hardly any homers until their power develops a little more.)
But then, that’s a very specific kind of exception, and if there’s one thing I’ll argue tooth-and-nail with even stat guys about is that you can’t figure anything out solely with stats; it’s a complex game with nuances which require a knowledge of the actual bags of meat playing the game. You have to know what kind of hitter a Juan Pierre is, divorced from the actual results of his hitting, to truly understand what those results are telling you.
@ 131: Jon Morse
Oh my God what are you talking about now?
“Well, jeez louise, man, I said in my first comment to you that was correct! We haven’t been arguing about whether they’re averages or not, but about whether they’re percentages or not. So what exactly is it that I don’t get?”
Wrong.
This is what you first said to me:
“They can also be considered to be “averages” in a mathematical sense, but only as a sum of all numbers in a set (the numbers being either “1? or “0?, and the set being the results of each AB or PA) divided by the quantity of numbers in the set (i.e., arithmetic mean). Insofar as we do not view a hitter’s statistics as a series of ones and zeros, but as a whole number divided by another whole number… well, the argument that “average” is the wrong term has merit in mathematical terms. Linguistically, what’s actually being said is that “Joe averages a hit 33.3% of the time,” but that meaning of “average” is subtly different from the mathematical connotation.”
In saying Hits per AB, Mathematically and Linguistically the stats are calling for an average. Linguistically it is asking for the number of Hits “for each” AB (as if the definition of “per”) – the number of hits will vary from 0 to 1 for each AB, and is easiest to calculate as the H/AB we are told to use. Mathematically they are averages which could eventually be altered into percentages or per-mille’s or parts-per-billion or whatever else a person may choose. But mathematically and linguistically they are not percentages or per-millies or parts-per-billion or whatever because neither the definition nor calculation specifically calls for that; the definition and calculation solely calls for the average.
And here:
“My main reason for preferring percentage here is that for all intents and purposes, knowing “how many hits a batter gets per AB” is useless since he either gets one or none in each individual AB.”
Here you admit what we have all known the entire time – that it is your preference but not necessarily the “intended” or “called for” or all the other things you have claimed BA implied. And there is the problem, and the reason we have been arguing – as if your preference for something somehow makes it a fact. Yes, we realize you personally want to say BA is a “percentage” and it is that imaginary intent of the stat you claim exists (“The only and absolute intent of the number, by itself, is to indicate “33.3% of the time.”) because you like to think of it as showing how many hits he gets over 100 AB or whatever. But that is just you and it doesn’t make it the case any more then my thinking it should be factored to “per every 3” or “per ever 600” or “per-mille” or whatever else I could come up with. That is why we had the entire conversation. You were arguing that your personal preference is fact and makes the definition of the stats something they just are not at all.
.
“Does my thought process at least make sense there?”
But yes, I read you perfectly clear now. You know you have been wrong the entire time so you are trying to backtrack to cover your bases. Unfortunately you obviously end up contradicting yourself in doing this; but whatever – you already proved my being correct the whole time so we can leave it at that…
[...] In his article, Posnanski also talks about the “negative hitting” stats, or “negative production” formula that Tango created. Honestly, I don’t completely understand it. Here’s the formula: (At bats – hits) + sacrifice flies + caught stealing [...]
[...] Posnanski wrote about his desire to adopt a baseball stat for his blog. He hinted at reasons for disliking OPS [...]