View Full Version : Trouble with Park Factor
cubbieinexile
02-13-2005, 10:46 PM
Park Factors are a flawed mechanism when used to compare individual players from different teams. PF is a measurement of totals runs at home compared to total runs away. Yet this number is then used to augment individual players batting lines. Stats like OPS, AVG, Home Runs, Runs and RBI's. As well as the doubles, triples, hits, and walks. Of course the trouble with doing this is that it assumes that all these different events are affected by the park the same as total runs and that individual players all basically accumulate stats the same way. In otherwords if a park increases scoring by 10% then it also increases home runs by 10 percent, as well as hits, doubles, so on and so on. And that individual players will hit 10% more home runs at home then on the road.
By looking at the components we see that it isn't even close to the truth. Looking below at the chart the first stat listed is the traditional Park Factor. The one that is gathered by comparing runs at home against runs away. The next is for home runs done the same way, then hits, doubles, triples, and walks.
Park Name ParkFac HRFact HitFac 2BFac 3BFac BBFac
Rockies 121% 112% 112% 116% 133% 109%
Rangers 111% 104% 105% 104% 133% 100%
White Sox 107% 120% 105% 99% 94% 101%
Blue Jays 106% 106% 103% 104% 110% 105%
Cubs 106% 116% 102% 100% 96% 99%
Red Sox 106% 99% 105% 117% 80% 99%
Orioles 104% 103% 101% 102% 85% 98%
Giants 103% 95% 104% 105% 141% 97%
DBacks 103% 115% 103% 106% 115% 101%
Twins 102% 96% 101% 98% 87% 97%
Brewers 102% 99% 99% 108% 112% 106%
Phillies 101% 107% 99% 93% 118% 100%
Athletics 101% 104% 99% 99% 84% 103%
Astros 100% 104% 100% 95% 122% 96%
Braves 100% 106% 99% 97% 87% 101%
Mets 99% 90% 102% 99% 73% 99%
Angels 99% 103% 101% 93% 79% 98%
Indians 98% 87% 98% 106% 89% 106%
Cardinals 97% 90% 100% 102% 113% 103%
Tigers 96% 94% 100% 91% 142% 99%
Yankees 96% 101% 98% 94% 77% 97%
Pirates 96% 94% 99% 104% 90% 95%
DRays 96% 100% 97% 95% 102% 101%
Royals 96% 85% 99% 97% 121% 99%
Dodgers 95% 101% 98% 88% 80% 92%
Expos 95% 93% 97% 107% 94% 100%
Marlins 95% 99% 98% 97% 101% 103%
Reds 92% 102% 95% 93% 77% 94%
Padres 92% 85% 95% 95% 126% 102%
Mariners 92% 102% 93% 102% 74% 102%
Take the Red Sox as an example, last year they had a PF of 106. Most people would then use that number to adjust a players OPS and home runs totals. Saying something like Player A batted .320 with a OPS of .950 with 40 home runs, but once we adjust it he is now batting .302 with an OPS of .896 and 38 home runs. But by looking at the components we see that the players HR totals should not have been adjusted (99 HR factor), that his hits were not increase by 106 but by 105 and that his walk total was increased by 106 but decreased by 99. Last year before park factoring this player would have an OPS roughly 24% better then the league, after PF it would be 17% better then the league. But looking at his components we would see that his OPS would be at least 20% of the league if we were to adjust each individual stat based on the park factors for each individual stat.
Of course all this ignores the other obvious flaws in PF which are that it ignores what side of the plate you bat on and whether or not the individual batter actually played the same ratio of home and away games that his team did and that he played every game. For example if you have a player who played 130 games and missed 3 games at Coors Field, 3 games at Arlington, and 6 games at the Cell his park factor adjusted numbers are going to be radically different the his teammates who did play those games.
Northernclan
02-14-2005, 07:31 AM
Well,
Personally I'd like to see deeper center fields. Why? More triples and in-the-park homeruns. Stretching doubles into triples is exciting to watch and with today's smaller parks, as compared to the deeper ones of earlier eras, (case in point Yankee Stadium) you just don't see the three bagger enough. A stand-up triple! Wow! A bases clearing triple..will he go home or stay at third? This adds to the excitement of baseball.
ElHalo
02-14-2005, 09:05 AM
Cubbie,
So what would you rather do? Just leave it unadjusted and call a season at Coors the same as one at Fenway, and the same as one at Dodger Stadium?
Like anything else, park factor isn't perfect. But it's much, much better than nothing.
antihipster
02-14-2005, 09:53 AM
Cubbie,
So what would you rather do? Just leave it unadjusted and call a season at Coors the same as one at Fenway, and the same as one at Dodger Stadium?
Like anything else, park factor isn't perfect. But it's much, much better than nothing.
:clapping
Yes, ballpark factors are not perfect, just as any other stat.
Without a ballpark factor, Todd Helton and Larry Walker would be ranked close to Bonds, Ruth, Gehrig. Without a ballpark factor, all time lists would be skewed and have tons of asterisks and symbols for explanations.
cubbieinexile
02-14-2005, 10:29 AM
Cubbie,
So what would you rather do? Just leave it unadjusted and call a season at Coors the same as one at Fenway, and the same as one at Dodger Stadium?
Like anything else, park factor isn't perfect. But it's much, much better than nothing.
Is it much better then nothing?
Take a look at the Rockies PF. It is 121. Yet home runs and hits are 112. So reducing SLG by 121 would be wrong. In fact it would be way off. How is a number that has an air of authenticy but is in actuality horribly wrong better then nothing. At least with nothing the viewers knows the number is not honest.
What I expect people to do is the same thing I expect people to do whenever the want anaylze the players and the game. Which is do the work. If you are going to say that player A is better then player B don't just look at some ink scores and OPS+. Look at what type of player each one is. How that style of play is effected in his home park so on and so on.
Whenever something like OPS+ is debated people always say they know about the limitations of Park Factor but it is the best out there or something along those lines. Yet when it comes time talk about players they always say things like Player A has an OPS+ of 117 and Player B has an OPS+ of 112 so Player A is better. They will spout that off and often they won't even know what the park factors were for each player. They are just spouting something they read on BRef. They don't know if the park factor is even accurate to the individual players. They ignore it all and blindly follow what is written on the page and just go by what is the larger number. It could be that the OPS+ 117 is unfairly getting a bonus while the OPS+ 112 is unfairly getting a penalty. Or it could be the opposite and the difference is really much larger then that.
In the end Park Factor measures game runs at home versus game runs away. Why somebody would use that to adjust individual players individual stats is beyond me. To me that is like trying to use a Canadian dollar to buy something in America. Yes the Canadian dollar has value but not in that environment.
ElHalo
02-14-2005, 10:34 AM
Is it much better then nothing?
Take a look at the Rockies PF. It is 121. Yet home runs and hits are 112. So reducing SLG by 121 would be wrong. In fact it would be way off. How is a number that has an air of authenticy but is in actuality horribly wrong better then nothing. At least with nothing the viewers knows the number is not honest.
You're not understanding how park factors work.
Yes, the PF is 121 for Coors. No, nobody (should) adjust SLG by 121. The PF is a measure of relative runs scored. Relative runs scored tends to be roughly proportionate to OBP * SLG, and thus you can use a PF for OPS with some degree of accuracy. If you're just adjusting SLG or OBP, then you wouldn't use the regular park factor, you'd use its square root. In this case, the square root of a 121 park factor is 110, so you'd use a 110 factor to adjust SLG or OBP.
cubbieinexile
02-14-2005, 10:49 AM
:clapping
Yes, ballpark factors are not perfect, just as any other stat.
Without a ballpark factor, Todd Helton and Larry Walker would be ranked close to Bonds, Ruth, Gehrig. Without a ballpark factor, all time lists would be skewed and have tons of asterisks and symbols for explanations.
All time lists of what?
The only list where PF is used is OPS+ the rest are not PF and yet there are no asterisks and no need for explanations. People are not idiots they understand environments. People know when they see Dante Bichette on a list that he achieved it with much help from Coors. People know that when they see Ed Delahanty on a list that he achieved it because he got to play baseball right when they moved the mound back. People know when they see Bob Gibson on a list that he was helped by one of the greatest pitchers eras since the deadball. Same for Sandy Koufax. There is no need to put an asterisk on a stat accrued in 1930 or in 1893. If you know what is going on you know why these numbers were achieved.
Lets look at Barry Bonds. Barry Bonds from 2000 to to 2003 actually got a bonus and a quite substantial one becuase he supposedly played in a park that was a pitchers park. The PF was usually 91 which means his OBP and SLG were inflated by 9%. Here is his line:
Barry Bonds 2003 2002 2001 2000
Home +369/569/805 +351/564/750 +335/516/915 +321/449/741
Away +313/485/692 +386/596/842 +321/514/817 +291/431/633
Does it look like Barry Bonds home stats were suppressed by 18% every year. Because that is what a PF of 91 is claiming. it is claiming that offensive stats are suppressed by 18% at home.
cubbieinexile
02-14-2005, 10:55 AM
You're not understanding how park factors work.
Yes, the PF is 121 for Coors. No, nobody (should) adjust SLG by 121. The PF is a measure of relative runs scored. Relative runs scored tends to be roughly proportionate to OBP * SLG, and thus you can use a PF for OPS with some degree of accuracy. If you're just adjusting SLG or OBP, then you wouldn't use the regular park factor, you'd use its square root. In this case, the square root of a 121 park factor is 110, so you'd use a 110 factor to adjust SLG or OBP.
No you wouldn't. You are not understanding how OPS+ works. OPS+ does OBP and SLG seperately and it uses basic PArk Factor scores.
Adjusted OPS+
This value is calculated differently from the Total Baseball PRO+ statistic. I chose OPS+ to make this difference more clear. PRO+ as best I can tell is
PRO+ = 100 * ( OBP/lgOBP + SLG/lgSLG - 1)/BPF
Where lgOBP and lgSLG are the slugging and on-base percentage of a league-average player, and BPF is the batting park factor. This takes into account the difference in runs scored in a team's home and road games, so it doesn't depend on how good an offense or defense a team has.
My method is slightly more complicated, but I think it is more correct. The BPF is set up for runs and the way it is implemented in PRO+ applies it to something other than runs.
My method
Compute the runs created for the league with pitchers removed (basic form) RC = (H + BB + HBP)*(TB)/(AB + BB + HBP + SF)
Adjust this by the park factor RC' = RC*BPF
Assume that if hits increase in a park, that BB, HBP, TB increase at the some proportion.
Assume that Outs = AB - H (more or less) do not change at all as outs are finite.
Compute the number of H, BB, HBP, TB needed to produce RC', involves the quadratic formula. The idea for this came from the Willie Davis player comment in the Bill James New Historical Baseball Abstract. I think some others, including Clay Davenport have done some similar things.
Using these adjusted values compute what the league average player would have hit lgOBP*, lgSLG* in a park.
Take OPS+ = 100 * (OBP/lgOBP* + SLG/lgSLG* - 1)
Note, in my database, I don't store lgSLG, but store lgTB and similarly for lgOBP and lg(Times on Base), this makes calculation of career OPS+ much easier.
That was from the glossary of BRef. It claims Total Baseball Park Factors at the end at it claims that they (BRef) park factor the run environment and then assume that everything else happens at the same rate. Which obviously it doesn't.
ElHalo
02-14-2005, 11:06 AM
No you wouldn't. You are not understanding how OPS+ works. OPS+ does OBP and SLG seperately and it uses basic PArk Factor scores.
That was from the glossary of BRef. It claims Total Baseball Park Factors at the end at it claims that they (BRef) park factor the run environment and then assume that everything else happens at the same rate. Which obviously it doesn't.
Yes, it does assume that everything happens at the same rate, but for much simpler purposes than you're claiming. It assumes that so that a park factor for OBP and SLG can be used independantly of having to do one for BB, H, TB, etc. While this isn't precisely true, it's not going to be horrendously off either.
And what I said was indeed true. Look at the formula for OPS+ again.
100 * (OBP/lgOBP + SLG/lgSLG) / PF.
Note that it uses relative OBP + relative SLG. Roughly speaking (again, this is all rough work, no exacts used here), OBP + SLG will be distinctly proportionate to OBP * SLG. I.e., as one goes up, the other goes up, pretty constantly.
Now. We know that park factors will be fairly accurate at determining OBP * SLG differences for a given park (not exact, again, but fairly accurate). We know that OBP + SLG will be roughly proportionate to OBP * SLG. Therefore, we can say that PF will be fairly accurate at determining OBP + SLG.
However, it's not accurate at determining OBP or SLG, independantly of each other. Since the park factor works with runs, and the runs are roughly proportional to OBP * SLG, then for a PF`, the park factor of just OBP, say, we'd need to take (PF / SLG) to get an accurate measure. Since OBP and SLG will tend to be roughly equal (again, this is all in approximates, but a .350 OBP is numerically a rough equivalent of a .450 SLG), we take a square root of the PF to get PF`.
You follow? So yes, it assumes, for one small part of the math, that hits and walks will change at the same rate, which they won't. However, we NEVER use that 121 park factor for Coors' to determine an SLG factor. Note where he talks about using the quadratic formula to compute lgOBP* and lgSLG*. This is not the same as the formula used for lgOPS*.
cubbieinexile
02-14-2005, 11:13 AM
Actually BRef does not claim the other events happen at the same rate like they used to. It appears that BRef changed its OPS+ within the last year or so. They used to go by the Total Baseball method, and they used to Park Factor OBP and SLG seperately. I know because I took part in the discussion on Baseball Primer when he was setting up OPS+ for the first time for his site.
What they do now is park factor the runs created of the league average. They then use this difference to create a stat line. It cannot go up or down by the same rate because there is not a one to one relationship between hits and runs created. So they use a quadratic formula to find out how many offensive events must occur to make the new runs created. Once they got that they can figure out LgOBP and LgSLg.
cubbieinexile
02-14-2005, 11:44 AM
Yes, it does assume that everything happens at the same rate, but for much simpler purposes than you're claiming. It assumes that so that a park factor for OBP and SLG can be used independantly of having to do one for BB, H, TB, etc. While this isn't precisely true, it's not going to be horrendously off either.
How so? Coors park factor is 121. Coors walk factor is 133. Coors home run factor is 112.
The data is now out there why shouldn't we look at each stat seperately? Why should we assume that a park effects lefties and righties equally?
Look at Dodger stadium it hurts everything except Home Runs, don't you think that power fly ball hitter will be effected by the park differently then a contact line drive hitter? The power hittesr slugging is not going to be effected much if all he does is walk and hit homers (McGwire type) while a player like Tony Gwynn is going to see his stats effected much more. In fact it is entirely possible for McGwire to hit more home runs at Dodger park then he would have hit at Busch stadium. McGwire and his style of play is least likely to be effected by Dodger stadium yet he would get the same bonus as Gwynn.
cubbieinexile
02-14-2005, 12:16 PM
The more I look at BRef's new OPS+ the more concerned I get. They take a park factor based on actual runs then use that on hypothetical runs. Runs Created is generally around 5% or more off on actual runs. It would be interesting to see what the Park Factors would be if we used runs created instead of actual runs. How much of a difference there would be or if there is any difference at all.
cubbieinexile
02-15-2005, 01:42 AM
I just crunched some numbers for Coors field last year and found some things out. Coors field had a OBPfactor of 107. a SLGfactor of 110 and a run factor of 121.
So for instance if we were to figure OPS+ using the most common method out there, which is the Total Baseball method (100 * ( OBP/lgOBP + SLG/lgSLG - 1)/BPF) then Helton's OPS+ would be 156.
It would be 156 because the park factor is based on runs and not the components. If we were to use the component PF's Todd Heltons OPS+ would be 166.
To me that is a big difference one that should not be ignored. By taking the easy way out the numbers are artificially lowered by 10 points.
cubbieinexile
02-15-2005, 01:52 AM
I took at what the park factor would be for Coors if we only used a basic runs created formula. The Park Factor would be 122 compared to 121. Not a big deal, though I have no idea if this is consistent for every team. So far I have only ran numbers for the 2004 Rockies.
BRef's new way of doing OPS+ has intrigued me. I had no idea they switched around their formula. They must have done it only a few months ago. I like that they are not only comparing players OPS versus other hitters and not pitchers. I think that is a plus. For instance Todd Helton using the old method has an OPS+ of 156. Only comparing Todd to other hitters he has an OPS+ of 148.
The runs created part I was skeptical at first but it seems to me that it is more accurate the Total baseball way. Using Runs created to come up with a park adjusted OBP and SLG gets you this.
OBP: .368
SLG: .475
Using the real data accrued during the season gets you a park adjusted league average of this:
OBP: .365
SLG: .481
Not bad. Using the runs created method gives Todd Helton and OPS+ of 158, and using real data give hims an OPS+ of (drunroll please) 158. The same, but that is of course luck. Or at least I think it is. It just so happens that one went up enough and the other went down enough to cancel each other out. Normally there is going to be a couple point difference in the two numbers. Possibly more, it depends if runs created always over-values OBP and undervalues SLG, if it doesn't meaning that it is possible that RC undervalues both or overvalues both at the same time the difference can get even higher. I haven't done enough numbers yet to see what exactly happens. Anyway if we were to do this with Vinny Castilla his RC OPS+ would be 103 and his real data OPS+ would be 102. Total Baseballs OPS+ would 105
Does this mean I like OPS+ now? No it doesn't. What I am showing is that PF is highly subjective. Using Total Baseballs method we get a numbers that is much different then the other methods. Even if we adjust TB's method by only not including the pithcer we get a number that is off by a few points from both methods as well. TB's OPS+ for Vinny would be 99, Heltons would be 148. One off by 3 to 4 points the other off by 10 points. Using the newer Quadratic formula method is better but it still come ups with different numbers. Finally park factors even the ones I am using still ignore the platoon factor and the playing time factor. Last year Vinny did not play in 14 games, 57% of those games missed were road games. What about somebody who played even less. Like Matt Holliday who missed 41 games? Who got to play more home games then his team. What about players who miss the first month of the season versus players who miss the last part of the season?
vinay
03-01-2005, 11:13 PM
The data is now out there why shouldn't we look at each stat seperately? Why should we assume that a park effects lefties and righties equally?
When you use an overall run factor, you are not assuming that the park affects everybody equally. OPS+ is not designed to measure what a player would do if magically moved to a neutral park. Instead, it compares what the player did to what an average player would do in his own park. It measures the value of his performance; using an overall park factor adjusts the value of the runs the player created.
If you are attempting to make projections for how players will do after changing ballparks, then I agree that component factors are better (especially those separating lefties and righties). But it is not inherently wrong to use overall park factors.
cubbieinexile
03-02-2005, 10:57 AM
OPS+ is attempting to compare the average to what the an individual did in his environment. By using PF you are not actually comparing real environments but theoretical environments. A PF of 110 means that the stadium had a run enviroment 20% higher then average. Now then what happens when we take a player from that stadium who didn't play in every game or had a playing style that was not as conducive to that stadiums strengths? Lets say that after looking into it we find that he had a PF of 103 or a run environment that was 6% higher then average. By using the PF of the stadium as a whole we are ignoring what actually happened to the individual player. In fact we are penalizing him for no real reason. Using PF was supposed to remove the bias of players home stadiums but by using the general PF we are in fact making it possible that an even bigger bias can occur.
PopTop
03-03-2005, 03:57 PM
What I am showing is that PF is highly subjective.
I'm not sure that's the right word, subjective. Deficient and still wet behind the ears would be how I look at it, a work in progress. Several SABR formulas changed in their infancy, and we even see different criteria used by different people today with some calculations. And it's only one number like any other column, none of which are all-knowing by themself. If you don't look at home-road splits and have a working knowledge of all the parks a player performs in, it's useless. If you didn't see games in the Astrodome in both 1965 and 1999, and points in between, you'd really never know how it went from probably the toughest hitters' park in the game to not far off average by the end.
antihipster
03-03-2005, 07:58 PM
I have my own variation to OPS+, which I call Era Adjusted Value. I add in a Ballpark adjustment.
Here is an example : Dick Allen
381 OB% + 534 SLG%= .915 OPS
The league average during his career was
356 OB% + 346 SLG%=.702
Ballpark adjustment=100.133
Dividing Allen's career stats by the league average stats during his career documents relative value of the ballplayer[100% and over means over average, 120%+ means great player.]This is my own system I call Era Adjusted Value or EAV for short. The difference between the OPS+ system and mine is that I add both the individual's ops and league OPS and divide them. Then I divide the the ballpark factor by the previous computation. As a result, the calculation is lower than the OPS+ method, because I skip the subtracting one from the the calculation. I think this is a more accurate measurement of value.
OB EAV=107.222 [106.880 w/ballpark adjustment]
SLG EAV=154.335 [154.130 w/ballpark adjustment]
OPS EAV=130.342 [130.169 w/ballpark adjustment]
cubbieinexile
03-03-2005, 09:53 PM
Where did you get your numbers. You have Dick Allens OBP wrong and his league average numbers wrong? I have him at .912 OPS against a .707 OPS.
Your EAV will always be lower because OPS is combination of two different formulas that use two very different ranges. OBP is a measurement that goes from 0 to 1, while SLG is a measurement that goes from 0 to 4. EAV really doesn't tell you anything more then does OPS+
antihipster
03-03-2005, 10:24 PM
Where did you get your numbers. You have Dick Allens OBP wrong and his league average numbers wrong? I have him at .912 OPS against a .707 OPS.
Your EAV will always be lower because OPS is combination of two different formulas that use two very different ranges. OBP is a measurement that goes from 0 to 1, while SLG is a measurement that goes from 0 to 4. EAV really doesn't tell you anything more then does OPS+
TOTAL BASEBALL reference book has him listed @ .914 OPS while I add up and divide the league average OPS over the years. That is my data source. I just doubled checked through baseball reference and cnn/si and you are correct about Allen's OPS.
While you say OPS+ is no different from EAV OPS, here are the differences between the top twenty.
Out of curiosity, what is your reference?
Top 20 OPS+
1. Babe Ruth+ 207 L
2. Ted Williams+ 190 L
3. Barry Bonds (39) 184 L
4. Lou Gehrig+ 179 L
5. Rogers Hornsby+ 175 R
6. Mickey Mantle+ 172 B
7. Dan Brouthers+ 170 L
Joe Jackson 170 L
9. Ty Cobb+ 167 L
10. Jimmie Foxx+ 163 R
Mark McGwire 163 R
12. Pete Browning 162 R
Frank Thomas (36) 162 R
14. Dave Orr 161 R
15. Stan Musial+ 159 L
16. Hank Greenberg+ 158 R
Johnny Mize+ 158 L
Tris Speaker+ 158 L
19. Dick Allen 156 R
Willie Mays+
Eav OPS
1)166.910 [Babe Ruth] 1.169 [1914-35]10,461
2)149.020 [Lou Gehrig] 1.079 [1923-39]9,509
3)148.799 [Barry Bonds] 1.054 [1986-*]11,400
4)145.534 [Ted Williams] 1.117 [1939-1960]9,727
5)144.067 [Mickey Mantle] .980 [1951-68]9,835
6)140.559 [Joe Jackson] .940 [1908-20]5,500
7)139.997 [Rogers Hornsby] 1.011 [1915-37]9,211
8)139.622 [Dan Brouthers] .942 [1879-96]7,622
9)137.754 [Jimmie Foxx] 1.037 [1925-45]9,586
10)137.055 [Ty Cobb] .945 [1905-28]12,683
11)136.620 [Mark McGwire] .986 [1986-2001]7,504
12)134.826 [Johnny Mize] .956 [1936-53]7,299
13)134.734 [Pete Browning] .870 [1882-94]5,341
14)134.654 [Willie Mays] .944 [1951-73]12,345
15)134.583 [Joe DiMaggio] .977 [1936-51]7,611
16)133.899 [Hank Aaron] .932 [1954-76]13,748
17)133.345 [Mike Piazza] .947 [1992-*]6,471
18)133.023 [Chuck Klein] .922 [[1928-44]7,087
19)133.020 [Frank Robinson] .929 [1956-76]11,426
20)132.742 [Mel Ott] .798 [1926-47]11,164
cubbieinexile
03-03-2005, 10:34 PM
Baseball Encyclopedia. If you were to use you league averages then Dick Allen would have to be playing in the 1890's to have a league OBP that high and then also be playing in the deadball era to have a SLG average that low.
How did you get Lou Gehrig that high? If I just compare his OPs to his league OPS I get 138, and even when I park factor it I only get it up to 141.
But more importantly what exactly is EAV telling you that OPS+ is not?
antihipster
03-03-2005, 11:00 PM
Baseball Encyclopedia. If you were to use you league averages then Dick Allen would have to be playing in the 1890's to have a league OBP that high and then also be playing in the deadball era to have a SLG average that low. :noidea
How did you get Lou Gehrig that high? If I just compare his OPs to his league OPS I get 138, and even when I park factor it I only get it up to 141.
:noidea
How did you get Lou Gehrig that high? If I just compare his OPs to his league OPS I get 138, and even when I park factor it I only get it up to 141.???????????????
I found some data on the internet about league average. [baseball reference]
antihipster
03-03-2005, 11:15 PM
But more importantly what exactly is EAV telling you that OPS+ is not?
I beleive EAV OPS uses better math as oppossed to a shortcut method. When adding up the slg% and the ob% and then subtracting one from the formula, you are not using the correct margin, which in my opinion should be the league average.
Eventually, I plan on adding a baserunning component to this stat.
cubbieinexile
03-04-2005, 01:52 AM
I beleive EAV OPS uses better math as oppossed to a shortcut method. When adding up the slg% and the ob% and then subtracting one from the formula, you are not using the correct margin, which in my opinion should be the league average.
Eventually, I plan on adding a baserunning component to this stat.
Actually EAV would be the shortcut method. EAV would just look at OPS in general and not the components. Your method shortchanges OBP because OBP is on a different scale. For instance two players one player (A) has an OBP of .500 and a SLG of .500, Player B has an OBP of .400 and a SLG of .600. Both have an OPS of 1.000. We'll say the league average is .300 for OBP and .400 for SLG, for an OPS of .700. Your method would give both players the identical 143 EAV. Using OPS+ player A would have an OPS+ of 191 and Player B would have a 183. Why? Because it is a lot harder to be .200 points over lg average in OBP then it is in SLG. Because again SLG is on a scale 4 times greater then OBP. All you are really doing is simply splitting the difference between OBP and SLG. For instance in the two players above Player A is 66% the league in OBP and 25% in SLG, player B is 33% and 50%. The difference between the first two numbers is 41 and the second two numbers is 17. Split them in half and subtract them from the first number or add them to the last number and you have a close approximation of EAV. It works on real players as well. For instance the EAV of Sammy Sosa is 117. His OBP is 103 and his SLG is 129. Difference of 26, split it and you get 13 which comes out to 116. So basically what you are doing is splitting the difference while OPS+ adds them.
Also I used B-Ref's league average park factored OPS and it still comes out to 141. What exactly are you doing because I have yet to find a number that is consistent with anything I can find?
antihipster
03-04-2005, 08:08 PM
Actually EAV would be the shortcut method. EAV would just look at OPS in general and not the components. Your method shortchanges OBP because OBP is on a different scale. For instance two players one player (A) has an OBP of .500 and a SLG of .500, Player B has an OBP of .400 and a SLG of .600. Both have an OPS of 1.000. We'll say the league average is .300 for OBP and .400 for SLG, for an OPS of .700. Your method would give both players the identical 143 EAV. Using OPS+ player A would have an OPS+ of 191 and Player B would have a 183. Why? Because it is a lot harder to be .200 points over lg average in OBP then it is in SLG. Because again SLG is on a scale 4 times greater then OBP. All you are really doing is simply splitting the difference between OBP and SLG. For instance in the two players above Player A is 66% the league in OBP and 25% in SLG, player B is 33% and 50%. The difference between the first two numbers is 41 and the second two numbers is 17. Split them in half and subtract them from the first number or add them to the last number and you have a close approximation of EAV. It works on real players as well. For instance the EAV of Sammy Sosa is 117. His OBP is 103 and his SLG is 129. Difference of 26, split it and you get 13 which comes out to 116. So basically what you are doing is splitting the difference while OPS+ adds them.
Also I used B-Ref's league average park factored OPS and it still comes out to 141. What exactly are you doing because I have yet to find a number that is consistent with anything I can find?
I get your point and I will use the OPS+ formula, but w/a decimal point and three more numerals [ie aaron OPS+ 155/EAV 154.879] as I am putting lots of players through the ringer and then rating them, then will add a baserunning element to this formula. I suspect i will eventually abondon the OPS+ formula as I logically understand another system. This will be an evergoing project.
I was using a calculator and doing my own math. I think "Total Baseball" has a lot of mistakes in the percentages, plus some categories are left out [hbp,gdp].
I am pretty new at this as I was introduced to sabremetrics last spring.
four tool
03-08-2005, 05:46 AM
Unless these are adjusted for left and right they are even more flawed. Right handed hitters are normally hurt by Yankee Stadium, for instance.
Here are 5 consecutive year peak OPS+ and single season best. In parentheses is shown the number of years in which Aaron surpassed each players career best OPS+. Aaron tops all but Joe Jackson in 5 year peak and all of them and had at least 3 years above all of the other's career best except for Kiner, Jackson, Albert Belle and George Sisler.
For example, Kiner's 5 best consecutive year average was 169 (versus 172 for Aaron) and his career best was 184 (which Aaron surpassed once).
Aaron basically at least matched the peaks of these players, but maintained that level for at least 3 times longer (19 straight years of 140+, 15 at 150+, 10 at 160+, 6 at 170+). He was also a good defensive outfielder and a good base-runner. The fact that he basically matches these "shooting stars" in peak value is quite a testament.
Hank Aaron: 172/194
Ralph Kiner: 169/184 (1)
Nomar Garciaparra: 141*/158 (10)
Joe Jackson: 176/193 (1)
Albert Belle: 159/191 (1)
Larry Walker 161*/177 (3 + tie)
Duke Snider: 161/172 (5)
Chuck Klein 160/175 (5)
Johnny Mize:171/178 (3 + tie)
Hank Greenberg:169/172 (4)
Hack Wilson: 160 CB/178 (3 + tie)
George Sisler: 160/181 (1 + tie)
Elmer Flick:149/172 (5)
Gavvy Cravath: 160/172 (5)
Pete Reiser: 131/165 (7)
O'Doul: 147/163 (8)
Brett: The official major league encyclopedia for 2001 has Aaron's career best OPS+ at 190. However, I note that Baseball reference.com lists Aaron's OPS+ for the 1971 season at 194. Is that where you got the 194 figure from?The Official Encyclopedia has Aaron's OPS for 1971 at 190. Wonder why there's such a difference between Baseball Reference and the official encyclopedia? Aaron was playing in 1971 (at age 37) in what was referred to at the time as the "launching pad" in Atlanta. Davy Johnson, had been the second baseman for the Orioles for many years, and his two best home run seasons with them were 18 and 10. After Johnson was traded to Atlanta, his very first season in the "launching pad" resulted in his hitting 43 home runs. That's an example of how much Aaron's park helped hitters in those days. Aaron was a great hitter and great player, however it should be realized that Aaron's ability to continue to post big numbers late in his career was aided a great deal by playing in the 'launching pad".
c JRB
538280
11-19-2006, 11:15 AM
I've noticed that difference in OPS+ between BBRef and the Encyclopedia as well, JRB, not only for Aaron but for many players. That is because it is calculated differently. The OPS+ in the Encylcopedia is the same as Total Baseball's PRO+, referenced here:
http://www.baseball-reference.com/about/bat_glossary.shtml
Scroll down to the "Adjusted OPS+" heading and you'll see the explanation.
538280
11-19-2006, 12:05 PM
The OPS+ used in Total Baseball is also adjusted for park factors. That link that I gave explains exactly why they are different.
On page 2498 of the 2001 edition of the Official Encyclopedia they make it clear that they are adjusting for home park and normalizing to league average to come up with their OPS+ number. According to the link that Chris posted, Baseball reference is apparently using a different formula to come up with their numbers. They claim their method is more "complicated", which immediately gives one pause. Aaron's lifetime OPS+ number is actually better in the official encyclopedia (156) than it is on BB-Ref (155). Incidentally, Cobb's lifetime OPS+ (167) is the same in both. Without further information, I would be inclined to use the numbers from the Official Encyclopedia, though I can see where the other source is more convenient to access.
EvanAparra
11-19-2006, 12:17 PM
I've noticed that difference in OPS+ between BBRef and the Encyclopedia as well, JRB, not only for Aaron but for many players. That is because it is calculated differently. The OPS+ in the Encylcopedia is the same as Total Baseball's PRO+, referenced here:
http://www.baseball-reference.com/about/bat_glossary.shtml
Scroll down to the "Adjusted OPS+" heading and you'll see the explanation.
Any noticeable differences in player totals from bbref to Encyclo?
538280
11-19-2006, 12:19 PM
JRB, although BBRef's method is more complicated it is also almost indisputably the better way to go about it. The Encylcopedia's method basically applies the park factor to something it isn't really a factor for-OBP and SLG. It is actually a factor for runs, and BBRef's method applies it to actual runs. BBRef's method is a slight improvement on the Encyclopedia's.
538280
11-19-2006, 12:20 PM
Any noticeable differences in player totals from bbref to Encyclo?
There are VERY few cases where it is much different at all. I've only come across one player whose OPS+ is two points different, most of the time it's one point. That is Gene Tenace, who the Encyclopedia has a 137, but on BBRef has a 135.
AstrosFan
11-19-2006, 12:21 PM
Baseball Reference argues that because the park factors used are run based, an OPS+ formula needs to be derived from runs. So Sean looks at how the park factors affect run scoring, extrapolates the elements in basic runs created: H, BB, TB, AB, using the quadratic equation, and after making the adjustments, calculates OPS+. If you do it the Encyclopedia way, you're assuming the park factor for OPS is the same as for runs.
It's based on the Willie Davis comment in the New Historical Baseball Abstract. IPod showed how to do it in a post here at Fever, but I can't seem to find it.
brett
11-19-2006, 12:37 PM
OK, after reading that a few times, here's the deal.
Total baseball, and others adjust for park factors after computing relative Slg% and OB% and adding them together. In effect, this divided the -1 by the park factor.
Baseball Reference computes relative SLG% and OB%, and then adjusts for park effects (and does each one separately) and THEN subtracts 1. Therefore, they are subtracting 1, while the other sources are subtracting something slightly more or less than 1 because it already got divided by park effects. This would give hitters in poor parks a fractional boost, and those in better parks a fractional cut.
Clearly, Baseball Reference's method is more proper. (although both stats are pretty arbitrary).
The OPS+ used in Total Baseball is also adjusted for park factors, Bill. That link that I gave explains exactly why they are different. That is not why.
brett
11-19-2006, 12:43 PM
Yes, Baseball Reference converts park effects back from runs to extrapolated bases. This also makes sense because the park effect on OPS are not equal to the park effect on runs. Still, he does not divide the -1 by the park effects, which is also an improvement.
Baseball Reference argues that because the park factors used are run based, an OPS+ formula needs to be derived from runs. So Sean looks at how the park factors affect run scoring, extrapolates the elements in basic runs created: H, BB, TB, AB, using the quadratic equation, and after making the adjustments, calculates OPS+. If you do it the Encyclopedia way, you're assuming the park factor for OPS is the same as for runs.
It's based on the Willie Davis comment in the New Historical Baseball Abstract. IPod showed how to do it in a post here at Fever, but I can't seem to find it.
brett
11-19-2006, 12:51 PM
So basically there are two problems with the baseball encyclopedia method.
First, they use a run based park adjustment rather than a SLG and OB% based park adjustment.
Second (I gather) they divide the -1 by the park factor which is even worse (in principal, but less in effect).
OK, after reading that a few times, here's the deal.
Total baseball, and others adjust for park factors after computing relative Slg% and OB% and adding them together. In effect, this divided the -1 by the park factor.
Baseball Reference computes relative SLG% and OB%, and then adjusts for park effects (and does each one separately) and THEN subtracts 1. Therefore, they are subtracting 1, while the other sources are subtracting something slightly more or less than 1 because it already got divided by park effects. This would give hitters in poor parks a fractional boost, and those in better parks a fractional cut.
Clearly, Baseball Reference's method is more proper. (although both stats are pretty arbitrary).
Brett: Among reference works, is baseball reference in the minority in their approach? What is the importance of adjusting SLG% and OB% for park effects separately? When you say "poor parks" are you referring to "poor hitters' parks? Thank you.
brett
11-19-2006, 02:40 PM
Yes-because they actually understand the statistical flaws but yes. These stats are loosely defined though (like total average). OPS+ is the combinedm relative margin of league/park adjusted on base and slugging percentage (plus 1), or combined relative league/park adjusted on base+slugging percentage.
How you figure park and league affects is open to research.
Dividing the -1 by the park affect is just plain wrong though, and Baseball Reference does not do that, but granted, its effect would be maybe .05 for the most extreme examples.
Brett: Among reference works, is baseball reference in the minority in their approach? What is the importance of adjusting SLG% and OB% for park effects separately? When you say "poor parks" are you referring to "poor hitters' parks? Thank you.
brett
11-19-2006, 02:45 PM
I was wrong. A good hitter's park would produce a slight improper edge to hitters. Lets say Relative OPS gets multiplied by a factor of .95 to make up for a good hitter's ballpark, the -1 would also get divided by .95 (to -.95) so the hitter would end up +.05 in the final score.
Brett: Among reference works, is baseball reference in the minority in their approach? What is the importance of adjusting SLG% and OB% for park effects separately? When you say "poor parks" are you referring to "poor hitters' parks? Thank you.
Bench 5
11-19-2006, 08:02 PM
Comparing the two methods of OPS + I am inclined to go with the Encyclopedia's version. In the big picture the differences are minimal but at least PRO+ uses actual runs rather than an approximation (RC).
BBR estimates runs created for the league using the RC formula. Runs created is a good way to approximate the numbers of runs that a player,team or league would score based upon ab,h,bb, and tb. At the league level runs ceated will usually be within 1% of actual runs so that's not a big deal. BBR states in number 2 that it adjusts the RC by the BPF. BPF is based upon runs scored so he is using the same element that PRO + uses in it's formula. In order to calculate the Slg and Oba for the park, BBR backs into the number of park adjusted hits,walks, ab, total bases etc. by using the quadratic formula. I think that by adding formulas that are estimates, they increase the likelihood of inaccuracy. BBR's numbers should always be close to PRO+ because runs created is a close approximation of runs scored.
BBR states that it is more accurate because they are comparing Slg and Oba to the park adjusted Slg and Oba. But in order to arrive at the park adjusted Slg and Oba, they use runs created as the basis for the approximation of park adjusted Slg and Oba. PRO+ applies the park factor directly to the league Slg and Oba. It's cleaner in my opinion. BBR makes an attempt to back into the numbers by using formulas that are intended to approximate runs, hits, etc.
Based upon the BBR formula stated below I see no reason why this is superior to PRO+.
"My method is slightly more complicated, but I think it is more correct. The BPF is set up for runs and the way it is implemented in PRO+ applies it to something other than runs.
My method
Compute the runs created for the league with pitchers removed (basic form) RC = (H + BB + HBP)*(TB)/(AB + BB + HBP + SF)
Adjust this by the park factor RC' = RC*BPF
Assume that if hits increase in a park, that BB, HBP, TB increase at the some proportion.
Assume that Outs = AB - H (more or less) do not change at all as outs are finite.
Compute the number of H, BB, HBP, TB needed to produce RC', involves the quadratic formula. The idea for this came from the Willie Davis player comment in the Bill James New Historical Baseball Abstract. I think some others, including Clay Davenport have done some similar things.
Using these adjusted values compute what the league average player would have hit lgOBP*, lgSLG* in a park.
Take OPS+ = 100 * (OBP/lgOBP* + SLG/lgSLG* - 1)
Note, in my database, I don't store lgSLG, but store lgTB and similarly for lgOBP and lg(Times on Base), this makes calculation of career OPS+ much easier. "
brett
11-19-2006, 09:26 PM
Except that slugging percentage and on-base% statistically correlate to directly to "runs created", and only closely correlate to actual runs. Also, there is still the problem that PRO+ multiplies the -1 by the park effect. In other words, they take relative SLG plus relative OB% then subtract 1, then multiply by the park effect, whereas Baseball Reference takes relative Slg + OB% and factors in park effects at that point (granted based on RC rather than actual runs) and then subtracts the "1" to put it on a 1 +/- scale.
BBR states that it is more accurate because they are comparing Slg and Oba to the park adjusted Slg and Oba. But in order to arrive at the park adjusted Slg and Oba, they use runs created as the basis for the approximation of park adjusted Slg and Oba. PRO+ applies the park factor directly to the league Slg and Oba. It's cleaner in my opinion. BBR makes an attempt to back into the numbers by using formulas that are intended to approximate runs, hits, etc.
Based upon the BBR formula stated below I see no reason why this is superior to PRO+.
"My method is slightly more complicated, but I think it is more correct. The BPF is set up for runs and the way it is implemented in PRO+ applies it to something other than runs.
My method
Compute the runs created for the league with pitchers removed (basic form) RC = (H + BB + HBP)*(TB)/(AB + BB + HBP + SF)
Adjust this by the park factor RC' = RC*BPF
Assume that if hits increase in a park, that BB, HBP, TB increase at the some proportion.
Assume that Outs = AB - H (more or less) do not change at all as outs are finite.
Compute the number of H, BB, HBP, TB needed to produce RC', involves the quadratic formula. The idea for this came from the Willie Davis player comment in the Bill James New Historical Baseball Abstract. I think some others, including Clay Davenport have done some similar things.
Using these adjusted values compute what the league average player would have hit lgOBP*, lgSLG* in a park.
Take OPS+ = 100 * (OBP/lgOBP* + SLG/lgSLG* - 1)
Note, in my database, I don't store lgSLG, but store lgTB and similarly for lgOBP and lg(Times on Base), this makes calculation of career OPS+ much easier. "
brett
11-25-2006, 05:14 PM
Park Factors are a flawed mechanism
I agree. For a stretch of several years in Colorado, it was basically the case that there were nearly 50% more runs scored there, but BA was only about 20% higher and HRs were about 40% higher.
538280
11-25-2006, 05:43 PM
I agree. For a stretch of several years in Colorado, it was basically the case that there were nearly 50% more runs scored there, but BA was only about 20% higher and HRs were about 40% higher.
If you want actual value though, it's not really misleading, because more runs in the park dillutes the value of each run contributed. It doesn't matter how or why those runs are scored, just that they are, if you are going strictly on value.
AstrosFan
11-25-2006, 06:50 PM
For those who are interested, here's the computation of OPS+. I'll use David Ortiz 2006 as an example.
Baseball-Reference takes pitchers out of the equation. That would require me to do some Access queries, so I won't. They also use HBP and SF in their RC equation. I am taking those out and using the very basic form, ((H+BB)*TB)/(AB+BB)
In 2006, there were 21752 H, 7247 BB, 34293 TB, 78497 AB, for 11526.05 RC. Multiplying this by a park factor in Boston of 1.02, we get 11756.57.
Relating everything to H, the equation is now:
((H+.34H)*1.59H)/(56925+H+.34H)=11757
or
((1.34H*1.59H)/(56925+1.34H)=11757
or
1.34*1.59*H^2=11757*56925+1.34*11757H
or
2.12H^2-15706.13H-669243032
Note that when I solve for everything, the figures will not be rounded. Note also that the equation above is in linear form, and can be solved using the quadratic equation.
(15706.13 +/- sqrt(-15706.13^2-4*2.12*-66924302))/(2*2.12)
The +/- is almost indubitably a plus.
That gets me 21830.478 H. Since BB is about .34H, and TB is about 1.59H, that nets 7333.834 BB and 34703.9 TB. AB is the constant outs, 56925, + H, or 78755.48.
So the league OBP for Ortiz is (simple version), (H+BB)/(AB+BB), or .339. SLG is TB/AB, or .441. Compute OPS+ as you normally would (100*(OBP/lgOBP+SLG/lgSLG-1)).
Because I made a couple of changes in the equations, if you were to follow this all the way through, you would get a different OPS+, 166, than the Baseball Reference figure, 164. But the basic concept is sound.