View Full Version : What is the average and replacement level RA for the past few years?
Mariano_Rivera
09-16-2006, 06:08 AM
What is the average and replacement level RA for the past few years?
538280
09-16-2006, 09:29 AM
RA would just be runs per game (run average, runs per nine innings)...so here you go:
AL
2005: 4.76
2004: 5.01
2003: 4.86
NL
2005: 4.45
2004: 4.64
2003: 4.61
I can't give you replacement level though. Replacement level is really a theoretical thing which you can define however you want...so what replacement level is is really your own decision (or BP's, or Bill James', or anyone else's. There is no definite replacement level).
SABR Matt
09-16-2006, 11:13 AM
The run-allowing MARGIN can be defined...the replacement level cannot. The replacement level is an empirical thing that is different depending on how you define it. The Margin is a mathematical truth...it is the threshold at which winning games becomes impossible against league average competition. That occurs at a pythagorean .250 W% for team offense and team defense (a team will win zero games or darned close to zero games if its' offensive W% and its' defensive W% add up to .500 or less).
in 2005: 4.76 * (.75 / .25) ^ (1/X)...
.75 because we want to offense facing the pitching staff to have a .750 W%
X is the pythagorean exponant which in this case would be 9.52^.285 or 1.90
that above expression equals 8.49..the margin for team pitching would be roughly 8.5 R/G
Mariano_Rivera
09-16-2006, 12:14 PM
Thats was actually a typo I don't know how I typed that.
SABR Matt
09-16-2006, 01:22 PM
What was a typo?
Tango Tiger
09-16-2006, 02:21 PM
Matt, that is not a mathematical truth! I can see why you are not a fan of the Odds Ratio method.
The .250 off and .250 def will win .000 games, if their effect is linear. However, that's not how it works. The Odds Ratio method would say such a team will win .100 times per game. If you want to say that's "darned close to zero", fine. But, it's not zero. It's around .100.
As for replacement level, I like to define it as .300 on the team level, which would be .380 for non-pitchers and .410 for pitchers. Since it's easier to relieve than start, I break down the pitcher's replacement level into .380 for starters and .470 for relievers.
But, as others have said, you choose what you mean by replacement level.
Tango Tiger
09-16-2006, 02:27 PM
Further to the .250 thing, we'd expect a .250 winning percentage if the offense scores 2.69 runs and the def allows 5.0
We expect a .250 winning percentage if the defense allows 8.5 runs and off scores 5.
Then, it's simply a matter of figuring out how often a team wins if it scores 2.69 and allows 8.5. Using pythagenpat, that answer is .094.
(Odds Ratio Method would have said .100)
If I use the Tango Distribution, my guess is that it'd be around .100 as well.
SABR Matt
09-16-2006, 02:31 PM
Tango...I've regressed (I know you hate that...but hear me out here) actual W% to the formula I gave...I understand the linear comment, but the regression did not show any systematic tendencies toward underpredicting team W% at the low extreme....the example of this being the Cleveland Spiders, who won at a .130 clip with an OW% and DW% that predicted by my approximation a .149 W%...
I know it ain't perfect, but it's a much more logically based solution to what the 0-win margin is than anything else out there I'm aware of. The OR method would never predict a 0-win team even if it were a little league team vs. the 1927 Yankees. Probabilistically there is always a tiny chance any team can beat another...the linear assumption stops right about where diminishing returns suggests that any probability of victory is so small it can be ignored.
Tango Tiger
09-16-2006, 02:40 PM
The Tango Distribution (think Poisson version of baseball) approximates run scoring distribution in baseball. It is an extremely simple process to take any two Poisson distributions, and figure out which will win.
I understand all about linear regressions, etc. They are completely unnecessary, if we can actual model run scoring. That's what the Tango Distribution does. It's extremely powerful, and I have the program right on my site ready for anyone to use.
As for LL v 1927, or say if you expect one team to score .01 runs per game, while the other will score 1000 runs, pythagenpat gives the chances of the LL team winning as 39,052,190,748,495,700,000,000,000,000,000,000 to 1. Which is as close to zero as you'll get in probability circles.
Tango Tiger
09-16-2006, 02:41 PM
The mathematical certainty is that a linear regression will fail at the extremes for something that is bounded at 0 and 1.
Mariano_Rivera
09-16-2006, 02:47 PM
What was a typo?
The replacement level part of the title I must have been thinking of something else
Tango Tiger
09-16-2006, 02:50 PM
Running the Tango Distribution against a team that scores 3.435 and allows 8.13 runs ('99 Spiders, who ended up at .130), I get a winning percentage of .152 if they scored runs the way they do today. If I change the shape slightly, the winning percentage can be brought down to .136 rather easily.
Pythagenpat says .181, and so, that equation is probably limited at a certain point (say .250 to .750).
SABR Matt
09-16-2006, 02:52 PM
Well too bad Rickey...cause you've start another of Tango and my discussions. :D
I love the idea of using Poisson to predict run scoring Tango...the only problem is that true probabilitiy distributions (rather than mere regression, which yes..I know the flaws) never give you a zero...for my approach to player rating to work, I need to know where the threshold is in performance where wins are not being created...because I believe it possible for players to produce so poorly that they cost their team wins rather than adding wins.
SABR Matt
09-16-2006, 02:54 PM
Running the Tango Distribution against a team that scores 3.435 and allows 8.13 runs ('99 Spiders, who ended up at .130), I get a winning percentage of .152 if they scored runs the way they do today. If I change the shape slightly, the winning percentage can be brought down to .136 rather easily.
Pythagenpat says .181, and so, that equation is probably limited at a certain point (say .250 to .750).
What do you mean by "change the shape"? I thought the poisson distirbution had only one variable...the lambda (rate events occur per unit area or time interval)
SABR Matt
09-16-2006, 03:02 PM
BTW Tango...I found a series of charts involving the Tango distribution, but I don't see anything that describes exactly what it is and how it works nor do I see the program. Perhaps you could link me? Sorry if it's really obvious and I am just blind...searched all of the index links at tangotiger.net
Tango Tiger
09-16-2006, 03:26 PM
It's the last link on my home page:
http://www.tangotiger.net
The Tango Distribution allows you to change the shape. I only invoked the name Poisson to give you a hint as to what to think about. You can ignore anything Poisson I said, otherwise.
That's fine that you want to set a "zero" baseline, just like James does in Win Shares. If you want to call that a "mathematical certainty", sobeit.
SABR Matt
09-16-2006, 03:36 PM
Notice I used qualifiers like "or darned close" though...I was saying that there should be a mathematical way to define the 0-margin.
I'll take a look at the Tango distribution and see if I can extend it...
I was just speaking to another guy who thinks about sabermetrics a little in his rare free time and I remembered the last time I looked at Poisson for baseball. I was trying to explain the shape of a league's RS/G curve...I rejected Poisson because a single-Poisson does not explain the right-skew or the specific shape...however I have realized that this is because the underlying assumption with Poisson is that there's ONE RS/G rate that doesn't change aside from random variance. But a league is made up of teams with various strengths of offense and defense going at each other..the expected RS/G changes with each match-up...it becomes necessary then to model the XRS/G based on the strengths of each team's offense and defense against average competition...sort of like a "double poisson"
Tango Tiger
09-16-2006, 07:02 PM
I suggest you just look at the Tango Distribution. It's been tested against empirical data back to 1900.
SABR Matt
09-16-2006, 08:14 PM
Can the Tango Distribution be used to predict the RS and RA in a game or is it just a W% estimator?
Tango Tiger
09-16-2006, 09:00 PM
I implore you to download the file, and check out the readme file. The last line in that file says:
rundistrOutput.html - the actual runs and win distribution for the 2 teams
SABR Matt
09-16-2006, 09:10 PM
Sorry...I DLed the file I just hadn't read the readme yet..LOL
I guess then the question would be...could the program (currently projects the result of one team vs team match-up) be extended to project the result of ALL of the team vs. team match-ups in a league.
Tango Tiger
09-17-2006, 04:55 AM
The program simply takes the expected RS and RA, and gives you a win%. So, if you do the hardwork of figuring that out, then, it'll give you any matchup you want.
Tango Tiger
09-18-2006, 05:12 AM
Then, it's simply a matter of figuring out how often a team wins if it scores 2.69 and allows 8.5. Using pythagenpat, that answer is .094.
(Odds Ratio Method would have said .100)
If I use the Tango Distribution, my guess is that it'd be around .100 as well.
.092, depending on the scoring distribution of each team.
SABR Matt
09-18-2006, 08:55 AM
Hmm...
Going to need to do something about where to place the margin if I'm going to use the tango distirbution or something similar to calculate expectations and margins.
Tango Tiger
09-18-2006, 09:36 AM
Well, if your system is based on linearity, then whatever you choose will be rather arbitrary.
This file might be useful:
http://www.tangotiger.net/wincomps.html
You can estimate the runs per win (RPW) as: .75*RPG + 3. So, if there are 10 runs scored in a game, the RPW is around 10.5. (Though the higher the score differential, the higher the RPW.)
SABR Matt
09-18-2006, 09:48 AM
My system WAS based on linearity...it won't necessarily be based on linearity when all is said and done...still mulling over my options. I don't want to leave my analysis in the form of rate stats like wOBA because (and I still firmly believe this) that doesn't have much meaning to a baseball exec or to a common fan. It is important to see how a player's performance will actually translate into the success of a team so converting to something concrete (runs or wins) is a must...
Tango Tiger
09-18-2006, 09:59 AM
You sure like that word "must"! I like both the wOBA and win-based numbers (like WPA = Wins Advancement minus Loss Advancement).
The conversion into wins should be a rather mundane step, once you have the wOBA, the playing time (PA,Innings,BFP), and the league average. Teams much prefer seeing the individual steps, rather than the "final number".
SABR Matt
09-18-2006, 10:26 AM
I agree the intermediate steps are important, Tango and teams are looking for people who have a new approach to those intermediate steps that is original and makes an important step forward in our understanding.
Tango Tiger
09-18-2006, 10:35 AM
Running the Tango Distribution against a team that scores 3.435 and allows 8.13 runs ('99 Spiders, who ended up at .130), I get a winning percentage of .152 if they scored runs the way they do today. If I change the shape slightly, the winning percentage can be brought down to .136 rather easily.
Pythagenpat says .181, and so, that equation is probably limited at a certain point (say .250 to .750).
Oops. Patriot rightly pointed out a mistake I made. Pythagenpat says .153, which exactly matches the Tango Distribution!
(It was actually a ratio of .181 wins to 1 loss, which is .153.)
The incredible simplicity and accuracy of PythagenPat continues to astound.
SABR Matt
09-18-2006, 10:39 AM
PythagenPat is the method used in PCA...I feel a little better now. :)
Tango Tiger
09-18-2006, 10:49 AM
I ran a few tests:
RS RA PythPat TangoD
0.50 5.00 0.024 0.029
1.00 5.00 0.065 0.070
1.50 5.00 0.116 0.120
2.00 5.00 0.171 0.174
2.50 5.00 0.228 0.230
3.00 5.00 0.286 0.288
3.50 5.00 0.343 0.344
4.00 5.00 0.398 0.399
4.50 5.00 0.451 0.451
5.00 5.00 0.500 0.500
5.50 5.00 0.546 0.546
6.00 5.00 0.588 0.588
6.50 5.00 0.627 0.627
7.00 5.00 0.663 0.662
7.50 5.00 0.695 0.694
8.00 5.00 0.724 0.723
8.50 5.00 0.750 0.749
9.00 5.00 0.774 0.772
9.50 5.00 0.795 0.793
10.00 5.00 0.815 0.812
It really is quite astounding.