PDA

View Full Version : Introducing The Fiato-Souders Intrinsic Analysis Matrix



SABR Matt
01-20-2006, 01:21 PM
OVERVIEW

One of the first missions of almost every sabermetrician is to determine a preferred strategy for rating the performance of baseball teams and players while keeping in mind the many complicating factors that distort statistics like wins and losses and run differentials.

There is a host of available data today that makes analysis of teams possible, but some understanding of the dynamic way in which those statistics combine to produce wins and losses is required, and this is not a simple matter. Empirical analysis has for years centered on the idea that averages tell enough of the story to be used as the backbone of any system designed to adjust raw statistics to account for the context in which they occured. This document will explore the problems with empirical sabermetrics and introduce a new tool designed to bridge the gap between the intrinsic skill of the players, and the real world statistics that define them.

REVIEW OF EMPIRICAL METHODS (empirical or traditional analysis includes my own work in the field...proto-PCA for example)

Up until this moment, all documented analyses of player and team value has proceeded in a straight forwad, logical fashion, going from point A to point B to point C in order.

A) Rate the offensive context of the league.

This has commonly been done through some variant of looking directly at the league average run scoring rate. If 10,000 runs scored in 2500 games, then the assumption was made that it was a 4 R/G league...that the other factors would essentially cancel out and that the average scoring rate would be fully explanatory.

B) Rate the offensive context of the park as it relates to the league.

Even in the most sophisicated of modern traditional park adjustments, this boils down to a direct comparison of the scoring rate in a given park (the home team and its' opponents combined) and the scoring rates of that team and its' opponents on the road. The best of methods is iterative...adjusting and readjusting to account for the effect each park has on the net park effect of "the road"...but these are not commonly used or published What is typically available at ESPN.com or any source for baseball statistics is a simple ratio between the run scoring rate of the home park and the run scoring rate of everyone else.

C) Combine the league and park contexts to come up with an average expectation to produce runs for each player and team.

Because of the way traditional park factors are calculated (a ratio...the most natural thing to do with two sets of data that the statistician is trying to compare) the normal method for blending league and park statistics into one number (a R/G or R/O or R/PA type statistic) is to multiply the league scoring rate by the park adjustment for each team and use that as a basis for comparison.

PROBLEMS WITH TRADITIONAL CONTEXTUAL ANALYSIS

1) Missing Elements

There has always been an assumption in empirical sabermetrics that the variation in run scoring at the league level can be entirely explained by the league. If the league scores five runs a game compared to an average league that scores 4.75 runs a game, there is an implicit assertion made that the league and only the league is responsible for that change.When you put the league context with the park adjustment, it is assumed that those two things combine to fully explain what we should expect from average talent in the same conditions.

There's a serious problem with that claim, however. It is fairly evident by just taking a quick glance at the rosters of the teams as the years pass that balance of talent changes. Some years the pitching is a little better than others. Some years the hitting is a little better. It should be pretty clear looking at the rosters from 1999 that the hitter was better than the pitching.

Top Hitters from 1999 in no particular order

Alex Rodriguez
Sammy Sosa
Mark McGwire
Barry Bonds
Ken Griffey Jr
Edgar Martinez
Jason Giambi
Manny Ramirez
Vlad Guerrero
Mike Piazza...etc etc

Top Pitchers from 1999 in no particular order

Pedro Martinez
Roger Clemens
Greg Maddux
Randy Johnson
Curt Schilling
Kevin Brown
Mike Mussina
... ... ... ... uh ...

Obviously I'm leaving some names off and some of you reading this can fill in both lists with more detail, but it seems clear to me that there was a greater depth of hitting talent in 1999 than there was pitching talent.

To assume that 1968 was a great defensive year only because the league made it easier is to rip off the incredible depth of pitching and fielding talent and give too much credit to a mediocre crop of hitters by major league standards.

When park factors are calculated, there are two elements that are commonly forgotten and ignored.

A) The opponents a team faces are not necessarily neutral competition. The late 90s Cleveland Indians did not face a league average offense overall when they played their games at Jacobs Field. Those Indians were a cut above the rest with the bat, which means the rest were a cut below normal by definition. Traditional park factors make no effort to account for this.

B) Players make adjustments to the parks in which they play. Some players do this better than others, but the personnel that play in any given park have a direct impact of how that park APPEARS to play (how offensively friendly it is). Some front offices do a great job acquiring players that maximize their potential because they are good matches for the home park. The 1998 Yankees had a lot of left handed hitters up and down the line-up turning a normally neutral park into a hitter's haven...for the Yankees. The 2001 Mariners filled their outfield with defensively gifted players and loaded up on flyball pitchers to take advantage of the dead air in center.

2) Runs are cumulative, not multiplicative.

Traditional analysis as we have covered above includes a step where the league context is multiplied by a park adjustment to come up with a new expectation to score runs.But contexts shouldn't be multiplied like that. If the park makes the offensive environment more conducive to run scoring, it does not do so by multiplying the danger...it does so by ADDING runs to the scoreboard. Additive adjustments are less prone to the vagueries of small sample sizes, generally more stable, and more intuitive. They also translate more logically to player level analysis. If the park is adding a run per game (27 outs) one can easily see how it effects the players that player there. A multiplicative park factor will effect higher run scoring contexts more severely than lower scoring periods. If the league average R/G is 4 and the park factor is 120, then it we are claiming it increases run scoring by 20 percent (0.8 runs). If the run scoring environemtn is changed to 5 R/G, the park didn't change at all, but one of two things is true...either the multiplicative factor remains 120 (and the park therefore adds 1 R/G)...or the amount the park adds to run scoring doesn't change (and the multiplicative factor drops to 116).

It seems evident to us that the park effect should not depend on the offensive environment...the park has the impact it has...whether it's the deadball era or the rabbitball of 1930, if the park adds a run a game, it does it either way (unless of course it's doing that run adding by being homer friendly in which case it's not likely to hekp deadball hitters much!).

3) The Denominator is Wrong

Traditional contextual analysis includes adjustments that take the form of a series of fractions of the form (Runs per Run). Park factors are formed by a ratio of runs scoring rate at home over run scoring rate on the road. League contexts are set in essence by the ratio of the league's run scoring rate to the all time average scoring rate. The contexts themselves are not attached to anything...they're unitless multipliers that blow up with small sample sizes. In reality, any context has an increasing impact on a player or team the longer they spend in that context. And as we all know, time in the game of baseball is measured in outs, or when outs are not available, games. The denominator of any contextual adjustment should take the form of R/Out or R/G.As soon as you change the denominator to that form, it becomes very easy to see that contexts add together to explain run scoring changes.

4) The elements that come together to explain runs are completely inter-related.

Traditional sabermetric analysis proceeds from A to B to C, without stopping to fully appreciate how dependent each step of their analysis is on the steps that come before and after it. In order to know how the league impacted scoring, we need to know how the parks, the teams, and the players impacted the league...in order to know how the parks impacted scoring, we need information about the league, the teams, and the players...etc. What is needed is some sort of system of equations where each variable is considered as it related to the others.

THE LAW OF SUCCESSION

As noted earlier, sabermetricians fight a constant battle with small sample sizes. Even a full major league season includes match-ups that only recur 6-12 times between pairs of teams. Getting information from these match-ups requires a more useful method than simply taking the statistics at face value.There is a wing of statisical analysis known as Bayesian probability. The general idea behind Bayesian Probability is that we cannot assume we have seen an entire distribution simply because we have all of the available data. Just because two teams face each other 10 times and one of the teams wins all ten, doesn't mean there is a 100% probability that the successful team will win the next game or that if those ten games were replayed under identical conditions the results would be the same.

The Bayesian model starts with the assumption that every team, every park, every league is average and forces the statistics to prove or disprove this assumption, one run at a time. This methodology is the driving force in our analysis and the idea came to us (myself and Randy Fiato...a programmer of great skill and tenacity and a budding sabermetricin in his own right) by way of a Dr. Colley of Princeton University, who used Bayesian probability to mathematically explain the success of college football teams and rank them (his matrix is still used today as a part of the BCS ranking system). His system is somewhat simpler, because all football fields are the same dimension, he doesn't have to deal with a changing timeline, and his method deals only with ordinally ranking football teams so a certain level of precision is not necessary to achieve the desired accuracy in the rankings. But beyond cosmetic differences, our approach relies on the same central theory - the law of succession.

Rather than assume that without any data present, no conclusion cam be drawn, we assume that in the absense of data, one conclusion MUST be drawn...that being that future events will occur at the average pace until proven otherwise.

THE MATRIX

The unifying idea behind the Fiato-Souders Intrinsic Analysis Matrix can be summed up in one equation.

For any team: (ARSPG + OIRAAPG - DIRAAPG + LIRAAPG + PIRAAPG + OPR - DPR) = Actual Runs Scored - Actual Runs Allowed (both of which could be accurately predicted using only the componants that apply)

Holy Acronyms, Batman! I think we need a decoder ring!

ARSPG -> Alltime Runs Scored per Game (per side)...this turns out to be approximately 4.76 Runs/Game/Side excluding 1871-1875 which were not even slightly major league calibar baseball and would unnecessarily throw off the alltime scoring average. All additional contextual adjustments are relative to an "average" league.

OIRAAPG -> Offensive Intrinsic Runs Above Average Per Game...this term represents how many runs per game above the alltime scoring average this team's offense could be expected to score in an average league agaisnt average competition in a neutral park.

DIRAAPG -> Defensive Intrinsic Runs Above Average Per Game...this term represents how many runs per game above the alltime scoring average this team's defense could be expected to allow in an average league against average competition in a neutral park.

LIRAAPG -> League Intrinsic Runs Above Average Per Game...this term represents how many runs per game above the alltime scoring average this league would result in given neutral parks and average players and teams.

PIRAAPG -> Park Intrinsic Runs Above Average Per Game...this term represents how many runs per game above the alltime scoring average would score in this park given average teams and players, and an average league.

OPR -> Offensive Park Reactions...this term represents how the offensive players on each team did relative to what would be expected of them given the intrinsic strengths of the parks in which they played. This is a little harder to put in words, but to put it as simply as possible...if the park favors pitchers, and your team has found a way to score at an above average clip, you're doing something that is not statistically expected of you and that needs to be accounted for separately.

DPR -> Defensive Park Reactions...same as above only with defensive players (pitchers and fielders).

That sounds like a lot...but here's how it goes together. Each individual variable in the matrix is reported (these variables include the team's unique reactions to each and every park in which they played...one at a time...each park...the league as a whole...and the offenses and defenses of each team in the majors in a specific year and league) and placed in a linear equation where every other variable upon which it depends is set to an all time average (the Law of Succession...the average assertion). This is done for every variable in the history of the game...and those variables number greater than 1,000 in each year of the modern era.

Those equations are placed on the left hand side of a linear system of equations. They're set equal to the real world results on the right hand side (in the league row, the runs scored in that league would be recorded...in the intrinsic offense row, that team's runs scored would be recorded...etc). This system of equations can be solved using matrix algebra and what comes out the other end of that process is a set of results explaining each variable.

RESULTS OF FSIA CALCULATIONS

1) What do the results look like?

Top Fifty Teams since 1900
(In terms of Intrinsic Run Differential per Game)

Year Team InRD/G
1939 NYA 2.265
1927 NYA 2.197
1902 PIT 2.000
1936 NYA 1.945
1931 NYA 1.869
2001 SEA 1.790
1998 NYA 1.772
1906 CHN 1.723
1937 NYA 1.702
1929 PHA 1.644
1905 NY1 1.592
1932 NYA 1.586
1942 NYA 1.567
1944 SLN 1.530
1942 SLN 1.525
1904 NY1 1.504
1935 CHN 1.492
1953 BRO 1.492
1931 PHA 1.490
1969 BAL 1.489
1901 PIT 1.475
1903 BOS 1.452
1911 PHA 1.448
1998 HOU 1.447
1934 DET 1.441
1948 CLE 1.439
2001 OAK 1.410
1975 CIN 1.401
1921 NYA 1.369
2002 ANA 1.365
1935 DET 1.359
1995 CLE 1.351
1938 NYA 1.339
1974 LAN 1.336
1912 BOS 1.336
1912 NY1 1.330
1949 BRO 1.330
1998 ATL 1.328
1910 PHA 1.325
1942 BRO 1.321
1932 PHA 1.315
1909 PIT 1.298
1999 ARI 1.295
1922 SLA 1.289
1905 CHN 1.288
1955 BRO 1.286
1950 NYA 1.268
1909 PHA 1.266
1953 NYA 1.262
1901 CHA 1.261

Bottom Fifty Teams since 1900

Year Team InRD/G
1909 WS1 -1.480
1904 BSN -1.486
1963 NYN -1.514
1940 PHI -1.522
1906 BSN -1.534
1951 SLA -1.536
1955 KC1 -1.540
1935 BSN -1.548
1974 SDN -1.559
1953 DET -1.565
1920 PHA -1.583
1901 CIN -1.586
1948 CHA -1.590
1910 SLA -1.604
1923 PHI -1.612
1908 SLN -1.618
1979 OAK -1.619
1937 SLA -1.628
1952 PIT -1.635
1924 BSN -1.636
1926 BOS -1.636
1909 BSN -1.640
1925 BOS -1.653
1956 WS1 -1.676
1969 SDN -1.682
1941 PHI -1.701
1942 PHI -1.702
1905 BRO -1.719
1928 PHI -1.726
1939 PHI -1.749
1919 PHA -1.764
1904 WS1 -1.767
1921 PHI -1.773
1945 PHI -1.775
1954 PIT -1.778
1962 NYN -1.783
2002 DET -1.784
1936 PHA -1.812
1916 PHA -1.834
1939 SLA -1.849
1938 PHI -1.873
1911 BSN -1.887
1903 SLN -1.901
1996 DET -1.925
1932 BOS -1.930
2004 ARI -1.940
1954 PHA -1.974
1939 PHA -2.009
1915 PHA -2.024
2003 DET -2.112

Prior to 1900, the intrinsic un differentials start to take off in magnitude owing largely to the wildly oneven distribution of talent, unstable franchises, shorter schedules, and higher run scoring environments that make up the 19th century game, but the 1899 Cleveland Spiders..widely recognized as the worst baseball team ever to compelte a season, finish dead last among teams to play at least 80 games, with an abysmal -4.048 InRD/G...(that's 624 runs they allowed more than they scored INTRINSICALLY...they were that bad all on their own!).

It should be noted that these intrinsic calculations included the intrinsic offenses and defenses of each team as well as the team's unique park reactions (because park reactions are a skill that shouldbe accounted for when rating the merits of teams).

2) Benefits of the FSIA

A) This represents the first ever system that has made an attempt to credit the players at least in part for helping to create the changes in the run scoring environment.

Typipcally, the credit awarded to the offenses and defenses (one way or the other depending on the conditions in the league) is on the order of 50-400 Runs over the course of an entire season for an entire league, so the credit is relatively small, but certain extreme seasons like 1999 in the national league, or 1968 in the NL, or 1987 in the AL swing further (1999 for instance gives almost as much credit to the hitters as the leagues themselves for the huge spike in offensive production).

B) Park adjustments are significantly more conservative, and stable over time compared to ratio factors currently available. When you apply a ratio factor of 120 (the Coors Field effect) to a player season, you get a rather extreme result...when you apply a cumulative adjustment of one additional run expected every 27 batting outs to the same season (the park added roughly 160-180 runs each year to the scoring from both sides combined), the park's pull on the hitter's value will be somewhat muted (though still very real). FSIA park factors are significantly less prone to wild fluctuations from season to season and reflect our belief that most parks have a very minor effect on scoring and that it's only a few extreme parks at either end of the spectrum that can really be counted on from year to year to have a certain impact. Stable park factors were made possible by switching to cumulative math, and by factoring out the unexpected fluctuations in the reactions of players to the parks (and thus neutrallizing the home-team bias problem mentioned earlier)

C) This represents the first complete effort to separate the intrinsic abilities of teams from their contexts, while being able to reproduce real-world statistics with a high degree of accuracy. One of the problems with Baseball Prospectus's EqR statistic is that while it is a fairly aggressive attempt to put all players on a level playing field, it does not in any way model actual run scoring (it's not intended to...it's a conceptualized ideal league environment based on the average EqA being .260), so it's not particularly useful for doing any kind of top-down win analysis (you can't use EqR to predict how many runs a team will score and allow). The FSIA not only places players on a level playing field...it models the real world too.

THE ACCURACY OF REAL-WORLD MODELLING WITH THE FSIA

Using run differential data totalled up for each league and season, we were able to determine a series of encouraging error-statistics that we hope will make it clear that the FSIA is a highly accurate intrinsic analysis tool for use in real-world modelling.

First we tested its' ability to accurately reproduce league run scoring results from the componants. The largest discrepencies we found when comparing real-world run scoring totals to the FSIA generated RS was 68 runs. The error range was -68 to +49. To put this in clearer terms, on a per game basis, the error range was -0.030 to +0.027 R/G. In the worst case scneario, we're talking about maybe a 1% error (more likely closer to half a percent). The root-mean-square-error (standard deviation of the error) was a mere 8.7 runs. In an average league which scores something like 8,000-11,000 runs!!

Next we tested its' ability to accurately predict runs scored and allowed by teams. We expected a larger error here, because the fewer games you have in a sample, the more the Law of Succession will play a part in pulling that sample variable toward the mean. This model will tend to underestimate the spread of run differentials in the case of extreme teams, partially because it is a proper statistical question whether we have seen the entire distribution of outcomes when the sample is reduced in size to 162 or 154 games (in most cases), and partially because in the case of extreme teams, we begin to run into a new error source which we are working toward correcting and which will be discussed in our future research plans below.

In any event, we did get a larger error here, but it was far smaller than even I had expected. The error range was -47 runs to +52 runs...or in terms of runs per game...-0.315 to +0.326 runs/game. In the worst case scenario we're looking at something like a 6-8% error, but this wasn't all that common.

The RMSE for team offenses was 14.7 R and is was 20.4 R for team defenses. Given that the average team scores and allows about 770 runs over the course of major league history, the "typical" error is something more like 2-3%.

That error shouldn't really even fully be called error, since, particularly in the case of teams with shorter schedules or extreme teams, all laws of probability suggest that a center-pull is wise (there is an increased probability that what we've seen out of a team with a shorter schedule or an extreme team is just a part of the distribution and that if those games were replayed under identical conditions, a somewhat less severe result would occur).

PRIMARY SOURCE OF REAL ERROR

Aside from random chance and the center-pull inherent to Bayesian probability, the primary problem with the FSIA is that there is one somewhat incorrect assumption required to make it work. The FSIA is a system of LINEAR equations. But we already know from research done by Bill James that teams do not combine LINEARLY to produce wins and losses...and they probably don't combine linearly to produce runs either. Teams and the contexts in which they play combine very NEARLY linearly when winning percenages of those contexts fall inside a range near .500 (.400 to .600 is considered the acceptible range of the linear assunption). The FSIA works very well for most of the variables it evaluates...but particularly park reactions, which are very small sample sizes, and prone to random fluctuations that make them appear extreme and therefore force them to fall outside the range where the linear assumption holds, and extremely good and poor teams, are sometimes vulnerable to error.

FUTURE RESEARCH

Randy and I have already planned out the concepts for the final advancement of our intrinsic analysis and are beginning work on a non-linear solver for systems of equations following a form pioneered by Bill James called "log5". More details on the log5 system when we are ready with new results, but as you have seen, the FSIA is already very accurate in just about every case, and ready for application to player evaluation models like PCA.

Adding to our work on log5, we are beginning to strategize on how to improve the accuracy of dynamic linear weights...more details on that at a later time.

It should also be noted that the FSIA masakes no attempt to correct for the strength of a league...that's another project entirely. We're working on ways to try to quantify the competitiveness and depth of a league as well, but that'll take some time.

I think I've written quite enough for one day...anyone still reading this...I solute you for taking the LOOONG time necessary to digest it all and I thank you for reading.

Thoughts? Quibbles? General wonderings?

leecemark
01-20-2006, 01:41 PM
--I did have a major quibble until I got to the very end of your piece, where you say that league strength is not factored into your calcualations. Exhibit A being your #3 team the 1902 Pirates. The National League had been decimated by raids from the AL with the exception of Pittburgh, which returned its pennant winning roster from the year before virtually untouched.
--I would say your system (at first glance anyway, I haven't yet digested the whole concept) is as good as any at telling us how successfull a team was. Whether it tells us how good it was is another story. Those are not always the same thing, even without considering league quality. The 2001 Mariners, for example, were wildly successfull, but that success was fueled by some flukishly good seasons. I would not pick that roster as one of the best in history by any strech of the imagination.

SABR Matt
01-20-2006, 08:25 PM
A perhaps more revealing look at "goodness"...at least relative to the league...(again...this is still without strength of league...which we're working on)...is through the use of intrinsic strengths only.

Using that measure and eliminated the "nuique reactions to parks" the '01 Mariners not only drop out of the top ten...the drop behind the '01 ATHLETICS for tops in the 2001 AL.

We feel that while it is imperative that players be given credit for reacting well to the parks in which they play in the rating and ranking process...intrinsic strengths will prove to be more predictive of team performance in the future. Totally divorced from context...the As were a better team than the Mariners in 2001...in fact Seattle drops from an RD of 1.77 to on closer to 1.4...and Oakland retains much of its' 1.5-ish success.

That having been said...although it is true that Bret Boone had a fluke season in '01...he is the ONLY 2001 Mariner that I can think of that performed far afield from his career line in 2001. The main thing that made those Mariners success was an enormous depth and a stellar team defense...things that are hard to see upon a visual inspection of a roster looking for "star power".

SABR Matt
01-21-2006, 04:38 AM
Great Hitter's and pitchers parks by the FSIA

Top 50 hitter's Parks in terms of NET Park Adjustment (the weighted and combined park factors for all parks played in by the team whose home park is listed here...this is done so you can see what kind of mathematical adjustment will actually be applied to team and player contexts)...1900 and beyond

Team Year NetPkA
COL 1996 0.794
COL 2000 0.702
COL 1995 0.692
COL 1999 0.665
KCA 2002 0.649
PHI 1925 0.601
TEX 2002 0.591
COL 1993 0.589
PHI 1923 0.570
KCA 2001 0.540
TEX 1998 0.519
PHI 1929 0.514
PHI 1930 0.513
COL 1998 0.472
PHI 1933 0.470
MIN 2000 0.461
CLE 1998 0.460
PHI 1935 0.458
COL 2004 0.442
PHI 1922 0.440
BOS 1950 0.430
BOS 1955 0.426
PHI 1936 0.409
CHA 2000 0.401
SEA 1999 0.401
COL 2001 0.400
BSN 1911 0.399
PHA 1932 0.397
CHA 2004 0.397
BOS 1977 0.394
COL 1997 0.391
PHI 1932 0.391
KCA 1998 0.390
TEX 2000 0.388
CHN 1970 0.384
OAK 2002 0.381
KCA 1997 0.379
DET 1937 0.374
COL 1994 0.371
MIN 1999 0.367
CIN 1903 0.367
PHA 1902 0.366
SLA 1930 0.358
TEX 2004 0.356
PIT 1951 0.356
TOR 2004 0.353
KCA 2000 0.352
TEX 1999 0.351
ATL 1977 0.351
CLE 2002 0.344

Fifty greatest pitcher's parks since 1900 by the FSIA

Team Year NetPkA
SDN 1998 -0.526
SFN 1999 -0.523
PHI 2002 -0.495
CHA 1903 -0.491
SLA 1903 -0.491
FLO 1999 -0.440
CLE 1903 -0.420
MON 1998 -0.418
SDN 2002 -0.417
DET 1903 -0.416
FLO 2002 -0.405
SFN 2001 -0.401
HOU 1995 -0.388
OAK 1973 -0.386
LAN 1964 -0.386
HOU 1999 -0.385
LAN 2001 -0.384
LAN 1970 -0.372
BSN 1938 -0.372
SDN 2001 -0.371
BSN 1934 -0.368
LAA 1964 -0.367
PHA 1903 -0.364
ATL 1999 -0.362
LAN 2002 -0.359
HOU 1976 -0.358
NYA 1903 -0.354
LAN 1998 -0.345
CHN 2000 -0.344
CIN 2004 -0.343
NYN 2002 -0.341
NYN 2000 -0.341
SDN 1972 -0.333
NYA 1951 -0.332
ML1 1958 -0.331
NYA 1939 -0.330
BSN 1950 -0.328
CLE 1952 -0.328
SDN 1999 -0.324
BAL 1962 -0.323
BSN 1926 -0.322
ARI 1999 -0.318
HOU 1981 -0.314
LAN 1967 -0.314
SFN 2002 -0.313
NYN 2001 -0.312
CHA 1932 -0.312
CHA 1965 -0.311
CAL 1972 -0.310
LAN 1997 -0.309

Parks that make appearances on either list tend to do so more than once most of the time...there are a lot of year-families (a single park will appear multiple times in a number of adjoining years...which is what we'd expect)...and the parks appearing on these lists are not at all unexpected as far as I can tell.

Reminder...these figures represent how many Runs per Game (per side) a park adds to scoring. When the 1996 Rockies played out their entire schedule, the net amalgom of parks in which they played (weighted by games obviously)...including Coors field for 81 games...added 0.794 runs per game to their own scoring and to the scoring of their opponants. That league was about a 4.5 R/G league so that's something like the equivalent of claiming they had a weighted park adjustment of 117 (which is the equivalent of saying they had a park factor of about 134).

0.794 R/G is about 129 runs on a whole season for the whole team. For the average line-up spot...that's about 14 runs (1/9th the team total)...

Your commonly available park adjustment from baseball-reference.com for the Rockies in 1996 is 131...mine is 117-ish. When I calculated park factor susing three year normally weighted averagnig a la James I got a number about twice as aggressive as the number I'll be using now.

Just to give you an example of the more conservative nature of FSIA park factors.

And I was just looking at an extreme teams...there are many more teams hovering far closer to neutrality in the FIAS model then there are using standard park factors.

538280
01-21-2006, 07:52 AM
I haven't quite gotten a chace to read the whole thing yet, but obviously the major problem is the lack of an LQ adjustment. Matt, earlier you introduced a way of quantifying league quality, couldn't you somehow incorporate that into your system?

Looking at the results, they don't seem horrible. The 1939 Yanks as the best team ever is a conclusion that has been reached by a lot of statisticians. You have also reached the same conclusion as others that the Yankees of the early 50s just weren't all that great either, despite their 5 World Series titles. The Brooklynites may suffer a group stroke seeing their '55 team that low (though their '53 team is very high).

About the FSIA results, what the hell are the Great American Ballpark and Wrigley Field dong so high up there on the pitcher's parks? And Kauffman Stadiium? I've always though of that as a neutral park, you have it an extreme hitter's park.

I don't know if that helps, but those are just a few strange things I found in the results. I'll try to read through the system's details when I get time.

SABR Matt
01-21-2006, 08:29 AM
My first attempt at league quality was pretty good, but I was not convinced that it was really seeing league quality changes and league quality changes ONLY...nor was I convinced I had the right method for converting that league quality estimate into a percentage. I want to exhaust all possibilities for how to measure league quality including some rather difficult to calculated ideas like measuring interquartile ranges of PCA Wins Created ratings and attempting to quantify the idea that weak leagues cause dramatic shifts in the rating patterns of players (Zwillig goes from bench player to all star when he hits the Federal league...Ace Adams goes from great reliever to crappy last man when WWII ends...etc)...I believe looking at changes from season to season in rating patterns can reveal something about league quality.

That '55 WS team was not the best team Brooklyn produced...though being low in the top 50 isn't exactly a BAD thing (if you finish in the top 50...that's pretty impressive consider there've been 2100 teams since 1900.

Wrigley isn't a hitter's park. Not anymore. That myth needs to be put to rest. It got it's reputation as a great hitter's park back in the 70s when it represented one of the smallest parks in baseball. Nowadays...it's playing as at best a near-neutral park in most seasons and at worst a pitcher's park.

As for Great American..in its' opening season it did play strongly as a pitcher's park...I'm certainly not the only guy to reach that conclusion...one odd thing I've observed is that a lot of new parks play extremely in their first season or two...

Minute Maid played as an extreme hitter's park in 2000...since then it's been very mild as hitter's parks go. GAB played extreme in 2004...less so in 2005. There are a number of examples like this...it seems that a lot of new parks play extreme and them the entire league adjusts (something that wouldn't be seen inunique reactions to parks because if the whole league is doing it...it becomes expected).

Ubiquitous
01-21-2006, 08:30 AM
Over 100 years of baseball over 2000 baseball seasons and two of the 6 WS winning 50's Yankee teams show up in the top 50, or I should say two of the teams show up in the top 2% of history. Not bad for a great team.

Though I don't think anybody in a million years would have picked the 1935 Cubs as the 17th greatest team of all time.

SABR Matt
01-21-2006, 09:17 AM
The 1935 Cubs were VERY successful...they dominated a very weak national league...that would be why they ended up where they did.

And I agree...the 1950s Yankees were good...not "all-time awsome" but certainly a solid dynasty.

Ubiquitous
01-21-2006, 09:39 AM
They only won 100 games against that very weak league. Yet the 1954 Indians win 111 games and they don't show up.

SABR Matt
01-21-2006, 09:54 AM
the FSAA is based on runs though...not wins...the 1954 Indians WAAY out-won what you'd expect from a team with their runs scored and allowed...and while we've had debates about this before I continue to believe that wins are too prone to random chance to use as a measure of statistical success.

Obviously there remains the possibility that some teams overperform their pythagorean for a reason (or in this case their intrinsic run differential)...perhaps performance in the late innings...but I am of the increasing belief that this is not likely to be a major factor...I couldn't find any obvious pattern in the group of teams who outperformed their RS/RA distributions...it seemed more like random chance than anything one could/should credit the players for.

Ubiquitous
01-21-2006, 11:29 AM
The 1954 Indians were expected to win 104 games. They won 111. The 1935 Cubs were expected to win 101 games they won 100 games. According to the Indians Run differential they should have won more games then what the Cubs run differential tells us the Cubs should have won. Yet its the Cubs that get ranked 17th all time and the Indians on the outside looking in.

SABR Matt
01-21-2006, 11:49 AM
Complete breakdown of the two teams.

1954 Indians
Intrinsic Offense -> 0.215 RAA/G
Intrinsic Defense -> 0.767 RAA/G
1954 AL -> -0.197 RAA/G
Net Park Adjustment -> -0.037 RAA/G
Offensive Park Reactions -> 0.072 RAA/G
Defensive Park Reactions -> 0.104 RAA/G
Alltime scoring average > 4.76 R/G

Actual RS -> 746
FSIA projected RS -> 714.6

Actual RA -> 504
FSIA projected RA -> 533.8

What conclusion can we draw here? They both scored significantly more than the FSIA thinks is likely from them if the games were repeated and allowed significantly fewer than the FSIA thinks is likely if the games were repeated. There are two possible explanations...either their park reactions were flukey (and therefore prone to the linear error mentioned in the initial pos here) or they beat the crap out of bad teams and took part in some lopsided series that the FSIA doesn't see as being likely to repeat if the games were replayed under identical conditions.

By Comparison, the 1935 Cubs looked like this.

Intrinsic Offense -> 0.761
Intrinsic Defense -> 0.487
1935 NL -> 0.059
Net Park Adjustment -> -0.002
Offensive Park Reactions -> 0.099
Defensive Park Reactions -> 0.144

Actual RS -> 847
FSIA Projected -> 838.5
Actual RA -> 597
FSIA Projected -> 608.6

There is a center pull here too, but not nearly as severe as in the case of the '54 Indians.

Why the difference between the two teams? Perhaps the FSIA feels more confident that the rest of the 1935 NL was significantly weaker and therefore the Cubs' strength of schedule was bad enough that probabilities increased for better run differentials...whereas Cleveland had direct competition (the 1954 Yankees rated as a better team than the Indians...for instance...) and so the FSIA was less confident about the potential for a repeat?

SABR Matt
01-21-2006, 12:22 PM
Some breakdowns for you...

The 1935 Cubs were the model of sabermetric consistancy.

They played each opponant almost exactly like you'd expect them to have played...the .248 W% Braves they beat by 75 runs in 22 games. The .411 Phillies they beat by 52 runs and the .441 Reds they beat by 31 runs...

In fact they had no sabermetric trouble beating anyone in the NL except the second place Cardinals...who they played a little behind with -18 runs...(and that was a good team).

The 1954 Indians on the other hand were outscored by the third place White Sox by ten runs (half a run per game), played even with the Yankees, and the .411 Senators (same W% as the second worst team in the 1935 NL) they managed only a +32 margin...

These were all negative drags on the FSIA's probabilistic modelling of their performance. It looks like the Indians had some trouble playing with the teams right behind them in the standings and didn't beat up on the dead weights the way you'd expect a 111 win team to beat up on them.

That's probably why the FSIA gives that Cubs team just about every run they earned in the real world but the Indians have a heavy center pull.

It's worth mentioning that the 1955 Indians collapsed rather badly...there may have been warning signs in the play of the '54 team that the FSIA sniffs out...they may have been a lot better in real-world record than they actually were in personnel.

SABR Matt
01-21-2006, 12:28 PM
One other point worth making...

The FSIA by its' very nature heavily discounts blowouts. If you beat someone 22-0, it sees that is more probably representing 17-4 or even 15-5...in terms of what the score would be if that game were played again. So if one of these two teams had some blowouts throwing off certain match-ups, it could play a large (and justified) role in altering their final rank.

A run against a .600 team has a LOT more value than a run against a .300 team in this model...it goes game by game by game counting the intrinsic runs (based on Bayesian probability)...

The 1935 Cubs played +31 against .500 teams in 66 games, whereas the 1954 Indians played -6 against the two .500+ teams...that's the story here...

The '54 Yankees played the other good teams WAY better than the 54 Indians.

leecemark
01-21-2006, 02:58 PM
--Matt, I'd agree that only Bret Boone had a season which exceeded his norm by a wide margin. What led to that great record was EVERYBODY playing above their norms by at least a little. At least half the roster had a season that was arguably their career best and most of the rest were above their career averages.

RuthMayBond
01-21-2006, 03:08 PM
The 1935 Cubs were VERY successful...they dominated a very weak national league...But what's the big deal if the league was so weak?

RuthMayBond
01-21-2006, 05:16 PM
One other point worth making...

The FSIA by its' very nature heavily discounts blowouts. If you beat someone 22-0, it sees that is more probably representing 17-4 or even 15-5...in terms of what the score would be if that game were played again. So if one of these two teams had some blowouts throwing off certain match-ups, it could play a large (and justified) role in altering their final rank.

A run against a .600 team has a LOT more value than a run against a .300 team in this model...it goes game by game by game counting the intrinsic runs (based on Bayesian probability)...

The 1935 Cubs played +31 against .500 teams in 66 games, whereas the 1954 Indians played -6 against the two .500+ teams...that's the story here...

The '54 Yankees played the other good teams WAY better than the 54 Indians.I like your ideas, and you're uncovering a lot of things the average Joe Blow wouldn't. Great job :clapping

SABR Matt
01-21-2006, 06:31 PM
But what's the big deal if the league was so weak?

Well I agree...I wouldn't rank the 35 Cubs as the 17th best team all time..this model can't determine strength-of-league, though. I will have to figure out a way to do that separately (and I am working on it)...this model is intended to show probabilistically who was the most "successful"...given the competition they had. It's entirely possible that when strength-of-league is considered, even deespite playing only even-money )-6 runs is not a huge amount) against the other good teams of 1954's AL, the '54 AL was significantly stronger than the '35 NL...and the '54 Indians might end up being called a "better" team than the '35 Cubs because of that.

SABR Matt
01-21-2006, 06:33 PM
--Matt, I'd agree that only Bret Boone had a season which exceeded his norm by a wide margin. What led to that great record was EVERYBODY playing above their norms by at least a little. At least half the roster had a season that was arguably their career best and most of the rest were above their career averages.

What caused 2001 was that everyone played at least to expectations, no one got injured aside from Guillen's TB thing, and we had GODLY depth at every position so everyone was always well rested.

538280
01-21-2006, 06:47 PM
Wrigley isn't a hitter's park. Not anymore. That myth needs to be put to rest. It got it's reputation as a great hitter's park back in the 70s when it represented one of the smallest parks in baseball. Nowadays...it's playing as at best a near-neutral park in most seasons and at worst a pitcher's park.

And yet one year it jumps up and becomes one of the best pitchers' parks of all time? This is my main problem with trying to quantify park effects on a year to year basis. NOTHING has changed about Wrigley 1999 to 2000, and yet in 1999 it's about neutral and in 2000 it's one of the best pitcher's parks ever. I'm sorry, but I think the explanation for that is nothing but pure coincidence. I think park effects would be more accurate if you gave a park a set factor for a five year period (unless of course the fences were moved back or something like that).


As for Great American..in its' opening season it did play strongly as a pitcher's park...I'm certainly not the only guy to reach that conclusion...one odd thing I've observed is that a lot of new parks play extremely in their first season or two...

Minute Maid played as an extreme hitter's park in 2000...since then it's been very mild as hitter's parks go. GAB played extreme in 2004...less so in 2005. There are a number of examples like this...it seems that a lot of new parks play extreme and them the entire league adjusts (something that wouldn't be seen inunique reactions to parks because if the whole league is doing it...it becomes expected).

But the problem is that the GAB has garnered a reputation as a hitter's park. Looking at park factors, I'm not quite sure why though.

SABR Matt
01-21-2006, 07:14 PM
The GAB looks kind of intimidating...it's not a huge park...but it has definitely played as a pitcher's park. Why this is...I'm not sure...it might be a glare problem...perhaps the weather wasn't favorable for good hitting in Cincy the last two years...who knows...all available evidence suggests that it is playing as a pitcher's park...even when you pull out the personnel.

As for Wrigley...I tend to agree that 2000 was a bit flukey...even my rating effort has a few of those...though they are far less common than with traditional park factors...and down the road I foresee doing a little more smoothing of the data...we may run the FSIA on multi-year blocks of time (take the statistics for each team and accumulate them over many years instead of just going season by season) and see what happens to the park factors...we're not done experimenting. :)

antihipster
01-21-2006, 07:25 PM
Very interesting read.

While I do not have the time or the amount of data that Matt has, I have been tinkering w/using probability in my version of stats. Once again, they are not as advanced as Matt's.

SABR Matt
01-21-2006, 11:54 PM
I would definitely encourag eanyone interested in rating the teams to consider who they played, how they played against them, and how it compares to what you would expect given the strengths and weaknesses of each team in a match-up.

It's very clear that the '54 Indians weren't really a 111 win team in strength...they managed to win 111 games, but they did so in a way that suggests if they tried it again, they would come up well short.

Ubiquitous
01-22-2006, 12:23 AM
The Indians went 22-22 against the next two teams. The cubs went 22-22 against the next two teams. The indians scored 185 runs and allowed 191. The Cubs scored 201 runs and allowed 201 runs. With almost all of the difference in RS/RA for the Indians coming in a 4 game series in which they lost all four games by a combined score of 5 to 22. Besides the next top two teams the Indians win at an insane rate. 82% against Philadelphia and Washington, 86% against Baltimore, and 91% against Boston! With a mere 64% against Detroit.

The Cubs only play one team to anything close the dominance the Indians did and that was against the worst team in the league Boston.

SABR Matt
01-22-2006, 12:41 AM
You're still focusing on the wins...we focus on runs specifically because (and this has been proven by other sabermetricians) they're more predictive.

The Indians did in fact truly dominate the weaker teams in the AL in 1954...that however..is the point I was making. Their record, and more to the point their RD is entirely explained by pounding the crap out of bad teams. Winning at an 85% clip against loogies has far far less predictive value than beating good teams by much smaller amounts. Aside from losing 18 runs to the Cardinals, the Cubs have significant RS advantages against every team in the 1935 NL...including two other clubs who were above .500. The Indians barely held their own against the other good competition in the AL, and did surprisingly poorly against the third worst team in the league managing to outscore the Senators by only 33 runs in 22 games.

In order to be a greatly successful team...you must beat the other successful teams against whom you are competing in the run scoring game. The Cubs did this...they were +32 against the top three teams in the 1935 NL (including -18 against the Cards and +50 against the other two)...the Indians did not do this (barely over the zero mark against the top three teams in the 1954 AL). Whereas for instance, the Yankees were significantly above zero in RD against the same competition.

TKD
01-22-2006, 04:12 AM
I got some requests on my site about how FSIA works on a lower level, so I'll post my explanation here as well. It's heavy on the linear algebra and Bayesian probability, but I'll try to explain it clearly nonetheless. Let me know if any of this is unclear.

FSIA basically boils down to the equation that for, any game,

OIRAAPG - DIRAAPG + LIRAAPG + PIRAAPG + OPR - DPR = Actual runs above all-time average

That is, run scoring is a combination of:

- The all-time average (currently 4.76-ish)
- One team's offensive strength
- The other team's defensive weakness
- The home team's league conditions
- The park conditions
- The first team's offensive reaction to the particular park
- The other team's defensive reaction to the particular park

Each of these, except for all-time average, is a variable. So, in the modern league, there are 30 team offensive strengths, 30 team defensive strengths, 2 league variables, 30 park variables, a maximum of 900 offensive park reactions (but about 1/3 of these are 0 and are consequently dropped altogether), and a maximum of 900 defensive park reactions.

So, we've got, in practice, about 1,180 (or so) variables for a single year. So FSIA constructs a matrix of linear equations, one for each variable. A different variable is made dependent for each equation, and the other variables become weighted sums of the other variables that affected this variable over the course of the season.

Now, the only problem is that, if two teams have the same schedule (which they did up until a few years ago), you won't be able to solve the system of equations because some equations will look exactly the same. This condition is known in linear algebra as singularity, and I'll give you a small example to illustrate:

2x + 2y + z = 1
2x + 2y + 3z = 3

Here, we can find out that z = 1. But then we're left with:

2x + 2y = 0
2x + 2y = 0

That's essentially one equation with two degrees of freedom; you don't know what x and y are, nor do you have any way of finding out, except that one is the negative of the other. That's obviously not helpful.

As such, this is where Bayesian probability comes in. This was one of the key points in Dr. Colley's paper describing his matrix that is now part of the BCS (the paper is available in its entirety from http://colleyrankings.com/matrate.pdf ). The rule of succession in essence concedes that there is a possibility that we have not seen a sample representative of the entire distribution of teams/parks/etc. -- past, present, and future.

In the case of teams, before we have seen a team play, and without any information about that team, we can only assume a priori that the team is somewhere between .000 and 1.000 with equal probability for any point between 0 and 1; the average of this uniform distribution is .500, and this is what we mean when we say that we are assuming the average a priori. As we process games, we realize a posteriori that the team is better or worse than .500, but the possibility still remains, albeit decreasingly so, that we have seen something not representative of the team's strength.

To cut a long story short, the gist is that, for all variables, we "add in" a single game where run-scoring/allowing is equal to the all-time average. This represents our a priori assumption of the average until data proves otherwise. Note that this is a fundamental difference between Bayesian probabilistic statistics and frequentist statistics; the latter would attempt to process only the actual data; in the latter, you might use something like a t-stat to account for small sample sizes, but Bayesian probability addresses this by incorporating the initial fact (that all run scoring has averaged out to ~4.76 R/G) into the data itself.

This extra game is added ONLY to the dependent variable in each equation. Since we are assuming the average, the right-hand side, which is runs above average, remains the same. So the effect is to pull all teams closer to 0. Obviously, for teams that play a full schedule of 154 or 162 games, the pull will be small. But for teams that only play 3 or 4 games in a park, their park reactions will experience a significant center pull because there just simply isn't enough data to deviate too far from 0, without extreme cases like 22-0 blowouts (and even then, those tend to be discounted heavily).

Note that, in my original matrix, I have the signs of defensive variables reversed, so that positive always points toward increasing run scoring, not increasing strength.

Once you have this system of linear equations set up, you can just solve it with a good linear system solver. I use an LU solver in C++, although theoretically the matrix is symmetric positive definite, so you could use Cholesky factorization if you wanted. Don't attempt to do this by hand; even the smallest league (8 teams) will have on the order of 89 independent equations.

I do realize that, unfortunately, very few people actually have at their disposal an equation solver capable of handling 1,000+ variables efficiently. However, let's take a VERY simple example. Suppose that team A plays team B twice, beating them by a combined score of 12-6. Suppose that the all-time average is 5 R/G (just to make the math a bit easier). Let's also ignore park reactions for now, since those don't make sense for a single matchup. Let AO = A's offense, AD = A's defense, BO = B's offense, BD = B's defense, P = park, L = league.

We have that:

3AO + 2BD + 2P + 2L = (12 - 2 * 5)
3BO + 2AD + 2P + 2L = (6 - 2 * 5)
3AD + 2BO + 2P + 2L = (6 - 2 * 5)
3BD + 2AO + 2P + 2L = (12 - 2 * 5)
2AO + 2BO + 2AD + 2BD + 5P + 4L = (18 - 4 * 5)
2AO + 2BO + 2AD + 2BD + 4P + 5L = (18 - 4 * 5)

As you can see, a simple two-team matchup requires 6 equations, 10 if you were to try to do park reactions. Anyway, you should notice that the park and league have 4 games, not 2. That's because each offense-defense pairing is considered a "game", and what A's offense does against B's defense is completely independent of what B's offense does against A's defense.

In traditional analysis, we would say that team A has a +6 run differential, while team B has a -6 differential. But solving the FSIA system above, we find that:

AO = +.523
BO = -.677
AD = -.677 (or +.677 in terms of goodness and not run allowing)
BD = +.523
P = -.154
L = -.154

What this says is that A's offense would probably score, on average, 5.523 R/G against an average team in an average park under average league conditions, while B's offense would score 4.323 R/G in that same environment, etc. Note that the system can't differentiate between the park and league because, given the information that we have, both contributed to the exact same set of results under the exact same conditions, so it assigns the same number to both.

TKD
01-22-2006, 04:33 AM
Just FYI, I just checked my database, and I was off on the all-time R/G. I'm not sure why I thought 4.76 R/G, but the number that I'm getting is 4.53.

SABR Matt
01-22-2006, 06:50 AM
I knew that number you quoted me last week seemed high...LOL

Fortunately all of my calculations of all time average RS/G used the real RS and the real G. :)

Ubiquitous
01-22-2006, 08:48 AM
Matt where are you getting your info for the top three teams? Because you are wong about the Indians. The Indians scored 52 more runs then allowed against the top three teams. The Cubs scored only 27 more runs against the top three teams.

SABR Matt
01-22-2006, 09:04 AM
I had only the numbers taken from queries run by TKD...he didn't tell me how Cleveland did against the 4th place team in 1954, they were in fact -6 against the top two...and the third was well below .500 and a much worse team than the one faced by Chicago...and BTW I have my figure right for the '35 Cubs...they were +32 against the top three...not +27...don't know where you got your number...

In any event, the same method that was used to rate 1935 CHN was used to rate 1954 CLE...the probabilistic modelling didn't come from nowhere...go game by game by game...the Cubs had more success overall in the RS/RA columns than Cleveland did against higher winning percentage teams. You have to look at the whole WEIGHT of the game by game distribution.

Ubiquitous
01-22-2006, 09:19 AM
The third team below Cleveland in 1954 was Boston, both in wins and run differential. The Indians outscored them by 58 runs.

Cubs RD:
STL: -18
NYG: +18
PIT: +27
Total: +27

The Cubs outscored their opponents by 8 runs better then the Indians, and they did those 8 extra runs in an era in which more runs were scored.

So even if we just say top two then its -6 for Indians and 0 for the Cubs. Indians get to -6 becuase of a few blowout losses against the White Sox otherwise they are at 0 as well.

Ubiquitous
01-22-2006, 09:26 AM
You have to look at the whole WEIGHT of the game by game distribution.

Okay against the second place team the Indians outscored them by 4 runs. The Cubs were outscored by 18 runs against the second place team. Advantage Indians. The Cubs outscore the third place team by 18 runs, Indians get outscored by 10 runs. Advantage Cubs. From there on out its lopsided RD's for both of them

SABR Matt
01-22-2006, 09:59 AM
You're looking at ranks...take a look at winning percentages or intrinsic strengths.

aside from the lowly Braves...the teams behind the Cubs in the '35 NL were much more evenly spread across the intrinsic strength spectrum...the Indians had two good rivals and then a whole bunch of crap below .500. '54 Boston is not the same as '35's 4th place finisher...you can't treat them as though they were.

When you add to that the fact that Cleveland did pretty unimpressive work agaisnt the Senators...a .411 team that they played to only +33 (the Cubs played their .411 rival to +52)...I know this seems like semantics...but it's a big deal exactly how those runs are spread around and how strong the teams are against which those runs are spread.

Ubiquitous
01-22-2006, 10:13 AM
The Cubs scored +31 against CinCin which only had two more wins then the Senators.

SABR Matt
01-22-2006, 11:28 AM
The other question is (and I haven't looked at this yet so I don't know the answer yet)...did either of these two teams have an unusual spread of blowout games against certain teams that made their overall match-up RDs look different than they probably should have been. The blowouts are heavily discounted by the FSIA so that could be playing a role...

I don't have the game logs loaded and queriable on this machine presently (we're working on fixing my database, but TKD is steeped with some work right now so time is a problem)...but I want to look game by game and see what the explanation is here...I guarantee there is an explanation...whether it's an acceptible explanation or not is yet to be determined.

Rome Colonel
01-22-2006, 03:46 PM
Interesting thread and an interesting system.

Some thoughts on the Cubs-Indians controversy. These are more observational than analytical comments as I don't assume to be much of an analyst in any sabermetric sense.

1. Both teams blew out (5+ runs) the opposition at an impressive rate:

Cubs 34-8 253-159
Indians 30-9 285-140

2. But the Cubs were unimpressive when it came to winning one run games, while the Indians were outstanding. This may explain why the team beat its projection by 7 wins:

Cubs 25-23 189-187
Indians 32-13 162-143

3. The most distinguishing difference between the two teams is relief pitching. The Cubs didn't have a true reliever. The Indians had two of the best in Mossi and Narleski. The save statistics naturally reflect this and likely account for the Tribe's dominance in one-run games:

Cubs - 14 Saves
Indians - 36 Saves

4. One might expect that the Indians wouild have had far fewer complete games but that isn't the case. When you combine the Cleveland CG and save numbers you have a pretty good idea of just how dominant their pitching was in 1954:

Cubs - 81 [95]
Indians - 77 [113]

5. Although the Cubs were remarkably consistent when it came to dominating the other clubs (except the Cardinals), they were a very streaky team (the famous 21 game run in September and a 24-3 run in July). Outside of those stretches they were barely a .500 team (53-51).

6. The Indians, who were only .500 against NY and Chicago, tended to be a much steadier team over the course of the season, never winning more than 11 straight but playing over .667 every month but April.

The observations tended to point me to the following conclusions, admittedly not supported by any analysis but perhaps worthy of some further research.

First of all, as we all know it's hard to compare teams from different eras. The fact that relief pitching had become much more important by the 50s tends to suggest that run differentials might not be as significant in evaluating a team's overall strength in that era as they were in the 30s.

Second, when comparing and evaluating pennant winners (or near winners) consistency of success over the course of a season may be a better indication of a team's strength or dominance than their performance against any one team or group of 3 or 4 teams.

Third, if we look ahead to 1936 and 1955, we find that both clubs fell off significantly, but the Indians were on their way to another pennant until they lost 6 of their last 9. The Cubs, on the other hand, were only 29-31 over the last two months, gradually fell out of contention, and had to win their last 2 games in StL to finish tied for second. They'd won 15 straight in June but couldn't muster another late season run.

I'm not sure what do conclude from all this but I think that it may show that the Indians had the more consistent team and in winning 204 games to the Cubs 187 were the better team over the course of two seasons, regardless of the relative strengths of the two leagues over that time.

SABR Matt
01-22-2006, 05:29 PM
A fair analysis...

All I will say at ths point is that your blowout RDs do show that Cleveland managed more blowout advantage than Chicago...interestingly..by about 30 runs...which is the difference between the center-pull of the Indians and the center-pull of the Cubs by the FSIA.

I'm not entirely sure, outside of the Indians' tendency for a bit more of a blowout edge what about the FSIA's probability scheme would cause the Indians to be so apparently underestimated relative to other teams with similar pythagorean wins.

I've heard the reliever theory advanced to explain the performance of teams above their pythagorean, and I do think it's possible having a good bullpen helps a team outperform their pythagorean, but I have not thus far seen proof that this is true...Bill James estimated that the innings thrown by the relief ace have about 1.7 times as much leverage on wins and losses as your average inning, so the theory is supported by some of the top pundits...I would like to see it confirmed with some research on the subject...the 2005 Mariners had an outstanding bullpen...and yet underperformed their pythagorean by 6 games...to give an example of a team that defies that theory. Eventually I'd like to research this question myself.

Ubiquitous
01-22-2006, 05:52 PM
Tangotiger has his own leverage index, with weights for various situations. With one being normal and bottom of the 9th, ahead by 1, men on second and third, and 1 out being worth 10 times more then that.

They may have underperformed but that probably has more to do with the KCR and NYM then the bullpen. There are a lot of things that pull at record just because the Mariners have a good bullpen does that mean it automatic that they should exceed what is predicted. But having said that the Mariners record in one run games was very good for a bad team.

SABR Matt
01-22-2006, 10:43 PM
I take it by the KCR and NYM you mean our RD against those teams was too high for a bad ballclub and that was throwing off the season's pythagorean? (a lot of blowouts against KC last year...and an outburst against the NYM that ended in a 13 run game..something we rarely see).

I think it's easy to get lost staring at one case study and forget the larger picture...I'm going to take a close look and try to figure out why the '54 Indians ended up so underrated...I have a sneaking suspicion that the Indians' blowouts came against bad teams and the Cubs' blowouts biased toward teams with somewhat higher W%s and that's the difference...but obviously I can't confirm that yet. But it shouldn't be lost on the casual observer that...had I had the data for the 2005 Mariners...they'd have rated well under their pythagorean by the FSIA specifically BECAUSE they had a lot of blowouts against other bad teams and it's throwing off their overall RD.

RuthMayBond
01-25-2006, 09:21 AM
OVERVIEW

RESULTS OF FSIA CALCULATIONS

1) What do the results look like?

Top Fifty Teams since 1900
(In terms of Intrinsic Run Differential per Game)

Year Team InRD/G
1939 NYA 2.265
1927 NYA 2.197
1902 PIT 2.000
1936 NYA 1.945
1931 NYA 1.869
2001 SEA 1.790
1998 NYA 1.772
1906 CHN 1.723
1937 NYA 1.702
1929 PHA 1.644
1905 NY1 1.592
1932 NYA 1.586
1942 NYA 1.567
1944 SLN 1.530
1942 SLN 1.525
1904 NY1 1.504
1935 CHN 1.492
1953 BRO 1.492
1931 PHA 1.490
1969 BAL 1.489
1901 PIT 1.475
1903 BOS 1.452
1911 PHA 1.448
1998 HOU 1.447
1934 DET 1.441
1948 CLE 1.439
2001 OAK 1.410
1975 CIN 1.401
1921 NYA 1.369
2002 ANA 1.365
1935 DET 1.359
1995 CLE 1.351
1938 NYA 1.339
1974 LAN 1.336
1912 BOS 1.336
1912 NY1 1.330
1949 BRO 1.330
1998 ATL 1.328
1910 PHA 1.325
1942 BRO 1.321
1932 PHA 1.315
1909 PIT 1.298
1999 ARI 1.295
1922 SLA 1.289
1905 CHN 1.288
1955 BRO 1.286
1950 NYA 1.268
1909 PHA 1.266
1953 NYA 1.262
1901 CHA 1.261

Thoughts? Quibbles? General wonderings?Where do you have the '68 Tigers ranked? Their pitching may not have been that great but they had really good hitting and postseason success. I'd think they'd be better than the '05 Cubs

SABR Matt
01-25-2006, 01:47 PM
I don't have the data at my disposal at present...my laptop died a horrible horrible death so Randy has the only available copy..but as I recall, the '68 Tigers were in the 70s (that's not as bad as it sounds...the teams are VERY tightly clustered fromabout 40 on...you can see that clustering beginning as the numbers start getting closer and closer together. The Tigers were a pretty unbalanced team...lots of hitting...defense not much above average...good park reactions...certainly a championship calibar team...perhaps not an all-time great...but close.