PDA

View Full Version : SABRmetricians, Please Help!


Mr. Red
03-28-2006, 02:35 PM
Okay, today in my study hall I had 35 minutes of downtime. Recently, I've been making programs on my calculator for popular and SABR stats. Today, however, I figured that I would enter the realm of SABRmetrics by myself, in other words, try to come up with something new. Now, I am not familiar with a wide range of SABR pitching stats, so if what I am working on is similar to an already existing formula. Just tell me and hook me up with the formula.

Anyways, my problem is twofold, 1) I can't figure out how to weight the differenct variables (i.e., I know a HR has more weight than a BB, but how much more) and 2) how do I set the stat on a "scale" and find what the average is? Thanks you so much in advance.

And here it is:
{[((HR + BB - K)/IP)-GB%]+4}

Once again, any advice, help, criticism, etc. appreciated.

Tango Tiger
03-28-2006, 03:01 PM
Part of your equation looks like:

http://www.tangotiger.net/drspectrum.html

So, the 13*HR + 3*BB - 2*SO all divided by IP is what you want. The result is an unscaled earned run value per game. The mean is around 1.2.

It's an interesting idea to put GB% in there like that. IIRC, the difference in run value
between a GB and FB is .10 runs per ball contacted. There are about 30 contacted balls per game. So GB% * .10 * 30, or GB% * 3 will give you an unscaled earned run value per game as well. The mean is around -1.5.

You add up the above two numbers. The mean number is close enough to zero, that I think this would make a great scale! Or, you can add around 5.0 to scale it to ERA.

Great job!

Mr. Red
03-28-2006, 03:19 PM
^Thank you so much. And also, where can I find your book?

Tango Tiger
03-28-2006, 03:41 PM
Please see my signature. Thanks...

SABR Matt
03-28-2006, 06:20 PM
Very interesting that you bring up GB% impacting on earned runs allowed...I'd be curious to find out how you determined that a groundball was a little worse than a flyball in terms of allowing earned runs...I'm sure it has everything to do with run expectencies...but the exact method would be of some interest. GB/FB tendencies are scheduled to become at least a minor part of my work with seasonal pitching ratings in the next generation of PCA.

Tango Tiger
03-28-2006, 06:59 PM
It's easy enough. Get the run values of a GB single, double, triple, out, and the same for FB, plus the FB HR run value (which is 1.4).

As it turns out, if you remove the HR, the run value of the GB and FB are a virtual match. Adding in the HR makes it .10 runs worse per contacted ball for a FB.

Matt, I would highly recommend that you contact studes at http://www.hardballtimes.com . You would fit right into that group. Plus, you'll get access to the data you may want. You can tell him I sent you. They get great coverage.

SABR Matt
03-28-2006, 09:00 PM
Curious...why do you think I would "fit right in"?

Really just curious to see if I agree with your assessment. :)

does he have PBP data for 1993-1998? I've been e-mailing everyone I can find who might have it desperately trying to license it and fill the danged PBP hole so I can make use of it...no one is responding to me.

Tango Tiger
03-28-2006, 09:29 PM
The only two sources that has the data is STATS and Palmer/Gillette. What's your price budget, what do you want to do with the data (personal, publish here, or ???) and I can ask Palmer for you.

THT will be able to offer you PBP data for the last few years, probably more granular than you are used to, though I'm not sure. You can ask Dave about it.

As to fitting in, you've got the skillset and outlook that they probably would go for.

misterdirt
03-29-2006, 05:15 AM
For 2003-2005 I have ground ball run value as -.093, non-HR FB as -.1051, and all FB as .084.

Tango Tiger
03-29-2006, 06:18 AM
If you assume 10% of FB are HR, you get this equation:

.10 * 1.4 + .90 * (-.1051) = x

x = .045

How'd you get .084? Do you include line drives?

Tango Tiger
03-29-2006, 06:20 AM
Come to think of it, if you are going to include HR in Red's first half of the equation, you GB rate should then EXCLUDE the HR portion. If you include the HR as a FB, then you must exclude it from the first half of the equation.

You will also find that (K-BB)/PA*someMultiplier is a *great* proxy for ERA (after you add some constant).

SABR Matt
03-29-2006, 11:59 AM
Tango...I've had a great deal of difficulty getting in touch with Gary Gillette...I've attempted to contact him a couple of times in the last few weeks during my latest push toward filling out my PBP Event database. If you could put me in touch with Pete Palmer (who I know has this data and also has a much more accurate overall player database than the baseball-databank folks do thanks to his continued research), I am exceedingly interested in discussing this with him.

As far as what my budget is, I am somewhat limited (being a college student with not a lot of financial flexibility) but presuming the price was even REMOTELY reasonable, I would find a way to make something happen...I am DEADLY serious about acquiring all of the data I need.

I intend to use the products of my research at least somewhat commercially (I'd like to do something similar to what is done at baseball-refernece.com where they sell ad space on player pages, for example), but I would never freely distribute any PBP data I acquired, and would be willing to work with anyone to structure legal documentation that prevents me from profiting from the resale or redistribution of their data...

Please...anything at all you can do to help me here, I'd *really* appreciate it.

As for the Hardball Times...I'm not a huge fan of their choice of metric (primarily Win Shares), but I am more of a seasonal/predictive sabermetrician than a microsabermetrician (your work on game theory and strategic sabermetrics is interesting though, and I'll be ordering your book at the start of next month), so perhaps you are right.

Tango Tiger
03-29-2006, 01:05 PM
Win Shares is not necessarily the metric of choice at THT. Studes likes them, but it's not a requirement that you follow along. If you were to write an article saying why Win Shares sucks, I'd guess he'd run it. (As you may or may not know, I have 4 articles on my site saying why Win Shares is not a good framework, with the PDF file being the most important of the bunch.)

Send me your email address, and I'll forward it to Palmer for you. Just guessing, a personal-use cost would run several hundred bucks a year. And you wouldn't be able to do anything commercial with it, certainly not in the b-r.com spirit. But, maybe I'm wrong.

If you intern for BP, maybe you can get access to their DB.

SABR Matt
03-29-2006, 02:51 PM
I haven't the slightest clue how one would go about interviewing for an intern position for Baseball Prospectus, let alone where they are located...I should probably search their website for information. :)

BTW, I am well aware of your criticism of the Win-Shares framework...I have some comments on the use of win and run expectencies too. :) I think sabermetricians would benefit from talking to each other a little more because we all (myself included) have this tendency to get pet ideas and run with them.

Anyway, my e-mail address is m_souders@yahoo.com. I'm hoping Palmer is willing to negotiate a bit to help a young researcher...it would be a shame if I was stopped from proceeding with real work simply because he didn't make an effort to work with me.

Tango Tiger
03-29-2006, 03:06 PM
Matt - in-depth saber discussions can be found here:
http://mb3.scout.com/fbaseballfrm8

We don't pull punches. I used to run a blog at Baseball Primer, and there was alot of work being done there. We're going to start our own board soon on our site, to complement The Book. So, feel free to fire away with whatever criticisms you have on the Scout site, or on our site eventually.

As for Pete, he's given more to the sabermetric community than just about anyone out there. I don't think you should put Pete in the position that he has to make an effort to work with you. There's probably thousands of people like us, and he can't possibly cater to each one of us, as much of a nice guy as he is. You should set your expectations that you should be thankful for anything he can do for you. I also don't speak for him, so I don't know what his position is. I'll send him your email, and we'll see what he does with it.

SABR Matt
03-29-2006, 03:59 PM
I'm not trying to seem rude, Tango...

EVERYONE I've talked to confirms that Pete is one of the nicest guys out there...to a fault sometimes (if he were a tad more assertive, then perhaps his database work wouldn't have been hijacked by Lahman without so much as a recognition of his efforts). I had no intention of approaching him in a way that would be unfair. I am simply stating my feelings on the subject here...I am willing to do whatever it takes to make him comfortable with an arrangement to gain access to the missing data and the overall improved quality of data he offers...I simply hope that we can come to some kind of feasable arrangement for both sides.

SABR Matt
03-29-2006, 04:02 PM
BTW...I'll bookmark this sabermetrics forum and recommend it to the folks I know at non-sabermetric sites who might have an interest in contributing...I'll probably just lurk for a bit and get a feel for the atmosphere...what I don't want to see is sabermetricians attacking each other and cutting down the work of others...I try never to do that, and it tends to prevent me from speaking much...so I'm hoping this site doesn't foster that kind of ugly competitiveness (not saying it does, but I've seen it elsewhere).

SABR Matt
03-29-2006, 04:44 PM
BTW...I can't find studes' e-mail address at THT...is he Dave Studerman?

Tango Tiger
03-29-2006, 09:53 PM
Yes, that's the one.

SABR Matt
03-29-2006, 10:26 PM
thanks Tango...

You know it occurs to me that perhaps I should be taking it as an insult that you described me as "fitting right in with that group"...LOL I don't know what your general opinion of THT is...for all I know you think they're the lowest common denominator...LOL *runs and hides*

Tango Tiger
03-30-2006, 05:17 AM
They do great work over there. If you haven't bought their annual, you should.

misterdirt
03-30-2006, 08:33 AM
If you assume 10% of FB are HR, you get this equation:

.10 * 1.4 + .90 * (-.1051) = x

x = .045

How'd you get .084? Do you include line drives?


I am glad that you got me to recheck my work. I did have a problem with Home Runs with nobody out that needed correcting. Corrected numbers are all FB=.081, non HR FB = -.098, HR=1.400. I didn't include line drives and I didn't include Pop Ups. I made that calculations directly from the 2003-2005 PBP data using F as a criteria in the batted ball type. 110941 total fly balls, 97799 non-HR fly balls and 13265 FB Home Runs, so a bit more than the 10% ratio that you used in the above calculation.

Another question. When you calculate the run value for 1B, 2B, and 3B do you adjust for batters that took extra bases on errors or FC and for batters that were thrown out trying to advance?

redbuck
03-31-2006, 05:16 PM
Of balls in play, home runs appear about 3.5% of the time. Triples 2.9%, doubles 5.7% and singles 17.3%.