Tuesday, March 27, 2007

Baseball and a toss of the coin

It seems the new baseball season is upon us. I wouldn't know this, but for a bunch of people hanging around watching spring training games in the arcade where girlfriend and I regularly play air-hockey. But never one to let a new sport slip by me, especially one so stats-friendly as baseball, I got my head down and did some work.

While considering the best way to implement a fuzzy Elo system (on which, possibly, more later), I wondered to myself: how much evidence is there that skill plays any part in MLB? I mean, how different are the results from what you'd get by tossing coins?

This is actually a pretty simple investigation. Looking just at last year's wins-losses record for all 30 MLB teams, sticking the wins into a spreadsheet, expecting that each team would win 81 of 162 coin tosses* and asking my free Excel-alike to do a chi-square test: there's a 16.3% probability that results as extreme or more so would occur just by tossing coins.

However - something's not quite right here. In this account, every game is counted twice, once for the home team and once for the away team. How about we just look at the home records? That way, we only count each game once, as nature intended.

Things get even more incriminating for people who prefer baseball to watching slot-machines - tossing a coin for each game would return more extreme results more than two thirds of the time. And that's without taking into account the observed home-field advantage.

Let's do that now - assuming a 55% win rate for home teams**, we find the probability of more extreme results to be 99.32%. Extending that back to 2001***, the probability is less - about 46% - but still way short of statistical significance. In short, given that data, there's no reason to suspect that skill variations between MLB teams has any influence on the season-by-season home win-loss records.

---

Of course, that's all a bit mischievous. While the stats are sound enough, there is a possible problem with the data: 80-odd matches per team simply isn't long enough to establish a statistically significant result. Indeed, when you group the records by team over the last six seasons, agglomerating as you go, you do get a statistically significant result (a probability around 10^-8, since you ask).

It's possible to play around with this. If you only look at the middle 20 teams, the probability is well over 20%. If you decide that Yankee Stadium, Boston, Oakland, San Francisco and Minneapolis are intimidating places to go and crank up their win ratios to 60%, while saying that Tampa Bay, Cincinnati, Baltimore, Detroit and Kansas City are less intimidating and worth only 50% - all numbers plucked from guesswork - then it's back to insignificance at the 10% level. Caveat being, of course, that the stadia to alter were picked after looking at the data.

* Some teams only played 161 games. I took this into account.

** That's a guess inspired by the data. Over the last six seasons, the actual rate was 53.91%.

** i.e. taking home W-L records for each team in each season, making (30 x 6)=180 records.

No comments: