Thursday, March 15, 2012

How Wise is the Crowd In Basketball Predictions?

Bottom line: a crowd with mostly non-fans did better than two out of three major sports prediction sites, and didn't lose by much to the third. For more, read on.

Pundit Tracker, an awesome blog which holds "experts" accountable to their predictions after the results come in instead of letting them slink off to hide, rounds up March Madness picks so far.

Thanks to everyone who participated, including the folks I met over at McGee's in Alameda. By the way, some of your friends over there are a little too intense. One guy was about to run me out of the place. (I think he had trouble understanding that I wasn't trying to get people to bet on anything.) After taking out wise-guy outlier data, I had 38 responses - not as many as I would've liked, but it might still be interesting. I post this stuff on the internet for people to discuss so if you like it, hate it, think I'm full of it, please comment below!

Now that the game is played, here's how the crowd as a whole did. And there are some interesting differences between groups of people in the survey, not all what I predicted they would be.


1. Score Prediction.

1.1 All Respondents (Minus Outliers) Against Experts

Numbers for the crowd are given in the format mean/median. TE is Total Error (sum of absolute values of point prediction errors for each team.)

RankSourceCalUSFTE
N/AActual5465-
1.sports-ratings.com565512
2.Crowd (38)69/6967/6316/16
3.collegehoops.net504425
4.CBS Sports725825


Of course with N=1 the crowd's performance could be a fluke and we should do it for multiple games - but I leave this to the statisticians and psychologists of the world who have time and funding to do this. Note, I put CBS Sports 4th because they had the same total error but one of the individual scores was more egregiously off.





Outliers: I took out two sets of score outliers for each, probably from the same wisguy since they came in sequentially, who gave predictions of 500 and 1 for Cal, and 219 and 1 for USF. This left a scoring range 40-116 for Cal, and 30-114 for USF. (At first I thought the 30 was an outlier but left it in because a score that low actually has precedent in recent tournament history - Mississippi Valley State lost 70-29 to UCLA in the first round of the 2008 tournament. Yikes.)


1.2 Score performance by level of basketball fandom:

CategoryCalUSFTE
Actual score5465-
Serious fan (7)53/5454/5012/15
Casual fan (11)64/6661/6214/15
Not a fan (20)78/7774/6933/27


Note that the more serious a fan you are, the lower and more accurately you guessed. Expertise made a difference. And when I started looking up predictions to compare to, I noticed there was a lot of talk about both teams' defensive play, especially USF.


1.3 Score performance by method of prediction

CategoryCalUSFTE
Actual5465-
Own knowledge (12)54/5258/547/11
Ask friends (1)(70)(64)(17)
Guessed (23)75/7570/6826/24
Researched (2)93/9388/8862/62


It looks like if you had some knowledge, or asked someone personally who did, you did better. It seems our two researchers were looking at the wrong numbers. Most of the guessers were not fans of basketball (17 of 22); of the rest, 5 were casual fans and 1 was serious. Of those that used their own knowledge, 6 of 12 were serious fans, 4 of 12 were casual, and 2 of 12 were not.


1.4 Score performance relative to caring about these particular teams

CategoryCalUSFTE
Actual5465-
Care52/4854/5113/20
Don't Care75/7471/6627/21


Again we see that the less important the game is to you, the higher your score estimate (as in 1.2 above, where the more serious of a fan you are, the lower the score you predict.) This is possibly because of lack of knowledge of USF's defense.

I asked this because I wanted to see how fans of the teams stacked up against those who had no interest. My prediction was that they would do worse, but obviously they did not. (I entered an office pool in 2008 and won, based entirely on averaging points scored and allowed by each team during the regular season, and won! Ironically that year Kansas State took it, and that's what I predicted, by the method I used for this little experiment to predict 72-62 Cal. Back in 2008 at the start of the tournament an otherwise loyal Kansas State fan told me I was nuts - the whole point is that I had no emotional attachment to any team, and I was just looking at numbers. While I was collecting my money I asked someone to explain what a 3-point shot was because I didn't know. The system didn't work as well when I tried again in 2009.)

As I was sorting data I most regretted not asking whether people were Cal OR USF fans, because such a bias could run in opposite directions depending on who your teams are, and it could cancel out. However, it's probably a safe assumption that more of the 9 people who cared about these teams cared about Cal than USF, considering I work about 5 miles from Cal's basketball court, and some of the people I initially sent this to actually work at the university. So most influence that we see here is probably the result of Cal fans.

As would be expected, everyone who said they cared about these teams were casual or serious fans.


2. The Future of March Madness

The Atlantic just posted an article about the growing pile of dollars involved in the NCAA tournament. This blog is no stranger to posts about the distorting effect of money on college sports,
but there's nothing special about college athletics: in 1997, a Kenyan runner accepted money to run under the banner of Oman. Why? Because they paid him, quite above board, that's why! Because there's prestige that comes from associating with a winning sports figure or team, and as long as that's the case, these things are going to happen. At the time this was considered shocking (I wish I could find a link), and though I've read no coverage of similar events in the international marathon world since then, I would be surprised if it hadn't happened more. And to be honest, is it any different than any other athletic endorsement? "Eat Wheaties and Qatar is great." And in the end, none of us has a legitimate complaint against a committed professional's making a move that will benefit them and their family.

What will be interesting is if all this money flowing into March Madness ends up making the championship process more drawn out and opaque - because what's being maximized is profit, not fair and exciting athletic competition. Fans would like to believe those two mostly overlap, but that's not how it works. Just ask anyone who watches college football. Previous posts about that particular conflict of interest here.

Of course the people making money in college athletics aren't evil athletic robber barons. The statistic that has been thrown around for Penn State football for years is that the first home team paid for the entire athletic program, and the rest was profit for the school at large. But these windfalls still have to be considered against the attention that college-branded entertainers divert from the true mission of their colleges. The heroes at these institutions are the researchers pushing knowledge forward. Find me a fan at one of these games who can name a single prominent scientist or scholar there. I bet you can, but it's in the single digits. That's another prediction that's probably more worthwhile to follow up.

No comments: