I am working on a computerized basketball rating system and I thought I'd throw it out there for my fellow 'Cloids to comment on. The system rates and ranks 1542 teams based on (currently) 13,347 game scores* from Peter Wolfe's UCLA website which includes NCAA, NAIA, NCCAA, USCAA, CIS, CCAA ACCA, "and a few others."
Each team is awarded a rating based on its average point spread, positive or negative, adjusted by strength of schedule. The rating is the presumed margin of victory or defeat versus a hypothetical average team (#706 Briar Cliff, with a rating of -0.01, comes close).
The Cardinals fare better than I would have expected, considering their offensive woes, currently coming in at #32 with a rating of 37.11, just behind Purdue (37.23) and just ahead of UConn (36.90). The other big surprise is that Murray State is 'way down at #73. The Racers' score differential is 15.95 ppg and their SOS is 16.33, yielding a rating of 32.28.
Here's my Top 10:
|
1 |
Ohio State (50.94) |
|
2 |
Kansas (48.17) |
|
3 |
Kentucky (47.49) |
|
4 |
Syracuse (46.84) |
|
5 |
North Carolina (46.42) |
|
6 |
Missouri (45.65) |
|
7 |
Michigan St (45.64) |
|
8 |
Wisconsin (44.77) |
|
9 |
Florida (44.11) |
|
10 |
Indiana (44.02) |
The rating algorithm I use is simple:
The difference between two teams' ratings should give a fair idea of the expected point spread, but at the extremes it becomes nonsensical. Ohio State's rating is +50.94; last place Collège Édouard-Montpetit has a rating of -111.84, with an average score differential of -39.55 and a SOS of -72.31. OK, the Buckeyes are good and the Lynx are bad. But even a #1 vs. #1542 matchup doesn't seem likely to produce a point spread of 162.78.
The algorithm currently does not take home court advantage into consideration. A future version will. Rather than using a global home court advantage as Sagarin does, I will rate each team's home court advantage separately. I also plan to experiment with giving each team an offensive and defensive rating, which will allow prediction of actual scores rather than just the spread.
I may also try to do a rating based only on Division I game results. I'd have done so from the beginning, but Wolfe's scores page doesn't differentiate between divisions or sanctioning bodies. However, he has a separate page that lists each team and its affiliation, so with some table lookups I should be able to filter the scores to include only D1 games.
In order to keep things "well connected" between divisions, it was necessary to include inter-divisional (usually exhibition) games in the ratings. This may seem to invalidate the results, but without it, Carleton of Ontario comes out on top of the ratings despite having lost to LaSalle, UCF, Akron and Albany among others. With exhibitions counted, Carleton sinks to #102.
* Due to an apparent bug in GFortran, there are 28 games whose scores are omitted or incorrectly processed. They all involve visiting teams with accented characters in their names, such as San José St and La Cité. So really I should say the rankings are based on 13,319 of the 13,347 scores.
** To prevent rounding errors from causing the ratings to diverge forever, I actually adjust by 99.99% of the SOS. The ratings converge much faster with a lower SOS factor (28 iterations this week at 50% versus 750 iterations at 99.99%) but this would overvalue teams that pad their schedules with cupcakes.
Comments and constructive criticism are welcome.
There are 37 Comments. Load Now.
Shortcuts to mastering the comment thread. Use wisely.
C - Next Comment
X - Mark as Read
R - Reply
Z - Mark Read & Next
Shift + C - Previous
Shift + A - Mark All Read
Comment Settings
Live comment alert: Hide it!
Comments for this post are closed.