rickmbari's comprehensive college basketball ratings
I am working on a computerized basketball rating system and I thought I'd throw it out there for my fellow 'Cloids to comment on. The system rates and ranks 1542 teams based on (currently) 13,347 game scores* from Peter Wolfe's UCLA website which includes NCAA, NAIA, NCCAA, USCAA, CIS, CCAA ACCA, "and a few others."
Each team is awarded a rating based on its average point spread, positive or negative, adjusted by strength of schedule. The rating is the presumed margin of victory or defeat versus a hypothetical average team (#706 Briar Cliff, with a rating of -0.01, comes close).
The Cardinals fare better than I would have expected, considering their offensive woes, currently coming in at #32 with a rating of 37.11, just behind Purdue (37.23) and just ahead of UConn (36.90). The other big surprise is that Murray State is 'way down at #73. The Racers' score differential is 15.95 ppg and their SOS is 16.33, yielding a rating of 32.28.
Here's my Top 10:
|
1 |
Ohio State (50.94) |
|
2 |
Kansas (48.17) |
|
3 |
Kentucky (47.49) |
|
4 |
Syracuse (46.84) |
|
5 |
North Carolina (46.42) |
|
6 |
Missouri (45.65) |
|
7 |
Michigan St (45.64) |
|
8 |
Wisconsin (44.77) |
|
9 |
Florida (44.11) |
|
10 |
Indiana (44.02) |
The rating algorithm I use is simple:
- The initial rating for each team is its average point differential. (If on average you score 10 more points than your opponents, your initial rating is 10; if you average 7.3 points less than your opponents, your initial rating is -7.3.)
- Strength of schedule for each team is computed by averaging the ratings of all opponents. (If your opponents average 5 points more than their opponents, your initial SOS is 5.)
- Ratings are recalculated as point differential plus strength of schedule.** (If you average 10 points more than your opponents, and they averaged 5 points more than their opponents, your new rating is 15; but if you averaged 15 points more than your opponents and they averaged 13 points less than their opponents, your new rating is 2.)
- Step 2 is repeated using the new ratings, and step 3 with the resulting strength of schedule numbers, until no team's rating changes by more than a defined "convergence limit" (currently 0.00001 point). This week's rankings took 750 iterations to converge; on the 750th time through step 3, the largest change in any team's rating was about 0.00000763.
The difference between two teams' ratings should give a fair idea of the expected point spread, but at the extremes it becomes nonsensical. Ohio State's rating is +50.94; last place Collège Édouard-Montpetit has a rating of -111.84, with an average score differential of -39.55 and a SOS of -72.31. OK, the Buckeyes are good and the Lynx are bad. But even a #1 vs. #1542 matchup doesn't seem likely to produce a point spread of 162.78.
The algorithm currently does not take home court advantage into consideration. A future version will. Rather than using a global home court advantage as Sagarin does, I will rate each team's home court advantage separately. I also plan to experiment with giving each team an offensive and defensive rating, which will allow prediction of actual scores rather than just the spread.
I may also try to do a rating based only on Division I game results. I'd have done so from the beginning, but Wolfe's scores page doesn't differentiate between divisions or sanctioning bodies. However, he has a separate page that lists each team and its affiliation, so with some table lookups I should be able to filter the scores to include only D1 games.
In order to keep things "well connected" between divisions, it was necessary to include inter-divisional (usually exhibition) games in the ratings. This may seem to invalidate the results, but without it, Carleton of Ontario comes out on top of the ratings despite having lost to LaSalle, UCF, Akron and Albany among others. With exhibitions counted, Carleton sinks to #102.
* Due to an apparent bug in GFortran, there are 28 games whose scores are omitted or incorrectly processed. They all involve visiting teams with accented characters in their names, such as San José St and La Cité. So really I should say the rankings are based on 13,319 of the 13,347 scores.
** To prevent rounding errors from causing the ratings to diverge forever, I actually adjust by 99.99% of the SOS. The ratings converge much faster with a lower SOS factor (28 iterations this week at 50% versus 750 iterations at 99.99%) but this would overvalue teams that pad their schedules with cupcakes.
Comments and constructive criticism are welcome.
37 comments
|
0 recs |
Do you like this story?
Comments
If La Cite (accent aigu) has been left out, the ranking are worthless. Sorry.
by Carolina Cardinal on Jan 24, 2012 2:00 PM EST reply actions
I knew you'd understand
"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster
Exactly. As a long time La Cité supporter, I just could not let this egregious oversite pass unchallenged.
by Carolina Cardinal on Jan 24, 2012 2:18 PM EST up reply actions
I'm a For the City
two Foghats in one day. They must be primed for a revival tour,
You will be pleased to learn
that I have found the bug, which involves two-byte representations of accented chartacters, and I have worked around it by changing all such characters in the imput file to asterisks. So now we will have San Jos* St and La Cit* because I was far too lazy to figure out what characters to substitute for all accented characters.
Fans of Collège Édouard Montpetit will be similarly pleased to know that their rsting is now more accurately calculated as -52.15 rather than -111.84.
"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster
true statement.
"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster
you'd need 10?
"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster
That'll be my first try, yes
If I do that, I have to also adjust their home scores downward by their home court advantage when calculating their real rating.
"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster
good call.
It may take longer, but it would make for a better rating. And you only do it weekly anyway right?
I only have to program it once.
It runs in about one second. But the scores are only updated on Sunday night, so there’s not point in running it more often.
"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster
I've rethought this
I’m going to rate the home Cardinals and the road Cardinals as if they were two separate teams. Each team will have two different ratings – one at home and one on the road. The average of the two will be used for an overall rating and ranking.
"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster
Aaaand that's done.
Each team now has a home rating and an away rating as well as an overall rating, which is simply the average of the other two.
"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster
You can handle the outrageous scoring margins when pairing a great team vs. a horrible team
by applying a decay (your choice Log or Exp) based on eventual outcome, then using a ceiling or round function. So if the margin is 4, the decay would take it to 3.92, ceiling back up to 4. But a 162 point margin would decay to 44.21, ceiling to 45 points.
Finally, you including all the leagues reminds me of this.
by Remote Cardinal on Jan 24, 2012 2:19 PM EST reply actions 1 recs
Nice link.
I hope our Cards can make it to the Fantastic Five Hundred and Twelve this year.
by sarasota-card on Jan 24, 2012 2:28 PM EST up reply actions
Statistics
Had to take it my junior year. Came a skinny milimeter from transferring majors to get away from it. Seemed like three separate courses.
What was in the text.
What was covered in class.
What was on the tests.
The class average was a D+.
Rofl. My stats prof and the book we used were good at contradicting each other.
Had that issue in a few of my electrical engineering and programming classes though. I got a 17/100 on a final from my first electrical engineering class where I needed to get upwards of an 85-100 to get a D in the class. He taught us one thing, told us another would be on the final, and it was something completely different actually on the final.
Isn’t college fun?
by CardinalDude on Jan 24, 2012 2:57 PM EST up reply actions
legend has it the Sadistics prof about five years before I got there was fired because...
……he gave a pop final and flunked the entire class.
Was that a Desoky class?
He gave my entire class a jolly good rodgering in one of our first EE courses.
Nope. No idea who it was anymore.
Although I think he was French. Could be wrong though.
by CardinalDude on Jan 24, 2012 7:10 PM EST up reply actions
What is it about computer rankings and Wisconsin?
Very interesting though, Rick. I will be keeping watch on this project.
by James Sutherland on Jan 24, 2012 3:02 PM EST reply actions
Rick do you have any objectives in mind?
Road performance should probably get much heavier factoring to be predictive, for example. I love stats, but I usually burn too much time playing with them unless I decide what I’m trying to accomplish.
Well first and foremost, I like playing with algorithms
so it was fun to do this. Call it idle curiosity. I plan to use my ratings to fill out my tournament bracket this year, and see if it works any better than my previous methods (which generally involved dart throwing and coin flips).
Ultimately, I’d like to come up with a rating system that is unbiased (this one is, being blind to any loyalties I may have) and reasonably predictive. I’d like to be able to say (whether anyone listens or not) that my system fared better than, say, Sagarin’s at predicting the outcomes of games.
Something that has always bothered me about Sagarin’s ratings is that he assumes a single home court advantage applies to everyone. This is clearly not the case; there are teams that routinely fare much better at home than on the road, and others that do so barely if at all. The problem in trying to assess home court advantage is deciding how much of a team’s improved performance at home is due to their home court advantage and how much is due to their road performance being depressed by their opponents’ home court advantage. I’m still puzzling out how to assign that. Anyway, “home court advantage” is to some extent simply another way of looking at road performance.
It would be great to someday have the rickmbari ratings recognized as an amazingly good rating system, license it to ESPN or USA Today or somebody, and revel in the fame and fortune it brings me. But really I’m just dicking around.
"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster
Then there are teams that do better on the road
Didn’t the Cards do better on the road the last year at Freedom Hall?
if so, their "home court advantage" should compute as a negative number
but I think that would be hard to defend in predicting future scores.
"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster
This is why I left engineering for art school.
well it was actually a sort of mutual decision on the part of the university and myself…
Yeah me too,
I have tons of spreadsheet relics laying around from years of dicking. I once tried to use one of my formulas for brackets, and well.. obviously I don’t have a website selling it.
Anyway, predictive is much more interesting than qualitative, IMHO. I could dig out some of my old factors if you want. I recall trying to install not only a heavy road factor, but also recent trending of the team in question—as well as their opponents’ recent trending. I.e. Beating a ranked team isn’t as good if they are slumping. If you believe in the notion that it only matters how you’re playing in late season, not how you began, you want your algorithms to discern that in the data. (Which also suggests heavier weighting of more recent data.)
Yeah, thought about that but boy would that be a bear to calculate.
And the weighting is totally arbitrary. And streaks happen at random too – toss a coin enough times and you’ll get ten heads in a row, but that doesn’t mean you’re more likely to get heads on the next toss – which isn’t to say that teams don’t have periods where their morale impacts their play for good or ill (cf. Louisville Cardinals 2011-12).
"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster

by 









