Navigation: Jump to content areas:


Pro Quality. Fan Perspective.
Login-facebook
Around SBN: NFL Players Ready To Welcome Gay Teammate

rickmbari's comprehensive college basketball ratings

I am working on a computerized basketball rating system and I thought I'd throw it out there for my fellow 'Cloids to comment on. The system rates and ranks 1542 teams based on (currently) 13,347 game scores* from Peter Wolfe's UCLA website which includes NCAA, NAIA, NCCAA, USCAA, CIS, CCAA ACCA, "and a few others."

Each team is awarded a rating based on its average point spread, positive or negative, adjusted by strength of schedule. The rating is the presumed margin of victory or defeat versus a hypothetical average team (#706 Briar Cliff, with a rating of -0.01, comes close).

The Cardinals fare better than I would have expected, considering their offensive woes, currently coming in at #32 with a rating of 37.11, just behind Purdue (37.23) and just ahead of UConn (36.90). The other big surprise is that Murray State is 'way down at #73. The Racers' score differential is 15.95 ppg and their SOS is 16.33, yielding a rating of 32.28.

Here's my Top 10:

1

Ohio State (50.94)

2

Kansas (48.17)

3

Kentucky (47.49)

4

Syracuse (46.84)

5

North Carolina (46.42)

6

Missouri (45.65)

7

Michigan St (45.64)

8

Wisconsin (44.77)

9

Florida (44.11)

10

Indiana (44.02)

Star-divide

The rating algorithm I use is simple:

  1. The initial rating for each team is its average point differential. (If on average you score 10 more points than your opponents, your initial rating is 10; if you average 7.3 points less than your opponents, your initial rating is -7.3.)
  2. Strength of schedule for each team is computed by averaging the ratings of all opponents. (If your opponents average 5 points more than their opponents, your initial SOS is 5.)
  3. Ratings are recalculated as point differential plus strength of schedule.** (If you average 10 points more than your opponents, and they averaged 5 points more than their opponents, your new rating is 15; but if you averaged 15 points more than your opponents and they averaged 13 points less than their opponents, your new rating is 2.)
  4. Step 2 is repeated using the new ratings, and step 3 with the resulting strength of schedule numbers, until no team's rating changes by more than a defined "convergence limit" (currently 0.00001 point). This week's rankings took 750 iterations to converge; on the 750th time through step 3, the largest change in any team's rating was about 0.00000763.

The difference between two teams' ratings should give a fair idea of the expected point spread, but at the extremes it becomes nonsensical. Ohio State's rating is +50.94; last place Collège Édouard-Montpetit has a rating of -111.84, with an average score differential of -39.55 and a SOS of -72.31. OK, the Buckeyes are good and the Lynx are bad. But even a #1 vs. #1542 matchup doesn't seem likely to produce a point spread of 162.78.

The algorithm currently does not take home court advantage into consideration. A future version will. Rather than using a global home court advantage as Sagarin does, I will rate each team's home court advantage separately. I also plan to experiment with giving each team an offensive and defensive rating, which will allow prediction of actual scores rather than just the spread.

I may also try to do a rating based only on Division I game results. I'd have done so from the beginning, but Wolfe's scores page doesn't differentiate between divisions or sanctioning bodies. However, he has a separate page that lists each team and its affiliation, so with some table lookups I should be able to filter the scores to include only D1 games.

In order to keep things "well connected" between divisions, it was necessary to include inter-divisional (usually exhibition) games in the ratings. This may seem to invalidate the results, but without it, Carleton of Ontario comes out on top of the ratings despite having lost to LaSalle, UCF, Akron and Albany among others. With exhibitions counted, Carleton sinks to #102.

* Due to an apparent bug in GFortran, there are 28 games whose scores are omitted or incorrectly processed. They all involve visiting teams with accented characters in their names, such as San José St and La Cité. So really I should say the rankings are based on 13,319 of the 13,347 scores.

** To prevent rounding errors from causing the ratings to diverge forever, I actually adjust by 99.99% of the SOS. The ratings converge much faster with a lower SOS factor (28 iterations this week at 50% versus 750 iterations at 99.99%) but this would overvalue teams that pad their schedules with cupcakes.

Comments and constructive criticism are welcome.

Comment 37 comments  |  0 recs  | 

Do you like this story?

Comments

Display:

I knew you'd understand

"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster

by rickmbari on Jan 24, 2012 2:09 PM EST up reply actions  

I'm a For the City

two Foghats in one day. They must be primed for a revival tour,

by cbcard on Jan 24, 2012 2:37 PM EST up reply actions  

You will be pleased to learn

that I have found the bug, which involves two-byte representations of accented chartacters, and I have worked around it by changing all such characters in the imput file to asterisks. So now we will have San Jos* St and La Cit* because I was far too lazy to figure out what characters to substitute for all accented characters.

Fans of Collège Édouard Montpetit will be similarly pleased to know that their rsting is now more accurately calculated as -52.15 rather than -111.84.

"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster

by rickmbari on Jan 24, 2012 3:26 PM EST up reply actions  

true statement.

"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster

by rickmbari on Jan 24, 2012 4:35 PM EST up reply actions  

you'd need 10?

"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster

by rickmbari on Jan 24, 2012 7:37 PM EST up reply actions  

Home court advantage

Are you going to do this based on how they perform above expectations at home?

by HendoCard on Jan 24, 2012 2:02 PM EST reply actions  

That'll be my first try, yes

If I do that, I have to also adjust their home scores downward by their home court advantage when calculating their real rating.

"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster

by rickmbari on Jan 24, 2012 2:08 PM EST up reply actions  

good call.

It may take longer, but it would make for a better rating. And you only do it weekly anyway right?

by HendoCard on Jan 24, 2012 2:16 PM EST up reply actions  

I only have to program it once.

It runs in about one second. But the scores are only updated on Sunday night, so there’s not point in running it more often.

"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster

by rickmbari on Jan 24, 2012 2:18 PM EST up reply actions  

I've rethought this

I’m going to rate the home Cardinals and the road Cardinals as if they were two separate teams. Each team will have two different ratings – one at home and one on the road. The average of the two will be used for an overall rating and ranking.

"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster

by rickmbari on Jan 25, 2012 5:34 PM EST up reply actions  

Aaaand that's done.

Each team now has a home rating and an away rating as well as an overall rating, which is simply the average of the other two.

"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster

by rickmbari on Jan 26, 2012 9:19 AM EST up reply actions  

You can handle the outrageous scoring margins when pairing a great team vs. a horrible team

by applying a decay (your choice Log or Exp) based on eventual outcome, then using a ceiling or round function. So if the margin is 4, the decay would take it to 3.92, ceiling back up to 4. But a 162 point margin would decay to 44.21, ceiling to 45 points.

Finally, you including all the leagues reminds me of this.

by Remote Cardinal on Jan 24, 2012 2:19 PM EST reply actions   1 recs

Statistics

Had to take it my junior year. Came a skinny milimeter from transferring majors to get away from it. Seemed like three separate courses.
What was in the text.
What was covered in class.
What was on the tests.
The class average was a D+.

by cbcard on Jan 24, 2012 2:39 PM EST reply actions  

Rofl. My stats prof and the book we used were good at contradicting each other.

Had that issue in a few of my electrical engineering and programming classes though. I got a 17/100 on a final from my first electrical engineering class where I needed to get upwards of an 85-100 to get a D in the class. He taught us one thing, told us another would be on the final, and it was something completely different actually on the final.

Isn’t college fun?

by CardinalDude on Jan 24, 2012 2:57 PM EST up reply actions  

Was that a Desoky class?

He gave my entire class a jolly good rodgering in one of our first EE courses.

by dlpfis79 on Jan 24, 2012 6:47 PM EST up reply actions  

Nope. No idea who it was anymore.

Although I think he was French. Could be wrong though.

by CardinalDude on Jan 24, 2012 7:10 PM EST up reply actions  

What is it about computer rankings and Wisconsin?

Very interesting though, Rick. I will be keeping watch on this project.

by James Sutherland on Jan 24, 2012 3:02 PM EST reply actions  

Rick do you have any objectives in mind?

Road performance should probably get much heavier factoring to be predictive, for example. I love stats, but I usually burn too much time playing with them unless I decide what I’m trying to accomplish.

by 97E3LPL on Jan 25, 2012 9:03 AM EST reply actions  

Well first and foremost, I like playing with algorithms

so it was fun to do this. Call it idle curiosity. I plan to use my ratings to fill out my tournament bracket this year, and see if it works any better than my previous methods (which generally involved dart throwing and coin flips).

Ultimately, I’d like to come up with a rating system that is unbiased (this one is, being blind to any loyalties I may have) and reasonably predictive. I’d like to be able to say (whether anyone listens or not) that my system fared better than, say, Sagarin’s at predicting the outcomes of games.

Something that has always bothered me about Sagarin’s ratings is that he assumes a single home court advantage applies to everyone. This is clearly not the case; there are teams that routinely fare much better at home than on the road, and others that do so barely if at all. The problem in trying to assess home court advantage is deciding how much of a team’s improved performance at home is due to their home court advantage and how much is due to their road performance being depressed by their opponents’ home court advantage. I’m still puzzling out how to assign that. Anyway, “home court advantage” is to some extent simply another way of looking at road performance.

It would be great to someday have the rickmbari ratings recognized as an amazingly good rating system, license it to ESPN or USA Today or somebody, and revel in the fame and fortune it brings me. But really I’m just dicking around.

"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster

by rickmbari on Jan 25, 2012 11:17 AM EST up reply actions  

Then there are teams that do better on the road

Didn’t the Cards do better on the road the last year at Freedom Hall?

by ptichenor1 on Jan 25, 2012 12:13 PM EST up reply actions  

if so, their "home court advantage" should compute as a negative number

but I think that would be hard to defend in predicting future scores.

"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster

by rickmbari on Jan 25, 2012 1:31 PM EST up reply actions  

This is why I left engineering for art school.

well it was actually a sort of mutual decision on the part of the university and myself…

by ptichenor1 on Jan 25, 2012 2:01 PM EST up reply actions  

Yeah me too,

I have tons of spreadsheet relics laying around from years of dicking. I once tried to use one of my formulas for brackets, and well.. obviously I don’t have a website selling it.

Anyway, predictive is much more interesting than qualitative, IMHO. I could dig out some of my old factors if you want. I recall trying to install not only a heavy road factor, but also recent trending of the team in question—as well as their opponents’ recent trending. I.e. Beating a ranked team isn’t as good if they are slumping. If you believe in the notion that it only matters how you’re playing in late season, not how you began, you want your algorithms to discern that in the data. (Which also suggests heavier weighting of more recent data.)

by 97E3LPL on Jan 25, 2012 5:20 PM EST up reply actions  

Yeah, thought about that but boy would that be a bear to calculate.

And the weighting is totally arbitrary. And streaks happen at random too – toss a coin enough times and you’ll get ten heads in a row, but that doesn’t mean you’re more likely to get heads on the next toss – which isn’t to say that teams don’t have periods where their morale impacts their play for good or ill (cf. Louisville Cardinals 2011-12).

"I am willing to donate to the charity that is working on the prevention of whatever the hell Dick Vitale has." - noobmaster

by rickmbari on Jan 26, 2012 9:16 AM EST up reply actions  

Comments For This Post Are Closed


User Tools

Welcome to Card Chronicle, the Louisville sports blog Otis George might read if he knew it existed.

FanPosts

Community blog posts and discussion.

Recent FanPosts

Whiteout_small
The Cardinal Countdown...99 days till kickoff
Whiteout_small
100 Days....
Small
Tim Higgins is retiring
313364_2222722366459_1199700380_32210649_1027867481_n_small
Neutral Zone's Memorial Day Sale
Small
2012-13 Schedule
Blkcrd_small
Defining Moments of a Fanhood; The Vote-Off
Small
Francisco Garcia Wallpaper/Illustration
Whiteout_small
The Lady Cards nine...
Strongbasketball_small
Louisville should take note of new SEC-Big 12 bowl
Small
Michael Bush Illustration/Wallpaper (typo fixed)

+ New FanPost All FanPosts >


Managers

Admm_small Mike Rutherford