×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Using Graph Theory To Predict NCAA Tournament Outcomes

timothy posted more than 2 years ago | from the more-interesting-than-what-is-at-heart-portrayed dept.

Math 91

New submitter SocratesJedi writes "Like many technically-minded people, I don't have a lot of time to keep up with sports. Nevertheless, trying to predict the outcome of the NCAA men's basketball tournament is a fun activity to share with friends, family and colleagues. This year, I abandoned my usual strategy of quasi-randomly choosing teams and instead modeled the win-loss history of all Division I teams as a weighted network. The network included information from 5242 games played during the 2011-2012 season. From this, teams came be ranked using tools from graph theory and those rankings can be used to predict tournament outcomes. Without any a priori information, this method accurately identified all the #1 seeds in the top 5 best teams. It also predicts that at least one underdog, Belmont (#14 seed), will reach the Elite Eight. Although the ultimate test will be how well it predicts tournament outcomes, initial benchmarks suggest 70-80% accuracy would not be unreasonable."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

91 comments

Jock shit (-1)

Anonymous Coward | more than 2 years ago | (#39337487)

*zzzzzzzzzzzzz* Get this lame jock shit off my Slashdot. This is a site for News for Nerds not News for Turds.

Re:Jock shit (0)

Anonymous Coward | more than 2 years ago | (#39340201)

Someone is still smarting about being stuffed into lockers?

Call me when it works for stocks. (-1)

Anonymous Coward | more than 2 years ago | (#39337499)

Then we'll talk.

Re:Call me when it works for stocks. (0)

Anonymous Coward | more than 2 years ago | (#39338095)

It can't.

No way to account for the levels of idiocy on Wall Street. See AAPL's PE vs growth rate for example.

Re:Call me when it works for stocks. (1)

baldass_newbie (136609) | more than 2 years ago | (#39338587)

See AAPL's PE vs growth rate for example.

You mean how low their PE is? By rights their stock should be up around $600/share.

Re:Call me when it works for stocks. (2)

NatasRevol (731260) | more than 2 years ago | (#39338855)

Yes. If fairly valued at a PE of say 25 or so (which is still low for their growth rate), their stock should be at $875 or so.

MOT, INTC, EMC, JNPR are all similarly valued. But have much lower growth rates.

BIDU is the only large tech company with a similar growth rate. It's PE is 46, which would put AAPLs stock price at $1615.
VMware has lower growth, but a PE of 60. AAPL would be at $2100 if similarly valued.

http://www.google.com/finance#stockscreener [google.com]

Wanna Bet.. (-1)

Anonymous Coward | more than 2 years ago | (#39337505)

...on it?

past history (5, Insightful)

Collin (41088) | more than 2 years ago | (#39337577)

wouldn't running the algorithm against past years' records and testing against past tournament results be the best possible test to tune the algorithm?

Re:past history (5, Insightful)

PatDev (1344467) | more than 2 years ago | (#39337955)

I worked in a research group in college that worked on exactly this problem - predicting NCAA tournaments with a graph-theoretic approach. That is exactly how you test the algorithm. And the cited estimate of 70-80% accuracy seems made up. People who research the field know that there is far less certainty than that. At something like 20% confidence, your prediction should be something like 20%-90%.

The problem stems from the fact that we traditionally predict a team will win if it is a stronger or better team, and we use our graph theory to produce relative team ratings. And if each game of the tournament were played over and over again with the winner of the majority going to the next round, then our methods would work even better. As it stands though, we are trying to predict a single sampling from a probability distribution - which will necessarily have error. Informally, the real tournament has upsets (when a weaker team beats a stronger one). Our algorithms can't predict these, the best they can do is gain a better understanding than humans as to which team is better.

Add to that the fact that the tournament is structured hierarchically - a mis-prediction in the first round prevents you from even attempting to predict later games (and by NCAA bracket scoring, that counts the same as mis-predicting those later games). So early upsets can potentially have large negative outcomes on brackets.

Our algorithms can't predict these, (1)

ed1park (100777) | more than 2 years ago | (#39338899)

Yeah, like when someone intentionally throws a game. As long as people are gambling (somewhere) and money is to be made, there is an opportunity and incentive to cheat. Get your graph theory to account for that!

Or maybe regression analysis is better like Levitt used to find cheating with Sumo wrestling and US student test takers in his book Freakonomics. (Awesome book BTW) ;)

Re:Our algorithms can't predict these, (2)

a_fuzzyduck (979684) | more than 2 years ago | (#39339931)

Yeah, like when someone intentionally throws a game. As long as people are gambling (somewhere) and money is to be made, there is an opportunity and incentive to cheat.

and as long as the NCAA allows everyone *except* players to make any money directly from their game at all, there's loads of incentive for players to do so.

Re:past history (1)

Hatta (162192) | more than 2 years ago | (#39339095)

And the cited estimate of 70-80% accuracy seems made up. People who research the field know that there is far less certainty than that. At something like 20% confidence, your prediction should be something like 20%-90%.

If a coin flip is 50% accurate, than an extra 20% accuracy will give you 70%.

Re:past history (0)

Anonymous Coward | more than 2 years ago | (#39339305)

And if a die has 6 sides, an extra 20% will give you a 7 sided die. Seriously, what are you talking about?

Re:past history (1)

Hatta (162192) | more than 2 years ago | (#39339777)

There are only two teams per game, modeling that with a coin flip makes a lot more sense than modeling it with a die. A random chance will give you 50% accuracy at picking the winner. You have to do better than 50% accuracy to have any claim at success at all. The real question is, what was the GP talking about when he claimed that success rates between 20% and 90% were more realisitic. Why even try if your algorithms can't beat random chance?

Re:past history (1)

rich_hudds (1360617) | more than 2 years ago | (#39341115)

Definitely need to define what is 'success'. In your example 50% is as low as you can go since a success rate of 20% really implies 80% as you'd merely do the opposite.

Here in the UK betting is perfectly legal and Betfair (a betting exchange that allows people to take either side of a bet) has a nice API that lets you back or lay most sporting events. People use very sophisticated algorithms to work out the in play odds of football matches, adjusting them second by second as the game goes along.

As a hobby project, trying to beat Betfair is pretty fun, I'm not too bad at it but their 5% commission on winning bets (lower if you place enough bets as the big guys do) means I can't actually make money.

Re:past history (1)

PatDev (1344467) | more than 2 years ago | (#39343567)

Good catch. I meant an alpha of 0.2 - which as you note is 80% confidence.

50% is not as low as you go, because of the way brackets are scored. You predict the outcome of *all* the games in the tournament before *any* games are played. Which means that errors in the first round mean that you haven't even properly predicted who is playing in the second round. If the team you picked as winning a game doesn't even play that game, then you automatically lose.

If we simplify the tournament, we can pretend there are 64 teams (there are really 68). Thus, if you flip a coin, you expect to average 50% in the first round. However, in the second round games fall into one of two categories:
  • games whose participants are who you predicted (1/2 of all games, and you get 50% of them right)
  • games with one participant you predicted (1/2 of all games, and you get 25% of them right)

As you can see, this causes the proportion of games you properly predict to go down with each level of the competition. Now consider that the scoring is weighted by round - games in round 2 are worth twice as much as games in round one.

That's how coin-flip gets you worse than 50%.

Re:past history (1)

tehcyder (746570) | more than 2 years ago | (#39351453)

The house always wins.

Re:past history (1)

rich_hudds (1360617) | more than 2 years ago | (#39351915)

Indeed but at least on a betting exchange it's not actually rigged against you. You take either side of a bet, choose your own odds and just pay a commission on any winnings.

Re:past history (0)

Anonymous Coward | more than 2 years ago | (#39340299)

I did random one year. Got chucked out on the last portion. But made it 2nd and 3rd round beating most of my co-workers. They were not happy with my method for some reason. :)

My theory was I could do pretty good until then just by simple attrition. If I was truly random I would get about 50/50 first round then 1/4 on second 1/8th on 3rd. So I got eliminated in 3rd. But could have got lucky...

Re:past history (0)

Anonymous Coward | more than 2 years ago | (#39341663)

Last year had a lot of upsets that few predicted.

Re:past history (1)

multimed (189254) | more than 2 years ago | (#39349143)

The problem stems from the fact that we traditionally predict a team will win if it is a stronger or better team, and we use our graph theory to produce relative team ratings. And if each game of the tournament were played over and over again with the winner of the majority going to the next round, then our methods would work even better. As it stands though, we are trying to predict a single sampling from a probability distribution - which will necessarily have error. Informally, the real tournament has upsets (when a weaker team beats a stronger one). Our algorithms can't predict these, the best they can do is gain a better understanding than humans as to which team is better.

It's not just the single game problem - and even if you set aside upsets, the "stronger" team doesn't always win because as the coaches have been saying for years, it's about matchups. Teams have strengths & weaknesses - style of play, offensive & defensive skill sets of individual players, etc. A team with a tremendous front court but weak ball handlers is more likely to lose to a inferior team that has a high pressure trapping defense whereas it might beat a stronger team that doesn't use the same on-ball pressure. Rebounding, 3 point shooting, transition offense/defense, are all things that can turn games depending on the relative strengths & weaknesses of the teams and negate quality differences in the larger sense.

Re:past history (-1)

Anonymous Coward | more than 2 years ago | (#39337971)

No, teams change so drastically on a year to year basis in the NCAA that past tournament results are essentially useless. Style of play, how teams fair against opposition of various speed and size, how certain coaches matchup against other coaches, the coaches record in the tournament, proximity of the playing location to the teams home town, the number minutes played for each player in the regular season as well as the tournament would probably give you a much more accurate algorithm.

Re:past history (0)

Anonymous Coward | more than 2 years ago | (#39338137)

hes saying running this algorithm on previous years data to predict previous years results. That would be a good test of the theory as a whole.

Re:past history, cannot predict what cant be seen (0)

Anonymous Coward | more than 2 years ago | (#39349127)

""The problem stems from the fact that we traditionally predict a team will win if it is a stronger or better team, and we use our graph theory to produce relative team ratings. And if each game of the tournament were played over and over again with the winner of the majority going to the next round, then our methods would work even better. As it stands though, we are trying to predict a single sampling from a probability distribution - which will necessarily have error. Informally, the real tournament has upsets (when a weaker team beats a stronger one). Our algorithms can't predict these, the best they can do is gain a better understanding than humans as to which team is better.""

You cannot predict this stuff because you have no way to factor in players performances no matter which team they are on. You could have players on top ranked teams just suck out, and players on the lower seeded teams have the game of there season. Injuries, and fouls play a roll in which teams go on and those that go home. Then there is always the karma factor, for some strange reason science cannot fully figure out. IE Bounces going one way fouls that players are not getting to go there way, or bad foul calls.
Prefect game for this is Ice Hockey, for some odd reason a team has won 7 out of 10 games by one goal lost 3 by one goal. The 11th game sees them get blown out by a score of 8 - 0, 2-3 games later they do the same to the another team. In the 11th game you can usually tell it is not going there way sticks are breaking while trying to clear pucks, the pucks bounce over there sticks, players falling to the ice with no one around. But no one can explain why this happens it just does.

This is the same for any prediction the weather is a prime example of how bad they fail at predictions, factors not fully known or unseen account for this.

I understand these predictions are for fun, or just something to mess with. You have a 30% chance of getting your predictions right, in one of those off the wall tourney years. And 70% if the Tourneys go the way they are usually expected.

Predicting the top is easy (4, Insightful)

elrous0 (869638) | more than 2 years ago | (#39337581)

Everyone knows who the big names are who are likely to make it to the final four. It's predicting how things will go at the middle and bottom, where teams are much more likely to be evenly matched, that's really hard.

70-80%? (2, Informative)

Anonymous Coward | more than 2 years ago | (#39337703)

Okay, you can get 50% accuracy just by flipping a coin.
If you go with "the higher seed wins", you get to 85% or so. Color me unimpressed.

Re:70-80%? (2)

MyLongNickName (822545) | more than 2 years ago | (#39337713)

Should be lower seed (I am the AC).

Re:70-80%? (4, Informative)

MyLongNickName (822545) | more than 2 years ago | (#39337755)

And my numbers are off. In 2011, 43 times out of 63, the lower seed won for about a 68% win rate.

Re:70-80%? (0)

Anonymous Coward | more than 2 years ago | (#39337815)

Since 1999, picking the higher seed, and then picking number ones by alphabetical order, nets you about 65% win rate.

Re:70-80%? (1)

Ctrl+Alt+De1337 (837964) | more than 2 years ago | (#39343439)

Going off of the 2011 tournament for any generalized method of picking games is a bad idea. It was a particularly chaotic tournament for a variety of reasons. Having a system that failed last year is potentially a good thing because last year didn't work like the majority of tournaments do.

Re:70-80%? (0)

Anonymous Coward | more than 2 years ago | (#39338175)

Actually you can't get much higher than 25% by flipping a coin. In a single elimination bracket tournament a miss in the first round automatically gives you misses in each of the following rounds.

Blogger Blogs! (0)

Anonymous Coward | more than 2 years ago | (#39337705)

Blogger blogs on his blog! News at 11!

Just take last years results (1)

CAPSLOCK2000 (27149) | more than 2 years ago | (#39337715)

You can get very reasonable results by just taking last years results. This works for most sports.

Re:Just take last years results (5, Insightful)

JayBean (841258) | more than 2 years ago | (#39337777)

That may work for pro sports, but not for college sports. In fact, because teams usually lose their nucleus after winning it all (players declare for the draft), it is rare for a team to make it to the final game two or more years in a row.

Re:Just take last years results (1)

Anonymous Coward | more than 2 years ago | (#39337901)

I believe the point was that he could take last years season and build his dataset around the regular season games, create a bracket, and then match his bracket against the winning results.

Re:Just take last years results (2)

MonsterTrimble (1205334) | more than 2 years ago | (#39337935)

I disagree - how good a team is can vary wildly year to year. Coaching changes, injuries, age, experience and so on can play huge roles in how a team performs especially on a collegiate level where there is so much growth between juniors and seniors in terms of development. This is less so in professional sports but still relavent.

Re:Just take last years results (2)

UnknowingFool (672806) | more than 2 years ago | (#39338301)

Yes but last year's tournament had 2 small schools, Butler and VCU, in the Final Four. While Butler made it to the championship 2 years in a row, they were a surprise both times. VCU has never made it that far in the tournament and there were some TV pundits that said they should not have been selected for the tournament at all when the bracket was announced. VCU got to the Final Four after the same pundits predicted they would lose in the next game for every single game.

How is this news? (1)

sunking2 (521698) | more than 2 years ago | (#39337749)

People have been doing this, either knowing or unknowingly, since the inception of sports gambling.

Re:How is this news? (2)

Lunix Nutcase (1092239) | more than 2 years ago | (#39337771)

It's not. This is just a puff piece trying to drive hits to their site by mentioning the NCAA tournament.

Re:How is this news? (0)

Anonymous Coward | more than 2 years ago | (#39341167)

Its just a blog for Peat's sake. You have to admit the Wikipedia pages regarding Eigenvalue Centrality are interesting, unless of course you are prepared to argue that reasoning is hardly if ever necessary.

I bought an iPad (-1)

Anonymous Coward | more than 2 years ago | (#39337759)

Did you hear? I bought an iPad! AN IPAD!!! Now all the guys at the gay bar will finally notice me! I've also made sure to buy a dozen pairs of skinny jeans and some emo glasses in preparation of receiving it on Friday so I can go straight to Starbucks and show it off! AN IPAD!!! WOOHOO!!

As a sports fan (3, Interesting)

jayhawk88 (160512) | more than 2 years ago | (#39337791)

Some problems I see. Disclaimer: I know there's a margin of error here as the author said, and I know my observations will be based largely on anecdotal evidence, making it inferior. But if sports were so easy to predict there would be no sports gambling.

- That's probably too far for Belmont; a #14 has only ever gotten as far as the Sweet 16, twice (Cleveland State '86, Chattanooga '97). Lowest seed to make an Elite 8 is Missouri in 2002 as a #12 . Belmont is actually going to be one of the more popular upset picks, but they would have to upset two far superior teams twice in 3 days.

- It's a bit too "chalk". #1 seeds generally survive the first two games (undefeated against #16's, 55-14 v. #8's, 59-6 v. #9's), but the #2's have it worse (only four losses v. #15's, but 58-21 v. #7's and 29-21 v. #10's). I know two #12's, a #13 and a #14 doesn't seem like "chalk" but historically it's much more likely that we'll see more #5-7 or #10-11's. To have only one #2 not make the Elite 8 and all the #1's would be almost unheard of.

- A #12 always beats a #5, but three of them doing so in one year would seem unlikely, as they're only 39-89 overall.

- Some of the other first round matchups seem a bit improbably. It has every #6 and every #7 winning, for example.

Re:As a sports fan (2)

kenrblan (1388237) | more than 2 years ago | (#39338085)

I didn't read the article (yet), but I put together a game result predictor a couple of years ago that I ran against the tournament field with about an 83% success rate for the whole tournament. It was in the 93% range for the first two rounds. My algorithm utilized season long team statistics to get a team's baseline and then incorporated strength of schedule and seeding components. Just like you mentioned about how far a team has historically progressed from a specific seed, I used historical analysis of seed matchups as another component. Essentially those historical #12 beating #5 type of matchups included a slight scoring boost to the worse seed. In some pairings, that modifier kicked the scoring over the top, but in others it didn't. It turned out to be quite accurate and even predicted the Murray State win over Vanderbilt, among others.

I might make another run at tournament prediction this year using some different statistical metrics that are game pace independent rather than the raw scoring and defense that I used before. Game prediction simulators present unique challenges and are quite fun to work on, especially for nerds who also like sports.

Re:As a sports fan (2)

bjourne (1034822) | more than 2 years ago | (#39338979)

It is not hard to create a model that works perfectly on observed data. But then you run into the problem of overfitting [wikipedia.org] and your model loses any general predictability it had. To counter overfitting you need to have separate datasets for training and testing otherwise the model will depend on random details in the data. The proof of the pudding is in the eating and if you're model is good enough, you should be able to make money on sports betting on it.

Re:As a sports fan (1)

tragedy (27079) | more than 2 years ago | (#39341289)

if you're model is good enough, you should be able to make money on sports betting on it.

Not against people whose model is just as good, and not (over the long term) against any professional gambling enterprise (legal casino or bookie) set up to profit whether you win or lose. A professional gambling outfit either takes a cut that negates all statistical advantages of having a good predictive method or they set up the odds so that they'll make back what they lost to you last time when you lose to them next time. The only people who make money reliably in gambling are those who have found a sucker or group of suckers to victimize.

Re:As a sports fan (2)

ThatsNotPudding (1045640) | more than 2 years ago | (#39340843)

If there is a core of deep, personal knowledge about early upsets in the NCAA BB Tourney, it would definitely be at Kansas University (KU). Oh; I meant University of Kansas: UK. No, wait... what?

Re:As a sports fan (0)

Anonymous Coward | more than 2 years ago | (#39342451)

When's the last time your Mizzery Tiggers made it to a Final Four?

Re:As a sports fan (0)

Anonymous Coward | more than 2 years ago | (#39342081)

Since you're a KU fan, I also expected you to object to Mizzou getting to the Final Four. I'm not sure what's more ridiculous: Belmont and VCU being ranked the 7th and 8th best teams, respectively, or Mizzou getting to the Final Four. By comparison, Jeff Sagarin has Belmont and VCU ranked at #33 and #50, respectively, which seems reasonable. The idea of the ranking system sounds a lot like random walkers, which do a respectable job of ranking teams, but are very limited as they don't account for important details like margin of victory. The rankings are pretty much a joke, and therefore, so is the bracket.

MUCK FIZZOU!

Re:As a sports fan (0)

Anonymous Coward | more than 2 years ago | (#39342947)

I thought I could not care any less about sports. You have proven me wrong.

"Like many technically-minded people, I don't... (1)

Anonymous Coward | more than 2 years ago | (#39337801)

... have a lot of time to keep up with sports."

Yes, if you enjoy sports, you must not be technically-minded. Tis for the plebes...

But, I bet you have time for Skyrim!

Re:"Like many technically-minded people, I don't.. (2)

Hatta (162192) | more than 2 years ago | (#39339135)

At least in Skyrim, you're an interactive participant. That, and Skyrim isn't just a polite way for people to act out their base tribalistic instincts.

Re:"Like many technically-minded people, I don't.. (1)

TwistedChopper (2472796) | more than 2 years ago | (#39339987)

Skyrim isn't a polite way to act out my base tribalistic instincts? Am I offending people when I slaughter entire towns of virtual Argonian civilians?

Re:"Like many technically-minded people, I don't.. (0)

Anonymous Coward | more than 2 years ago | (#39349699)

Yeah, screw those people reading books, watching movies, or listening to music. They should be doing something more interactive! Seriously, why do you care if people want to relax by watching sports?

Morale of the story... (1)

hcs_$reboot (1536101) | more than 2 years ago | (#39337819)

...you're rich! 70~80% accuracy beats the 70~80% of people who don't know/use/master the graph theory, thus you're gonna win 70~80% of online bets.

Re:Morale of the story... (2)

kenrblan (1388237) | more than 2 years ago | (#39338183)

Not quite. Picking winners =/= winning at gambling. Margin of victory, aka the spread, comes into play. That is a bit harder to account for in these types of situations involving so much human variable. Granted, being able to identify some potential upsets could allow someone to bet big on those and become potentially rich.

Re:Morale of the story... (1)

TheRaven64 (641858) | more than 2 years ago | (#39338281)

And then he has to persuade someone to take the bet. You can sure that betting establishments will pay someone to work out the odds at least as well as he does. It's okay for informal betting among friends, but if you're trying to make money then fleecing your friends only works a few times before you run out of friends...

Re:Morale of the story... (2)

Bill, Shooter of Bul (629286) | more than 2 years ago | (#39339055)

Ah, you would think that the casino sports book odds were the most accurate availibe and only determined by scientific study of the sports.

BZZZT! Wrong. Casinos need to make a profit. So they determine the *initial* odds by studing the sport, but then change the odds in reaction to the bets that are placed. They try to have equal amounts on both sides of a bet. They pay less to the winners than they get from the losers.

What's the point of pointing that out? Well, you have some pro gamblers who actually do make an incredible living off of betting on sports who use the above factlet. They simply move the odds the casino gives by placing money on the other side of the bet. So they want the odds to go up on a team A winning, they place a large bet on the opposite team B, the casion increases the odds of team A winning in order to attract gamblers to help them balence out the bet on team B. So they now place an even larger bet on team A with the odds they really wanted in the first place. If team A wins, it will cover the loss of the first bet on team B. If Team B wins, they obviously lose more money than they win.

The joys of single elimination (2)

PPalmgren (1009823) | more than 2 years ago | (#39337869)

March Madness is notoriously hard to predict, partly because of the number of teams involved and also because of the single elimination system that I love so much. Its prevalent in few sports and makes each game mean a lot more, also opening the door for cinderalla to take her 15 minutes of fame. 7-game playoff rounds like they have in Baseball and the NBA tend to nullify those outliers. I honestly think that's a big reason for the success of the NFL too - every game and every play means a hell of a lot more when the best possible record is 19-0.

Doesn't matter if it works (2, Insightful)

Anonymous Coward | more than 2 years ago | (#39337959)

Can you write a windows installer for it and sell it to gamblers?

Nerd (1)

utahjazz (177190) | more than 2 years ago | (#39338003)

Like many technically-minded people, I don't have a lot of time to keep up with sports.

The word you're looking for is "Nerd". It's OK to say it, it's in the title-bar of Slashdot.

Re:Nerd (2)

tragedy (27079) | more than 2 years ago | (#39341413)

You know, for stereotypical nerd behaviour like communicating to each other in incomprehensible jargon and obscure references that other people don't get, obsessive behaviour, dressing up in ridiculous costumes for gatherings, etc, I've come to realize that nothing beats a hard-core sports fan.

Forget your fancy graph theories (0)

Anonymous Coward | more than 2 years ago | (#39338111)

I make my brackets based on mascots. Look at the mascots and decide, "In a fight, which would beat the other one".

some python source code for this! (0)

Anonymous Coward | more than 2 years ago | (#39338257)

here are my two algorithms for ncaa football and basketball.
http://homepages.uc.edu/~carrahle/rankFootball.py [uc.edu]
http://homepages.uc.edu/~carrahle/ncaa.py [uc.edu]
the first uses a hits model, so multi component ranking while the latter uses principle component analysis.
i agree, with the above comment 70-80 seems high. I actually went through a somewhat lengthy projection to latent space learning phase to try and better rank information about teams. instead of simply using win/loss through multiple years (also weighting time distance) i also used the full box scores such as rebounds, shots on, fouls etc for each player. And didnt come near that accuracy. My end conclusions is that especially for basketball, the teams are just too turbulent to be very consistently accurate with. none-the-less good luck. I think this is why we don't use PCA to seed bowl games in football, simply put, people are better at ranking.

Behind the Curve (1)

TheMathemagician (2515102) | more than 2 years ago | (#39338283)

You're some way behind the curve if you want to make money sports betting on this. There is an extreme non-stationarity problem with basketball teams which inevitably means methods using past statistics will never be that successful. I know of professional basketball modellers who pay an army of students and the like to watch college games while clicking on hand-held devices to record second-by-second data on passes, interceptions etc. This data is then fed into their models and provides a very accurate picture of how a team is playing right now. They are then able to handicap the games and look for value where the line is wrong.

temporal modelling is important too! (0)

Anonymous Coward | more than 2 years ago | (#39338327)

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1176059&tag=1 : "Temporal principal component analysis - advances in dual auto-regressive modeling for blind Gaussian process identification "
This paper combines Autoregressive model with a principle component model. could be useful to better discover hot streaks, ie UC beating Marquette, Georgetown, Villanova then Syracuse! despite being heavily outranked. It was a winning streak and they totally need to be modelled.

Bringing the excitement back (0)

Anonymous Coward | more than 2 years ago | (#39338351)

Nothing like a little graph theory to bring excitement to NCAA Basketball.

(I'm not even kidding.)

*YAWN* (1)

OWJones (11633) | more than 2 years ago | (#39338493)

Between 70 and 80%. That's a HUGE difference. That means that compared to the other computerized systems out there [thepredictiontracker.com] you're either totally awesome or really suck.

That's like saying, "I did a lap in a Formula 1 car, and I'm either 15 seconds ahead of last year's world champion, or I'm a minute behind the field."

You haven't done this before, have you?

A plug for Nate Silver's FiveThirtyEight (1)

madro (221107) | more than 2 years ago | (#39338505)

His statistical reasoning is always well described, so that if you disagree with his results, at least you understand why you disagree. He's got "picks" [nytimes.com] and a description of the system [nytimes.com] used to generate them.

The original article is an interesting network analysis exercise, but it is really limited by its assumption of no a priori quality data. (Any time you beat Kentucky or North Carolina or other perennial powerhouses, that's almost always a quality win.) Sagarin and LRMC follow similar logic, but without an explicit network piece.

Not enough time? (3, Insightful)

babyrat (314371) | more than 2 years ago | (#39338601)

You don't have time to follow sports, but you have time to model "information from 5242 games played during the 2011-2012 season".

You could be honest and just say you don't really care, but get involved in the playoffs because everyone else is talking about it.

I'm guessing your level 80 warlock probably doesn't 'have time' either. :)

Re:Not enough time? (1)

LanMan04 (790429) | more than 2 years ago | (#39342397)

You don't have time to follow sports, but you have time to model "information from 5242 games played during the 2011-2012 season".

How is that a contradictory statement? He's so busy doing data modeling stuff that he doesn't have time to watch sports.

When someone says they "don't have time" to do something, it's generally because they're very busy with....gasp....other things!

Re:Not enough time? (0)

Anonymous Coward | more than 2 years ago | (#39343309)

It is level 85 now, you noob.

Why are you revealing this? (0)

Anonymous Coward | more than 2 years ago | (#39338679)

1. Go to Vegas
2. Use you predictions to place bets
3. profit!

Re:Why are you revealing this? (1)

a_fuzzyduck (979684) | more than 2 years ago | (#39340055)

Vegas :D Yeah, the one place where they *hate* people being successful with gambling :D

Re:Why are you revealing this? (0)

Anonymous Coward | more than 2 years ago | (#39342257)

As much as you might know about math to predict the winners, you probably don't know as much as the people setting the lines, know about sports match-ups. Those guys research everything about each match up, beyond just the stats.

My Best Luck (1)

lbmouse (473316) | more than 2 years ago | (#39338729)

It seems in office pools I do the best by picking favorite team colors.

Re:My Best Luck (1)

PRMan (959735) | more than 2 years ago | (#39340789)

I "won" an office pool once without even playing. I told the guy that I could win the whole thing, but, as I didn't want to take their money through gambling, I would just tell him my picks after it was closed.

The problem? He was giving points to each "winner" based on their number. If a #1 won, you got one point. If a #12 won, you got 12 points. I just picked 9-16 the whole tournament through. (He admitted that they all would have hated me had I played.) After the first round, with 4 upsets, there weren't enough points left in the tournament for anyone to beat me. But there are always 3-4 upsets in the first round.

Re:My Best Luck (0)

Anonymous Coward | more than 2 years ago | (#39341241)

You should have taken their money through gambling. It sinks the lesson in better.

a complex way of not increasing accuracy (0)

Anonymous Coward | more than 2 years ago | (#39338845)

70%-80% accuracy is what most algorithms can do (the others being worse), which is also similar to expert prediction. Here we have a way of being very precise, but of not significantly increasing accuracy. I suppose it might be more accurate than the designers' guesswork, however.

some dubious results (0)

Anonymous Coward | more than 2 years ago | (#39339151)

After reviewing his results, both ranking team and the bracket results, I notice that he ranks West Virginia University 91. Two teams WVU beat handily, Kansas State and Georgetown are ranked 66 and 65 respectively. WVU also lost some very close games to ranked teams, including at Syracuse by 1 point, to Baylor by a basket, and Marquette by a basket.

Regarding the bracket, the four No 1 seeds march along, undefeated, until they meet in the final four. While this can happen, it seems like a trivial and unsophisticated result to me. Perhaps the algorithm should include a bump for a team that looses a close game to a team ranked at 3rd in the nation? Maybe some way to account for a coaches' history in tournament play? Some coaches are better able to prepare a team for specific games, aren't they?

Re:some dubious results (1)

cforciea (1926392) | more than 2 years ago | (#39341321)

Regarding the bracket, the four No 1 seeds march along, undefeated, until they meet in the final four. While this can happen, it seems like a trivial and unsophisticated result to me.

The problem with these predictions is, of course, that is the most likely scenario. There are enough other things that can happen that it is probably a worse than 50-50 shot, but there isn't another scenario that is more likely. Really, all any algorithm can do to beat picking the better seed every time is try to find spots where teams are seeded either higher or lower than they should be, and the very top and bottom of the list are probably not the most likely spots for this to happen.

The problem is (1)

Endo13 (1000782) | more than 2 years ago | (#39340539)

There's too much data and too many variables. Even just inputting all the known, public data might significantly improve the accuracy, but there's also lots of unknown private data that can influence games. Algorithms like this can't account for things like the coach's son getting killed in an automobile accident the night before a game, or the star center getting hit with a bad flu. And when you make it complex enough to take in all that data, it still has to get all that data somewhere, which means it has to have access to all news feeds, and it has to be accurate at knowing which ones are appliccable, etc. etc... or you have to manually input all that data, which would take a horrific amount of time. In the end, it's so much easier to just intuitively account for things like that without using a computer, which I believe is why human experts are just as good as computers at predicting outcomes. We don't calculate the hard statistics as well, but we can account for the human element.

Yes, I see... (1)

fahrbot-bot (874524) | more than 2 years ago | (#39341055)

.

... modeled the win-loss history of all Division I teams as a weighted network. The network included information from 5242 games played during the 2011-2012 season. From this, teams came be ranked using tools from graph theory ...

... you obviously don't have enough time to keep up with sports.

Predictions look unlikely to me (0)

Anonymous Coward | more than 2 years ago | (#39341865)

A quick glance at the predictions suggests that something just doesn't look right.

Typically, the 14 seeds are teams that got in by virtue of winning their conference. It's highly unlikely to have an at-large selection seeded below 13. It's very odd to see two at-large teams, BYU and Iona, in a play-in game for a 14 seed. I'm surprised just to see an at-large team come from the Metro Atlantic. This makes a huge difference because the at-large teams got in because of their resume and not because they got an automatic bid from a weak conference. There's a big drop-off from the 13 to the 14 seeds, and another drop-off with the 15 seeds. Although upsets by 14 seeds are hardly unheard of, unlike the very rare 15 over 2 upset, it would be very surprising to see two 14 seeds win. I would be even more surprised to see a 14 seed advance to the elite eight. The lowest seed I can remember advancing to the elite eight was Missouri, a 12 seed, back in 2001-02. But that was an incredibly talented Missouri team that was once ranked as high as #3 before collapsing during the Big 12 conference schedule. They were an at large team, and played some good basketball right at the end of the season to earn their way into the tournament.

If I was going to pick a 14 seed to advance, I might pick the winner of the BYU-Iona game. But even that's questionable. I'm not impressed at all with Iona's resume. I might suspect they could beat a quality team in the tournament if they had pulled out the win against Purdue. I know, it's at the beginning of the season, but there's not much impressive on Iona's schedule. And, had they won against Purdue, they likely wouldn't be in a play-in game as a 14 seed, either. While BYU has a similar record to Gonzaga, there's a big difference between the overall profiles. Gonzaga has non-conference wins that look really impressive, especially against Notre Dame. Again, BYU is a 14 seed for a reason and there's not a whole lot to like. They had a chance for a much bigger win than Iona's game against Purdue when BYU lost by three points to Baylor. Beating Baylor would have convinced me they could pull out a win against a higher seeded team in the tournament. But, then again, if they had beat Baylor, they would quite possibly be a 12 seed instead of a 14 seed. None of the 14 seeds look worthy of an upset pick.

Teams like VCU and George Mason, which also made runs in the past few years, were 11 seeds. Harvard and Long Beach State might be good candidates to make a run in the tournament. VCU can't be ignored, either. Even Davidson showed they could beat good teams when they upset Kansas at the Sprint Center, which might as well have been a home game for the Jayhawks. If you want to pick an upset below the 5-12 games, I'd pick Davidson to win a game.

It's also very unlikely to see all the teams seeded at 6, 7, and 8 winning their first games. Those teams are very evenly matched and there is basically no difference between an 8 and a 9 seed. I have no problem with three 12 seeds winning in the first round. But I would pick some 11, 10, and 9 seeds to win games, too.

Maybe the predictions will work out better than I expect. But if I'm entering a bracket in a pool, it sure wouldn't look like that. If Belmont makes it to the elite eight, someone will make a lot of money from that bracket. But I sure wouldn't be betting on that.

"Belmont upsets KU" (0)

Anonymous Coward | more than 2 years ago | (#39366039)

Yeah... no. If I ever see that headline I'll eat my hat. And I'm not even a KU fan.

Check for New Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...