×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Mathematician Predicts Yankees To Dominate

CowboyNeal posted about 7 years ago | from the safe-bets dept.

Math 170

anthemaniac writes "Computerized projections in sports are nothing new, but Bruce Bukiet of the New Jersey Institute of Technology has developed a model that seems to work pretty well. He projects how many games a Major League Baseball team will win by factoring in how each hitter ought to do against each pitcher in every game. His crystal ball says the Yankees will win 110 games this year, a pretty safe bet, many might agree. But he also projects all the divisional winners. He claims to be right more than wrong in five of the past six years."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

170 comments

Claims to be right more than wrong, heh? (-1, Troll)

Anonymous Coward | about 7 years ago | (#18629509)

that's right, Yankee haters, if you want to be the next boy-genius statistician, just predict the Yanks will win the World Series every year. You'll be batting > .500 too! Impress your friends, woo the women, be the first on your block to break the elusive 500 barrier.

Re:Claims to be right more than wrong, heh? (0)

Anonymous Coward | about 7 years ago | (#18629523)

86 Mets rock all y'all worlds!

Re:Claims to be right more than wrong, heh? (3, Informative)

BridgeBum (11413) | about 7 years ago | (#18629761)

Bruce is actually a die hard Mets fan. I helped work on this project with him back in my undergrad days 15 years ago or so. I doubt any of my code is still be used though. :-)

Re:Claims to be right more than wrong, heh? (1)

thatshortkid (808634) | about 7 years ago | (#18630057)

Bruce is actually a die hard Mets fan. I helped work on this project with him...

does he account for beltran removing the bat from his shoulder or just watching strike 3? and if so, constant or variable?

Re:Claims to be right more than wrong, heh? (1)

arodland (127775) | about 7 years ago | (#18630477)

Well actually if you predicted the Yankees to win the series every year from 1903 to 2006 you'd only have a .257 success rate. On the other hand that's a plurality, and more than double the wins of the next best thing, the Cardinals.

Re:Claims to be right more than wrong, heh? (1)

sebi (152185) | about 7 years ago | (#18630813)

If you had done this this millennium you'd have struck out a lot. Like all the time. Let's face it. The time of total dominance by one team is over. Wild card and luxury tax seem to be doing what they're supposed to. The last six world series were won by six different teams. Of course that won't get my team any closer to a championship, but all Cubs fans agree: If we don't manage this year MLB just has to give the trophy to us. After 100 years that is the least we deserve.

Also, they play-offs are a total crap-shoot. 8 teams make it every season. The Yankees are pretty much always one of the 8. That doesn't guarantee a championship. Hell, a crappy team with 83 wins can win it all. Why spend 183 million dollars on your roster?

110 wins? (5, Insightful)

nebaz (453974) | about 7 years ago | (#18629517)

It's a safe bet that the Yankees will do well, they always seem to spend almost twice as much as most other teams on talent, not to mention luring good players from other teams away to crush competition. Having said that, they have always spent such money, and not done exceptionally well as of late. 110 wins is a lot, and not many tesms have accomplished that. Safe bet? Hardly.

Re:110 wins? (0)

Anonymous Coward | about 7 years ago | (#18630031)

The model is total rubbish. One of the most accurate gauges of how many games a team will win is by estimating 1) the total number of runs the team is expected to score over the course of the year, and 2) the total number of runs the team is expected to allow. Looking at these numbers, you can very reasonably expect the Yankees to win 90-95 games this year. To say they'll win 110 games is just wrong.

Re:110 wins? (2, Interesting)

sebi (152185) | about 7 years ago | (#18630861)

I agree that RS vs RA is a good way to predict the success of a team. It's not always so helpful looking back. The Indians scored 870 runs last season and only allowed 782. How did they do? Not so well: a 78-84 record, good enough to finish fourth in their division. How can one explain that disparity? Blowouts. Those 22-0 games that happen every once in a while. I like Runs Scored vs Runs Allowed models. Just not the ones that get updated during the season.

A Much Safer Bet... (2, Funny)

Black-Man (198831) | about 7 years ago | (#18630063)

The Pirates - 2nd lowest payroll - will suck again. 14 losing seasons in a row. I give it a 99.9% certainty they make it 15. I'm not even a MIT grad!

Re:110 wins? A Safe bet? (1)

Nick_Allain (997908) | about 7 years ago | (#18630677)

Not a safe bet at all. Especially considering that the AL East is fairly strong this year. It seems april fools day comes 4 days late for baseball fans... The prediction is a joke. While math can certainly be applied to predict things like this, it fails to take into account that yankees overspend on old players. A more accurate prediction would be that by the end of the season, the cumulative number of years that yankees players are past their primes is about 110.

He left out several important variables (2, Interesting)

PFritz21 (766949) | about 7 years ago | (#18630779)

Injuries. Did he take these into account? A lot of good teams have had lousy seasons due to players being hurt for long periods of time. MAYBE if every member of every team was able to play a full schedule of 162 games...

Performances. If every player played consistently every day, but some guys go on hot streaks and get moved up in the batting order. Some guys go cold and get bumped down, or even worse, sent to the minors. MAYBE if the 25-man rosters stayed constant for the entire season.

Luck. Three teams each score 750 runs over the course of a season. Each one also allows 750 runs. http://en.wikipedia.org/wiki/Pythagorean_expectati on [wikipedia.org] Bill James' Pythagorean expectation says that each team should play .500 ball; 81 wins and 81 losses. But one team could win a lot of close games and lose a couple dozen blowouts, finish with 90+ wins. Another could lose a bunch of close games and win a couple dozen blowouts, ending up with only 70 wins.

Um. Yeah. (1)

Shadow Wrought (586631) | about 7 years ago | (#18629533)

He claims to be right more than wrong in five of the past six years.

Whoopty fsck. So's RailGunner [slashdot.org] . Runs are fun to watch, but pitching is what wins. And the Yanks have? Anyone? Anyone at all? Yep. They got nothin' at pitcher.

If he's so confident... (2, Interesting)

The Living Fractal (162153) | about 7 years ago | (#18629537)

Has he put up beaucoup bucks in Vegas on his numbers? If not, why not. If so, how much did he win, and where can I get his numbers this year?

TLF

Better Places to Put Your Effort (0)

Anonymous Coward | about 7 years ago | (#18629543)

His crystal ball says the Yankees will win 110 games this year, a pretty safe bet, many might agree.
Get to work on the weather, then we'll start talking about me investing in your Vegas trips.

I am skeptical (2)

5, Troll (919133) | about 7 years ago | (#18629553)

Just look at his predictions from 2006:
http://www.egrandslam.com/SeasonPredictions2006.ht ml [egrandslam.com]
for example:
he had Tigers 4th in AL Central (74-88 vs actual WC 95-67), Cubs 1st in NL Central (90-72 vs actually 6th and 66-96), Red Sox getting WC (99-63 vs 3rd 86-76).

I predict Red Sox will win 161-1 (that 1 being the opening day loss to the Royals. The fact that I am a physicist makes my predictions newsworthy!)

Re:I am skeptical (1)

MrSelfDestruct (30535) | about 7 years ago | (#18630085)

"I predict Red Sox will win 161-1 (that 1 being the opening day loss to the Royals. The fact that I am a physicist makes my predictions newsworthy!)"

OMGBBQ! This made me laugh on /. for the first time in years.

WARNING! (0)

Anonymous Coward | about 7 years ago | (#18629583)

Purple backgrounds may cause permanent retinal damage.

its a given that NYY will make the playoffs (0)

Anonymous Coward | about 7 years ago | (#18629611)

That's one of the biggest locks in sports. That's what a $200+ million payroll does for you, they have all stars at every position year after year, and whatever position they're lacking at midseason, George and Cash'll spend another $20-25 million to pick up for the stretch run.

Then the playoffs start, and that's when pitching, defense, clutch hitting, and team chemistry takes on a much bigger role. Ref: Yankees 2001-06. Sometimes I think Jeter is just waiting for Arod to leave so he (Jeter) can go all out to win the WS again.

$$ does not = champion (1)

p51d007 (656414) | about 7 years ago | (#18629633)

Well, considering the Yankees spend a zillion times the amount that most teams spend, then their odds are better than most teams. What I find funny is that they have spent that huge amount of money forever and they STILL haven't won. Goes to show you that you don't have to spend a ton of money to win! In October, if the Yankees don't win, I'll be saying na-na-na-na....along with the weather people who predicted all those hurricanes last year.

Re:$$ does not = champion (0)

Anonymous Coward | about 7 years ago | (#18629715)

Well, considering the Yankees spend a zillion times the amount that most teams spend, then their odds are better
than most teams. What I find funny is that they have spent that huge amount of money forever and they STILL haven't
won. Goes to show you that you don't have to spend a ton of money to win!
In October, if the Yankees don't win, I'll be saying na-na-na-na....along with the weather people who predicted
all those hurricanes last year.
What are you talking about? It isn't as if they are trying to get to the World Series every year. Only every other year. And they are damn close at doing that (they've been to the World Series 39 times, winning 26 of them). How many Major League Baseball teams are there again and how many times has the World Series been played? Winning a quarter of the World Series games and almost half of the AL pennants are no trivial feat. But you go ahead and pick your favorite team and I'll pick the Yankees in a bet. We'll see how those teams fare in the next 20 years.

Re:$$ does not = champion (0)

Anonymous Coward | about 7 years ago | (#18629793)

If money really won the championships, the Red Sox (second-highest spenders in all of baseball, they only spend something like 5% less than the Yankees) would have won more championships than they have.

Number of times the Red Sox won the world series in the past 75 years? Once.

Number of times the Yankees won the world series in the past 75 years? Twenty-three times.

There's no real correlation between spending and winning.

My prediction is that the Cubs won't win. It's an even safer bet.

I never understand these things... (4, Informative)

krbvroc1 (725200) | about 7 years ago | (#18629645)

Isn't here some rule or law about 'fitting a curve' to past data? Yet, the sports predictions, and many of the 'stock market systems' are all about
finding some seemingly obvious pattern in past data. While you might come up with a 'back tested' model that matches really well,
it doesn't mean squat for the future.

Re:I never understand these things... (4, Informative)

BridgeBum (11413) | about 7 years ago | (#18629789)

His models have evolved over the years, but he tries to simulate actual games using both individual statistics (players batting averages, etc.) as well as team trends (how well does a player do against a specific pitcher). He uses a large Markov chain to predict state transitions (Runner on first, no outs - how often does it go to two outs? That sort of thing.) Very interesting project, it was a lot of fun to work on. (I was an undergrad working with Bruce 15 years ago, when he was first starting this project. He's kept it going for years.)

Re:I never understand these things... (4, Insightful)

Burdell (228580) | about 7 years ago | (#18630297)

It is still trying to predict future results based on past performance. No matter what you predict, last year's Chipper Jones will never again face last year's Roger Clemens. Even if Clemens un-retires (again), he is not the same person, and neither is Chipper Jones. You also can't predict injuries, trades, managers' decisions, umpires' calls, weather, etc., all of which have an impact on the outcome of an individual game.

Re:I never understand these things... (2, Insightful)

Anonymous Coward | about 7 years ago | (#18630819)

You're right. We should stop trying to predict anything because we won't ever be 100% correct.

Re:I never understand these things... (0)

Anonymous Coward | about 7 years ago | (#18631409)

Damn. Why the hell am I going to grad school for signal processing anyway?! My signals and noises aren't the same from last moment to this moment, either!

Re:I never understand these things... (0)

Anonymous Coward | about 7 years ago | (#18630889)

Thats why you partition your data into calibration and verification sets. This allows a bias free calibration.

The best way to test... (1)

Dr. Eggman (932300) | about 7 years ago | (#18629651)

The best way to test any model is to start with the end points. How low does it score the New York Mets?

Re:The best way to test... (0)

Anonymous Coward | about 7 years ago | (#18630021)

But did he predict the Cardinals would win the World Series last year? I did and it paid nicely.

Huh? (4, Insightful)

Kuukai (865890) | about 7 years ago | (#18629701)

While Bukiet is the first to admit he's not a baseball expert, in five out of the past six years, he says that his model has produced more correct than incorrect predictions.
What? Does this even mean anything? If, say, he was right 51% percent of the time five years and wrong 90% of the time that other year, wouldn't that make his number of successes less than the expected number of successes from just guessing "win" or "lose"? I guess he's either really modest ("I don't like to brag, so I'll just say the accuracy is higher than 42%."), or a really, really bad statician.

Re:Huh? (-1, Flamebait)

Anonymous Coward | about 7 years ago | (#18629837)

Nitpick much, asshole?

On a serious note, you're fucking stupid.

Re:Huh? (2, Informative)

AstrumPreliator (708436) | about 7 years ago | (#18630501)

...or a really, really bad statician.

Or a really good statistician. Remember, when you ask a statistician to crunch some numbers for you he'll reply back with "and what would you like the numbers to say?". They'll make it fit any curve you throw at them.

Keeping up appearances (4, Funny)

ScrewMaster (602015) | about 7 years ago | (#18629703)

"Hello Mr. Bukiet"

"It's pronounced bouquet!"

Re:Keeping up appearances (0)

Anonymous Coward | about 7 years ago | (#18630261)

I wonder how many people reading /. got that reference.

amazing (2, Insightful)

flynt (248848) | about 7 years ago | (#18629733)

Wait, you mean you can use past data to try to predict future events under certain assumptions, and sometimes it works? Someone should generalize this into some sort of academic discipline!

Re:amazing (1)

luckystuff (836232) | about 7 years ago | (#18629963)

No way. That bridge I walked across this morning was sturdy enough. It's just that I'm never going to walk on that bridge again. Not for those historical, "can't cross the same stream twice" reasons either. I just don't trust engineers.

We did this in college too... (1)

jpellino (202698) | about 7 years ago | (#18629763)

It was called Strat-O-Matic Baseball, and many a night in the hills of Worcester I had to fall asleep to the constant clinkity-clink-clink-clinkle of a pair of dice in a stolen cafeteria coffee cup.

Re:We did this in college too... (1)

Fastball (91927) | about 7 years ago | (#18630653)

1-5 HOMERUN

:)

PS - My all-time favorite Strat-O-Matic cards belonged to Bobby Witt. Especially his 1987 card. 143 IP, 160 K, 140 BB. Every inning an exciting one. :D

Re:We did this in college too... (1)

aero2600-5 (797736) | about 7 years ago | (#18630733)

Wow, someone else that knows what Strat-O-Matic is.

By the way, backgammon boards and cups really keep the noise down quite a bit.

Aero

But... Yankees Suck!! (3, Funny)

Jon_S (15368) | about 7 years ago | (#18629773)

signed,

Red Sox fan

Red Sox suck!! (1, Funny)

doormat (63648) | about 7 years ago | (#18630397)

Signed,

Yankees fan

PS Have fun blowing up more innocuous devices because you think they're bombs

Re:Red Sox suck!! (0)

Anonymous Coward | about 7 years ago | (#18630485)

Wait, Yankees Suck is +5 Funny, but Red Sox suck is -1 Flamebait? How so?

I suppose it's because the Red Sox suck is demonstratively true and that the Yankees suck is demonstratively false.

Besides, blowing up devices is fun. The stupid thing was that they had to evacuate parts of Boston over some blinking lights attached to batteries.

Re:Red Sox suck!! (1)

Wannabe Code Monkey (638617) | about 7 years ago | (#18630753)

The stupid thing was that they had to evacuate parts of Boston over some blinking lights attached to batteries.

No parts of Boston were evacuated, they shut down part of the subway and a bridge or two. None the less, it was pretty stupid and I had a good laugh over it. Luckily I had to get into work really early that day so I completely missed the orange line closing.

Re:Red Sox suck!! (0)

Anonymous Coward | about 7 years ago | (#18631285)

So, in other words, they EVACUATED the bridges and subway stops, unless "shutting them down" means something different in Boston than in the rest of the world.

They EVACUATED parts of Boston. It's a fact. Live up to it and stop rewriting history.

Be glad it's just LEDs as bombs that people think of when they think of Boston. They could remember the Big Dig and the glued tiles that fell and killed a woman, paid for by federal tax dollars...

deconstruction (0)

Anonymous Coward | about 7 years ago | (#18631207)

Because his post was a tongue-in-cheek retort based on the standard battle cry of the Red Sox nation that is so ingrained that innumerable Bostonians are chanting it in their sleep right now, following their second win and a strong showing by Dice-K, pushing them to the top of the AL East. Applying the saying to retort a mathematical analysis of baseball is humorous because of the contrast drawn between the imagined stiff analytic mathematician and the rabid Red Sox fan, with a fenway frank in one hand and an overpriced fenway brew in the other. One knows which one to ask about the fluttering chaos of Wakefield's knuckleball on a blistering summer day at Tropicana Field, or, later, the chaos that will happen if Papelbon hits another batter deep into extra innings, and it's not the mathematician. For those in the know, the nature of baseball is that it's simply not subject to the mathematician's techniques, while the Sox fan's analysis is a succinct, withering critique of both the mathematician's methods and his results, while superficially the opposite appears to be true.

Your post was just dumb.

That's why he got +5 and you got a -1.

Plus the yankees do suck.

Re:Red Sox suck!! (1)

soundonsound (829141) | about 7 years ago | (#18631579)

I got news for you both. The Yankees AND the Red Sox suck. Put 'em both in the AL Central, and they're fighting for third place tops.

Biased (0, Flamebait)

Thirdsin (1046626) | about 7 years ago | (#18629795)

This guy is a jackoff. He lives in Jersey... last I checked, oh wait, yea Yankee bandwagon fan territory.

For my logical arguement, his calcs cannot possibly account for trades, injuries, or the fact that every god damn player ever in the MLB does not have many seasons that are close enough to eachother in terms of production to be predictable... This guy needs to go calculate the chances he gets hit by a bus on the Jersey Turnpike...

btw BoSox all the way this year :-P

Exactly 110 or at least 110? (1)

dircha (893383) | about 7 years ago | (#18629811)

The article says he has made more correct than incorrect predictions in his several years of doing this.

Something tells me that when he predicts that the Yankees will win 110 games, for example, he is counting his prediction as fulfilled if the Yankees win AT LEAST 110 games.

Because it would be pretty remarkable if he has correctly predicated the EXACT number of games teams will win more than incorrectly over the past several years.

And since no margin of error is provided, there's really no basis for saying whether his model is impressive or not. Probably not.

Re:Exactly 110 or at least 110? (1)

Fred Ferrigno (122319) | about 7 years ago | (#18631649)

My model predicts that they will win at least one game. That makes me right for all six out of the last six years, so I guess I've got him beat.

That's nothing... (5, Funny)

ericpi (780324) | about 7 years ago | (#18629843)

He claims to be right more than wrong in five of the past six years.

That's nothing: I've devloped a new mathematical algorithm that correctly predicts the outcome of the past six years with 100% accuracy.

Re:That's nothing... (-1, Troll)

Anonymous Coward | about 7 years ago | (#18629975)

I've developed an algorithm that plots how little anyone with a three digit IQ could possibly care about a boring prediction about a boring team in boring fucking sport.

Re:That's nothing... (0)

Anonymous Coward | about 7 years ago | (#18632189)

That's nothing: I've devloped a new mathematical algorithm that correctly predicts the outcome of the past six years with 100% accuracy.

Hahaha. You said "mathematical algorithm". That's as if there can be, "non-mathematical algorithm". Hahaha.

(Oh no why am I laughing at idiots again. I am the pathetic one here.)

the yankees song... (0)

Anonymous Coward | about 7 years ago | (#18629849)

Yankees yankees yankees,
They win all the games!
Yankees yankees yankees,
yankees yankees yankees!

110 Games? (1)

spike2131 (468840) | about 7 years ago | (#18629923)

The Yankees have weak-ass pitching this year. No chance they win 110 games. More likely 90.

Wanna bet on that? (0)

Anonymous Coward | about 7 years ago | (#18630335)

Whaddya say.. $50?

Re:Wanna bet on that? (1)

shoemilk (1008173) | about 7 years ago | (#18630451)

$50? coward, I'll see you $100,000. No way in hell the Yanks win 100 games let alone exactly 110.

Bah (1, Redundant)

localhost00 (742440) | about 7 years ago | (#18629951)

Don't Yankees fans predict they will dominate every year? That being said, I never take predictions like this seriously, especially if it is another "Yankees will pwn" claim. Odd, however, that I didn't see anyone predict what the 2001 Seattle Mariners did [wikipedia.org] (116 wins).

Oh, and yes, I am a mathematician (will obtain BA degree in math this June).

Re:Bah (1, Insightful)

Anonymous Coward | about 7 years ago | (#18631203)

You claim to be a mathematician with merely a BA in mathematics? Please, get off your high horse, son.

Mathematicians know nothing about (0)

iminplaya (723125) | about 7 years ago | (#18629995)

professional sports. Now, Jimmy the "Moose" Morgan, him I'll believe. He don't guess the probabilities, he makes them. A lead pipe trumps your modern math any day of the week.

Re:Mathematicians know nothing about (1)

spisska (796395) | about 7 years ago | (#18631011)

Now Jimmy the "Moose" Morgan, him I'll believe. He don't guess the probabilities, he makes them. A lead pipe trumps your modern math any day of the week.

But setting the odds on sports matches isn't really about the probablility of one team winning or losing. It's about balancing the way that people will bet. The odds are structured to minimize the risk and maximize the return of the bookmaker, based on bettor behavior.

"Moose" Morgan doesn't need to know or care whether the Yankees are likely to beat Orioles tomorrow, only what the balance will be between bets on the Yankees and Orioles. As an experienced bookmaker, Moose will naturally give favor to the Yankees from the outset. But if he's getting twice the number of bets on the Orioles to win than the Yanks, then he will shift his odds accordingly.

Because of this, the science (and math) in sports gambling comes down to finding the inefficiencies -- i.e. figuring out where the bettors have moved the gambling odds far enough beyond the real odds that it makes a bet attractive. Meaning that if you can come up with an algorithm that is reasonably accurate and says that the Yankees:Orioles ought to be 6:5 but the betting odds are 7:2, you stand to make a killing.

This is much easier to do with something like horse racing than with baseball. With horses, you have a relatively small number of people spread across 6 to 15 or so potential winners in every race. Inefficiencies abound, in the sense that favorites often win but pay at 2:1 when realistically they should be 4:1, eg. The secret to a happy day at the track is finding the horse that should be 4:1 or 5:1 but is running at 16:1.

Jimmy knows his business, and deserves credit for that. After all why would you try to gamble and risk losing when you're smart emough to take a piece of every bet and always win?

Not that impressive (0)

Anonymous Coward | about 7 years ago | (#18630005)

Statistically speaking, there is a 50% chance of blindly guessing correctly for each baseball game: win or lose. The fact that he has been "right more than wrong five out of the past 6 years" simply means that each year, except for one, he was more than 50% right. That is a very modest claim for such a comprehensive system. And for that one year, one would have better luck blindly guessing than following his calculations. There are a lot of methods of predicting the results of sports games and I fail to see why his deserves any attention, esp with such vague references of his collective results.

He's been way off-the-mark for years... (4, Interesting)

Golgafrinchan (777313) | about 7 years ago | (#18630023)

First, a link to the professor's baseball page. [njit.edu]

In 2006, he predicted 102 Yankee wins. They won 97. Not too bad.

In 2005, he predicted 113 Yankee wins. They won 95. Way off.

In 2004, he predicted 117 Yankee wins. They won 101. Way off.

In 2003, he predicted 110 Yankee wins. They won 101. Not great.

In other words, take this forecast with a big boulder of salt.

Re:He's been way off-the-mark for years... (1)

jayhawk88 (160512) | about 7 years ago | (#18630393)

So basically he's just a myopic Yankee's fan. Got it.

Although that is funny, him predicting in 2004 the Yankee's would break the season record for wins.

Big Whup... (2, Informative)

Anonymous Coward | about 7 years ago | (#18630029)

Bill James came up with simple quantifiable statistics that could very accurately predict the success rate for a baseball team back in the '70s. The Oakland A's had a lot of success using those methods to put teams out of the field that would win between 95-100 games per year while spending as little as possible. It worked remarkably well and a book (Moneyball, by Michael Lewis) was written about it.

In short, this is old and well covered news, unless this guy has come up with a simulation that is significantly more accurate (doubtful).

Predicting the past is... (1, Interesting)

Anonymous Coward | about 7 years ago | (#18630107)

easier than predicting the future.

He modeled his program on the past 5-6 years data thats why: "He claims to be right more than wrong in five of the past six years."

How does he factor rookies? Does he model injuries and use the data to rank teams susceptibility to lost talent?

Unless this program is 6 years old his model is only back-tested; not proven.

Re:Predicting the past is... (1)

Paradise Pete (33184) | about 7 years ago | (#18630831)

Predicting the past is...easier than predicting the future.

Are you basing that statement on past results?

Re:Predicting the past is... (1)

hyfe (641811) | about 7 years ago | (#18632251)

Predicting the past is easier than predicting the future.

No, it's seriously not. They are exactly the same. There's no difference between taking the first 3 of the last 5 years and training your dataset and validating on the last 2, and training on the last 3 years and validating on the next two to come. The models doesn't know the clock, and datasets are datasets.


There is a world of difference between accuracy rates on your training/calibration set and your models performance on the validation set. One of them is occasionally usefull, one of them is never ever. Somewhat illustrated by the lackluster comments on this story, In regular computer science there's way too many people who don't know the difference. This is also one of the major reasons I really don't consider Computer Science a real Science, it's just hack'n'slash and fancy words (kinda like dentists).

In Other News: (1)

Miseph (979059) | about 7 years ago | (#18630117)

"Accountant predicts Yankees will dominate based on salary spending."

"Sports historian predicts Yankees will dominate based on past seasons."

"Incoherent drunk predicts Yankees will dominate based on voices in his head telling him so."

"Everyone who's even remotely familiar with MLB dies of a massive simultaneous aneurysm trying to comprehend why anyone predicting the Yankees will be one of the top teams in the league for any reason at all qualifies as "news" rather than statement of the obvious."

Seriously, I'm from Massachusetts and detest the Yankees, and I still have to acknowledge that even if the Yankees are "having a bad season", they're still one of the best teams in the league.

What about Daisuke? (1)

stubear (130454) | about 7 years ago | (#18630141)

I want to know how he calculated Daisuke Matsuzaka's numbers since he's never played ball in the states. Theoretically he should dominate the AL given his performance in Japan but those numbers don't mean much when considering the power hitters in the AL, much less MLB. Here's hoping Bukiet is wrong though. I'd love to see the Yankees tank and not make the play-offs but I'm a Red Sox fan and I always hope that happens.

Climate Models? (5, Insightful)

Matteo522 (996602) | about 7 years ago | (#18630341)

So let me get this straight..

Climatologists use past data, computer models, and mathematical projections to support global warming and predict future results, and everyone calls it strong science based on facts. If the models are off, it's just a part of the scientific process, but the overall claim is still valid.

But if a statistician uses past data, computer models, and mathematical projections to predict baseball results, it's dismissed as some crack job's phony science. If the models are off, it's proof that he has no idea what he's doing and how these kinds of models don't work.

Am I missing something here?

Re:Climate Models? (0)

Anonymous Coward | about 7 years ago | (#18631103)

This has got to be the most insightful comment on Slashdot in the past few years.

Re:Climate Models? (1)

mumrah (911931) | about 7 years ago | (#18631185)

I would think climatologists have a bit more data to work with than a handful of baseball players' stats and past trends. Also climate models are based on physical laws as well as statistics, whereas baseball is pretty much purely statistics.

Re:Climate Models? (2, Insightful)

zippthorne (748122) | about 7 years ago | (#18631191)

Yes, In the public experience, most fancy sports predictions have a history of being inaccurate. This is unlike the experience with climate models, which historically have also given us some predictions.

Re:Climate Models? (2, Insightful)

Ibag (101144) | about 7 years ago | (#18632087)

What you are missing is that not all models are created equal, and not all things are as easy to model. It's all about variance. Consider the weather, for example. We can accurately predict what it will be for a day or two, and we have a decent guess for about a week, but beyond that, there is too much complexity and variability for us to say much (not to mention that weather appears to be a dynamical system, i.e., an example of chaos theory, which means that prediction is theoretically impossible). However, if I were to ask you what kind of weather I could expect this July, you could make some fairly accurate guess of "warm". All the small scale variations cancel out, and you can have a very good prediction of what the average temperature, or average rainfall, or average anything else will be over the next year, or 10.

For long term climate, we have a good idea how many of the processes involved work, and we can vary all the parameters to give ranges on the possible outcomes. While we can't use them to predict the rainfall in Boston on July 4, 2057, we can use them to say that the mean global temperature will be 3-5 degrees warmer that year (or some other similar statement).

Compare this to baseball. There aren't enough interactions for small variations not to throw everything off. Things like injuries, marital problems, drugs, rivalries, and weather could shift the outcomes of major games in ways and change the outcome in this model more severely than China switching to nuclear power would do in climate models. There is a better chance at predicting total numbers of runs or hits during the season, as the variation on things like that is smaller. Predicting the number of games won is almost as hopeless as predicting the outcome of an individual game, and if you could do that, you could hire people to post to slashdot for you.

Win Expectancy and available data (1)

h4ter (717700) | about 7 years ago | (#18630349)

FTA: "Were the model to be commercialized, it could be updated on a play-by-play basis, which fans could monitor to see how every play changes the outcome of a game. "I think some fans would think that's cool," Bukiet said."

How individual plays affect the outcome (or probable outcome) has been a well-worn subject of late in the blogs and discussion lists of baseball fans. And you don't need commercial products for answers. Retrosheet.org [retrosheet.org] provides play-by-play data reaching back decades, from which I calculated how often given game-states have resulted in wins for the home team. Taking the win expectancies before and after an event tells you how important the event was. My Win Expectancy Finder is lives here [walkoffbalk.com] .

I imagine this guy's using Markov chains, too.

From one of his students (5, Informative)

kenb215 (984963) | about 7 years ago | (#18630443)

Wow, I never expected somebody that I knew to get on Slashdot. Bruce Bukiet is my Calculus II professor at NJIT.

He mentioned this before a few times, including today after that article made it to the most popular spot on Yahoo! [yahoo.com] News. This is more of a hobby for him than an official project.

From what he has said in the past about the model, it tends to overestimate the Yankees, among other reasons, because they often buy good players at the end of their prime. Thus the players won't play as well as they had in the past. He hasn't used it to make any bets. For the model, coming within a game or two of the actual results is considered a good prediction.

As some people above said, the model isn't intended to be extremely accurate, and is frequently off by a significant amount. The interviews he does are more to get people interested in math, and to see how it has real use, rather than to try and show off. He used to go into more details in the past, but doesn't now because they tend to confuse the interviewer, and don't make it into the final article.

Some pages of his own about the project are:
http://m.njit.edu/~bukiet/baseball/baseball.html [njit.edu]
http://www.egrandslam.com/ [egrandslam.com]

Baseball and nerdiness go hand-in-hand... (1)

walkie (794662) | about 7 years ago | (#18630499)

Two of the more respected, statistically-based projection systems out there are Nate Silver's PECOTA [wikipedia.org] and Diamond Mind Baseball [diamond-mind.com] .

Their 2007 Yankees projections:

PECOTA: 93
Diamond Mind: 96

Steinbrenner and Bush (0, Flamebait)

Dracos (107777) | about 7 years ago | (#18630505)

Just as president Bush ignores Congress, so does George Steinbrenner ignore the salary cap rules of Major League Baseball. The yankees literally buy a spot in the playoffs every year.

Not a real world application (1)

HotDogWater (1084827) | about 7 years ago | (#18630605)

This sounds like a good idea but you are gonna go crazy just like Maximillian Cohen trying to predict life. You cant predict a player going on the injured list like you can calculate RBIs. It is illogical to use something like this in a chaos filled world. For all you know the whole Yankee's team can be thrown out for illegal sports betting. It is also wrong because you forgot about the Detroit Tigers.

Regression toward the mean (0, Flamebait)

tyrr (306852) | about 7 years ago | (#18630689)

I guess, your "mathematician" is not a big fan of "regression toward the mean" [wikipedia.org] . Very unfortunate.
Please, stop calling charlatans mathematicians. Mathematicians do know that luck cannot be predicted or replicated.

Shameless Plug: (1)

noSignal (997337) | about 7 years ago | (#18631349)

Speaking of computerized projections, if you're at all interested in horse racing, check out http://www.desertsea.com/ [desertsea.com] . Oh that and it takes some guts to predict a good season for the yankees. That's like going to a casino and rooting for the dealer.

Math? Hardly (1)

ffejie (779512) | about 7 years ago | (#18631487)

AL East: New York Yankees
AL Central: Cleveland Indians
AL West: Los Angeles Angels
AL wildcard: either the Boston Red Sox, the Toronto Blue Jays or the Minnesota Twins


OK, so he managed to choose division winners and then say that the Wild card would come from one of THREE other teams. I don't think there's much math or stats going on here. Shouldn't he be able to pick ONE team and say they're going to win the Wild Card? This sounds more like a baseball fans prediction than a mathematical prediction.

Re:Math? Hardly (0)

Anonymous Coward | about 7 years ago | (#18631849)

AL East: New York Yankees
AL Central: Cleveland Indians
AL West: Los Angeles Angels
AL wildcard: either the Boston Red Sox, the Toronto Blue Jays or the Minnesota Twins

OK, so he managed to choose division winners and then say that the Wild card would come from one of THREE other teams. I don't think there's much math or stats going on here. Shouldn't he be able to pick ONE team and say they're going to win the Wild Card? This sounds more like a baseball fans prediction than a mathematical prediction.

And if you bothered to look at his win/loss predictions [egrandslam.com] :

AL East AL Central AL West
Yankees 110-52 Indians 91-71 Angels 94-68
Red Sox 87-75 Twins 88-74 A's 80-82
Blue Jays 87-75 Tigers 84-78 Rangers 77-85
Orioles 75-87 White Sox 82-80 Mariners 74-88
Devil Rays 55-107 Royals 58-104

The Sox and Jays have 87 wins and the Twins 88. Given an error of +/- 1, this makes perfect sense.

This sort of thing is explained in detail.... (1)

iritant (156271) | about 7 years ago | (#18631739)

... in the book Moneyball by Michael Lewis. He follows Billy Beane through a season with the Oakland As, where they beat their division even though they were outspent by nearly every other team. This prompted former Fed Chair Paul Volker to comment that Beane had found a market inefficiency. He had used such an inefficiency, but it wasn't Beane who had found it.

To do this right, however, you have to do legwork, because according to the model described in Moneyball, On Base Percentage is really what you're after, not batting average, and from a pitching/fielding perspective you want to do something more nuanced. He broke the field out into zones and provided feedback based on that. My recollection is that he didn't go into too many details about that part.

The important part was to get a $/runs scored number.
 

claims to be right in.... (0)

Anonymous Coward | about 7 years ago | (#18631815)

>> He claims to be right more than wrong in 5 of the past 6 years.

60% of the time, it works every time.

read this title here in Europe when half awake... (1)

Herve5 (879674) | about 7 years ago | (#18631839)

OMG, here in Europe I always enter /. on RSS (so with no tags indicated): honestly, this morning when half aspleep I understood someone had mathematically determined than US is to dominate everyone forever...
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...