×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

New Leader In Netflix Prize Race With One Day To Go

Soulskill posted more than 4 years ago | from the sniped-like-an-ebayer dept.

Programming 87

brajesh writes "The Netflix Prize, an algorithm competition to improve the Netflix Cinematch recommendation system by more than 10%, has a new leader — The Ensemble — just one day before the competition ends. The 30-day race to the end was kicked off after BellKor's Pragmatic Chaos submitted the first entry to break the 10% barrier, with the results showing a 10.08% improvement. The Ensemble, made up of three teams who chose to join forces ('Grand Prize Team,' 'Opera Solutions' and 'Vandelay United), has managed to overtake BellKor with a score of 10.09% — an improvement of .01% over the former leaders. From the article on Techcrunch: 'The competition will end [today], so teams still have a little bit of time left to make their last-second submissions, but things are looking good for The Ensemble. This has to be absolutely brutal for team BellKor.'"

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

87 comments

new leader in first post (-1, Offtopic)

Anonymous Coward | more than 4 years ago | (#28826713)

you can still suck my ass, though.

I think (5, Insightful)

sys.stdout.write (1551563) | more than 4 years ago | (#28826715)

that other websites should do this as well.

Slashdot, for instance, could have a contest to unbreak their fucking code by 10%.

Re:I think (3, Funny)

Anonymous Coward | more than 4 years ago | (#28826869)

Are you joking? Slash is written in Perl, the best maintenance method is too start again.

(Joking, partly).

Re:I think (0)

Anonymous Coward | more than 4 years ago | (#28827627)

And your suggestion would be too rewrite it in PHP ? *cough* *script-kiddie* *cough*

I just hope your syntax is better than your grammar.

Re:I think (0)

Anonymous Coward | more than 4 years ago | (#28828007)

Syntax is a subset of grammar, you insensitive clod!!

Re:I think (1)

brentonboy (1067468) | more than 4 years ago | (#28828037)

Syntax is a subset of grammar, you insensitive clod!!

Ha! But you're equivocating, of course. He means code syntax.

Re:I think (3, Insightful)

Vectronic (1221470) | more than 4 years ago | (#28826871)

(-1 Offtopic) But, I've sort of hoped that a site, such as Slashdot, should somehow open-source their site code, it a sort of "community", and considering the context of the site, the amount of users, there are probably about 5,000 people capable of contributing decent code/help, and there has to be a rather significant number of those that are willing to.

Add a section devoted to it, then Polls, about which contribution should be implemented, etc. Articles/Submission are sort of (controlled) "open-source", why not the site itself?

Re:I think (5, Funny)

Blue Stone (582566) | more than 4 years ago | (#28826901)

>Slashdot, for instance, could have a contest to unbreak their fucking code by 10%.

I remember playing Call of Cthulhu many years ago and being told of the hideously deranging results of mere mortals who happened to gaze upon the unspeakable things that lurked in the dark places.

I beg you not to lead others down your insane and twisting path.

NO GOOD CAN COME OF IT! NO GOOOD!

Re:I think (1)

houstonbofh (602064) | more than 4 years ago | (#28827101)

So you are saying that looking at all of the slashdot code, and actually understanding it breaks your mind? Well that explains this nasty system choking javascript then.

Re:I think (1)

cream wobbly (1102689) | more than 4 years ago | (#28854419)

Only half-scanning your comment, I initially read the GP as "unbreak [Slashdot's] fucking code: 10%", and wondered what would happen if I rolled 11. Would the ref decide that Slashdot's fucking code had broken more? Or would he be reasonable and decide I had narrowly avoided a total fucking code breakage? Or was he playing the rule that missing a skill by 1% meant that I would be eaten by shoggoths?

Uve Boll (2, Funny)

Afforess (1310263) | more than 4 years ago | (#28826717)

What did they do, make sure that all of Uve Boll's movies never came up as a "Recommended for you" movie?

It's not Uve (2, Informative)

thetoadwarrior (1268702) | more than 4 years ago | (#28826803)

Uwe Boll. It only sounds like a v because he's German.

Re:It's not Uve (0)

Anonymous Coward | more than 4 years ago | (#28838817)

Sorry. I thought the v came from evil.

Re:It's not Uve (0)

Anonymous Coward | more than 4 years ago | (#28851923)

No because he's a fucktard.

I used to be very elitist about my reading (3, Interesting)

BadAnalogyGuy (945258) | more than 4 years ago | (#28826727)

Back when I first began using Amazon.com, I never bought a book based on the recommended items. I felt the recommendations were trite, ill-advised, and typically only peripherally related to the item I was buying.

Then the recommendations got better. Much better. I started to find myself buying things right out of the recommended section, and the product combination deals also became very tempting.

If Netflix can turn their recommendation engine into something similar, they will be sitting on a goldmine. As they say, people hate get sold to but they love buying.

Re:I used to be very elitist about my reading (0)

Anonymous Coward | more than 4 years ago | (#28827507)

Looks like you aren't a Netflix customer. Their rec engine is already way better than amazon's.

If you are going to make assumptions, at least try to be on the safe side, for fuck sake.

Re:I used to be very elitist about my reading (1)

BadAnalogyGuy (945258) | more than 4 years ago | (#28827587)

The only assumption that I made was that the recommendation engine could be improved.

With approximately 10,000 subscribers (as of 2008), and 1.3B in revenues from these subscribers, even a 1% increase in rentals would be worth 10 times the 1M they are paying to the winner of this contest.

Amazon has almost 20B in revenues from a much larger group of customers. A 1% increase per customer here would be huge.

Netflix, in addition to increasing the number of rentals per customer, should also be thinking about increasing the total number of customers.

Re:I used to be very elitist about my reading (0)

Anonymous Coward | more than 4 years ago | (#28828383)

With approximately 10,000 subscribers (as of 2008), and 1.3B in revenues from these subscribers, even a 1% increase in rentals would be worth 10 times the 1M they are paying to the winner of this contest.

How do you figure that? It's not as if Netflix charges by the DVD rental; people pay for the ability to rent/keep x DVDs at a time, for as long as they pay for that plan. People will (usually) already have a large number of movies they want to watch stored in their queue, so queue length isn't driving these people to more expensive subscription plans. I fail to see how a better recommendation system (which will simply lead to more movies in the queue) is going to translate into a significant number of customers making the leap to an x+1 subscription plan.

Re:I used to be very elitist about my reading (1)

mattack2 (1165421) | more than 4 years ago | (#28828927)

Wow, you need movies & books to be recommended?

I have far more movies & books that I'm at least *vaguely* interested in than I can 'consume'. (A large part of the reason I started using the Netflix profile system was because of the 500 item limit in the queue.. and yes, I realize I won't ever watch the VAST majority of them.. but I would add movies/TV shows/documentaries that sounded interesting, and hit the limit. Note obviously a lot of the multiple items are separate discs in a collection, such as 'extras' discs or multiple discs of a TV show. I have since mostly separated into movies & TV profiles, but haven't moved all TV shows on my orig huge list to the TV profile.)

Don't get me wrong, I'm interested in improving the recommendation system just for curiosity/algorithmic reasons.

Why now? (1, Insightful)

Anonymous Coward | more than 4 years ago | (#28826741)

Why not wait another day before submitting the improvement? All they did now was giving the other team one day to respond, and if they succeed, I doubt they will be able to submit yet another improvement. So why not simply wait until an hour or so before the deadline, or am I missing something about the rules, e.g. any submitted improvements prolong the deadline by one day?

Re:Why now? (1)

garcia (6573) | more than 4 years ago | (#28826823)

Maybe they already have a solution which is higher and they are just being dicks? Maybe they aren't dicks at all and want to see the best team win? Maybe they think that their solution is unbeatable?

Whatever it is, it is certainly a lot more interesting than I thought it'd ever be. Kudos to the groups that have broken the 10% barrier!

Re:Why now? (3, Insightful)

Anonymous Coward | more than 4 years ago | (#28826929)

It does seem like a slight flaw in the rules if there is only one 30-day countdown timer. That is, if a competing team can hold off until the last moment to release their version that bests the current leader, as is the case here. Now that this improvement has been made public, there should be something like a 10-day response time for the other competing teams.

Re:Why now? (4, Interesting)

caffeinemessiah (918089) | more than 4 years ago | (#28826961)

Why not wait another day before submitting the improvement? All they did now was giving the other team one day to respond, and if they succeed, I doubt they will be able to submit yet another improvement. So why not simply wait until an hour or so before the deadline, or am I missing something about the rules, e.g. any submitted improvements prolong the deadline by one day?

For the grand prize, there was a final 30-day countdown from the time the first entry that achieved greater than 10% was received, which was a month ago. So it seems like this will indeed come down to an ebay-like sniping situation in the last few hours.

I wouldn't feel too sorry for BellKor/KorBell though -- they've got many, many best paper awards at conferences and a huge degree of publicity out of the whole endeavor. In fact, in KDD 2009, they detailed most of the methods that most likely got them to the top -- i.e. they incorporated the fact that tastes and preferences drift over time. Simple, in retrospect of course. If you have an ACM subscription, you can read the 2009 paper here [acm.org].

Plus, since they work for AT&T/Yahoo Research, I remember Yehuda Koren stating that the money wouldn't have gone to them anyway -- possibly a large bonus, but I think they're entitled to that anyway. So I wouldn't feel too sorry for them.

Re:Why now? (5, Informative)

brian_tanner (1022773) | more than 4 years ago | (#28827871)

It's also true that the winner is not the person who gets the highest score on the leaderboard. Most people seem to miss this.

The leaderboard gives score on the QUIZ dataset, which is half of the answers that the team submits. The WINNER of the million dollars is the person who does best on the TEST dataset, the other half of the answers they submit. Nobody knows how good these guys are doing on the TEST set, either team could be overfitting [wikipedia.org] the quiz set.

Re:Why now? (0)

Anonymous Coward | more than 4 years ago | (#28828983)

Nobody knows how good these guys are doing on the TEST set, either team could be overfitting [wikipedia.org] the quiz set.

Yeah, but if they're not hitting the 10% mark on the quiz set, then they're probably not going to hit the target 10% on the test set either, regardless of whether they're overfitting to the public data.

Re:Why now? (1)

brian_tanner (1022773) | more than 4 years ago | (#28829549)

Yeah, but if they're not hitting the 10% mark on the quiz set, then they're probably not going to hit the target 10% on the test set either, regardless of whether they're overfitting to the public data.

Yeah, there is a flaw in the evaluation mechanism, in my opinion. The good thing is that you don't need to hit 10% on the test set to win the money. Whatever team is qualified (10% on quiz) AND has the best test score wins. Even if they have terribly overfit the quiz set (the quiz set has been around for years now), and have terrible performance on test, one of the two qualified teams will win the money.

The flaw is that other teams that have not hit 10% on quiz might be doing better on test. If that's true, those people cannot win the money, even though they apparently have a stronger (less overfit) solution. Of course, all of these scores are ridiculously close to each other anyway, but it seems contrary to the nature of a competition if the winner is not the team with the best submitted solution.

I sincerely hope that no matter what happens, ALL of the test scores (for all teams) are revealed, so everyone can see what was really happening.

Re:Why now? (2, Informative)

currivan (654314) | more than 4 years ago | (#28831039)

In fact, according to the second post by Yehuda Koren in this thread, it looks like BelKor does have the best test error rate and will be declared the winner. http://www.netflixprize.com/community/viewtopic.php?id=1498 [netflixprize.com]

Mod parent +5 Informative, BellKor et al have won (0)

Anonymous Coward | more than 4 years ago | (#28831545)

Yehuda claims to have the best test error rate. They will win the million dollars. It was super-exciting: It seemed BellKor et al will be defeated thanks to a little-known rule of the competiton (the 30-day last call rule). But they have won after all, thanks to a less-known rule (the quiz dataset - test dataset distinction).

Re:Why now? (1)

SpinyNorman (33776) | more than 4 years ago | (#28826979)

Try and flush out the competition, maybe? (unles it really is the best they have, or think they'll have).

Or perhaps try to lull the competition into a false sense of security by only edging them by a hair, when they something better held back?

Of course, with the amount of effort the teams have put into this, and the money at stake, you'd be nuts not to keep working on it flat-out until the time runs out; but still, if you're tired it could make a difference if you think you've got the competition by a comfortable margin as opposed to knowing you're in a losing position because they've already submitted their true best shot, or something close to it.

Re:Why now? (1)

tonycheese (921278) | more than 4 years ago | (#28827257)

Well, perhaps they did not know exactly how Netflix would rate their efficiency until after a submission. .01% is a pretty close difference, and they might not have known whether they would overtake first or not without submitting and having their algorithm run by Netflix.

should've "gamed" it (4, Interesting)

petes_PoV (912422) | more than 4 years ago | (#28826799)

rather than declaring your best result early, the Belkor team should have employed a bit of strategy and only declared a lesser result (if any). That would give the other teams something to aim at, without giving away their best results. These would be held back right up until the last minute and then submitted, so that other teams would not have time to make any further improvements (in fact, maybe this IS what they're doing). It's been a successful bidding strategy on eBay for years, so why wouldn't it translate into other competitive areas too?

Re:should've "gamed" it (5, Insightful)

stuckinarut (891702) | more than 4 years ago | (#28826837)

Who's to say they haven't? People smart enough to win this competition are probably smart enough to think of this.

Re:should've "gamed" it (1)

SpinyNorman (33776) | more than 4 years ago | (#28826863)

I'd be very surprised if Belkor doesn't have something better to submit at the last second.

It'd certainly have been an awful strategy to trigger the endgame with all your cards on the table.

Re:should've "gamed" it (4, Insightful)

Manip (656104) | more than 4 years ago | (#28826865)

This isn't eBay, they can't just magic high scores.

If you game it or otherwise, everyone will end up submitting their max score, because, well... Why wouldn't they? Who cares if the other team knows you have 10.8%... Either they can beat it and will submit that score, or they cannot and won't.

Re:should've "gamed" it (1)

Jah-Wren Ryel (80510) | more than 4 years ago | (#28827043)

If you game it or otherwise, everyone will end up submitting their max score, because, well... Why wouldn't they? Who cares if the other team knows you have 10.8%... Either they can beat it and will submit that score, or they cannot and won't.

OR maybe they can do better than 10.8% but because they thought they had it in the bag, they didn't put the extra effort in to really push those improvements through and now, with less than a day left, they don't have the time to get those improvements fully polished enough for submission

This isn't eBay, they can't just magic high scores.

Actually this is precisely like ebay. It appears that the prize got "sniped" out from under BellKor. The problem, just like ebay, is that the process has a fixed end-date. The way to avoid this problem (and produce the best results for the "seller" - in this case NetFlix - is to having a rolling end-date that is always a fixed period after the most recent highest result submission.

Don't get me wrong, I am a BIG fan of sniping, but then I'm always a buyer on ebay, not the seller, and sniping is the best bidding policy to keep bidding-wars at bay.

Re:should've "gamed" it (1)

Stile 65 (722451) | more than 4 years ago | (#28827055)

Actually, Netflix used a different way to prevent gaming the system. They split the submitted predictions into two sets - the "quiz" set and the "test" set. The quiz set results are on the leaderboard; the test set is used for final judging.

Re:should've "gamed" it (0)

Anonymous Coward | more than 4 years ago | (#28827767)

Maybe Netflix should have learned something from the online auction sites and made the deadline push back whenever a new winner emerged. That would prevent snipers from waiting until the last minute. Say the deadline gets pushed out 7 days every time someone takes the lead.

Sure, that could result in the competition getting pushed out indefinitely. But If that much is going on I don't see the harm in it. Once someone gets a result that's superior enough to stand for 7 days then they win.

Re:should've "gamed" it (1)

mrvan (973822) | more than 4 years ago | (#28826875)

Maybe they did, and the 10.08 (pretty minimal increase from 10) was their low end result, and they will announce their 25% increase result in the coming day..

Then again, maybe they didn't :-)

Re:should've "gamed" it (1)

MartinSchou (1360093) | more than 4 years ago | (#28826959)

The 10.08 was a 10.08% improvement over the original system. That's not exactly a minimal increase, and considering that the new leaders posted a 10.09% improvement over the original (0.0098% better than 10.08%) it's rather harsh to write off the 10.08% improvement as "pretty minimal".

Re:should've "gamed" it (1)

MrShaggy (683273) | more than 4 years ago | (#28827059)

Does Mighty Mouse come in time to save the day?

Tune in next week, to see the Action-packed conclusion!

Re:should've "gamed" it (1)

shentino (1139071) | more than 4 years ago | (#28827145)

True, and if only their own interest counts, that would be a good choice.

Things is, it's not good sportsmanship to "game the rules" that way.

Re:should've "gamed" it (0)

Anonymous Coward | more than 4 years ago | (#28830659)

Sportsmanship is, possibly, appropriate to .. sports. Despite the fact that the submission title has "race" in it, it is not a sport and sportsmanship is unlikely to be a concern for the submitters.

Indeed, even in sports, there are acts that are considered "gamesmanship" rather than "sportsmanship" because they are perfectly acceptable, but harm the efforts of the opposing team.

Re:should've "gamed" it (1)

flynt (248848) | more than 4 years ago | (#28827437)

Basically impossible. The teams cannot compute their improvement. Netflix computes the improvement. The improvement is computed on a "secret" test dataset that only Netflix has access to. The models are developed on a public dataset available to everyone.

Ensemble learning (1)

mysterons (1472839) | more than 4 years ago | (#28826915)

I'm actually surprised that this hasn't been done before. You can prove that using multiple models will on average produce better results than using any single model in isolation. For example, each netflix system will make different errors; using multiple systems will tend to average-out these errors and the consensus decision is most likely to be correct.

Re:Ensemble learning (5, Informative)

Stile 65 (722451) | more than 4 years ago | (#28827037)

Many teams actually combined multiple methods to get a better score. In fact, "BellKor's Pragmatic Chaos" is a combination of three teams, I'm guessing - BellKor, BigChaos and Pragmatic Theory.

Also, it helps to remember that what's posted on the leaderboard is the result of the "quiz" set - half of the actual set of recommendations you're asked to make. The other half, the "test set," is used for final judging. With such a small difference between BellKor's Pragmatic Chaos and The Ensemble on the quiz set (.0001 RMSE), the test set rank may actually end up reversed.

Re:Ensemble learning (0)

Anonymous Coward | more than 4 years ago | (#28827095)

Most "individual" contest entries are, at this point, made up of over 100 different models. The improvement that can be gained by just "throwing another model on the pile" is very small.

Re:Ensemble learning (1)

janwedekind (778872) | more than 4 years ago | (#28827137)

Actually it is not about averaging out. It's about building a better classifier from many good ones. See Adaboost [wikipedia.org].

Re:Ensemble learning (1)

mysterons (1472839) | more than 4 years ago | (#28827227)

Well, you really want to think about bias/variance reductions which brings ideas of averaging and using better classifiers together. For example, "bagging" can be thought of as a variance-reduction technique; "boosting" does both if I recall.

Algorithms? (1)

wkurzius (1014229) | more than 4 years ago | (#28826939)

I thought Vandelay was into manufacturing latex.

Re:Algorithms? (1)

frieko (855745) | more than 4 years ago | (#28827129)

They're thinking of quitting the exporting, and adding more import statements. And this is causing a problem, because, why not do both?

Re:Algorithms? (0)

Anonymous Coward | more than 4 years ago | (#28827809)

Ultimately, they're waiting for Professor Von-Nostrand to weigh in with his opinion before the decide.

Any winner at all? (4, Interesting)

Fnord666 (889225) | more than 4 years ago | (#28827315)

My question is whether there will be any winner at all other than netflix? One of the rules for the competition was that you could not form multiple teams. This was to prevent people from gaining multiple submissions per day. Otherwise a five person group could create 30 teams and thus be able to submit 30 attempts per day. I believe both teams that have exceeded the 10% threshold and thus are eligible for the grand prize are composed of members from other teams and could be disqualified.

Re:Any winner at all? (0)

Anonymous Coward | more than 4 years ago | (#28827383)

I think you've missed a key difference between people forming multiple teams with the same people and people from multiple teams consolidating into a single team.

Re:Any winner at all? (3, Insightful)

ceoyoyo (59147) | more than 4 years ago | (#28827447)

Why would that disqualify them? The didn't form multiple teams, they did the opposite -- they started with multiple teams and then merged them into one, abandoning or deleting the old, multiple accounts.

I suppose you could speculate that the teams weren't ever independent, but I think that's fairly obviously not the case.

Re:Any winner at all? (0)

Anonymous Coward | more than 4 years ago | (#28828057)

This rule only applies to teams that have the exact same set of members. The rule tries to prevent teams from creating many aliases to get around the one-result-submission-per-day rule. You can imagine that particularly now is the time that teams would like to be able to submit results as frequently as possible, which is why Netflix reminded everybody of the no-aliasing rule recently.

There is more than 1 day left (0)

Anonymous Coward | more than 4 years ago | (#28827385)

Call me crazy, but if you actually *read* the rules it says the contest is going until at least October 2nd, 2001.

Netflix Prize Rules [netflixprize.com]

Terms and Conditions in a Nutshell

        * Contest begins October 2, 2006 and continues through at least October 2, 2011.

Re:There is more than 1 day left (2, Funny)

Qubit (100461) | more than 4 years ago | (#28827517)

Call me crazy, but if you actually *read* the rules it says the contest is going until at least October 2nd, 2001.

Actually, yes, I think I will call you crazy.

Re:There is more than 1 day left (2, Funny)

tomhudson (43916) | more than 4 years ago | (#28827591)

Call me crazy,

Okay, you're crazy :-)

but if you actually *read* the rules it says the contest is going until at least October 2nd, 2001.

So, there's approximately minus 2855 days left?

I just want to know if netflix gets to keep John Titor's time machine [wikipedia.org] ... the time frame (2001) is right ...

Re:There is more than 1 day left (1)

sleeper0 (319432) | more than 4 years ago | (#28827605)

Competition had 30 days to submit after the qualifying submission was presented. From your link: "After three (3) months have elapsed from the start of the Contest, when the RMSE of a submitted prediction set on the quiz subset improves beyond the qualifying RMSE an electronic announcement will inform all registered Participants that they have thirty (30) days to submit additional candidate prediction sets to be considered for judging."

Re:There is more than 1 day left (0)

Anonymous Coward | more than 4 years ago | (#28828067)

I haven't gotten any email notifying me of only 30 days left... the last email I got was (July 9th, 2009 -- and before that was Oct 02, 2008):
A reminder for participants in the Netflix Prize contest:

Some participants have failed to comply with the contest rules, for example
by creating multiple teams with an identical set of members. Such participants
and any teams to which they belong may be suspended from participation in the
contest, become ineligible for any Contest Prize, and/or have their
submissions rejected by the judges.

Teams which combine the work of multiple participants or multiple teams must
ensure that all their contributing participants are in compliance with the
contest rules.

The contest rules are posted at http://www.netflixprize.com/rules

It's an exciting time for the Netflix Prize contest.
We thank you for participating and wish you luck!

Sometimes better design beats better algorythms (3, Insightful)

davidannis (939047) | more than 4 years ago | (#28827713)

They could improve the predictive value immensely if they allowed me and my wife to each rank the movies we watch together separately. With the current system, some movies are rated by just me, some by just her, and some have a consensus rating. It leads to a dataset full of garbage.

Re:Sometimes better design beats better algorythms (1)

memristance (1285036) | more than 4 years ago | (#28828307)

This brings up an interesting point. The Netflix algorithm is working from flawed/incomplete data generated from poor design decisions, so no matter how good the algorithm gets it still won't be able to accurately predict what movies will actually interest people based on a very subjective unidimensional rating. For example, the same person might rate a movie differently under differing conditions, and the rating itself may hinge entirely on one thing in the movie (s)he did(n't) like, whereas the movie might have been overall pretty good. It's like asking someone, 'on a scale of 1 to 5, what is your favorite color?'; it has next to no relation to its supposed objective.

On top of all this, people are capricious at best when it comes to movie tastes; they might not even like a movie based on its own merits, but something completely orthogonal to the question such as it being the movie they saw on their first date. As such, no set of ratings from any given user can really be accurately matched with those of another to provide suggestions, since they may have liked/hated those movies for entirely separate reasons. Granted, some of these things can't easily be transcribed into data for formulaic processing, but you'd think Netflix could at least add an optional 'detailed rating' section (e.g., rate by pace, plot, action, acting, dialogue, etc.) to better describe why a user did or didn't like a flick.

Re:Sometimes better design beats better algorythms (2, Insightful)

Hawke666 (260367) | more than 4 years ago | (#28828893)

That'd be all your fault. You should be creating separate account profiles for yourself and your wife.

Re:Sometimes better design beats better algorythms (1)

St.Creed (853824) | more than 4 years ago | (#28829189)

Yeah, I should totally jump through hoops to improve their ability to sell to me. Just because it would make the programmers lives easier :)

No, if Netflix wants to sell more, they should follow up on that recommendation and make it very very easy to have multiple identities on a given account and a button on the page to switch them.

The reason is that there is a difference between the information needs of the administration of purchases (tied to an account in a 1:1 relationship) and the information needs of the marketing department (tied to people, who can be tied to an account in a many:1 relationship, or a 1:1, or 1:many relationship as well). If you put a one-size-fits-all discipline in there (as lots of IT-departments are unfortunately wont to do), you lose information.

Re:Sometimes better design beats better algorythms (4, Informative)

Hawke666 (260367) | more than 4 years ago | (#28829413)

Yeah, they do. see "Your account", "Account profiles". And then there's a dropdown on the top of the page. I don't see how they could make it much easier.

Re:Sometimes better design beats better algorythms (1)

Alpha830RulZ (939527) | more than 4 years ago | (#28833005)

At least some of the ensemble modeling techniques handle this just fine. They will develop classifiers that detect your ratings, classifiers that detect her ratings and classifiers that detect your joint ratings. See the previous citation for adaboost at wikipedia. They do this by looking at error from a given classifier, and finding additional weak classifiers that address the error. So if your wife likes schwarzenegger movies, your liking for tear jerkers will show up as errors, and the algorithm will seek an additional classifier to select for tear jerkers. Then eventually you get True Lies in the suggestion list. ;-)

Re:Sometimes better design beats better algorythms (2, Insightful)

coaxial (28297) | more than 4 years ago | (#28829333)

Data sets like this are always have garbage. There's the jackass that rates everything 5 stars. There's the jackass that rates everything 1 star. There's the jackass that rates the worst movies by consensus 5 stars, and vis versa.

There are 61,441,618 ratings by 478,548 unique users in the publicly available training set.

It just doesn't matter.

Re:Sometimes better design beats better algorythms (0)

Anonymous Coward | more than 4 years ago | (#28833103)

In fact, there are jackasses who rated every movie 2 stars and jackasses who rated every movie 3 stars and jackasses who rated every movie 4 stars as well. Approximately 3000 people altogether. Predicting those is easy.

Re:Sometimes better design beats better algorythms (0)

Anonymous Coward | more than 4 years ago | (#28831011)

1) Movies rated by individual users of a joint account often cluster very nicely. Probably movies rated jointed work the same way. What is the experimental basis for saying that knowing who is doing the rating would "improve the predictive value immensely?"

2) In real life (not a netflix contest) there is a lot of other information that can improve predictions substantially. Just knowing the zipcode of the subscriber is a huge advantage.

Stop Women's Suffrage (0)

Anonymous Coward | more than 4 years ago | (#28832779)

Great point, stop the suffering!

Obvious (0)

Anonymous Coward | more than 4 years ago | (#28828085)

Screw the code. All they have to do is stop hiding the new releases. They purposely bury recent releases so they dont have to keep up with demand. Which is total bullshit. 10% improvement of what? Getting the same garbage they offer out there even faster? *sigh*

Amazon follow suit (0)

Anonymous Coward | more than 4 years ago | (#28828587)

Amazon could really benefit from a similar contest for its laughable Gold Box picks, but increasing the accuracy of recommendations by even 5000% would be low-hanging fruit.

Photo Finish (0)

Anonymous Coward | more than 4 years ago | (#28828675)

It's a tie, check the leaderboard again...

Be afraid.... be very afraid... (3, Interesting)

Baldrson (78598) | more than 4 years ago | (#28829417)

It's interesting that the fearmongering of the prior /. post about AI got hundreds of responses but this /. post, which is far more relevant to real AI, has gotten less than a hundred responses thus far. Anyway, congratulations to Netfilx for doing the right thing for their business in response to The Hutter Prize.

Creepy (1)

TranscenDev (1602045) | more than 4 years ago | (#28839449)

While I would appreciate some good movie recommendations, I can't help but feel a little creeped out that netflix may be able to read my mind one day....maybe I can make up a movie in my imagination and netflix can play it for me! ~Ami
Chicago Web Design [transcendevelopment.com]

Who's Better? (0)

Anonymous Coward | more than 4 years ago | (#28863237)

Since the test set was a lottery draw wherein any team could have come up on top, isnt it a little awkward for BPC to justify why their results would be better since the other team (Ensemble) is visibly better than then them on the leaderboard?

Check for New Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...