Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Google Launches a Data Prediction API

CmdrTaco posted more than 3 years ago | from the i-predict-they-launch-something-else-too dept.

Google 70

databuff writes "Google has released a data prediction API. The service helps users leverage historical data to make predictions that can guide real-time decisions. According to Google, the API can be used for prediction tasks ranging from product recommendations to churn analysis (predicting which customers are likely to switch to another provider). The API involves three simple steps: upload the data, train the model, then generate predictions. The API is currently available on an invitation-only basis." Google also recently announced several other API additions, including Buzz, Fonts, and Storage.

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered


well? (1, Insightful)

snmpkid (93151) | more than 3 years ago | (#32278554)

Does it use users Wifi sniffer captures to aid in this prediction?

I can do that too (0, Offtopic)

suso (153703) | more than 3 years ago | (#32278670)

Despite history (and having really good access to historical information), will people keep making stupid choices, voting for someone that screws them in the end and buying products that they think will make them happy but end up at the next garage sale for 90% off.


I Predict ... (2, Insightful)

WrongSizeGlass (838941) | more than 3 years ago | (#32278562)

... that Google will do their own analysis on your data. They're nothing if not thorough.

Re:I Predict ... (0)

Anonymous Coward | more than 3 years ago | (#32280548)

Think of all the great things Google could do knowing that you have a sequence of numbers, and the next one may or may not be 100!

In Soviet Amerika: (0, Funny)

Anonymous Coward | more than 3 years ago | (#32278596)

Data prediction launch YOU!

Yours In Astrakhan,
Kilgore T.

P.S.: Maybe Google can use scrape invitation-only users predictions for its bond trading floor.

Three simple steps? How about four. (2, Funny)

olsmeister (1488789) | more than 3 years ago | (#32278610)

1. Upload the data.
2. Train the model.
3. Generate predictions.
4. PROFIT!!!!

Re:Three simple steps? How about four. (3, Funny)

dingen (958134) | more than 3 years ago | (#32278712)

Holy shit, you... you... you figured out step 3!!

Re:Three simple steps? How about four. (1)

L4t3r4lu5 (1216702) | more than 3 years ago | (#32278870)

What is listed as step 4. is actually step 5. There wasn't much of a wait involved at all, so we skipped it to keep things simple.

Think of the step you're thinking of as being more of an extension of step 3... "3.b) ..." I you will.

That's the power of Cloud Computing.

Psychohistory ? (3, Interesting)

Vapula (14703) | more than 3 years ago | (#32278624)

What about feeding it with historical events, train with the outcome from these events and try to get a glimpse at which way the future will evolve ?

Re:Psychohistory ? (3, Interesting)

0100010001010011 (652467) | more than 3 years ago | (#32278776)

Or use the last half of your data set as blind data. Train the model on 1900-1990 and see if it can predict 1990-2000.

How far can you predict? 1%, 10%, 50%?

If you want to really see how good it is feed it stock market data and see how well it predicts that.

Re:Psychohistory ? (1)

MozeeToby (1163751) | more than 3 years ago | (#32279930)

Stock market data by itself is insufficient to predict the stock market because of all the external variables. It would be impossible to predict the post 9-11 crash for instance because there is nothing in the markets that changed leading up to it. It would be difficult to predict the more recent meltdown because it was caused by a combination of lax oversight, repealed laws, semi-legal trading techniques, and a culture of over borrowing. It's possible that you may be able to predict the minute to minute changes, maybe even day to day. But long term trends are almost impossible.

Re:Psychohistory ? (3, Interesting)

Kilrah_il (1692978) | more than 3 years ago | (#32280626)

The nice thing about the stock market is that when everything is fine the analysts say that their models are great, but when something unexpected happens they go all "but we couldn't have foreseen that. Except for this unexpected incident, our models are great!". The problem is that these "unforeseen incidents" are what drives most of the extreme changes in the stock market, and more generally, in our entire society.
Just look at 9/11 (to use your example): It not only affected the economy, it affected (and still affects) our entire lives - from airport searches, to US PATRIOT acts to wars in Iraq and Afghanistan.
These extreme events are called Black Swans ( http://en.wikipedia.org/wiki/Black_swan_theory [wikipedia.org] ) and I do recommend the book by the same name by Nassim Nicholas Taleb. Fascinating reading (if a bit repetitive sometimes :) ).
The bottom line: Trying to predict the future from past events is fine, until it breaks up, and it does so more than we care to imagine.

Re:Psychohistory ? (2, Interesting)

fusiongyro (55524) | more than 3 years ago | (#32281102)

+1! "Past performance is not a predictor of future success." Taleb is my hero. Everyone should read Fooled by Randomness, which I didn't find repetitive at all.

Re:Psychohistory ? (2, Insightful)

S-100 (1295224) | more than 3 years ago | (#32281832)

Like I heard a seasoned stock trader once say: "Technical analysis works great, until it doesn't".

Re:Psychohistory ? (1)

E IS mC(Square) (721736) | more than 3 years ago | (#32280246)

Nothing new there. The risk analysis used by most of the wall street firm to calculate their risk exposure is doing just that. And we all know how that turned out to be.

Dear Google, (0)

Anonymous Coward | more than 3 years ago | (#32278630)

plz predict for me if this turing machine haltz, kthksbye.

Dear google.. (3, Funny)

cntThnkofAname (1572875) | more than 3 years ago | (#32278634)

Given my family history... is there a girl for me?

Re:Dear google.. (3, Funny)

0100010001010011 (652467) | more than 3 years ago | (#32278800)

Every male in your family tree has had sex at least once, so the odds look good for you.

Re:Dear google.. (1)

asukasoryu (1804858) | more than 3 years ago | (#32278918)

He is the only recorded male in his family tree. All women in his family were artificially inseminated. All the sperm donors were virgins.

Re:Dear google.. (1)

cntThnkofAname (1572875) | more than 3 years ago | (#32279572)

that may be relevant if I was a male...

Re:Dear google.. (0)

Anonymous Coward | more than 3 years ago | (#32279634)

Then my guess: insufficient data. Must supplement with additional observation.

Re:Dear google.. (1, Funny)

Anonymous Coward | more than 3 years ago | (#32280294)

Past performance is not a guarantee of future results.

Re:Dear google.. (0)

Anonymous Coward | more than 3 years ago | (#32286630)

talk about a biased sample

Gambling API (2, Funny)

psbrogna (611644) | more than 3 years ago | (#32278778)

I can't wait to take my Droid to Vegas once this launches!

Re:Gambling API (0)

Anonymous Coward | more than 3 years ago | (#32279554)

With the money you make, you can hire a doctor to extract it from wherever the casino security guard leaves it.

Data mining (3, Informative)

JayJayEm (220851) | more than 3 years ago | (#32278842)

When I used to work in the financial services industry we used to call this "data mining". The result is usually at best worthless and at worst dangerous as it is so often misused.

It's worth remembering the saying with data: "if you look hard enough, you can find anything you want to".

Re:Data mining (3, Interesting)

LizardKing (5245) | more than 3 years ago | (#32279536)

It's worth remembering the saying with data: "if you look hard enough, you can find anything you want to".

A friend of mine works as a quant at one of the big investment banks. He admitted that the models his team creates are useless at predicting the unexpected (as you'd probably expect). Adding in a degree of randomness rarely produces better models, as there are too many possible sources of such unpredictability and the reactions to them depend on many unquantifiable forces. This results in models that are OK at telling traders what they want to know - that they're doing the right thing by all doing the same thing. As soon as something undesirable or unexpected happens, then all hell breaks loose and the traders panic. Having mulled this over for a bit, I suggested his job was pointless, to which he agreed, but pointed out that the pay's great. So much wasted mathematical genius.

Data mining certainly not worthless (2, Informative)

gnieboer (1272482) | more than 3 years ago | (#32279852)

It's absolutely data mining, but it's far from worthless.

Every time you go to Amazon and it recommends something to you, guess what, that's data mining using basically the same techniques that this service will use. And as you might expect, that equates to big $$$ for them (or else they wouldn't be bothering).

Many many fields use the technology, particularly the medical fields for analyzing the relationships between a large number of input variables (which may or may not be correlated) and some desired output variable. Spam filters, Google Search itself... all data mining algorithms. Nah, no money to be made there...

Now, the reality isn't as simple as 'upload the data, training the model, and generate predictions' normally. It takes time to figure out what factors to include, ETL'ing the training data from the actual source(s), plugging in algorithm parameters, and carefully validating your output model. Most models I've worked have taken several iterations to get right as you learn more about your input data relationships as you use the model.

And your second sentence is sadly true, if management wants a certain output, then the endeavor is pointless. But when used appropriately (and it's on the experts to explain the limitations of the tech to the users), this stuff is really powerful.

But will a lot of businesses be willing to send their 10 year history of accepted/declined credit card transactions with all the related demographic data to the cloud? Or their medical scenarios with the medical details of each patient? I think not. The type of data most mining projects use is critically sensitive. So I predict this will be limited to experimental users 'playing around', nothing more.

Re:Data mining certainly not worthless (2, Insightful)

JayJayEm (220851) | more than 3 years ago | (#32282880)

OK - I'll admit it - I did engage in a little bit of hyperbole.

But you have to admit that "at best worthless" has a better ring to it than "at best, when combined with a qualitative analysis of the model itself, and some testing with out of sample data, can be a useful tool in decision making".

You are right that no investment bank will go anywhere near this.

Re:Data mining (2, Interesting)

E IS mC(Square) (721736) | more than 3 years ago | (#32280354)

Often misused, definitely. But that does not invalidate the importance of any tool, including data mining.

One good example is Netflix recommendation engine. I know it's far from perfect (as there is nothing perfect about prediction), but is it useful? Hell yeah. It's the best recommendation engine I have used and have benefited greatly from.

Problem is when it's applied to areas where stacks are higher - like risk analysis by the investment banks.

And that brings me to mention an interesting (old) and related read - "Fooled by randomness" by Naseem Taleb.

Re:Data mining (1)

Bakkster (1529253) | more than 3 years ago | (#32281474)

It's worth remembering the saying with data: "if you look hard enough, you can find anything you want to".

I was just thinking that this automation will save unscrupulous scientists all the trouble of fudging the models to make the prediction fit their expected results.

Re:Data mining (1)

afeeney (719690) | more than 3 years ago | (#32282148)

I prefer the more direct: "Numbers are like people. Torture them enough and they'll say whatever you want to hear."

More seriously, though, a solid predictive system usually needs both the qualitative and the quantitative analyses. These tools can inform decision-making, but can't make the decisions for anybody, unless the decisions are in the same discrete closed system. There aren't that many entirely closed systems in the world.

Re:Data mining (1)

laddiebuck (868690) | more than 3 years ago | (#32289520)

The way one of my co-workers puts it: If you torture the data long enough, it'll tell you anything you want to hear.

Available to US developers only (2, Informative)

inkhorn (650877) | more than 3 years ago | (#32278906)

Google require you to have a current Storage-For-Developers account, which is only available for US parties currently.

ACNielsen is baning their collective heads... (0)

Anonymous Coward | more than 3 years ago | (#32278954)

Could be bad news for the already ailing ACNielsen. Based on my experience there, I'd guess that many companies that use the services of ACNielsen would also be willing to plug their data into an API like this and not only compare the output, but compare it to the outcome. If the API does a satisfactory job, they'll drop Nielsen like a ton of hot bricks.

Nielsen has some slick and useful software, but making their own API like this is the kind of ingenious thing that they could really use about now.

Re:ACNielsen is baning their collective heads... (0)

Anonymous Coward | more than 3 years ago | (#32280464)

Perhaps I'm mistaken, but I thought Nielsen's claim to fame was their ability to match up household attributes (ethnicity, income, age) to consumer behavior. Anyone with a suitable BI platform can do the rest, but it's not worth nearly as much without that household data, which is their bread and butter.

Re:ACNielsen is baning their collective heads... (0)

Anonymous Coward | more than 3 years ago | (#32286168)

My experience was that their "bread and butter" was once they had *your* data, they were also allowed to also use that data to get better results for any competitors that might also be using Nielsen.

The short version: The data I worked with *was* regional based, but not influenced by household income. However, it *was* influenced by competitor data for the same region. The exception to this was when sales data was side-by-side with regional income data and then the only influence was how the analysts read it.

With the data that I worked with, it all boiled down to, "If I release a coupon in region A on date x/y/zzzz..." - or - "If I start an ad campaign in region A on date x/y/zzzz, how will it affect my sales?"

All that being said, I only worked with a slice of their operation.

Prediction Battle Prediction (1)

Ukab the Great (87152) | more than 3 years ago | (#32279000)

I predict that within the next year someone's blog or the Wall Street Journal will feature a cage match between Google's Prediction API, a chimp with a dartboard, and a magic 8-ball.

Eye trick (1)

adeft (1805910) | more than 3 years ago | (#32279012)

I know that word is "churn" but the first 3 times I read it as "chum" Anyway, is this similar logic to how google is able to advertise based on what is discussed in your email?

Great, day trading here I come! (1)

kalirion (728907) | more than 3 years ago | (#32279050)

What could possibly go wrong?

Re:Great, day trading here I come! (1)

dward90 (1813520) | more than 3 years ago | (#32279200)

Given that actively managed funds often outperform the market at large, I would actually expect you to do decently with this strategy, even if Google only gives you random stocks.

Filler sentence (0)

Anonymous Coward | more than 3 years ago | (#32279230)

The service helps users leverage historical data to make predictions that can guide real-time decisions.

This sentence hurts my brain in how vague it is. You could say the same thing about Excel, Lotus 1-2-3, your kid's history homework, my filing cabinet, or the library. If it was removed from the summary, no meaning would be lost.

Here's my FREE data prediction API: (1)

proc_tarry (704097) | more than 3 years ago | (#32280132)

    delete data
    prediction = random()
    return prediction

Re:Here's my FREE data prediction API: (1)

mujadaddy (1238164) | more than 3 years ago | (#32282104)


Patent it in Germany!

Actually, change the last line to "return whatClientWantsToHear" and you've really got something!

Hmm... I smell a internet-scale prank opportunity (2, Funny)

pdxp (1213906) | more than 3 years ago | (#32280670)

Google probably wants to use the data for their own analysis. So, I suggest all of Slashdot team together and forge a large volume of the most bullshit data that will convince Google that, without a doubt, they need to make every first search result named "Frosty P1ss!" linked to goatse in order to make their customers happy.

Amazon has one too. (1)

Animats (122034) | more than 3 years ago | (#32280686)

Now I see why the Amazon Cloud people have been so insistent on people in Hacker Dojo's machine learning class run problems on their "cloud".

This stuff is actually fairly routine by now. It's much the same technology that's behind spam filters.

Acting on prediction (1)

gmuslera (3436) | more than 3 years ago | (#32281076)

The API predicts that will be an empty niche/opportunity in a day, then everyone that uses it jump there, so the prediction fails because becomes overcrowded. Is very easy to turn predictions for everyone to predictions for none if all try to take advantage of that knowledge.

More information is not necessarily better (1)

Stupid McStupidson (1660141) | more than 3 years ago | (#32281956)

It's interesting to see this coming, as in google becoming a digital Harry Seldon. But while it's good to have plenty of info to which base decisions on, it's becoming what in the Army is referred to as "paralysis by analysis". At some point, you need to trust your instincts, and do it. Pouring over the amount of data google can provide, filtering what is relevant (google isn't perfect), and then deciding what to do would likely take longer than going with your gut, or the smaller amount of available data, and then adjusting from there.

Classification algorithms as web service (1)

msbmsb (871828) | more than 3 years ago | (#32283552)

The use of the word "predict" is for ease-of-understanding for the business market and those not familiar with machine learning. Many of the comments here are getting lost in that word. The algorithms behind the API are most likely the same basic ones that have been around for a long time: naive bayes, svm, knn, etc. The actual novelty of this service is that it puts these methods in easy reach for people who otherwise wouldn't know where to start looking, or wouldn't know how to use one of the many available libraries already around, or much less implement something themselves.

See also: http://mlcomp.org/ [mlcomp.org] for a service that allows you to try out different classification algorithms on your own data sets.

Wow, a RESTfull API (1)

mzechner (1351799) | more than 3 years ago | (#32284986)

that's actually a pretty nice idea. The thing seems to have some caveats though: only categorical labels are allowed, training sets are limited to 100mb and no sparse features can be used. There's also no info on whether things like cross-validation are done and what algorithm will be chosen. I also wonder about how fast the prediction phase will be. Still pretty neat.

Completely useless (1)

Bitmanhome (254112) | more than 3 years ago | (#32287280)

I asked the Google Prediction API what the next Google API would be, and it said "Google Prediction API".

Check for New Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account