Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Google Begat the End of the Scientific Method?

CmdrTaco posted more than 6 years ago | from the well-i-begat-a-roast-beef-sandwich dept.

Google 387

TheSauce writes "In a fairly concise one-pager from Chris Anderson, at Wired, the editor posits that all of our current (or now previous) models for collecting data are dead. The content is compelling. It notes that we've entered the Age of the Petabyte — where one can collect immense amounts of data that are paradigm agnostic. It goes on to add a comment from the head of Google's R&D, that we need an update to George Box's maxim: 'All models are wrong, and increasingly you can succeed without them.' Have we reached a time where all of our tool-sets are now made moot by vast clouds of information and strictly applied maths?"

cancel ×

387 comments

Sorry! There are no comments related to the filter you selected.

Ahem (5, Insightful)

Anonymous Coward | more than 6 years ago | (#23935741)

The content is compelling. It notes that we've entered the Age of the Petabyte â" where one can collect intense amounts of data that is paradigm agnostic. It goes on to add a comment from the head of Google's R&D, that we need an "update to George Box's maxim: "All models are wrong, and increasingly you can succeed without them." Have we reached a time where all of our tool-sets are now made moot by vast clouds of information and strictly applied maths?"
I believe I speak for not a few of us when I respond:

WTF?

English, ---, do you speak it?

WTF indeed (5, Insightful)

GameboyRMH (1153867) | more than 6 years ago | (#23935775)

I saw the article yesterday, but it was so WTFey I just moved on...definitely not Slashdot submission material (especially being a Wired article).

Re:WTF indeed (5, Funny)

eggoeater (704775) | more than 6 years ago | (#23935875)

"WTFey"
I hadn't seen WTF adjective-ised before, but I love it... there's just so much I can use it with. In fact, I gotta go now and tell my boss how my project is going....

Re:WTF indeed (5, Funny)

mrchaotica (681592) | more than 6 years ago | (#23936061)

adjective-ised

And I hadn't seen adjective verbed!

Re:WTF indeed (3, Funny)

melikamp (631205) | more than 6 years ago | (#23936613)

And I—a pronoun slashed. Only on /.

Just to clarify (5, Insightful)

GameboyRMH (1153867) | more than 6 years ago | (#23936161)

To avoid the same fate as the GP, let me clarify that by WTFey I specifically meant that the article was full of fluff, light on details and generally pointless...which makes me think "WTF." The closest thing to a point I could get from the article was "Nice big blobs of data can be useful, and statistical data based on said blobs could replace the results of scientific research." Mmmkay.

A sensational headline leading to a rather pointless article consisting mostly of fluff: WTF.

Re:WTF indeed (5, Funny)

MightyMartian (840721) | more than 6 years ago | (#23936249)

It reads like some sort of brain-damaged new-age technohippy tripe. Yeah, we don't need methodologies any more, because, maaaan, we've got tubes! Gimme a break.

Re:WTF indeed (1)

arivanov (12034) | more than 6 years ago | (#23936437)

Yep.

And if it was true all investment shops would have used this tech instead of paying silly money to people who know math and can do modelling. I have not heard of that happening just yet so as they say: "keep me posted..."

Re:Ahem (5, Insightful)

smallfries (601545) | more than 6 years ago | (#23935923)

I used to think that I could translate most dialects of bullshit into english but this threw me off guard. The most reasonable explanation is that Chris Anderson is a tool and doesn't know what he is talking about.

For example, data is now "paradigm agnostic". Seriously, wtf? When was data ever not "paradigm agnostic" and when did we develop the need for a term to describe it. Data is data. It is raw, and unanalysed, and as such the notion of a paradigm is completely irrelevant.

Re:Ahem (2, Insightful)

Anonymous Coward | more than 6 years ago | (#23936023)

"For example, data is now "paradigm agnostic". Seriously, wtf?"

Just look at the creation evolution controversy, to see how data is not 'paradigm agnostic'. Each claim the others data is unsound by the paradigm's umbrella it falls under.

Re:Ahem (5, Informative)

Anonymous Coward | more than 6 years ago | (#23936175)

Each claim the others data is unsound by the paradigm's umbrella it falls under.

No, each claim the other's theory is wrong.

Nobody (sane) refutes the existence of ring species, or refutes microevolution, or other observable forms of data. The only thing in dispute in the controversy is "species are species because they were made that way" versus "species are species because after some really big N evolutionary steps they become that way".

Re:Ahem (4, Funny)

clang_jangle (975789) | more than 6 years ago | (#23936125)

Data is data. It is raw, and unanalysed, and as such the notion of a paradigm is completely irrelevant.


Well, we already know it wants to be free, so maybe now it's just exercising its sentient status in other areas.

The Paradigm is the Data Subset (5, Insightful)

fictionpuss (1136565) | more than 6 years ago | (#23936261)

The paradigm is embedded in the quantity, or subset, of data you choose to analyse.

For example, to detect stress you might traditionally measure heartbeat, skin conductivity, pupil dilation.

In the "petabyte age" you throw in the number of times the subject uses the letter 's'; how frequently they use the 'reload' button on the browser; what colour of pants they wore last tuesday; Pepsi vs. coca cola; the number of times they picked their nose in 1997 and any and every other bit of data you have on the subject.

In the "petabyte age", most of the data you sift through will show no correlation, but you have a much better chance of finding the unexpected if indeed, there is some unknown factor out there.

Re:The Paradigm is the Data Subset (5, Insightful)

kurthr (30155) | more than 6 years ago | (#23936713)

Don't you run a much higher probability of finding high correlation by chance?

I can expect to find a result that matches my model to 95% certainty about 5% of the time in random data. You can correct for this, but it's against human nature because people like to see the face of Mary in toast.

Learning how to look for correlation in huge uncontrolled data sets will require a new paradigm... or it will ultimately be useless and even perhaps, unsuccessful.

Re:Ahem (4, Insightful)

commodoresloat (172735) | more than 6 years ago | (#23936267)

Well, in the abstract data may be "paradigm agnostic," but the selection of data one has access to at any given time is inevitably not. Which data you choose to collect, how much of it you collect, which data you ignore - these are all decisions that are ultimately subjective. (BTW I think this is probably true even in the age of google but his point is that one is now collecting, storing, and accessing so much data and the "paradigm" influencing those decisions is not a specific scientific theory or point of view.)

Paradigm agnostic (0, Funny)

Anonymous Coward | more than 6 years ago | (#23936333)

I firmly believe in paradigm. Call me what you will: "paradigm freek", "irrational", "stupid", etc...

Nothing will shake my faith in data or the paradigm. My faith has given me peace and happiness. I just hope you agnostics and paradigm atheists respect my beliefs and I'll respect yours.

Thank you and peace.

Re:Ahem (5, Interesting)

nine-times (778537) | more than 6 years ago | (#23936355)

Yeah, I don't know what "paradigm agnostic" means specifically, but I think it's a mistake to think that "data is data".

Not all data is created equally. You have to ask how it was collected, according to what rules, and with what purpose. I can collect all sorts of data by stupid means, and have it be unsuitable for proving anything. It's even possible that I could collect a bunch of data in an appropriate way, accounting for the variables which matter for my particular experiment, and have that data be inappropriate for other uses.

Of course, if what's intended by "paradigm agnostic" is that we no longer pay attention to those things, then I hope we're not becoming paradigm agnostic. I'm just bringing this up because I think some people think numbers don't lie, and that when you analyze data, either your conclusions will be infallible or your analysis is flawed. On the contrary, data can not only be bad, but it can be inappropriate.

Re:Ahem (0)

Anonymous Coward | more than 6 years ago | (#23936675)

The "petabyte age" thing seems to assume that all data is available - even the stuff that nobody thought was important at the time. How you get that previously unimportant data is anybody's guess.

Re:Ahem (5, Interesting)

eln (21727) | more than 6 years ago | (#23936037)

It's simple really: The article seems to be saying that we have access to such a ludicrously large amount of data that trying to draw any real meaning from it is pointless. So, we employ a "shotgun" approach at reading the data, and voila, we get data that at least appears to be interesting.

Of course, since we have no particular purpose in mind when we do this, and no particular method other than "random", we end up with mostly useless data (in the example given, we have a bunch of random gene sequences that must belong to previously unknown species, but we know nothing about those species other than that we found some random DNA that probably belongs to them, and have no particularly good way of finding out more).

The article seems to be saying that since we have so much data, we can now draw correlations between different pieces of data and call it science. No reason is given why this is useful other than that we have so much of it, and Google is somehow involved. Apparently when you have enough data, "correlation does not equal causation" is no longer true. Again, no coherent reason is given for this stance.

I think the article makes the same mistake a lot of ill-informed people that get excited by big numbers make: It seems to believe that data is in and of itself an end goal, when really vast amounts of data are useless unless it can help us as humans answer questions that we want answered. Yes, knowing that there are lots of species of organisms in the air that we didn't know about before is sort of interesting I guess, but it doesn't really tell us anything useful.

Above all, the article proves that you can be almost entirely incoherent and still get your article published in Wired if it says something about how Google is changing the world.

Re:Ahem (1)

truthsearch (249536) | more than 6 years ago | (#23936171)

Above all, the article proves that you can be almost entirely incoherent and still get your article published in Wired if it says something about how Google is changing the world.
Chris Anderson is the editor-in-chief of Wired Magazine. He's the one who gets to choose what they publish.

Re:Ahem (1)

AKAImBatman (238306) | more than 6 years ago | (#23936285)

Explains a lot, doesn't it?

Re:Ahem (2, Funny)

loonycyborg (1262242) | more than 6 years ago | (#23936185)

the article proves that you can be almost entirely incoherent and still get your article published in Wired
And linked to on slashdot even!

Re:Ahem (2, Insightful)

tshetter (854143) | more than 6 years ago | (#23936353)

I didnt see the article really saying that "correlation does not equal causation" at some point with a large enough data set.

I saw it as saying "With so much data, you can use that as a base for preliminary research."

You then research those interesting things in traditional ways, but you have started with some sort of insight.

If you have enough images of the sky and stars, you can use the images to look for interesting things first, and then jump on a telescope or satellite when you have something solid to look for.

But to be sure, the author was selling Google is the Answer pretty hard. The application of math to problems is never a bad idea, they are doing it pretty well. And with the evolution of computers, more data and more processing are naturally going to occur.

Re:Ahem (3, Interesting)

MightyMartian (840721) | more than 6 years ago | (#23936361)

It's an idiotic notion. We've had vast amounts of data for well over a century now, more than we can hope to fully measure and catalog in a life time. Everything from fossils to space probe readings to seismic measurements fill up data archives, in some cases literally warehouses full of data tapes, artifacts and paper. The way you deal with this sort of thing never changes. Providing the data is stored in a reasonable fashion, if you have a theory, you can go back and look at the old measurements, artifacts, bones, whatever and test your theory against the data. The only difference is that rather than going out and making the observations yourself, your using someone else's (or some computer that just transmitted its data).

Very WTFey (1)

fictionpuss (1136565) | more than 6 years ago | (#23936405)

Yes, knowing that there are lots of species of organisms in the air that we didn't know about before is sort of interesting I guess, but it doesn't really tell us anything useful.
WTF? It tells us that money spent on discovering those organisms will not be in vain -- that there is an area worthy of further investigation.

What is not useful about that?

Re:Very WTFey (1)

eln (21727) | more than 6 years ago | (#23936631)

Maybe, but the article also says we don't need the scientific method or any other "model" to interpret the data once we have it. Instead, we use some sort of ill-defined "Googlish" method to derive meaning from it.

Seems to me that no matter how much data you have, and no matter how efficiently you can search through it, you're still going to need some sort of model, and especially the scientific method, if you want to derive any useful science out of it. The article seems to be suggesting that Googling through the data is good enough to find all the answers you need.

Re:Ahem (1)

MrMarket (983874) | more than 6 years ago | (#23936561)

Above all, the article proves that you can be almost entirely incoherent and still get your article published in Wired if it says something about how Google is changing the world.

...or if you are the publication's editor

Re:Ahem (2, Insightful)

jank1887 (815982) | more than 6 years ago | (#23936697)

Translation:

Old-way: develop physical model of how we think things work, test a few cases, refine model. New way: collect a huge relevant data set, mine the data for interrelationships, make a correlation. Correlation models replace scientific models. no more need for the hypothesis testing.

We had this coming (2, Insightful)

nova.alpha (1287112) | more than 6 years ago | (#23935761)

> made moot by vast clouds of information Sure, seeing how 90% of websites are door-ways, satellites, and other SEO tricks. Way to go, interwebz.

Definitions (3, Insightful)

sir_eccles (1235902) | more than 6 years ago | (#23935785)

"Data, information, knowledge, intelligence."

They may lead from one to the other but they are not all the same thing.

Re:Definitions (4, Insightful)

Itninja (937614) | more than 6 years ago | (#23935895)

A bit OT here, but don't forget 'wisdom' after intelligence. So many people stop at intelligence.

Re:Definitions (2, Funny)

Anonymous Coward | more than 6 years ago | (#23936071)

Also, charisma and dexterity are very important.

Re:Definitions (4, Insightful)

gnick (1211984) | more than 6 years ago | (#23936095)

don't forget 'wisdom' after intelligence. So many people stop at intelligence.
From what I've seen, it's not completely a progression from one to the other. I've met people who I would describe as 'knowledgeable', 'intelligent', or 'wise' without possessing either of the other attributes. Those traits are often coincidental and one can help beget another, but it's far from a hard set 'intelligent'->'knowledgeable'->'wise' progression.

Re:Definitions (0)

Anonymous Coward | more than 6 years ago | (#23936453)

A bit OT here, but don't forget 'wisdom' after intelligence.
Yes, wisdom is after intelligence, and then comes charisma.

It depends (2, Funny)

geekoid (135745) | more than 6 years ago | (#23936461)

Fighter classes generally stop at con, where as Casters generally for Int or Wis. No one cares about Cha.

Re:Definitions (1, Funny)

Anonymous Coward | more than 6 years ago | (#23936513)

Wisdom does come after intelligence in the stat arrays.

Str, Dex, Con, Int, Wis, Cha

There's no getting around it, Wisdom is simply a good dump stat for most classes.

Re:Definitions (1, Funny)

Anonymous Coward | more than 6 years ago | (#23936571)

Feel free to continue using Charisma as a dump stat, though.

Somebody didn't understand Kuhn (0)

Anonymous Coward | more than 6 years ago | (#23935815)

"one can collect intense amounts of data that is paradigm agnostic"

No data is paradigm agnostic. You already chose to either collect it or pay attention to it, and neither of those decisions are paradigm-agnostic. Not to mention, the data must be stored in paradigm-laden formats. Units and categories that may mean nothing, or everything.

Don't worry, hardly anyone else really understood Kuhn either.

Re:Somebody didn't understand Kuhn (1)

Breakfast Cereal (27298) | more than 6 years ago | (#23936731)

Thank you, this is exactly what I was going to post. Without a paradigm, there's no way to determine what "the data" even means, much less set about collecting it.

I wish paradigm had never become a buzzword. Now it means whatever people want it to mean.

Not quite (4, Funny)

edwebdev (1304531) | more than 6 years ago | (#23935817)

Until cells, molecules, atoms, and subatomic particles start publishing blogs, the scientific method will remain useful.

Re:Not quite (2, Funny)

cp.tar (871488) | more than 6 years ago | (#23936541)

Quite.

And no matter the amounts of data, no matter the computing power, I don't think pure statistics will ever be able to analyze human language efficiently.

So... (5, Insightful)

dunnius (1298159) | more than 6 years ago | (#23935823)

So everything possible has been researched now and therefore no more research is necessary since it will all be on the internet? Ridiculous!

humm - the hell? (1)

Amouth (879122) | more than 6 years ago | (#23935827)

what?

the current qoute "Never frighten a small man -- he'll kill you." seems more relevent

this was kinda predictable.... (1)

jrathe89 (1310829) | more than 6 years ago | (#23935845)

kinda predictable if you ask me.....hardware as well as software will always be ever expanding...

The problem with this newly coined 'age'. (0)

Anonymous Coward | more than 6 years ago | (#23935881)

I just hope Petaphile never becomes mainstream.

Re:The problem with this newly coined 'age'. (2, Funny)

Vectronic (1221470) | more than 6 years ago | (#23936497)

Petaphile [urbandictionary.com]

1) Someone who loves their pets more than human beings or, at the extreme, someone willing to kill a human to save a lower animal's life.

2) Somebody who has sex with animals because they cannot attract any humans, or they are attracted to animals

(and the best one)

3) someone so caught up in his own egomaniacle conception of the world that he is compelled to spew vomit and blood on a strangers clothes to show his contempt for anybody's thought but his own.

Which sounds kinda like the summary for the article, as well as some of the article.

How bout no (5, Insightful)

Anonymous Coward | more than 6 years ago | (#23935903)

Um, no. Claims like this demonstrate a lack of understanding of what a model is.

From the perspective of physics, the universe is just a massive amount of data--more data than any single human can comprehend at once. But thanks to the models of Newton we have a set of relatively simple equations that describe, generally, the way bodies in the universe interact. The model is not perfect, but it is useful.

Likewise, Google uses a very explicit model to describe the universe of the web: some pages are more relevant to a given search query than others, and these pages will generally be more 'popular' among other important pages. Again, the model is not perfect, but it is useful.

The fallacy is that somehow what Google is doing is a paradigm shift. It's not. It's just applying the same kind of scientific method to a type of data that hadn't existed before.

What, I think, the article is really trying to say is that Google's data is so massive and complex that we can't ascribe any explanation to the results it gives us. First of all, that is false, because the PageRank algorithm in its simplest form does give us a very explicit explanation (popular pages generally return better results). But even if it were true, Newton faced the same kind of accusations when people called his model of the universe 'Godless' and claimed, for example, that he decribed how gravity works without actually explaining "why" it works like it does. And that accusation is always with science. There are always more questions raised than answered. This is nothing new.

One word: (1, Funny)

Anonymous Coward | more than 6 years ago | (#23935905)

"Computers."

Honestly, that's about the gist of the article, and it left me wondering just what the point of it was. Until I remembered the career advice from The Graduate.

Bullshit (1, Interesting)

Anonymous Coward | more than 6 years ago | (#23935907)

This might have been true if all of your data was in the same order of magnitude. But consider things like the hyperfine structure. A petabyte is pretty large, but it is nothing compared to the orders of magnitude needed to randomly sample the entire electromagnetic spectrum that would detect hyperfine levels. When things like physics deal with subjects with over 40 orders of magnitude difference, random sampling isn't going to displace intelligent sampling.

Don't rule science out it. (5, Insightful)

russotto (537200) | more than 6 years ago | (#23935911)

The article is utter nonsense. But it's such a rambling mess it's hard to know where to start picking it apart. Perhaps the best is when he presents as an example of this new "model-free" approach with a program which includes "simulations of the brain and the nervous system". Uh, hello... a simulation IS a model.

Re:Don't rule science out it. (5, Funny)

feed_me_cereal (452042) | more than 6 years ago | (#23936105)

He didn't bother writing more than one rambling page because he figured someone said it better somewhere else on the internet and that we're all bound to find it.

Re:Don't rule science out it. (4, Interesting)

ColdWetDog (752185) | more than 6 years ago | (#23936163)

The article is utter nonsense. But it's such a rambling mess it's hard to know where to start picking it apart.

I suppose you could start where he, again, tries to present the argument that correlation really is "good enough" - causation be damned. What he is blattering on about is that you can infer lots of things via statistical analysis - even complex things. That's certainly true. Where he fails (and it's an EPIC fail) is his assertion that this method is a general phenomena, suitable for every day use.

The other major failure of TFA is that I can't find a car analogy anywhere.

Re:Don't rule science out it. (5, Insightful)

JustinOpinion (1246824) | more than 6 years ago | (#23936205)

it's such a rambling mess it's hard to know where to start picking it apart.
Agreed. I want to do a line-by-line rebuttal... but I fear that would be a waste of time.

The article does not make a compelling point. It keeps saying that we can give up on models (and science), because now we just have lots of data, and "correlation is enough." What utter BS. Establishing a correlation is not enough. Even if it is predictive for the given trend, it doesn't allow us to generalize to new domains the way a well-established scientific model does. If an engineer is designing a totally new device, that goes above and beyond what any established device has done, what data can he draw upon? If there is no mountain of data, he must rely on the tried-and-true techniques of engineering/science: use our best models, and predict how the new device/system will behave.

The article actually makes this point perfectly clear when it says:

Venter can tell you almost nothing about the species he found.
Indeed. Merely having tons of data doesn't actually give you insight into what you have measured. You must distill the data, pull out trends, and construct models. I just don't see how have mountains of data about a species, but still being unable to answer simple questions about it, is superior to conventional science (which can answer questions about the things it has discovered).

A deluge of data and data-mining techniques is a boon to science. But I don't see the benefit of giving up on the remarkably successful strategy of constructing models to explain the phenomena we've observed. I somehow doubt that having 20 petabytes of data on electron-electron interactions is more useful than having a concise theory of quantum mechanics.

Fat data stores might not be the right data (1)

postbigbang (761081) | more than 6 years ago | (#23936411)

We agree.

Restated:

The information quality of data isn't implied by large amounts of it. Correlation (read petabites of foo) != causation.

simulation != model (1)

migloo (671559) | more than 6 years ago | (#23936579)

Uh, hello... a simulation IS a model.

Simulation is a copy.
A model involves a shortcut.

There is a confusion here between technology, which indeed is taking the peta-brute-forcing route, and Science whose beauty precisely resides in its computational economy.

A perfect copy of a brain would be an engineering feast without providing any understanding of why it works.

He's trying to sell you Enterprise Search (0)

Anonymous Coward | more than 6 years ago | (#23936593)

The root of all this is a pitch for Enterprise Search. I can't tell you how much more productive I am now that I can search all my email with Google Desktop (or whatever application anyone thinks is better.)

I am getting in screaming matches with my boss because management wants me to "moderate" all our 10 different corporate "portals", each of which has been created because some pissant minor manager didn't like the way the 9 other pissant managers were moderating their portals. Fuck that, the corporate intranet is a big pile of data, the tools exist for users to search it themselves, and I can do more interesting things than argue with users over the difference between "its" and "it's".

The real death of the scientific method came with (-1, Troll)

Iowan41 (1139959) | more than 6 years ago | (#23935913)

The rejection of the metaphysical basis for knowing that there is an objective reality that can be known by the human mind. It was nearly totally destroyed with the recent politicization of science 'by consensus' by the IPCC, James Hansen and other fraudsters, and the insistence that the experimental method can only be called science if those experimenting happen to also be atheists - a religious test for being a scientist!

My Start menu has been Googled (4, Insightful)

spyrochaete (707033) | more than 6 years ago | (#23935939)

I am definitely a victim of this "Google effect". Search makes me lazy.

For example, for years I would pride myself on my well-tended Windows Start menu. I'd create base categories for my application folders like Hardware, Games, and Internet, and move applications into those folders to keep my Start menu manageable. I blogged about this procedure [demodulated.com] and included a screenshot.

Now that I'm using Vista I have little need to be so organized. I rarely have to navigate manually to an application folder thanks to the embedded search box on the Start menu. So now my Start menu is a huge clutter, but so what? I see that exercise as futile as dusting the cardboard boxes in the attic.

Re:My Start menu has been Googled (1)

Hal_Porter (817932) | more than 6 years ago | (#23936255)

Now that I'm using Vista I have little need to be so organized. I rarely have to navigate manually to an application folder thanks to the embedded search box on the Start menu. So now my Start menu is a huge clutter, but so what? I see that exercise as futile as dusting the cardboard boxes in the attic.
If you were fighting an enemy and wanted to wipe them out, would you want them to be capable of organising shit for themselves or would you want them to think organisation was a futile exercise? It's a lot easy to hunt slipshod hippies with Terminators and Hunter Killers than organised types who know where they hid the ammunition stash. The hippies will type "amuniton" and expect a machine to fix the typo and find it.

Re:My Start menu has been Googled (1)

ScentCone (795499) | more than 6 years ago | (#23936695)

The hippies will type "amuniton" and expect a machine to fix the typo and find it.

The details don't matter. The point is that it takes a village to find the ammunition. And if it turns out that one person is clever enough to do it on their own, without being properly vetted by the village's elites, then that person must be punished in some way, so as to avoid making the other villagers feel bad that they can't do it themselves. It's not that it DOES take a village to do something, see, it's that the village will squash anyone that has the gall to demonstrate the ability to function without the village's bureacracy and permission. Hopefully the village's enemies will be so horrified by the unstoppable monster of centrally dictated collectivism, that they'll run away without a shot being fired. In fact, the village doesn't even have to HAVE an ammunition stash. They should be able to bluff their way through a few generations of rule by a their benign dictatoriship before their enemies realize what a paper tiger they actually are, or the villagers themselves realize they're being cruelly shit on by authoritarian hippies. Nah, that would never happen. I mean, not again, right?

Re:My Start menu has been Googled (2, Insightful)

Hatta (162192) | more than 6 years ago | (#23936427)

Now that I'm using Vista I have little need to be so organized. I rarely have to navigate manually to an application folder thanks to the embedded search box on the Start menu.

If you're going to take your hands off the mouse to run an app, why not just pop open a console and start it from there? I have no use for any sort of start menu, I have a console. It's certainly more flexible than a search bar, you can pass arguments or file names(with wild cards even) to the application.

Re:My Start menu has been Googled (2, Interesting)

maxume (22995) | more than 6 years ago | (#23936531)

There are third party apps to add similar functionality to XP. Launchy is the one I use:

http://www.launchy.net/#download [launchy.net]

I think they are all clones of some Mac app though.

Chicken Egg (0)

Anonymous Coward | more than 6 years ago | (#23935945)

"It's time to ask: What can science learn from Google?"

Science had nothing to do with founding google.

Not so much (0)

Anonymous Coward | more than 6 years ago | (#23935957)

For a long time we have had two ways of looking at the world: deterministic and statistical. More data may make for better statistical models or maybe not!

The best example I can think of is weather forecasting. In the 1970s we thought that if we had enough data and powerful enough computers, we could totally predict the weather, nay even the climate. We didn't take butterflies into account.

So, sometimes no matter how much data you have, you're euchered. The scientific method still works in the domain where it works. (and it doesn't work ...) Nothing has changed. Nothing to see here folks. Move along.

What question do you ask the data. (4, Insightful)

xzvf (924443) | more than 6 years ago | (#23935961)

Searching data is a tool. You still need to have insight to formulate a theory, develop a test for the theory, and ask the data pool the right (non-leading) question. Then evaluate the data looking for both proof and disproof of the theory and be smart and ego neutral enough to let the data suggest a new theory, test and question. Don't confuse a new and useful tool that makes insight easier, with the ability of humans to have that insight.

Re:What question do you ask the data. (3, Insightful)

Daniel Dvorkin (106857) | more than 6 years ago | (#23936269)

Exactly. The "deluge of data" is a useful tool, no doubt about it. But Google doesn't make the job of collecting and analyzing data irrelevant any more than the advent of the telescope made the skills and knowledge of astronomers obsolete.

I particularly love this line from TFA:

For instance, Google conquered the advertising world with nothing more than applied mathematics. It didn't pretend to know anything about the culture and conventions of advertising -- it just assumed that better data, with better analytical tools, would win the day. And Google was right.

(Applied) science at its best! "The culture and conventions of advertising" are basically folk wisdom, and folk wisdom is often right but more often wrong. Google took a scientific, unbiased view of how to move bits around and make money with them: start with as few preconceptions as possible, analyze the data, see what happens.

Re:What question do you ask the data. (1)

stranger_to_himself (1132241) | more than 6 years ago | (#23936659)

An emerging problem is that we have more data than questions. So we go from the traditional approach of 'hypothesis -> data collection -> statistical test of hypothesis -> profit' to the new sort of data driven approaches that things like genome sequencing are giving us. These go more like 'data collection -> data mining -> hypothesis generation -> ???'.

The main problem with this new approach is that you get so many possible findings from these huge datasets that you need an awful lot more data and replication to be sure they aren't flukes.

quality still as important as quantity (2, Interesting)

peter303 (12292) | more than 6 years ago | (#23935973)

There are still several computing problems from earlier, smaller eras that havent been solved by the "more" paradigm. One example is realistic synthetic voice. The bandwidth is megabytes, achieved by mp3 players some years ago. However voice is the last part of the "real world" we have to capture instead of synthesize to implement computer-generated feature movies or video games. This keeps the need for having some "flesh" actors around, at least for a few more years :-)

Then there was Slashdot's retrospective of Artificial Intelligence a few days ago. Many of the interesting advances where made in the kilobyte and megabyte eras. It seems the gigabyte and terabyte eras have barely made a dent in progress.

Google =/= scientific method (5, Informative)

Rubikon (218148) | more than 6 years ago | (#23935979)

That an incredible amount of data exists on any given topic does nothing to describe relationships, causality, precision, accuracy, distribution, correlation, or anything else. Data is information, and information must be processed in order to make it meaningful. Additionally, everything that's written, printed, published, etc, is not necessarily true, accurate, precise, etc.

If anything, the Google phenomenon demands more rigorous examination by accepted methods.

The preceding message has been brought to you by Captain Obvious and the letters O,R,L,Y.

Say what now? (1)

TubeSteak (669689) | more than 6 years ago | (#23935993)

Correlation supersedes causation
Since when?
I'm pretty sure I was told the opposite in [every stats class ever]
 
Crunching large amounts of data is useless if you don't sort out which results are meaningless.
 
Side Note: WTF is up with /.?
I always post using Plain Old Text and hitting enter twice (two line breaks) only displays as one line break.
{p} doesn't create a new paragraph.
{br}{br} is the only thing that shows up correctly for me. /. was not behaving this way last week.

Re:Say what now? (1, Interesting)

Anonymous Coward | more than 6 years ago | (#23936483)

For a long time we've known that causality is a broken paradigm. Correlation is all there really is. Your "causal" laws of physics are just an expression of very very high correlation. People like to talk about "mechanisms" but the mechanism is defined in terms of other imponderables (such as "forces", whatever they may be). It's all just to make things look like how we want them to look. Causation is make-believe. Useful make-believe, but it doesn't generalize, while correlation also extends down into complex systems where "cause and effect" are impossible to observe.

As an example, we know perfectly well that if you smoke you are *a lot* more likely to get lung cancer than if you don't smoke. But there is no evidence whatsoever that smoking causes lung cancer. The problem is not that we can't prove that smoking causes lung cancer, but that our concept of causation does not apply to systems as complex as the human body.

So in the words of the original article, the Scientific Method in that sense has been dead for at least 100 years.

No. (4, Insightful)

qw0ntum (831414) | more than 6 years ago | (#23935999)

First, not everyone has access to vast clouds of information due to expense and I don't think that's going away any time soon. So we'll still get to understand what's going on around us and not just rely on regression analysis to inform our every decision.

Second, in my experience with large sets of data, you can do all kinds of math to them to bring out interesting relationships but someone with domain expertise is going to have a much better insight into what the data is saying than someone who doesn't. It seems the peak of hubris to think that the techniques taught in every science (social, hard, or otherwise) are worth nothing compared to massive amounts of data. How do you know where to get the data from? How do you apply the data?

I don't think it's quite time to throw out "correlation != causation". In fact, I think now more than ever we need to be able to understand underlying phenomena behind the data precisely because there is so much of it. With so much data, coincidental correlation is going to happen quite often I'm sure.

And, of course, the ultimate reason we need to understand things is for, you know, when the cloud's not there.

Nonsense (1)

Michael Restivo (1103825) | more than 6 years ago | (#23936009)

Out with every theory of human behavior, from linguistics to sociology. Forget taxonomy, ontology, and psychology. Who knows why people do what they do? The point is they do it, and we can track and measure it with unprecedented fidelity. With enough data, the numbers speak for themselves.
Out with meaning and understanding as well.

Cheers, -m

Wrong (5, Insightful)

DogDude (805747) | more than 6 years ago | (#23936021)

This is typical web 2.0 hype... more is better. Which, as anybody who has used Wikipedia knows, is utter bullshit. The scientific method can't be supplanted by a large amount of questionable data. Tons and tons of bad data is still bad data. It doesn't get any more correct just because there's more of it.

Re:Wrong (1)

SBacks (1286786) | more than 6 years ago | (#23936537)

I disagree! And, I have evidence:

Googling "more is better" returns 177,000,000 results.
Googling "more is worse" returns 114,000,000 results.

QED

Interesting, ranty, and wrong (5, Insightful)

xPsi (851544) | more than 6 years ago | (#23936053)

A thought-provoking piece written by someone who neither understands the scientific method nor Google. Who doesn't understand the difference between a Theory and a model. Who still doesn't get correlation!=causation. Who probably has never had to actually analyze any substantial amount of data before. And who has clearly been raised on a self-important intellectual diet consisting of too much Buckminster Fuller, Kurtzweil, Frank Tipler, and Derrida. I'm sure there are some kernels of insight buried in there someplace, but I'm just not clear what they are. If his rant is indicative about the future direction of science, we're all doomed.

Re:Interesting, ranty, and wrong (1)

Bazer (760541) | more than 6 years ago | (#23936207)

If his rant is indicative about the future direction of science, we're all doomed.
I wouldn't be too concerned about that. I'd be more concerned about the reason behind this quote:

"All models are wrong, and increasingly you can succeed without them."
I sincerely hope there's some merit behind it. If there isn't any then Google would have to revise this guy's job position.

Re:Interesting, ranty, and wrong (1)

TerranFury (726743) | more than 6 years ago | (#23936391)

I hadn't heard of Frank Tipler until you mentioned him. WOW! That man speaks crazy talk.

And he has a faculty position.

I'm starting to realize that academia is similar in some ways to pop-culture, in that name recognition is everything. It differs just in that the publicity stunts you do need to impress a different sort of person.

In a way, it's career advice. *Goes back to getting PhD*

Quite... (3, Informative)

denzacar (181829) | more than 6 years ago | (#23936457)

I'm sure there are some kernels of insight buried in there someplace, but I'm just not clear what they are
My thoughts exactly.
And since most slashdot readers don't RTFA most comments here have proven useless in trying to figure what those kernels you mention are.
But this guy, who has read TFA (and commented on it on the Wired's site) seems to have found them.

Posted by: technophile
20 hours ago1 Point

I think what you have hit on here is the difference between analytical and empirical solutions. Analytical relationships are usually first determined from empirical ones. Once you have the empirical relationships you can determine the missing factors or constants.

(See also http://en.wikipedia.org/wiki/Empirical_method [wikipedia.org] )

They are both necessary and a part of the scientific process. You collect data, generate empirical equations, then try and derive or otherwise model the empirical relationship with an analytical one. Empirical relationships are limited because they are somewhat system dependent. For instance an empirical relationship for the ideal gas law could be generated using methane. This might be accurate for methane, but limited in its use for a gas that deviates from the ideal behavior (i.e. hydrogen fluoride gas). You could generate an empirical relationship for every single molecule in the universe but that would be impractical, which is why analytical relationships can often be more useful. Hopefully the "Petabyte Age" will allow the scientific method to flourish, not replace it.

edit: Rethinking my reply, what the article seems to say is that the Petabyte Age will make determining empirical relationships for everything practical. The scientist who generates loads of empirical relationships and never questions the underlying theory is not a scientist at all, just an observer of scientific processes. I suppose it depends on your goal as to whether this will suit you or not.

Biggest Data Collector LHC relies on Models (4, Insightful)

markk (35828) | more than 6 years ago | (#23936101)

I thought this was a joke at first. One thing to think about is that the biggest data collector of them all, the Large Hadron Collider, which fits the frame given perfectly - delivering terabytes of data in huge data sets is just the opposite of the described scenario. Models are crucial to actually picking what data is actually recorded. In fact a large part of how good the LHC data will be will be in using models to select what events to capture. The way the data is captured is of course also based on long effort and knowledge from previous detectors. This isn't just randomly, or even generically selectively gathering data and then analyzing it. This is targeted data gathering based on complex scientific theories. There have been shouting matches at what to tag for collection based on what people think is important for a given theory - and these will happen again.

As our collection abilities rise exponentially, the the storage and analysis abilities are not exponentially growing, even though they are increasing at a fast rate! I would argue exactly the opposite of what this article said. We are going to be more and more dependent on our current scientific theories to even be able to choose appropriately the rich data that new sensors and techniques will let us collect. That is we are more and more dependent on our scientific theories when we get data not less. Did we even know to get methylation data when sequencing a genome. How about some other "ylation". Without background theory and experience we wouldn't even know some of that stuff was there to collect!

WTF, be serious (2, Insightful)

mlwmohawk (801821) | more than 6 years ago | (#23936135)

This is nonsense pure and simple.

One needs to acquire facts. Now these "facts" can come from your own research or, in the age if the internet, someone else' data, but they still need to be collected and verified.

The *only* advantage that google provides is a more efficient way of sharing and finding facts. Not even all facts, those that are popular and topical are what you'll most likely find.

Historical information, from when newspapers only used dead trees, can be very difficult to find on the internet unless someone else did the research first.

this one needs a "haha" tag (1)

quixote9 (999874) | more than 6 years ago | (#23936147)

Vast clouds of information used without intelligence are just garbage going nowhere. You can't even call it Garbage In Garbage Out, because it's not being processed by any kind of mind at all.

What could possibly go wrong?

Nonsense. (1)

going_the_2Rpi_way (818355) | more than 6 years ago | (#23936201)



This is grade school stuff. Correlation is not causation.

Which means if you're approaching a region you haven't sampled, then you can't understand what's going to happen because you've thrown away your interest in 'why [something] does what it does, [because] it just does it.'

If you're only using models as correlations or proxies, what are you using models for anyways? There's nothing 'increasingly' true about that.

WHAT? (1)

vivin (671928) | more than 6 years ago | (#23936215)

A scientist doing an experiment still relies on the scientific method to collect his own data to see if they support his hypothes[ie]s. I really don't see anyone publishing a paper and saying "Dudes! I used Google to find my data points!" How the hell is Google going to stop people from doing experiments and finding their own data?

This article is complete crap. I don't think this person even understands whats the "Scientific Method" means.

It's still alive (1)

rnaiguy (1304181) | more than 6 years ago | (#23936229)

Sure you've got ton's of data, but you need a theory to use it to solve real scientific problems.

For example, Craig Venter may have tons of genes that look like something that can make gasoline from grass, but you still need to test each one the old-fashioned way, with careful application of theory and experiment, to see if it works before you start using it.

Sidney Brenner (legendary Biologist and Nobel laureate) calls these methods "low-input, high-throughput, no-output biology." http://www.mc.vanderbilt.edu/reporter/index.html?ID=5027 [vanderbilt.edu]

Re:It's still alive (1)

Daniel Dvorkin (106857) | more than 6 years ago | (#23936585)

Brenner, like a lot of older wet-lab scientists, makes some good points but goes way too far in his criticisms. High-throughput biology is increasing our understanding of basic cellular processes at an exponential rate. The key point he misses, I think, is that high-throughput techniques are certainly low-output on a per-experiment basis compared to traditional tecnhiques -- but "low" is not the same as "no", and if you do a very large number of experiments in parallel, there's a good chance that one or two of them will yield useful data. Furthermore, with large public repositories like GEO [nih.gov] , there's a good possibility that the hundreds or thousands of experiments that don't yield useful results for your work can still be useful to someone else.

To paraphrase Mark Twain... (1)

Kid Zero (4866) | more than 6 years ago | (#23936273)

There's lies, damn lies, and statistics. Now "Clouds" of information.

Feh.

No. Science Scales. (3, Informative)

mbone (558574) | more than 6 years ago | (#23936299)

Have we reached a time where all of our tool-sets are now made moot by vast clouds of information and strictly applied maths?

No. And also no to the basic premise of the article.

Meteorologists have been doing this for decades (principal component analysis has been a crucial tool there since the 1960's, and correlation analysis has been used in some form since the 1920's if not earlier) and so have the astronomers. Oh, and the particle physicists have been sifting data in their own way on a big scale ever since World War II.

As one of many examples, if you ever have heard of an "El Nino event," that was discovered through correlation analysis and is best understood through principal component analysis. BTW, the original work predates electronic computers and was all done by hand. The vast quantities of meteorological data require statistical analysis to make any progress at all, but that certainly does not mean that you cannot use the scientific method.

So, no, this does not invalidate the scientific method. In the Internet jargon, science scales.

Consequence of the Post-Modern Age (2, Informative)

phobos13013 (813040) | more than 6 years ago | (#23936307)

Anyone who has read any work by Lyotard, Baudrillard, or Derrida has seen this interpretation of reality coming for years. This is basically the consequence of the Post-modernist/Post-structuralist mentality.

In a sense, what the article is proposing is the "simulation" of reality in a computer system based on the available "data". This simulation as i will suppose in a moment is merely a flawed model since the data being related must in some sense be based on an algorithm which inherently MIMICS reality and is not a substitution for it (no matter how, "accurate" agreement). But nonetheless, the result of this as Baudrillard observed is not a simulation but a simulacrum of reality and eventually will take the place of reality. The implication is that reality is not created or manufactured by the interaction of people in a "real" sense but is actually lead by the operation of the simulacrum!

Nonetheless, the fact is there is no possible way to store ALL the data of the entire world (since some data is not recordable by a binary machine, and no a "quantum" computer is the solution to say it can be); however, the problem is this fact does not mean we cannot be mislead by the simulacrum and be lead into a future where human interaction is as I would call inhuman, but as some who have (in some cases unknowingly) fallen for the post-modern myth would call it merely an evolutionary result of human-interaction.

In the future the storage of data, the usage of data, and the power of data will have a huge impact on our humanity as the past twenty years should already be evidence of. I am not an apocalyptic fear-monger, but the proof is in the pudding. For further reading, I recommend a highly prescient book written in 1990 by a Mr. Mark Poster called the Mode of Information which talks about some of these implications which are in the process of becoming as we speak

To the contrary! (1)

anmida (1276756) | more than 6 years ago | (#23936321)

The idea in the article is interesting, but I personally feel it's totally bogus. Yes, crunching data with mathematical formulas can help extract something useful, but...
Strictly speaking, isn't a mathematical formula a model? All of the theories (models) we use in materials science to explain things (quantum mechanics, stress/strain relations of materials, etc) are all mathematical. Qualitative understanding doesn't give you a numerical prediction.
Perhaps the above is a bit of a logical flaw, but you still need the maths to get information out of all the data. You need to know what to look for and make the necessesary algorithm (low-level model?). AFTERWARDS, though, you need to understand that data. Otherwise, you have not done much to advance your understanding. I did RTFA, and the person mentioned who "discovered a new species" but doesn't know anything about it...neat. What, really, has he done? Just thrown out some meta-data for someone else to analyze, model, and study. Google is not the end of scientific method. To the contrary, I think it will only help.

Offtopic (0)

Anonymous Coward | more than 6 years ago | (#23936327)

.. but found this information on wikipedia's current events page: "The US state of Florida purchases a large dildo to add to the pleasure of the senate about their lands in the Everglades."

When people say shit like this... (3, Funny)

iluvcapra (782887) | more than 6 years ago | (#23936337)

Have we reached a time where all of our tool-sets are now made moot by vast clouds of information and strictly applied maths?"

It means there's about to be an explosion in models and theoretical sciences. Always beware the End of History ;)

Houses will fall down, Tumors will go unchecked (1)

tezza (539307) | more than 6 years ago | (#23936371)

Here is some 'compelling' comment. Lots of very important things require the scientific method.

Obvious ones are medicines and housing materials.

Important ones are accurate global warming models and electric battery efficiency tests.

This Wired jerkoff and his band of know-little acolytes think that because they can accomplish everything in _their_ day without science, that it will die out.

This myopic self centredness would not have yielded them a clear signal on their iPhone. Science did that.

This is a lazy way to work (1)

wcrowe (94389) | more than 6 years ago | (#23936403)

I'm concerned that placing too much trust in such a model-less paradigm is dangerous. Why? There could be several reasons. Data can be artificially manipulated, for example. This can cause us to draw erroneous conclusions, and consequently, to make poor decisions. I would still want to know "why".

"Intense" amounts of data? (0)

Anonymous Coward | more than 6 years ago | (#23936431)

"Intense" amounts of data?

WTF is an "intense" amount?

Foundation Series (2, Interesting)

javelinco (652113) | more than 6 years ago | (#23936527)

Just finished rereading the Foundation series for the one millionth time. Anyone remember some of the signs of the decay of the first empire? The idea that these "scientists" were no longer experimenting, no longer looking for new ways to do things - just spending their time looking at old books and old experiments and trying to squeeze a "new" thought or two out of them? That a sociologist would study a society through books written about it? An archeologist would explore the ruins of a world by reading descriptions written by someone centuries before?

Anyway catching the parallels here? The "search engine" is a great tool for gathering existing data - but our current tools help us:

1. Analyze that data
2. Gather more data

Can you honestly say that those aren't important anymore? The summary seems pretty crazy to me.

good point of article ... (1)

peter303 (12292) | more than 6 years ago | (#23936569)

Even though its a bit exuberant, as some techies are,
the interesting point is that new generations of a "thousand" should give you new ways of looking at scientific problems. The claim it "ends science" is just a red-herring to get you to think about the issue.

Google marketing (1)

12357bd (686909) | more than 6 years ago | (#23936597)

is getting out of control.

Wired. (2, Funny)

E-Sabbath (42104) | more than 6 years ago | (#23936605)

You know, this may be the most pure Wired article I've read in a long time. Reminds me of the magazine's layout when it first came out. Complete bull, unreadable, unstructure, but slick.

Interesting, but off target. (1)

Techguy666 (759128) | more than 6 years ago | (#23936693)

What Google has done is represent the world, mathematically, as it existed a moment ago. This is a massively impressive feat which we slashdotters don't give enough credit. On the other hand, I still say, "meh".

Think of it this way, Google has created a "model" world. I'm not thinking "model" in the scientific terms but "model" as in the Gundam robots with snap-tite (tm) parts. Instead of plastic bits, this model Earth is built with data. It's pretty to look at and has a lot of great details. It's a darned good-looking likeness of our world. (And no, I don't mean Google Earth, either.)

But it can't predict things like a scientific model.

One still needs the scientific model and hypothesis testing to make predictions and see what our world will be like in the future. This, in turn, also helps explain how or why things came before. The Google model just shows what currently is.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?