Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Wal-Mart's Data Obsession

timothy posted more than 9 years ago | from the what-do-you-want-today dept.

Businesses 581

g8oz writes "The New York Times covers Wal-Mart's obsession with collecting sales data. Fun fact: 'Wal-Mart has 460 terabytes of data stored on Teradata mainframes, at its Bentonville headquarters. To put that in perspective, the Internet has less than half as much data, according to experts.' That much information results in some interesting data-mining. Did you know hurricanes increase strawberry Pop Tarts sales 7-fold?"

Sorry! There are no comments related to the filter you selected.

"Nothing for you to see here. Please move along." (0)

Anonymous Coward | more than 9 years ago | (#10814549)

Looks like Wal-mart is hiding something

Re:"Nothing for you to see here. Please move along (4, Informative)

MaxPower2263 (529424) | more than 9 years ago | (#10814579)

Even Walmart probably doesn't even know what all that data means. Think of the processing power needed to make sense out of it all. I'm sure there are countless interesting trends that are lost in that data ocean.

Re:"Nothing for you to see here. Please move along (3, Insightful)

PKPerson (784484) | more than 9 years ago | (#10814694)

I would assume this data is more than just shopping trends. I guess it includes survelance photos, employee data, backups of it all, etc. if it is all shopping trends, there are either very observative or stalkers.

Yeah (4, Funny)

Anonymous Coward | more than 9 years ago | (#10814550)

and shopping there means your income has dropped 7-fold

Re:Yeah (1, Funny)

Anonymous Coward | more than 9 years ago | (#10814687)

Sorta like running Linux.

Re:Yeah (5, Funny)

takeya (825259) | more than 9 years ago | (#10814741)

Wal-Mart has 460 terabytes of data stored on Teradata mainframes, at its Bentonville headquarters. To put that in perspective, the Internet has less than half as much data, according to experts.' ...

normally, but I guess they didn't check when I was sharing my pr0n on direct connect.

I would have thought that the Internet had more. (4, Interesting)

caluml (551744) | more than 9 years ago | (#10814551)

Who says how much data the Internet has available?

Re:I would have thought that the Internet had more (1, Insightful)

nerd256 (794968) | more than 9 years ago | (#10814572)

I agree,
Wouldn't Walmart's records constitute some part of the internet also? It has to be connected at some point to the internet, and given some clever haXing skills... one could access it.

It really depends on your definition of the bounds of the internet, but I think someone is being hyperbolic.

Re:I would have thought that the Internet had more (1)

krymsin01 (700838) | more than 9 years ago | (#10814603)

Uh... It doesnt have to be connected at any point to the net. Seeing as how that want to keep it to their selves....

So, if Walmart put up a web interface... (3, Interesting)

DoorFrame (22108) | more than 9 years ago | (#10814622)

If Walmart created a web interface for their data, would the amount of data on the Internet suddenly triple?

I think the expert they got their information from was full of baloney.

Re:So, if Walmart put up a web interface... (5, Insightful)

Frnknstn (663642) | more than 9 years ago | (#10814695)

Firstly, there is no way they can be talkinging about all the data availible on the internet. Filesharing networks alone have WAY more data than this, and when you add all the FTP servers and mirrors, the webmail archives, the home Windows users with insecure shares...

There is no way this can be true. Even if you ONLY take publicly availible WWW pages, it would far exceed their measly estimate.

Re:I would have thought that the Internet had more (0)

Anonymous Coward | more than 9 years ago | (#10814625)

The people making that estimate are probably only counting 'legitimate' data on the WWW. They probably don't include, for example, data made available via file sharing, which would make 460TB look miniscule.

Re:I would have thought that the Internet had more (3, Interesting)

Deliveranc3 (629997) | more than 9 years ago | (#10814691)

Perhaps non redundant DATA?

Re:I would have thought that the Internet had more (2, Interesting)

hankwang (413283) | more than 9 years ago | (#10814646)

Who says how much data the Internet has available?

Google has 8E9 web pages and documents indexed. If the average document is 20 kB in length, then we have 160 TB of publicly available data on the internet, not including pictures and filesharing. The latter probably has a great deal of duplicate data anyway.

Re:I would have thought that the Internet had more (4, Informative)

Hobbex (41473) | more than 9 years ago | (#10814688)


People who call themselves "experts" but are really just talking out of their asses do. Consider that The Internet Archive [wikipedia.org] alone contains more than a petabyte (1024 terrabyte) of data, all of it accessible, and that they are adding on the order of 20 terrabyte a day, and you start realizing how much bigger the Web is.

Re:I would have thought that the Internet had more (1)

cyberise (621539) | more than 9 years ago | (#10814717)

I'd say google would have a pretty good idea...

Re:I would have thought that the Internet had more (1, Informative)

Anonymous Coward | more than 9 years ago | (#10814722)

What constitutes the internet anyway? I know some dc hubs on the internet that have over 100TB, sure it's p2p, but what about archive.org? I know they have at least a feqw dozen TBs by themselves. That number in the article can't be right at all.

I, for one, (5, Funny)

fiftyfly (516990) | more than 9 years ago | (#10814552)

would like to welcome our new (evil) data collecting overlords.

Re:I, for one, (-1, Troll)

Anonymous Coward | more than 9 years ago | (#10814566)

suck the fuck up you cock sucker

Re:I, for one, (1)

dedeman (726830) | more than 9 years ago | (#10814601)

I don't think that they are new, by any strech of the imagination. Of course, this story falls right on the heels of the same story I saw on CNN last night (or this morning, depending on your definition).

In Soviet Walmart... (1)

brq22 (776340) | more than 9 years ago | (#10814555)

...the tinfoil hats are $1.99 plus tax. Made in China.

FUCK the New York Times (-1, Redundant)

Anonymous Coward | more than 9 years ago | (#10814556)

FUCK.

Re:FUCK the New York Times (2, Funny)

UserGoogol (623581) | more than 9 years ago | (#10814605)

The moderation on this guy amuses the hell out of me. Instead of saying "Why can't you be nice? -1 Troll" you say "Yeah, I know. -1 Redundant."

Nothing to see here. Move along. (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#10814557)

But slashdot hid the story from me for 5 tries! Now I lost that boootiful first post I was gunning for!

230 terabytes? Please (5, Interesting)

Anonymous Coward | more than 9 years ago | (#10814569)

My company alone has over 50 terabytes of data available for download on the internet. Whoever thinks there's that little data on the internet is very poorly-informed.

Huh? (2, Interesting)

phoxix (161744) | more than 9 years ago | (#10814573)

I'd be highly surprised if the internet combined didn't reach the exabyte mark ...

Sunny Dubey

Re:Huh? (1, Insightful)

Anonymous Coward | more than 9 years ago | (#10814647)

That 460 terabyte mark sounds fishy.

There are 300 million americans. Let's include Canada and a couple other countries walmart is in to make it a round number.

460,000,000,000,000 /300,000,000 ~= 1,500,000

1.5 Megabytes per person?? I dont believe the average person has generated 1.5 megabytes of data at walmart! If you listed every single item I ever bought at ANY store and even include timestamps this will not reach 1.5 megs! These figures must be exaggerated and include a lot of redundancy.

Re:Huh? (1)

nizo (81281) | more than 9 years ago | (#10814674)

Heck just the duplicates of porn alone seems like it would be way beyond 600 terabytes.

Re:Huh? (1)

Spectre_03 (786637) | more than 9 years ago | (#10814739)

I personally would have to disagree, with a caveat that in actual data I think the article is correct, in capacity, an exabyte is easy to see as realistic.

All that aside though does anyone think that may be a bit excessive? Especially for a company starting to delve into RFID and likely not only tracking what is bought but with that kind of data store I would guess it's tracking your debit/credit card purchases against you and also knows not only that you like those pop tarts but your underwear size as well.

Anyone happen to think like me that this is a bad thing?

Haha... (5, Funny)

GR1NCH (671035) | more than 9 years ago | (#10814575)

you fools have no idea that I would never let you hurt the Wall-Mart

More than the Internet ?! (4, Funny)

architimmy (727047) | more than 9 years ago | (#10814576)

Someone at Walmart has ALOT of pr0n!

460 Terabytes (0)

Anonymous Coward | more than 9 years ago | (#10814577)

Think how much porn you could fit on 460 Terabytes!

Maybe I'm obsessed with data too :/

2004 = 1984 + 20; (1)

piquadratCH (749309) | more than 9 years ago | (#10814581)

George Orwell got it all wrong! I't wasn't 1984, it's 2004!

Re:2004 = 1984 + 20; (3, Funny)

nwbvt (768631) | more than 9 years ago | (#10814658)

Yeah, all that evil marketing data is really oppressing the masses and restricting the free flow of ideas.

Re:2004 = 1984 + 20; (1)

Ziviyr (95582) | more than 9 years ago | (#10814714)

We love our two minutes of %50 off though!

Re:2004 = 1984 + 20; (1)

piquadratCH (749309) | more than 9 years ago | (#10814729)

No, but it's all part of a giant system of surveillance. Once all those pieces come together, 1984 will look like a bithday party

Re:2004 = 1984 + 20; (2, Interesting)

nwbvt (768631) | more than 9 years ago | (#10814747)

Yeah, how dare they find out how many pairs of socks I've bought in the past year.

Listen, if you really are that paranoid, pay in cash. Then there is no way for the evil Wal-mart overlords to find you and force you to buy more pop tarts.

economies of scale (4, Insightful)

man_ls (248470) | more than 9 years ago | (#10814584)

When you have 460TB of data, how the hell do you even begin to search it?

Seems like they'd need to license map-reduce from google or something. (That's a distributed data correlation engine. With extremely high fault tolerence, to boot.)

Re:economies of scale (5, Insightful)

Sexy Bern (596779) | more than 9 years ago | (#10814613)

More to the point - how do they back it up?

Re:economies of scale (5, Funny)

seann (307009) | more than 9 years ago | (#10814627)

select sFirstName,sLastName,iPhone from LargeAssDatabase where bWelFare = False;

go on vacation for a week or ten..

deal with resulted data.

Re:economies of scale (0)

Anonymous Coward | more than 9 years ago | (#10814681)

I'd mod you up if I could but I can't so I won't. I'll expand on the joke.

UPDATE Customers Set [WhiteTrash]=True where sales not like null

Re:economies of scale (4, Informative)

antifoidulus (807088) | more than 9 years ago | (#10814746)

I know this is a joke but as far as I know, Wal-Mart does not collect individual customer names for most purchases, there is no customer card thing like there is at a lot of supermarkets. I suppose they could collect data via credit cards, but I doubt that is legal.....

Re:economies of scale (5, Informative)

kimanaw (795600) | more than 9 years ago | (#10814668)

When you have 460TB of data, how the hell do you even begin to search it?

With SQL.

Teradata was built to handle processing very large datasets from day 1. 460 Terabytes distributed across a large number of CPUs and disks working in parallel with a robust SQL implementation isn't really the challenge. The hard part is keeping all those disks spinning when you start pushing MTBF limits, handling the thousands of concurrent users all banging away at the data, and the constant streaming of new data into the system in order to support near real-time DSS.

For those inclined to know more, check here. [ncr.com]

Re:economies of scale (0)

evilviper (135110) | more than 9 years ago | (#10814673)

When you have 460TB of data, how the hell do you even begin to search it?

I would (first off) imagine it's not in one single database. You can probably address it in any number of different ways.

I'm sure they don't run "grep pop.tarts hugefile.txt"

Even full-text seach should still be possible, if they confined it to just one store/region/etc.

Re:economies of scale (3, Informative)

sql*kitten (1359) | more than 9 years ago | (#10814684)

Seems like they'd need to license map-reduce from google or something.

As the article says, they're using Teradata [ncr.com] . This is not a product that I'd expect the average Slashbot, who thinks "IT" and "internet" are synonymous, to have heard of. Nevertheless, if you work with industrial amounts of data, you will know that Teradata databases can reasonably claim to be to Oracle as Oracle is to MySQL.

Re:economies of scale (0)

Anonymous Coward | more than 9 years ago | (#10814700)

how the hell do you even begin to search it?

Step 1: Remove Microsoft Access

That is... (1)

Wig (778245) | more than 9 years ago | (#10814587)

...that is...extremely....lame... I wonder, does rain increase Halo 2 sales?

Re:That is... (1)

beaverbrother (586749) | more than 9 years ago | (#10814664)

No, but Halo 2 sales do increase sick days.

You gotta love "experts" (4, Funny)

broothal (186066) | more than 9 years ago | (#10814590)

the Internet has less than half as much data, according to experts

What's the word I'm looking for? Oh yeah - it's bullshit

Re:You gotta love "experts" (1)

SJS (1851) | more than 9 years ago | (#10814682)

100 copies of the same 500k image file doesn't constitute 50 megs of data...

And in other news... (4, Insightful)

wesmills (18791) | more than 9 years ago | (#10814594)

...Microsoft has an astonishing amount of information collected from Windows Update users (none of it personally identifiable, of course).

I highly suspect Wal-Mart didn't get into the position it's in of being the largest retailer by being stupid, at least business-wise. This is the sort of project that allows them to stock a 120,000 square-foot big box store from JIT shipments every night, and why every Wal-Mart in a region looks the same. Though I would be interested to read more on the pop-tart to hurricane correlation...

Re:And in other news... (1)

micromoog (206608) | more than 9 years ago | (#10814737)

Yeah, Wal-Mart is pretty well accepted as the most technologically advanced retailer of all time. And I hear most of their business systems are home-grown, too.

the real interesting part is... (5, Funny)

Coneasfast (690509) | more than 9 years ago | (#10814595)

they're storing them on a huge cluter of their $200 lindows systems. ;)

according to "experts" (0)

Anonymous Coward | more than 9 years ago | (#10814599)

To put that in perspective, the Internet has less than half as much data, according to experts.

According to other experts, "In June, an average of 8 million P2P users were online at any one moment, with 1 petabyte of data available to share."

http://digital-lifestyles.info/display_page.asp?se ction=cm&id=1396 [digital-lifestyles.info]

Correlation doesn't imply causation!!!!! (5, Insightful)

Baldrson (78598) | more than 9 years ago | (#10814602)

Did you know hurricanes increase strawberry Pop Tarts sales 7-fold

Correlation doesn't imply causation!!!!!

I mean what if a third factor caused both the hurricanes and strawberry Pop Tart sales to increase 7-fold????

Somebody was going to blurt that bromide out at that statement, so it may as well be me.

Re:Correlation doesn't imply causation!!!!! (4, Insightful)

krymsin01 (700838) | more than 9 years ago | (#10814637)

It makes sense though. If you are going ride out a storm, you are going to need lots of food that will not require refrigeration nor cooking.

Beer makes sense also. There are always a hell of a lot of hurrican parties in Florida whenever a hurrican comes 'round.

Re:Correlation doesn't imply causation!!!!! (0)

Anonymous Coward | more than 9 years ago | (#10814683)

Hurricanes aren't really caused by anything meaningful (a butterfly in china?), so in this case correlation does imply causation.

Re:Correlation doesn't imply causation!!!!! (3, Insightful)

zbyte64 (720193) | more than 9 years ago | (#10814686)

Yes there could be a third reason, but lets think about this. When a hurricane comes, you want non-perishable foods. Computer geeks like myself, like poptarts cuz you just open them up and eat em, and those things don't go bad for a while. No need for a microwave or stove, something you would want for soup and such. SO if a hurricane comes by and wipes out gas & electric and everything is friggen wet, you need something that requires no cooking or heating -> poptarts Of course 7 fold does seem a bit high

Re:Correlation doesn't imply causation!!!!! (3, Funny)

Anonymous Coward | more than 9 years ago | (#10814728)

It's a well-known fact that hurricanes bring toasters and mini-fridges, so Pop Tarts and beer are logical purchases.

Re:Correlation doesn't imply causation!!!!! (1)

miyako (632510) | more than 9 years ago | (#10814744)

Although it is possible that a third factor caused both the hurricane and the increase in sales of strawberry pop tarts, it really seems reasonable that what is actually going on is that people stock up on food that they can eat without having to cook or add water too. Pop Tarts are a really common brand of a fairly popular type of food, and Strawberry is kind of the default flavor.

We can stop this! (1)

MrP- (45616) | more than 9 years ago | (#10814608)

We can stop this! We just have to kill The Heart of Wal-Mart.. It's a mirror towards the back of the store, just smash it and all will be ok.

Sure about that? (0)

Anonymous Coward | more than 9 years ago | (#10814609)

From summary "To put that in perspective, the Internet has less than half as much data"

Unless the mainframes are connected to the internet, in which case they're part of it. Does data have to be broadcast from a service to count?

Seen it! (5, Informative)

Number44 (41761) | more than 9 years ago | (#10814610)

As a guest of WalMart I was able to enter their data center and see this Terraplex first hand. It's massive. It's thousands upon thousands of disks in ~8' frames, rows upon rows of racks. I walked down it and across it and up it and was simply awestruck by the idea of that many disks in one spot.

The gentleman who gave me the tour indicated they have something like 72 weeks (1 year plus 2 weeks) of purchase data on LIVE disk arrays, plus huge archives of the same data on tape. If you buy anything and use your credit, debit, or whatever card they can figure out your sales history obscenely quickly. Be afriad. Be very afraid.

I also got to see Walmart.com (Sun E15k) and Samsclub.com (A bunch of HP boxes in a smallish frame), they were creepy, in a sense... all those sales going on at once, converging on a spot not a few feet from me.

Re:Seen it! (2, Informative)

nizo (81281) | more than 9 years ago | (#10814696)

I wonder how many people they have running around replacing failed disks in the arrays. It would have to be at least several full-time jobs worth of people, not to mention they must have a gigantic pile of disks waiting on-site.

Re:Seen it! (4, Insightful)

SamMichaels (213605) | more than 9 years ago | (#10814748)

The gentleman who gave me the tour indicated they have something like 72 weeks (1 year plus 2 weeks)

According to Google:

1 year = 52.177457 weeks

So 72 weeks is 1 year plus 19.822543 weeks.

Great data (1)

LordHatrus (763508) | more than 9 years ago | (#10814611)

Great, maybe they even have data on the average slashdotter; for instance, for every 3 people that read the article, a hurricane destroys northern Taiwan. ... now notice northern Taiwan isn't being hit by hurricanes... CONSPIRACY!!! PUT ON THE TINFOIL HATS!!!

couldnt be as bad as... (1)

dwgranth (578126) | more than 9 years ago | (#10814616)

Acxiom, who in my mind are far worse than data hoes... they sell your information to the highest bidder.. and thats their business model.. Wallyworld would never give up their data... for their own self interest of course

Every move you make (5, Funny)

cloudkj (685320) | more than 9 years ago | (#10814618)

Every step you take, every move you make
Every single day, every time you pay
Wal-mart will be watching you

Expert source (1)

wombatmobile (623057) | more than 9 years ago | (#10814620)

By its own count, Wal-Mart has 460 terabytes of data stored on Teradata mainframes, made by NCR, at its Bentonville headquarters. To put that in perspective, the Internet has less than half as much data, according to experts.

What experts?

The NYT doesn't say.

Want more information? You can buy [nytimes.com] some more from the New York Times.

Re: Southpark.. (1)

meff (170550) | more than 9 years ago | (#10814621)

Everyone go to the back of your Wal-Mart and smash the mirror behind the little door... NOW!

Oh, and don't forget to shop at Target ;) *evil laughter*

Please remind me (4, Funny)

nerd256 (794968) | more than 9 years ago | (#10814626)

I've been reading the comments
I forgot, are we supposed to hate Wallmart?

On one hand they are a large corporate empire and on the other, they promote cheap linux computers.

arg, Im so confused

Pop Tarts (4, Funny)

evilviper (135110) | more than 9 years ago | (#10814628)

Did you know hurricanes increase strawberry Pop Tarts sales 7-fold?

Yes I did. God help me!

Heh, lets see if this "predicting" works (5, Interesting)

UncleJam (786330) | more than 9 years ago | (#10814630)

A few years ago when I worked in retail, everything was going smoothly. Every night the managers would go around with electronic guns and see what needed ordering the next day. Except for the busiest times of the year the backroom was pretty much empty of stock, and on top of the aisles the extra stock was minimal.

Then one day, the managers were really excited, as we were going to have a computer order everything for us, from records of sales from before and it would "predict" what we would need. They said the extra stock on top of the aisles would be eliminated. We would be able to concentrate on customer service.

Well, the day came, and for a few months you could tell the computer was fighting with limited data. Some weeks would be rediculously overstocked on a few items, others, the leading sellers in the store would have empty shelves. When it finally settled down after a year, it was worse than before the computer.

The top of aisles were jammed to the ceiling with stock, there was never any room to put anything up there, and getting to the bottom for something you needed cost a lot of time. Plus, the backroom was packed with stock. You could hardly move around, and trying to find the last box of something buried underneath these huge piles was a task that killed your morale. During the slow months, one stocker for the whole store was enough for a night, now 3 were common to deal with all the stock.

Re:Heh, lets see if this "predicting" works (1, Funny)

Anonymous Coward | more than 9 years ago | (#10814661)

You worked for Kmart, right?

programming for retail (1)

loid_void (740416) | more than 9 years ago | (#10814732)

Wal Mart has the most sophisticated retail and inventory control programs in the world. This is the reason they have eaten everyones lunch.

Re:Heh, lets see if this "predicting" works (1)

magarity (164372) | more than 9 years ago | (#10814749)

Well, the day came, and for a few months you could tell the computer was fighting with limited data. Some weeks would be rediculously overstocked on a few items, others, the leading sellers in the store would have empty shelves. When it finally settled down after a year, it was worse than before the computer.

So I hope everyone realises the point of the above anecdote is that a badly programmed computer system is MUCH worse than no computer system. However, and that the average slashdotter can probably understand, what's just as important is that a properly programmed system can lead to tremendous benefits. See: Wal-Mart. Beware vendor sales reps claiming their product is your one-stop panacea! Sounds like that's what happened to this person's company!

Compressed? (1)

tiredwired (525324) | more than 9 years ago | (#10814633)

The article does not mention if that is compressed data or not. It seems like inventory & sales data should compress really well.

460 terabytes? (1, Funny)

Anonymous Coward | more than 9 years ago | (#10814640)

460 terabytes? chargen would seem to disagree.

Only 230TB on the internet LOLOLOL (0)

Anonymous Coward | more than 9 years ago | (#10814641)

i've seen single DC hubs that store more than Walmart

Teradata mainframes? (1)

Kaemaril (266849) | more than 9 years ago | (#10814644)

Wow, I didn't realise they still made mainframes. Ever since the DBC/1012 I thought they just ran Teradata software emulated under Unix or NT.

Now the DBC/1012's, with the hardware AMPs ... things of beauty :)

Re:Teradata mainframes? (1)

kimanaw (795600) | more than 9 years ago | (#10814713)

Now the DBC/1012's, with the hardware AMPs ... things of beauty :)

Esp. when the cards and disks are replaced by a beer keg and tap!

how much space do you need to describe pop-tarts (1)

yorkpaddy (830859) | more than 9 years ago | (#10814652)

I know walmart does an amazing amount of business, but I still don't see how their CRM system needs 400 terabytes. How much space do you need to say, "person A bought pop tarts, a CD, and milk on 11/14/04"

Re:how much space do you need to describe pop-tart (1)

Cheeze (12756) | more than 9 years ago | (#10814707)

internet website search for poptarts...
looking up geography of ip address....go it ... ...
purchase of poptarts within 20 minutes at walmart 5.3 miles from website search.
ip address also searched for toaster ovens but there was no purchase...better send an order for more ovens to that store. ...
contacting ip provider...go it
assimilating customer data...go it
sending snail mail to address about new toaster ovens at local walmart with 10% off ad...

Yeah, okay... (1)

SamMichaels (213605) | more than 9 years ago | (#10814653)

They're also one of the most successful businesses in the country next to Microsoft. Maybe the data is working.

Nope, its location. (4, Informative)

mekkab (133181) | more than 9 years ago | (#10814736)

We learned a lot about Walmart and Data mining in my database 101 class. And the professor asks "Why do you think Walmart is so successful?"

And everyone says something about leveraging technology and JIT delivery, etc.

Professor Liu [jhu.edu] says "Nope. Location."
Walmart chose most of their initial locations in cities/regions where there was no other competition. Places where there was no Kmart, no department stores, no malls. And they flourished.

Hmm... (1)

northcat (827059) | more than 9 years ago | (#10814655)

Wal-Mart has 460 terabytes of data

The Internet Archive [archive.org] has 100 terabytes of data.

even the mango is tracted (4, Insightful)

loid_void (740416) | more than 9 years ago | (#10814659)

My brother sells mangoes to the Wal Mart Beast. He says it's all computerized, beginning with an order for the fruit, following the trucks, even the rotation of the ripening process in the warehouses is computer related. It's as close to virtual management as any company comes.

Just imagine (2, Insightful)

nizo (81281) | more than 9 years ago | (#10814660)

Imagine what evil could be done with this data: how about a service where you can track your spouse's/SO's buying habits? See if they buy condoms and flowers every night they work late for example. Imagine what would happen if they started keeping track of fingerprint data off of cash/checks that people use in stores too. Well I am off to go buy some tin foil now (with cash, wearing gloves) :-)

Re:Just imagine (1)

Ziviyr (95582) | more than 9 years ago | (#10814675)

No no no, they can track the cash implants via satellite. Buy your stuff with shiny rocks!

Upload or Bittorrent it! (1)

GabrielPM (633823) | more than 9 years ago | (#10814672)

They should put it online, so the Internet can tripple (if you buy that "twice the internet" part).
Or maybe get a torrent started. That data sounds juicy!

There's a name for this.. (4, Insightful)

k98sven (324383) | more than 9 years ago | (#10814676)

The Law of truely large numbers. [skepdic.com]

Basically, the more data you have, the more likely you'll find weird coincidental correlations.

I guess these kinds of 'statistical finding' will become more and more prevalent in the future, given that we're living in an age where we're collecting ever-larger amounts of data, and have the resources to process all this data automatically.

It would be a good thing if people were a bit more sceptical of this kind of stuff. Correlation isn't causation.

Re:There's a name for this.. (4, Insightful)

sql*kitten (1359) | more than 9 years ago | (#10814720)

It would be a good thing if people were a bit more sceptical of this kind of stuff.

Ermm, RTFA.
  1. They predicted that pop tart sales would increase
  2. They shipped additional pop tarts in anticipation
  3. The pop tarts sold like, umm, hot pop tarts

You can be skeptical all you want. Someone at Walmart made the call, and they were right.

Speaking of food trends, stop buying yeast! (3, Funny)

JoeShmoe950 (605274) | more than 9 years ago | (#10814677)

Did you know?
EVERY TIME A LOAF OF BREAD IS BAKED,
APPROXIMATELY
150,000,000 YEASTS ARE
KILLED

Come to the award-winning 1987 film,
"The Very Small and Quiet Screams"
-- a cinematic electromicrograph of yeasts being baked.

A must for those who care about yeast, and especially for those who don't.

SPONSORED BY
Brown Anaerobe Rights Coalition (BARC)
Student Bakers for Social Responsibility
Coalition for the Elevation of Life (CELL)

Defend all life: "From greatest to least, from human to yeast!"

I smell a lie (1)

Zoko Siman (585929) | more than 9 years ago | (#10814685)

The internet archive has a lot more info [archive.org] than that. And grows by a lot each month. If they think walmarts 460 Tb of data is > than the internet I'd wager that they're wrong.

The Internet? (1)

methangel (191461) | more than 9 years ago | (#10814701)

The "Internet" has a hell of a lot more data than what the article stated. I don't know about you, the last time I checked, the Internet is a collective of Web Pages, Usenet, IRC, Sharing Networks, etc.

Hell, DC++ (Direct Connect Client/Server) has had more than 500 terrabytes of shared data in several of my favorite hubs.

My guess is that the "expert" is Al Gore.

The Problem? (3, Informative)

squirel_dude (810037) | more than 9 years ago | (#10814709)

I hate to sound like some pro-totalitarian next generation Big Brother, but it's not as if they are collecting personal information on customers without the customer's consent. Wal-Mart are just doing some major (I agree with obsessive though) market research so as they can optimise their stores to maximise profits, exactly the same as every other business in the world.

That doesn't mean they know what to do with it... (3, Interesting)

Anonymous Coward | more than 9 years ago | (#10814711)

Coworkers who have worked with Wal-mart IT tell me that Wal-mart does indeed have mountains of data. However, they have so much data that they do not know what to do about it. They can't interpret it all because there is just too much of it.

This makes me wonder... there must be some ideal point where a certain amount of data collected is worth the most money because you can act on that data. After that point, collecting additional data is increasingly more costly and counterproductive unless you invest in an infrastructure that lets you process more data. How does one figure out that ideal point? Just a thought.

Did you know... (5, Interesting)

GoMMiX (748510) | more than 9 years ago | (#10814718)

Wal-Mart employees who use their employee discount cards have every purchase tracked and monitored.

Activity of the cards is ACTUALLY monitored for discrepencies in buying habits to find abusive employees who buy things for their friends?

Did you also know Wal-Mart's employee name badges have RFID tags (and have had for many years) that allow Wal-Mart to track where an employee is at any given time?

Another interesting tidbit, did you know at Wal-Mart's Jewelery warehouses they actually WEIGH the amount of metal in your body when you enter a leave? (And I don't mean they ask you to put things in a dish and weigh the dish - they scan YOU)

Another interesting thing, Wal-Mart has a fallout facility in Oklahoma that has a near-real-time backup of each BIT of that 460 terabytes of data?
Wal-Mart could survive a direct nuclear blast and still keep on a truckin'.

And, of course, if you're in a Wal-Mart home office - ISD building - distribution center - et al... and dial 911 - BOOM - you get Wal-Mart's private security? Niiice, hope it's not a real emergency, you first have to explain it to them - then if they deem it neccessary THEY will call the REAL 911!

And I really hope it's not on SQL (2, Informative)

The-Bus (138060) | more than 9 years ago | (#10814733)

To put that in perspective, the Internet has less than half as much data, according to experts.'


How the hell can they estimate that? Assuming "less than half" means about 45%, that gives us about 207 TB. Let's just round that up to 240.148445 TB to make it a nice, even number.

Google is searching 8,058,044,651 "webpages"* -- who knows what that means. Now, Google isn't searching every single page on the internet, certainly. But also, they can't be searching pages that don't exist. So the 8bn Google pages aren't certainly all the internet. But Google isn't double or triple counting pages. Still, at 240.148445 TB (my rough estimate), we come up with a page size of exactly> 32KB per page.**

Is this just counting the text? The code for this page right here (comments.pl) weighs in at about 14KB. Wal-Mart, in no way, has twice as much info as the internet. I would say the "internet" should be measured in at least petabytes. Archive.org itself already has 1PB, and I consider any of that content available to me "on the internet".

* I'm not even counting the Google cache.
* Which means Mr. Gates over-estimated by a factor of 20 when considering how much memory we all needed!

I'm not afraid.. (3, Funny)

Robber Baron (112304) | more than 9 years ago | (#10814735)

Did you know hurricanes increase strawberry Pop Tarts sales 7-fold? ...and if you needed a 460 TB data array to tell you that then you're too stupid to live.
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?