Beta

Slashdot: News for Nerds

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Dell Says 90% of Recorded Business Data Is Never Read

timothy posted about 4 years ago | from the sounds-lowball-no-matter-the-methodology dept.

Businesses 224

Barence writes "According to a Dell briefing given to PC Pro, 90% of company data is written once and never read again. If Dell's observation about dead weight is right, then it could easily turn out that splitting your data between live and old, fast and slow, work-in-progress versus archive, will become the dominant way to price and specify your servers and network architectures in the future. 'The only remaining question will then be: why on earth did we squander so much money by not thinking this way until now?'" As the writer points out, the "90 percent" figure is ambiguous, to put it lightly.

cancel ×

224 comments

Coincidence? (5, Funny)

Hognoxious (631665) | about 4 years ago | (#32859370)

90% - just like the percentage of statistics that are made up on the spot.

Re:Coincidence? (5, Funny)

dov_0 (1438253) | about 4 years ago | (#32859728)

Or is dell about to make a press release about faulty storage in their servers resulting in about 90% data loss?

Re:Coincidence? (5, Funny)

espiesp (1251084) | about 4 years ago | (#32860008)

Or having developed a new memory technology.

"Dell releases a new drive based on their patented WORN architecture. Because this device forgoes the need to read your data they can be made lighter and faster and more power efficient than even the latest SSD drive technology."

Re:Coincidence? (0)

Anonymous Coward | about 4 years ago | (#32860208)

90% of Slashdot comments are written, never having read TFA.

Which 90% ? (5, Insightful)

mbone (558574) | about 4 years ago | (#32859380)

I could believe the 90% number. There is plenty of data sitting around in case it is needed. Some of it will be needed. Much of won't be. How do you predict which is which ?

Re:Which 90% ? (5, Insightful)

eldavojohn (898314) | about 4 years ago | (#32859404)

I could believe the 90% number. There is plenty of data sitting around in case it is needed. Some of it will be needed. Much of won't be. How do you predict which is which ?

Yeah, as someone who has implemented a few auditing solutions where I work, I must confess that it seems to be 99% of the data we archive is never looked at again. A lot of it is due to policies and is only used after something goes dreadfully wrong. If they are well thought out, the metrics can be collected as the data is written instead of needing to search across the data.

I think their "90% dead-weight rule" is really a misnomer as you could probably claim that 90% of Google's indexing is never read but we all know that it's the potential that data holds that makes it so valuable and necessary. If Google knew every future possible search then they could delete the data they will never use ... but how do they know they will never use it? How do I know that the auditing data will never have a use--by new metric or incident investigation? The truth is simply that you don't.

Re:Which 90% ? (2, Insightful)

sco08y (615665) | about 4 years ago | (#32859632)

I think their "90% dead-weight rule" is really a misnomer as you could probably claim that 90% of Google's indexing is never read but we all know that it's the potential that data holds that makes it so valuable and necessary.

Another problem is figuring out _why_ data isn't used before archiving it. Is it not useful, or are the tools not in place to use it?

If companies decide that the x% least used data will be shoved away in the attic, then "x% of data isn't useful" becomes a self-fulfilling prophecy.

Re:Which 90% ? (1)

shentino (1139071) | about 4 years ago | (#32859644)

Just like 90 percent of the time you don't need to file an insurance claim, but when you do, you really do need it.

It's just insurance.

Sorta like how we have a big military that is spending more time in training than actual combat.

Cost of storage (1)

mangu (126918) | about 4 years ago | (#32860144)

it's the potential that data holds that makes it so valuable and necessary

What matters is the cost/benefit ratio.

The potential for the data being valuable may be very low, but the cost of storing it is going down all the time. Disk space today is a dime a gigabyte, so let's keep it just in case.

Re:Cost of storage (1)

vivian (156520) | about 4 years ago | (#32860354)

The real cost is not storing it - but rather the cost in recording all that info in the first place. Someone has to type in all that data to start with, and possibly someone else has to at least glance at the resulting reams of reports that are produced from it.

It is all too tempting to create database apps to record all sorts of information "just in case", but more often than not all you end up doing is making the system more complex than it has to be, and more time consuming in maintenance of both the application and the data.

Re:Which 90% ? (1)

AnonymousClown (1788472) | about 4 years ago | (#32859408)

Name, address, phone #, and shit purchased.

Anything else is a waste.

Re:Which 90% ? (2, Funny)

camperdave (969942) | about 4 years ago | (#32859706)

People change addresses and phone numbers at the drop of a hat, so recording that would be pointless.

Re:Which 90% ? (1)

ta bu shi da yu (687699) | about 4 years ago | (#32859998)

HIPPA laws say differently. And fair enough too - they should definitely be retaining your medical records for up to 7 years!

It's funny that the last sentence in this slashdot piece asks why didn't people do this before. They did, and indeed they still do! People have secondary and even tertiary backup - in fact I happen to know that even stodgy old EMC have made a mint out of their Centerra storage devices for this sort of thing. It's called Content Addressable Storage [wikipedia.org] , and despite a particularly brain-dead mechanism for addressing stored data (the content becomes the address - it gets hashed!), it's been pretty popular in the marketplace.

Re:Which 90% ? (2, Interesting)

Anonymous Coward | about 4 years ago | (#32859574)

I work for a large resource company and we collect loads of data... some of which is valuable today and some of which is valuable tomorrow... interestingly what is of value tomorrow is dependent on the maturity for data consumption is today.......

so we collect the data not because it's of value today, but because we might analyse it tomorrow in a new way.

Re:Which 90% ? (3, Interesting)

alexhs (877055) | about 4 years ago | (#32859790)

If each piece of data has 90% probability of not beaing read again...

You discard only 10 pieces out of 100, or out of 1 billion, whatever...

The probability that none of these 10 pieces of data would have ever been needed again is 0.9^10 = 0.348 = 34.8%

Which means that you keep all of your data.

Caveats :

  • This assumes that all pieces have equal interest (but maybe you store a field that the interface doesn't allow you to retrieve).
  • Assuming a random access on the 10% used, if you remove 10 out of 100, you have a much more important retrieve failure than if you remove 10 out of a billion. Some retrieve failure rate could be acceptable.

Re:Which 90% ? (0)

Anonymous Coward | about 4 years ago | (#32859836)

I worked for a company like this once. All incoming, outgoing faxes, and fax confirmations (physical copies) were stored for 6+ months on the off chance something needed to be verified. There were lots of errors in our system and our vendor's system, so who had the oldest archived copy usually ended up not having to pay for that particular mistake. My job was to archive and label all these, and then rotate out the oldest and fill the old, now emptied boxes with newer papers. Sales quotes (on paper!) were saved for 3-5 years. This was a full time position, rotating paper, so we could save about a thousand dollars a week in fees and clerical errors. What we saved on clerical errors almost paid for my salary+benefits. I ended up engineering myself out of the job when I suggested we could just scan in all the documents for the day that I was archiving, and then have a monthly checklist to delete anything older than 5-7 years if we ran out of hard disk space. The job went from a full time salaried position to a 3 hour a week job that a college intern in a short skirt did on friday mornings scanning in papers and making coffee. Shockingly, this happened in 2008, not 1958.

Re:Which 90% ? (1)

bryonak (836632) | about 4 years ago | (#32860258)

The same applies to backups as well.
Almost all of the backups you make are never actually needed, but that's a weak argument to forgo them.

Hope They Don't Want the Z-Series (2, Insightful)

eldavojohn (898314) | about 4 years ago | (#32859382)

From the article:

Opportunity too good to pass up

It was just about then that one of my favourite bargain-hunting websites turned up a device called the CORAID EtherDrive. Take a look at the product range at CORAID, but don’t spend too long on it.

That's the same device from a story I submitted yesterday [slashdot.org] . I hope they don't plan on getting a Z-Series running ZFS.

It's true and pretty well known (0)

Anonymous Coward | about 4 years ago | (#32859384)

Or at least I've certainly heard it before, that in large storage systems, the average number of times that a file is accessed during its lifetime is less than one. That is, some files are accessed lots of times, but most are never accessed.

That's certainly true of lots of the files I use. For example, I shoot a lot of digital photos and upload them to my computer. A few of them get a lot of use, the rest just sit there occupying some tens of GB of disk space (which is pretty cheap these days) and are never accessed except for the occasional migration to a new disk drive.

which 90% (3, Insightful)

marmusa (557884) | about 4 years ago | (#32859386)

Which 90% though? Like the Coca Cola exec who remarked that he was pretty sure half of his advertising budget was wasted, he just wasn't sure which half.

Re:which 90% (1)

dohzer (867770) | about 4 years ago | (#32859410)

Exactly. You don't know what you need until you need it, which is why you record all that you can so in the odd case that you do require it, it's where you need it.
Deciding how quickly and frequently you will require the data is a separate problem.

Re:which 90% (5, Informative)

Koby77 (992785) | about 4 years ago | (#32859600)

I worked in a call center, and I can definitely believe that 90% of the data is never read again. However, when a customer is calling back (and is angry!), you don't have time on a live call to wait to see what's up with the account. Also there can be some litigious aspects, and a lot of information was recorded for C.Y.A. purposes. Again, you never know which part is needed for C.Y.A. purposes, but that 10% sure is valuable.

So yeah, we needed to store ALL the account information, and we needed fast access to ALL of it ALL the time.

Re:which 90% (2, Informative)

bwintx (813768) | about 4 years ago | (#32859770)

Like the Coca Cola exec who remarked that he was pretty sure half of his advertising budget was wasted, he just wasn't sure which half.

FWIW, and pointing this out only because I've seen this quote referenced so many times over the years...

John Wanamaker, a 19th century entrepreneur, Lord Leverhulme, founder of consumer goods giant Unilever, and Franklin Winfield Woolworth, the founder of Woolworth's, have all been credited with the quote: "I know that half of my advertising is wasted. I just don't know which half."

-- Citation [businessop...dideas.com]
-- Google search [google.com]

It's like Office features (5, Informative)

drinkypoo (153816) | about 4 years ago | (#32859388)

People always bitch that they have to pay for Microsoft (or whatver) Office's features because they only use 5% of its functionality. But you buy all those features at once because you don't know which you will need in the future. Data warehousing is the same way. If you start taking data offline you'll just need that data. That's why analyses of very large data sets are performed before archiving.

But what is really wanted is a way to cluster the database servers, with old data automatically cycled to the slowest, most remote nodes, and with the most frequently-altered data heavily replicated and aggressively synchronized.

Re:It's like Office features (2, Insightful)

1u3hr (530656) | about 4 years ago | (#32859480)

People always bitch that they have to pay for Microsoft (or whatver) Office's features because they only use 5% of its functionality. But you buy all those features at once because you don't know which you will need in the future.

Bullshit. True only if you've never used a wordprocessor in your life before. If you have, you know what you use. And you can read the description of other features to decide if you want them.

And this is a pointless analogy because if in the future you decide you do need the 3D porn embedding, you can upgrade to get it. If you don't backup some of your data, you can never change your mind if you find you need it 10 years later.

Re:It's like Office features (2, Insightful)

drinkypoo (153816) | about 4 years ago | (#32859628)

Bullshit. True only if you've never used a wordprocessor in your life before. If you have, you know what you use. And you can read the description of other features to decide if you want them.

It doesn't make it unreasonable to purchase a lighter word processor with less features, but I for one would not want to support a word processor where you buy access to toolbar buttons. And if I'm doing database reporting (for which I have been paid in the past) I would not want to have to request that pieces of data be reloaded into the database so I can perform analyses. And further, if I have to do a year-by-year analysis, I do not want to have to load and unload data sets, crunching one year at a time. I want to build one report that goes forth and executes subreports to produce year-by-year reports without me having to sit at my desk and watch Crystal Reports grinding.

Re:It's like Office features (0)

Anonymous Coward | about 4 years ago | (#32860338)

I could take the nickel-and-diming if it was ultimately going to cost less. The problem is, I'd probably have to use that feature *at work*, where the purchase process is typically a complex arcane bureaucratic process that is best summed up as "you'll be dead first".

(captcha, apropos enough, is "keeled")

Re:It's like Office features (2, Insightful)

icebraining (1313345) | about 4 years ago | (#32859578)

No, I think Office features are different; everyone only uses 5%, but each person uses a different 5%.

Re:It's like Office features (0)

Anonymous Coward | about 4 years ago | (#32859622)

That isn't really expanding on anything to be honest.
5% is still 5%, it isn't 100%, which is what they are paying for.

I remember a time when Microsoft used to go on about snap-ins and modular programs. Yeah, that was a good time.
Now we just have monolithic crapware, that, yep, you guessed it, barely anybody uses!
What happened to that awesome modular Windows 7 they went on about for a year+? Oh yeah, that's right, never existed since they took the lazy and fast route and just made a service pack to Vista and renamed it. Maybe Windows 8, maybe. We will get our Modular Windows one day. /mini-rant

It's not like it is hard to make modular software. They could even add trials in for each of the features so people can download it on the spot and try it out.
Microsoft are too stupid to realize just how BIG a market this could make for them if they sold Office cheaper and add in paid-for upgrades.

Re:It's like Office features (0)

Anonymous Coward | about 4 years ago | (#32859898)

A modular Windows will never be titled under the Windows family; it will be too different. It may be a new operating system entirely, possibly underneath Singularity. Old programs won't work (by necessity) unless they run in a VM.

I look forward to this day... Microsoft has the unique capability to create the only good user-oriented OS since BeOS. Apple could be argued as being another contender, but they don't have the experience to ground-up design a massive-scale project in this field IMO. Although, lately I begin to question the future of the general-purpose OS entirely.

Re:It's like Office features (0)

Anonymous Coward | about 4 years ago | (#32859832)

No, I think Office features are different; everyone only uses 5%, but each person uses a different 5%.

I like these phrases with a punch, that deliver the idea clearly and shortly. But if it was true, then 100%/5% = 20 total Office users that use different 5%.

Things get further complicated by the fact Office does have multiple tiers (Starter, Home, Edu, Enterprise), so it looks like Microsoft wouldn't agree with the above.

Re:It's like Office features (1, Insightful)

Anonymous Coward | about 4 years ago | (#32859616)

>

But what is really wanted is a way to cluster the database servers, with old data automatically cycled to the slowest, most remote nodes, and with the most frequently-altered data heavily replicated and aggressively synchronized.

George Santayana: "Progress, far from consisting in change, depends on retentiveness. When change is absolute there remains no being to improve and no direction is set for possible improvement: and when experience is not retained, as among savages, infancy is perpetual. Those who cannot remember the past are condemned to repeat it."

The concept and implementations of hierarchical storage are http://en.wikipedia.org/wiki/Hierarchical_storage_management [wikipedia.org] several decades old in the mainframe world. Why did "we squander so much money by not thinking this way until now"? Because "we" are savages/infants who refuse to retain experience.

Re:It's like Office features (0)

Anonymous Coward | about 4 years ago | (#32859850)

Not just mainframes. Windows used to support this [microsoft.com] but the feature got removed because nobody used it.

Just like /. (1)

mtmra70 (964928) | about 4 years ago | (#32859390)

Wow, this percentage is the same as /. articles! Well, at least I assume - I haven't read the article.

Re:Just like /. (1)

sco08y (615665) | about 4 years ago | (#32859640)

Wow, this percentage is the same as /. articles!

Much more than 90%. When it comes to uselessness, /. has a rock solid 5 9's methodology.

Acutally, there's one more question: (1)

AnonymousClown (1788472) | about 4 years ago | (#32859400)

It’s an odd statistic. How is that data measured? 90% of all documents? 90% of stored bytes? When they said “ever again” did they mean explicitly retrieved by name, or should we include free text searches in that statistic? How long an interval needs to pass before some piece of data is clearly identified as belonging to the 90%, so that steps can be taken to reflect its reduced importance?

Why is so much data being collected? They should go back and review what data they're collecting and why.

The problem is "Write-only" applications (4, Insightful)

shoppa (464619) | about 4 years ago | (#32859406)

Interesting that this seems to have been written up as a "hardware" or "storage" topic.

The problem is, that IT people dream up all these "write only" applications that record data, without any rational plan for what the data might actually be used for in the business.

For example, some people worry about privacy when they go to the grocery store and know that all their purchases are being tracked by their loyalty card, or worry that the big bad US government is tapping all the E-mail.

In fact, I'm 100% sure that some IT geek had some wet dream years ago about recording everybody's purchases and E-mail and phone call and it's being done every which way.;

The true "IT application" issue is that there is no real business need for this data 99.999% of the time. It gets recorded, probably gets staged off to tape, maybe indexed in some giant table, and then ... sits there for years with no actual need for it.

I'm sure the IT geeks who dreamed up the technical ability to record all this stuff, thought they were hot shit when they came up with it. Oh, man, those IT architects were just having a big go-round whipping this problem in scalability. In their heads, they were gonna record everything on disk, then go home and fuck the prom queen.

Re:The problem is "Write-only" applications (5, Insightful)

mikael_j (106439) | about 4 years ago | (#32859486)

The problem is, that IT people dream up all these "write only" applications that record data, without any rational plan for what the data might actually be used for in the business.

These plans mostly come into being because us "IT people" (read: developers) know that the "business people" love changing the specs and they'll blame us if they want to start using data they didn't ask us to save and we tell them we can't save data retroactively (really, they'll basically blame the developers for not being able to time-travel). This is why we'd rather save everything than not save enough.

Re:The problem is "Write-only" applications (2, Insightful)

DerekLyons (302214) | about 4 years ago | (#32859536)

The problem is, that IT people dream up all these "write only" applications that record data, without any rational plan for what the data might actually be used for in the business.

Seems to me that the IT folks shouldn't be making these decisions (what data to capture and store) any more than they should be deciding what to stock for the Memorial Day sale.

Re:The problem is "Write-only" applications (1)

zarzu (1581721) | about 4 years ago | (#32859654)

yes, exactly! don't we all know this scenario where the management talks to the it head and is all like:

we need a system to save customer data. what data is really no concern to us, you decide and we see that our customers provide it! you just go ahead and implement whatever kind of sophisticated system you see fit, we'll pay for everything, tell us when it's done and we'll launch it! there's no need to check back with anyone here, we're sure you're gonna find the most profitable solution to this! we love you it guys!"

This isn't a 'new way of thinking' (5, Insightful)

sirwired (27582) | about 4 years ago | (#32859412)

Automated Hierarchical Storage Management has literally been around for decades. It may be new-ish on low-end crap x86 servers, but for say, mainframe users, it isn't new at all.

What is new is available implementation choices. When your tier choices are between enterprise disk and enterprise tape, you are biased towards keeping data on disk; there's still use cases for HSM with only high-end disk and tape, but they aren't as great. Now with lower-cost disk available, you have a cheap disk choice too, with fairly reasonable access time.

SirWired

In other news (0)

Anonymous Coward | about 4 years ago | (#32859414)

...Dell/EMC have several products available to help with this problem.

Coincidence?

Perfect (4, Funny)

Andreaskem (999089) | about 4 years ago | (#32859418)

A perfect application for my patented write-only memory.

Re:Perfect (1)

Dachannien (617929) | about 4 years ago | (#32859596)

A perfect application for my patented write-only memory.

Bob Pease, is that you? [national.com]

Human brain (0)

Anonymous Coward | about 4 years ago | (#32859426)

Only a small part of the human brain is active at any given time, but you just try to think without the rest of it...

Re:Human brain (1)

sco08y (615665) | about 4 years ago | (#32859650)

Only a small part of the human brain is active at any given time, but you just try to think without the rest of it...

I'll bet I'd still get credit card offers.

University Text (0)

Anonymous Coward | about 4 years ago | (#32859428)

I'm taking Business Systems units as electives to my degree and they always push the assumption that "more data is always a good thing because it could help you later in decision support etc.". I always found that assumption to be poorly grounded, guess I was right.

So, how many of us have 2 hard drives? (1)

Golbez81 (1582163) | about 4 years ago | (#32859430)

A fast SSD or 10,000 RPM'er for your OS or critical apps, and a larger 7200 or 5400 drive for all your other "media"? Personally I've been doing that setup since like 1997...

Re:So, how many of us have 2 hard drives? (1)

jabuzz (182671) | about 4 years ago | (#32859976)

You have the idea right, but what you want to do is automate the process. For example why should say a Brazilian keyboard layout or a driver for a printer I don't own be on fast disk just because it happens to be part of the OS? Why does the word document I am working on today get to be on slow disk?

That is you have some fast disk that new stuff gets written to, and then after a period of time if it is not accessed it gets moved to slower disk. You can even add in an extra layer so stuff that has not been used for a long period of time is moved to tape. If I access the file on tape, I want it to come back automatically, and if I start using that word document I wrote last year, I want it to come back to fast disk.

All the major storage vendors are introducing block based storage tiering to their line up as we speak. The other option is to build into the file system like IBM's GPFS so you can have more control to begin with, such as forcing all ISO images onto slow storage from the get go, along with all those MP3's. You also get the option of tape here as well which you don't get with block based tiering.

It is one of the reasons why ZFS simply does not cut the mustard. If a file system does not have storage tiering then it sucks; period. Of course the fan boys who think ZFS is the greatest thing since sliced bread just don't have a clue about enterprise storage of course.

Better safe than sorry? (1)

Allnighte (1794642) | about 4 years ago | (#32859438)

I imagine it's more a matter of "better safe than sorry" when management asks whether or not something should be kept.

Chances are, this "business data" is somewhat financially related. This also means there's a fair chance the government can/has/will tax something in those documents. And how long are we supposed to keep our records in case of audits?

This is new? (4, Interesting)

rapturizer (733607) | about 4 years ago | (#32859440)

I saw this over a decade ago when I was working as an IT consultant in the advertising industry. They regularly used only 5% - 10% of their information (and that's being generous). The systems I designed included a server for active work, an archive server for information used in the last 24 months, and then an archive solution (Magneto Optical at the time) that allowed for the information to be available, just not on demand. This idea has been working since for the clients that are still in business.

Re:This is new? (1)

daeglo (1822126) | about 4 years ago | (#32859796)

This idea has been working since for the clients that are still in business.

Seems as though it is working REAL well.

Re:This is new? (1)

rapturizer (733607) | about 4 years ago | (#32860042)

The advertising industry in the 2000's went through the worst recession since the great depression. Companies that specialized in print media and ones that were marginally profitable either went out of business or were purchased by other companies. Of all the clients I had, only one closed its doors (they specialized in quick turn around newspaper advertising), I did have three acquired by other companies. One of them purchased by another client of mine. One of the points that sealed the deal was compatible data structures that made the acquisition mostly seamless.

We already know about it (1)

Wolfraider (1065360) | about 4 years ago | (#32859444)

Businesses already know that most data is stored once and never looked at again. The simple solution would be to offer multiple locations to store data, one for frequently accessed data and one for archival, etc. The problem comes down to is training. There are a lot of people that can barely use a computer and the whole concept of the multiple folders would confuse them. Another solution is to solve the issue with software. There are several archival solutions that will look at the file accessed date and either move it to cheaper disk or even tape. It leaves a stub file in place in the original location and if a user tries to access the file, it will pop up a box saying "please wait while the file is restored". This solution is nice in where the users don't have to change how they save data but it is harder to manage. You have your data spread across multiple systems instead of one and backups could become harder. Overall, it just depends on which direction you want to go with your data and what makes the most sense.

Signetics invented the needed chip back in the 70s (3, Funny)

ve3id (601924) | about 4 years ago | (#32859458)

FINALLY !!! AN APPLICATION FOR THE WOM!!!! http://www.national.com/rap/files/datasheet.pdf [national.com] Bob Pease sure was fore-sighted, since this memory chip was invented back in the seventies!

In my experience (1)

AbbyNormal (216235) | about 4 years ago | (#32859460)

In my experience with small businesses, it may be never read but will absolutely need to be found for some type of emergency presentation/proposal.

Good argument for tape? (3, Interesting)

mlts (1038732) | about 4 years ago | (#32859474)

This is one reason I like tape: The drives are expensive, but the tapes are $30-$50 (LTO-4 is $30 on mail-order). So having an autochanger moving all the rarely used data into storage is likely the most efficient way of moving data to long term archiving. Even better is making sure that 2-3 sets of tapes are used (one onsite, one offsite.)

Of course, hard disks by themselves may seem cheaper, but they are not a true archival medium. There are so many moving parts in a HDD and each of them (bearings, heads, spindles, motors, controller card) are a point of failure.

With HDD capacities starting to not grow as exponentially as they did last decade, it would be nice if tape companies would not just catch up with 2-3TB native tape offerings, but be able to offer drives at a lower price so home and SOHO users can use them for long term storage. I'm sure that if someone offered a consumer level tape drive for $500 with a decent capacity, that a lot of small businesses would buy it, especially if it came with decent backup software (Retrospect, Backup Exec, Amanda, bru, or another utility that is similar.) Since some tape drives are even bootable (some HP offerings have a section of the tape to emulate a boot CD or DVD), it would be ideal for bare metal recoveries even by nontechnical users. Pop in the tape, boot the machine, type in the encryption key, select where the data should be restored to, walk off for a bit and it is done.

Even though the SAN companies have said tape is going to die, until another form of media (perhaps super-inexpensive flash media [1]) is as reliable as tapes and can be put in the Iron Mountain case and sent offsite for safekeeping for decades on end, tape will be with us. Only optical comes close to tape for long term archiving abilities.

[1]: I can see someone make flash media that is semi-smart where it is put in a specific case, shipped to an offsite warehouse, and that warehouse plugs in the cases into 5-12VDC. Then over time, the circuitry on the flash drives periodically checks the stored flash media for damage or bit rot, corrects errors by rewriting blocks, and good blocks it would periodically move to ensure that there is a high signal to noise level on all media. Of course, this requires power, while tapes can happily sit in a climate controlled warehouse and be still recoverable.

Re:Good argument for tape? (0)

Anonymous Coward | about 4 years ago | (#32859764)

There are so many moving parts in a HDD and each of them (bearings, heads, spindles, motors, controller card)

Controller cards don't move during normal operation.

Re:Good argument for tape? (3, Informative)

mbone (558574) | about 4 years ago | (#32860108)

Tapes are not archival storage either. In either case, archival storage is a system, not a medium.

I hope you are reading all of those tapes on a 5 year cycle, and writing new ones with the recovered data. I also hope you are making sure that the humidity and temperature are strictly controlled at all times in the tape storage room.

"Once" may be pushing it (1)

jayhawk88 (160512) | about 4 years ago | (#32859494)

But if you revised this to say, "Never accessed again a week after it's creation", I'd believe it.

Re:"Once" may be pushing it (1)

lorg (578246) | about 4 years ago | (#32860062)

Not so sure, there is always the "what the heck is this thing ..." **accessing data** "oh it's this bollocks .. nevermind ...", some period of time passes and you come back to and repeat this procedure again.

We aren't thinking this way until now? (1)

al-ahlex (937743) | about 4 years ago | (#32859512)

So why do all serious RDBMS systems have functionality for dynamically partitioning data based on the relevance of the data? Big databases are often set up to (for instance) have the last month's data on fast storage, and older data on slower/cheaper storage.

Re:We aren't thinking this way until now? (1)

theshowmecanuck (703852) | about 4 years ago | (#32859994)

We have been. It is called data warehousing, data marting, operational data stores, etc (and they aren't the same things). People have been doing this for a long time. That is why there are analysts who specialize in these areas. They help the business identify the things that are used regularly, things not used often, and things that are nice to keep somewhere, and things that you can throw out after a few weeks. And the most ideal storage mechanisms (but not necessarily the specific technology).

Whenever I've seen these issues, it is when when managers assume people who are SMEs in hardware management are also SMEs in data management just because they know how to set up the hardware the data resides on. Adding more drives is easy... to a point. From the little of the bio on the author of the article, it looks like maybe he is a SME on the hardware side of things...

Seriously this kind of smells to me like maybe this is the beginning of a new marketing push. Maybe Dell wants to start marketing solutions for Data Warehousing or similar.

this is actionable: think of the storage savings (4, Funny)

rubycodez (864176) | about 4 years ago | (#32859524)

this helps me to be a better employee. From now on I'll only save 25% of the data I acquire, because the odds are the other 75% would only be needed 7.5% of the time. In other words, 92.5% chance not likely to be needed at all.

90% of our brains, 100% of our spirit, unused (-1, Offtopic)

Anonymous Coward | about 4 years ago | (#32859532)

business documents are not like novels. they must be saved somehow 'in case' there's ever a need. paper is still the only genuine legal medium.

meanwhile (looking like a shorter while all the time now); the corepirate nazi illuminati is always hunting that patch of red on almost everyones' neck. if they cannot find yours (greed, fear ego etc...) then you can go starve. that's their (slippery/slimy) 'platform' now. see also: http://en.wikipedia.org/wiki/Antisocial_personality_disorder

never a better time to consult with/trust in our creators. the lights are coming up rapidly all over now. see you there?

greed, fear & ego (in any order) are unprecedented evile's primary weapons. those, along with deception & coercion, helps most of us remain (unwittingly?) dependent on its' life0cidal hired goons' agenda. most of our dwindling resources are being squandered on the 'wars', & continuation of the billionerrors stock markup FraUD/pyramid schemes. nobody ever mentions the real long term costs of those debacles in both life & any notion of prosperity for us, or our children. not to mention the abuse of the consciences of those of us who still have one, & the terminal damage to our atmosphere (see also: manufactured 'weather', hot, mass hypenosys etc...). see you on the other side of it? the lights are coming up all over now. the fairytail is winding down now. let your conscience be your guide. you can be more helpful than you might have imagined. we now have some choices. meanwhile; don't forget to get a little more oxygen on your brain, & look up in the sky from time to time, starting early in the day. there's lots going on up there.

"The current rate of extinction is around 10 to 100 times the usual background level, and has been elevated above the background level since the Pleistocene. The current extinction rate is more rapid than in any other extinction event in earth history, and 50% of species could be extinct by the end of this century. While the role of humans is unclear in the longer-term extinction pattern, it is clear that factors such as deforestation, habitat destruction, hunting, the introduction of non-native species, pollution and climate change have reduced biodiversity profoundly.' (wiki)

"I think the bottom line is, what kind of a world do you want to leave for your children," Andrew Smith, a professor in the Arizona State University School of Life Sciences, said in a telephone interview. "How impoverished we would be if we lost 25 percent of the world's mammals," said Smith, one of more than 100 co-authors of the report. "Within our lifetime hundreds of species could be lost as a result of our own actions, a frightening sign of what is happening to the ecosystems where they live," added Julia Marton-Lefevre, IUCN director general. "We must now set clear targets for the future to reverse this trend to ensure that our enduring legacy is not to wipe out many of our closest relatives."--

"The wealth of the universe is for me. Every thing is explicable and practical for me .... I am defeated all the time; yet to victory I am born." --emerson

no need to confuse 'religion' with being a spiritual being. our soul purpose here is to care for one another. failing that, we're simply passing through (excess baggage) being distracted/consumed by the guaranteed to fail illusionary trappings of man'kind'. & recently (about 10,000 years ago) it was determined that hoarding & excess by a few, resulted in negative consequences for all.

consult with/trust in your creators. providing more than enough of everything for everyone (without any distracting/spiritdead personal gain motives), whilst badtolling unprecedented evile, using an unlimited supply of newclear power, since/until forever. see you there?

"If my people, which are called by my name, shall humble themselves, and pray, and seek my face, and turn from their wicked ways; then will I hear from heaven, and will forgive their sin, and will heal their land." )one does not need to agree whois in charge to grasp the notion that there may be some assistance available to us(

boeing, boeing, gone.

Much, much higher - probably 99% +++ (3, Funny)

petes_PoV (912422) | about 4 years ago | (#32859560)

If you're talking about blog entries. Almost all of them (well, almost all of *mine* :-) are written once and never read, unless you count spiders as reading them.

I only read the headline... (1)

erroneus (253617) | about 4 years ago | (#32859570)

...I didn't bother to read any further because I felt it was probably useless data anyway.

dell's new line of fire extinguishers coming soon! (5, Insightful)

drfireman (101623) | about 4 years ago | (#32859588)

Over 92% of fire extinguishers will never be used, we could probably save a bit of space by having the unneeded ones stored off-site, or in less accessible corners of the garage.

Slightly more seriously, we can certainly answer this question posed by the linked article easily: "why on earth did we squander so much money by not thinking this way until now?" The answer is: because you are a moron. Anyone who has given even a moment's thought to storage has known this, either implicitly or explicitly, for a long time. So whoever's included in your "we," Steve Cassidy, is just profoundly stupid. I think that quite easily explains why you all squandered so much money by not thinking about this. Next question?

Re:dell's new line of fire extinguishers coming so (2, Insightful)

mbone (558574) | about 4 years ago | (#32860128)

Well over 99% of all lifeboats are never used.

In other news... (2, Interesting)

argStyopa (232550) | about 4 years ago | (#32859598)

...at least 70% of the crap you store in your house isn't really needed, either. Do you really ever LOOK at the pictures hanging on the walls? Are you sure you're going to read every book you own, again?

Re:In other news... (0)

Anonymous Coward | about 4 years ago | (#32859660)

Are you going to fap over that ever again?

Don't answer that.

Use Plan 9 file system? (0)

Anonymous Coward | about 4 years ago | (#32859604)

Plan 9 from Bell Labs, an OS they released in the early 90s, had a file system for this. Hard drives as cache with WORM drives for bulk storage.

It did some interesting things. cd /2009/12/25/ puts you in the root of the file system as it existed last Christmas.

yikes, more stuff that doesn't really matter (-1, Offtopic)

Anonymous Coward | about 4 years ago | (#32859608)

http://redactednews.blogspot.com/2010/07/did-military-stop-cheney-from.html

what a dick huh? alas just a puppet. although the faces have changed (horns, fangs etc.. coming out)the 'game's' still on, if a bit behind schedule. see you on the other side of it?

meanwhile (& we're hoping that there is a 'while' left to be had); the corepirate nazi illuminati is always hunting that patch of red on almost everyones' neck. if they cannot find yours (greed, fear ego etc...) then you can go starve. that's their (slippery/slimy) 'platform' now. see also: http://en.wikipedia.org/wiki/Antisocial_personality_disorder

never a better time to consult with/trust in our creators. the lights are coming up rapidly all over now. see you there?

greed, fear & ego (in any order) are unprecedented evile's primary weapons. those, along with deception & coercion, helps most of us remain (unwittingly?) dependent on its' life0cidal hired goons' agenda. most of our dwindling resources are being squandered on the 'wars', & continuation of the billionerrors stock markup FraUD/pyramid schemes. nobody ever mentions the real long term costs of those debacles in both life & any notion of prosperity for us, or our children. not to mention the abuse of the consciences of those of us who still have one, & the terminal damage to our atmosphere (see also: manufactured 'weather', hot etc...). see you on the other side of it? the lights are coming up all over now. the fairytail is winding down now. let your conscience be your guide. you can be more helpful than you might have imagined. we now have some choices. meanwhile; don't forget to get a little more oxygen on your brain, & look up in the sky from time to time, starting early in the day. there's lots going on up there.

"The current rate of extinction is around 10 to 100 times the usual background level, and has been elevated above the background level since the Pleistocene. The current extinction rate is more rapid than in any other extinction event in earth history, and 50% of species could be extinct by the end of this century. While the role of humans is unclear in the longer-term extinction pattern, it is clear that factors such as deforestation, habitat destruction, hunting, the introduction of non-native species, pollution and climate change have reduced biodiversity profoundly.' (wiki)

"I think the bottom line is, what kind of a world do you want to leave for your children," Andrew Smith, a professor in the Arizona State University School of Life Sciences, said in a telephone interview. "How impoverished we would be if we lost 25 percent of the world's mammals," said Smith, one of more than 100 co-authors of the report. "Within our lifetime hundreds of species could be lost as a result of our own actions, a frightening sign of what is happening to the ecosystems where they live," added Julia Marton-Lefevre, IUCN director general. "We must now set clear targets for the future to reverse this trend to ensure that our enduring legacy is not to wipe out many of our closest relatives."--

"The wealth of the universe is for me. Every thing is explicable and practical for me .... I am defeated all the time; yet to victory I am born." --emerson

no need to confuse 'religion' with being a spiritual being. our soul purpose here is to care for one another. failing that, we're simply passing through (excess baggage) being distracted/consumed by the guaranteed to fail illusionary trappings of man'kind'. & recently (about 10,000 years ago) it was determined that hoarding & excess by a few, resulted in negative consequences for all.

consult with/trust in your creators. providing more than enough of everything for everyone (without any distracting/spiritdead personal gain motives), whilst badtolling unprecedented evile, using an unlimited supply of newclear power, since/until forever. see you there?

"If my people, which are called by my name, shall humble themselves, and pray, and seek my face, and turn from their wicked ways; then will I hear from heaven, and will forgive their sin, and will heal their land." )one does not need to agree whois in charge to grasp the notion that there may be some assistance available to us(

boeing, boeing, gone.

How insightful (-1, Flamebait)

Anonymous Coward | about 4 years ago | (#32859626)

90% of niggers can't read.

Because storage is easy. (1)

Yaos (804128) | about 4 years ago | (#32859634)

Why go to all the trouble of setting of two different systems for live data and archived data when you can spend half the money on just one system for both and more storage space?

Databases should handle this automagically (1)

Proudrooster (580120) | about 4 years ago | (#32859652)

Anyone who manages large systems know that this is very true, yet the data piles up. I've often wished that databases would allow us to make a view or some other type of abstraction which would allow you to make the decision whether or not to join an archive table. Right now, everything needs to be handled on a program by program or query by query basis. Hey, maybe I should quickly patent this idea, then I can license it to Oracle. :)

This should be obvious... (1)

jridley (9305) | about 4 years ago | (#32859666)

to anyone aware of Sturgeon's Law [wikipedia.org] . 90% of everything is crud.

If you weren't thinking this way (0)

Anonymous Coward | about 4 years ago | (#32859678)

You should have been. This isn't new.

Many of the SAN's I've worked on automatically migrate data off to slower storage or even tape as it ages without modification. If you do a read to something that's offline, it has to get fetched from the tape juke and it takes a bit.

So what? (4, Insightful)

davidbrit2 (775091) | about 4 years ago | (#32859696)

And if you didn't have that 10% that is eventually needed, you'd be totally screwed. Do we really need to play the 20/20 hindsight game every time somebody thinks of something like this?

Re:So what? (1)

DaveGod (703167) | about 4 years ago | (#32859916)

And if you didn't have that 10% that is eventually needed, you'd be totally screwed. Do we really need to play the 20/20 hindsight game every time somebody thinks of something like this?

I know /. summaries are traditionally highly unreliable and jumping to obvious conclusions after picking up on a couple of key words is often a safer bet, but this time we have a good one. It goes straight (perhaps too straight) to the point that some data is in use that needs to be on expensive servers, and there is data that is not in use and can be stored on much slower and cheaper systems. There is no suggestion at all in TFA that the other 90% should be deleted or not collected in the first place - a debate worth having at individual companies perhaps, but that's another story.

There's nothing new in TFA except that the unused data is as high as 90%, and that there's a few gizmo's on the way to facilitate, so the cost savings may be much more significant now than previously perceived.

Only 90% (1)

flyingfsck (986395) | about 4 years ago | (#32859698)

Many businesses work with a customer file a few times and then never again - for example lawyers and realtors. I'd like to see a file system that will auto archive data and shift it transparently into long-term storage, and then transparently undo it when needed again.

Re:Only 90% (1)

IrquiM (471313) | about 4 years ago | (#32859876)

Dell delivers servers with this system. It's nothing new. We've used it the last 3-4 years.

Health Insurance? (1)

Kozz (7764) | about 4 years ago | (#32859704)

I also wonder if +90% of all health insurance benefits go unused each year. And you probably have business data and insurance for some of the same reasons: it's better to have it and not need it than need it and not have it. amirite?

Question (1)

chazzf (188092) | about 4 years ago | (#32859710)

Is there a reliable metric as to which 10% will be needed again at the time the data is written? If not then I don't see what this buys us.

If Dell is talking about it's failure rate (2, Funny)

christoofar (451967) | about 4 years ago | (#32859738)

If the data was recorded by Dell computers... then yeah I would expect that 90% of business customers aren't able to read it back.

Re:If Dell is talking about it's failure rate (0)

Anonymous Coward | about 4 years ago | (#32860032)

Yep. We know the data on those bad capacitors is in that 90%.

It's written for a purpose (1)

davidwr (791652) | about 4 years ago | (#32859756)

I create a lot of business data, and 90% of "never read again" or "never read again after 2-3 days" is not far from true.

However, the data serves a purpose. I frequently do searches on the data and you never know what you wrote months or years ago will turn out to be just the document you need. Keeping records for years instead of days has more than paid for itself in the long run.

Now, will these records be useful 5 or 10 years from now? Probably not except to an archivist or someone researching how we did business during 2010 and earlier. Or perhaps to a lawyer *groan*.

what about law enforcement data policy? (0)

Anonymous Coward | about 4 years ago | (#32859768)

There are some laws that says you must keep some data for over a period of time, five years or more.
Even if you know that you wont be using it in the future almost for sure, the law is the law and you must obey.

About 90% of slashdot posts are also WORN (1)

aapold (753705) | about 4 years ago | (#32859786)

At least 90% at Write Once Read Never.

Wonder if you could go into business archiving never-read data. I mean you could guarantee privacy....

Datenbrief (0)

Anonymous Coward | about 4 years ago | (#32859810)

The german Chaos Computer Club (CCC) was addressing this with its draft law "Datenbrief". Businesses would have to report all types of data collected about one person to them via slow-mail or e-mail on an annual basis. The time and effort spend on this would constitute a 'fee' for data and thus force them to hold as less data as possible. See http://www.ccc.de/de/datenbrief (german)

Exactly. (3, Insightful)

brusk (135896) | about 4 years ago | (#32859824)

I wasted money on a dictionary that has tens of thousands of words but have only ever looked up a few hundred. I should have bought one that just had the words I would actually need.

Solutions: (5, Interesting)

drolli (522659) | about 4 years ago | (#32859862)

a) Forbid *unmanaged* of documents. If the question: "where is the most up-to-date version of this document stored?" is systematically and easily answered then people can delete the crap from their laptops.

b) Forbid in-company attachments to mails. If the last version can be easily found, including the revision history, a link to this revision is worth *more* than the current state of the document. Most space in my inbox are totally useless attached documents.

c) Forbid the use of formats unsuitable for storing a certain kind of information. (Where i work, they use powerpoint/word files for electronics forms)

d) Provide a good archiving and backup service. Besides the quality improvement by using a service, also the 100th copy done in some unsystematic way of some data is prevented (forbid this explicitely)

e) Thin clients. store the data on a server. Deduplicate.

f) i would expect that most of the documents in a company can (and should) be stored in a database.

It's not like this is something new (1)

IrquiM (471313) | about 4 years ago | (#32859866)

Dell has been doing this for our company the last 3-4 years now.

I think they're already implementing this . . . (0, Offtopic)

bedouin (248624) | about 4 years ago | (#32859948)

Made an order for 6 computers and received 11 in January. Returned the extra 5 and they refunded me for all 11, then took 6 months to realize their mistake.

All this after I called them trying to tell them about their error, and getting some script/screen-reading Indian who didn't understand me.

Imagine what it would have been like if the situation was reversed . . . yikes.

Fuck Dell. This is the kind of dumb thinking that will lead to their inevitable downfall. Welcome to Gateway Country Part Deux.

The more things change... (0)

Anonymous Coward | about 4 years ago | (#32860080)

Back in the day I ran (operations manager) a very large mainframe shop with three supercomputers churning out enough numbers to consume several tons of paper each month. The scientists would make their way to the output distribution area each morning, pick up a 6" stack of 11"x17" paper, and flip to the last page to find an eigenvalue. All too frequently they'd shake their heads and say something like "Should have been higher" and drop the whole stack into the recycling bin strategically posted near the exit.

A purely statistical analysis might suggest that we have the trucks delivering the paper just drop it off at the recycling center, saving wear and tear on the printers, printer ribbon costs, and scientists' time, as they would no longer have to come by to pick up their output. Could probably have cut a couple of staff as well. And the loss of information would be negligible, statistically speaking. The scientists failed to see the humor in this, however, so we continued killing trees at an alarming rate.

In related news... (1)

GodfatherofSoul (174979) | about 4 years ago | (#32860096)

Backup snapshots are wasting space 99% of the time!

I can easily believe it (1)

onyxruby (118189) | about 4 years ago | (#32860178)

Most people don't understand the nature of large amounts of data like that. They think "I want more, more, more" and never beyond that. Getting data is easy, getting useful data is far more important and for that you need to have your customers spend some time with the database where they can tell you everything that they don't need or want. Once you can confirm the accuracy of that information you can then purge your data of the clutter.

What people really fail to understand though is that getting rid of data is just as important. Unless your dealing with something like scientific research data, or have a compelling legal reason (SEC etc), or another really good reason (manufacturing plans) than your data needs to have a planned lifecycle just like any other asset. You need to have a date for end of life for data (SQL Data, documents, etc) just like you would for emails or other documents. As a rule of thumb, set up an end of life asset policy for your data, notify the stakeholders and users and from that point forward - every chance you have to destroy that data, do so.

If you destroy data when you had a subpeona, knew a subpeona was coming or knew a criminal investigation was coming you can end up a felon. Any data that isn't destroyed can be used against you in a court of law. However - if your data is destroyed via policy on a given date and that destruction doesn't violate something like a SEC requirement that you are safe. Yes, I do speak as someone that has at times been heavily involved in litigation (the technical expert that has prepared data for use in court and explained what everything means to lawyers) more than once.

90% is reasonable value (1)

stanlyb (1839382) | about 4 years ago | (#32860226)

Let's face it, in practice, you are makking backup every friday, of almost everything: Database, SVN, CVS, builds, release, etc...And there is a good reason for it, like computer burned, sysadmin left without giving up the password ;).... But in reality, this backup data is almost never used. In my long long practice i never had the chance to see the need of these backups, nevertheless, you just have to have it. Period.

Observation (1)

halcyon1234 (834388) | about 4 years ago | (#32860230)

And now that Dell's looked at the files, they've been read. There goes that theory.

Data warehousing (1)

v1x (528604) | about 4 years ago | (#32860250)

I have serious doubts about how they came up with that number. Data captured once can be stored in a data warehouse and analyzed and reused in many different ways for analytics and reporting, so I am not sure how they estimate that 90% of data is never used again (unless, of course they meant that it is not pulled up again on the frontend application side, which would still make no sense at all).

At our hospital, they have replaced the inpatient electronic medical records system at least 3 times in the last 20 years, and our data warehouse, which has been around for more than 15 years, contains a large percentage of that clinical data from the different (current & historical) systems. A lot of this data is still used pretty actively for retrospective research, recruitment of patients for clinical trials, operational and financial resource planning, forecasting, cost-accounting, etc. In other words, at our institution, most of our data is used all the time, but for different purposes.
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Create a Slashdot Account

Loading...