×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

The 1-Petabyte Barrier Is Crumbling

CmdrTaco posted more than 5 years ago | from the so-much-data dept.

Databases 217

CurtMonash writes "I had been a database industry analyst for a decade before I found 1-gigabyte databases to write about. Now it is 15 years later, and the 1-petabyte barrier is crumbling. Specifically, we are about to see data warehouses — running on commercial database management systems — that contain over 1 petabyte of actual user data. For example, Greenplum is slated to have two of them within 60 days. Given how close it was a year ago, Teradata may have crossed the 1-petabyte mark by now too. And by the way, Yahoo already has a petabyte+ database running on a home-grown system. Meanwhile, the 100-terabyte mark is almost old hat. Besides the vendors already mentioned above, others with 100+ terabyte databases deployed include Netezza, DATAllegro, Dataupia, and even SAS."

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

217 comments

Porn collection (4, Funny)

Anonymous Coward | more than 5 years ago | (#24735439)

No porn collection jokes please.

Re:Porn collection (-1, Redundant)

Anonymous Coward | more than 5 years ago | (#24735483)

They seem to forget Ron Jeremy has over 9.75 petabytes of pr0n on his hard drive.

Won't somebody think of the children.... (2, Funny)

Anonymous Coward | more than 5 years ago | (#24735459)

Oh wait, that was petabyte...

Fixed it for you... (5, Funny)

hyperz69 (1226464) | more than 5 years ago | (#24735477)

I had been a Porn Collector for a decade before I found 1-gigabyte Porn Collections to write about. Now it is 15 years later, and the 1-petabyte barrier is crumbling.

Noob (4, Funny)

SmallFurryCreature (593017) | more than 5 years ago | (#24735795)

My porn collection has long since achieved infinity.

Re:Noob (5, Funny)

Gilmoure (18428) | more than 5 years ago | (#24736089)

It has an event horizon and is actively acquiring porn on it's own?

Re:Noob (0)

Anonymous Coward | more than 5 years ago | (#24737125)

It has an event horizon and is actively acquiring porn on it's own?

And surprisingly, it only needed to be seeded with "the 1000 pound woman" with a banana.

Chia Pet a Bite? (0)

Anonymous Coward | more than 5 years ago | (#24736351)

No no, if your Chia Pet is a-bitin' you're most likely doing something wrong.

Just add water, and sing along: Ch-ch-ch-chia!

This isn't really news (1, Interesting)

Anonymous Coward | more than 5 years ago | (#24735487)

Since 500GB drives, this has been a reality. A couple of companies started selling petabyte
arrays at about the time those drives were
established.

Petabyte DBs are old news to... (2, Funny)

C_Kode (102755) | more than 5 years ago | (#24735489)

Petabyte DBs are old news to techie porn collectors. They always mix their two favorite subjects into one. Tech + Porn = Petabyte+ Porn Database

Re:Petabyte DBs are old news to... (5, Interesting)

houghi (78078) | more than 5 years ago | (#24735607)

This is intended as a joke, I asume, but it also brings up the fact that it will be different sort of data that is now collected.

When I look at CRM systems, they used to contain basically the address and perhaps logs from calls they made to the call center. Now whole phone conversations are logged as well as faxes and letters that are scanned, together with images and video that is available.
Faxes and letters used to have only a reference number and you could look them up in a file cabinet.

So even though there is not that much more data collected, (things were already available) they are now all put in the database. Where it used to be an entry 'customer was extremely angry and cursed a lot' it now saves the mp3 for all eternity (where legal).

So yes, the HD space it takes is bigger and thus the amount is bigger, yet it does not automaticaly mean that sort of data is bigger. e.g. do we suddenly have shoesize or other data available? Could be but it also could be that we just have different file formats we now save in the databse.

Re:Petabyte DBs are old news to... (0)

Anonymous Coward | more than 5 years ago | (#24737227)

Hmmm... you say most places record and store the whole call now? Like, even what I say when on hold?

Uh-oh.

I guess let's just hope nobody listens to my recordings, lest they find out how I truly feel about their hold music.

Oh s***! I'm calling my Congressman! (5, Funny)

BitterOldGUy (1330491) | more than 5 years ago | (#24735497)

We must protect the children from the petabytes! These petabytes are everywhere trying to have sex with our children!

I have to find my kid. Last time I saw her, she was with her Uncle Micky while he was having his morning martini.

Google Street View must be most massive db ever? (3, Interesting)

Anonymous Coward | more than 5 years ago | (#24735515)

They have many towns now with less than 50k people completely photographed, every street in high res. That has to be well over 1-petabyte, though I doubt it's all in one location, must be distributed?

I am confused !! (5, Funny)

neonux (1000992) | more than 5 years ago | (#24735523)

How many Libraries of Congress are necessary to break the 1-petabyte barrier ??

Re:I am confused !! (1, Interesting)

n3xg3n (994581) | more than 5 years ago | (#24735573)

0.009 1 Library of Congress = 10 Terabytes = ~0.009 Petabytes

Re:I am confused !! (0)

Anonymous Coward | more than 5 years ago | (#24735723)

You did your math wrong... I'll give you another try...

Re:I am confused !! (1, Funny)

Anonymous Coward | more than 5 years ago | (#24735913)

but, if they submit it to the LoC, then what?

Re:I am confused !! (4, Informative)

Anonymous Coward | more than 5 years ago | (#24735945)

1 Petabyte = 1,000 Terabytes
1 LoC = 10 Terabytes
100 LoC = 1,000 Terabytes
======
100 LoC = 1 Petabyte

Re:I am confused !! (2, Informative)

Lachlan Hunt (1021263) | more than 5 years ago | (#24736401)

You seem to be trying to calculate in Tebibytes (TiB) and Pebibytes (PiB), which are based on the binary system, rather than Terabytes (TB) and Petabytes (PB), which are base 10.

Although some operating systems incorrectly use the decimal-based units with binary-based values (i.e. 1TB = 1024MB), that is technically wrong. Hard drive manufacturers actually report correctly using the decimal-based values (i.e. 1TB = 1000MB).

Also, you still got your maths wrong. 10TiB = ~0.09PiB.

Re:I am confused !! (1)

hattig (47930) | more than 5 years ago | (#24736509)

I speak on behalf of many people when I say this:

Screw the SI units for data capacity.

1PB = 1024TB
1TB = 1024GB
1GB = 1024MB
1MB = 1024KB
1KB = 1024B

In this case, the historical, and de-facto standard, wins. Base 2 capacities are all that matter for computer data stored in base two units of capacity, such as a block on a disc, computer memory, etc. You won't catch me using the inanely stupid SI unit names for this.

Re:I am confused !! (1)

suggsjc (726146) | more than 5 years ago | (#24737029)

Isn't the Library of Congress continually growing? If so, doesn't that need to be a dynamic algorithm to adjust for its rate of growth? I couldn't find any documentation or any historical data, but I would think its out there somewhere...then we can start working on this algorithm.

Yawn... (1, Insightful)

admi-systems (1334401) | more than 5 years ago | (#24735565)

Hard drives keep getting larger. Hard drive consumption keeps getting larger. How much larger it keeps getting really isn't all that impressive.

Re:Yawn... (1, Flamebait)

bconway (63464) | more than 5 years ago | (#24735633)

Database, not filesystem. Thanks for almost bothering to read the summary, though.

Re:Yawn... (3, Insightful)

Beale (676138) | more than 5 years ago | (#24736013)

As soon as you have the capacity, people will fill the capacity. There's always more data to collect.

I could see practical applications (2, Informative)

gravis777 (123605) | more than 5 years ago | (#24736395)

Okay, I know that the article is refering to database, but the comments seem to have gone into the way of disc storage, so I will take the bait and go off topic.

Petabyte drives would not really be that unpractical of an application for people who like to archive stuff. I just filled up a 300 gig drive and a 750 gig drive with just stuff off of the DVR in under a year. While National Geographic HD may be compressed so badly that it barely looks better than HD, and a one hour show is under 2 gig, try archiving something with a higher bandwidth. For example, I recorded the Olympics, and saved the opening and closing ceremonies and all gymnastic events. A single 4 hour day saved is around 40 gig.

So, lets think media server for HD material. Let's just stick with HDTV for a while. Let's say that I want to archive on a media server a Blu-Ray disc. Let's for the matter of talking say that the movie takes up all 50 gig of the disc. Ten movies, 500 gig. 100 movies, 5 Terrabyte, 1000 movies, 50 Terrabyte.

Now let's say that we are an IMAX theater, and upgrading to the new Imax Digital standard. I read not too long ago that an Imax film is equilivant to 18k (most digital theaters project 2K, although some are now installing 4K systems). So, to keep from having these big massive films around of the 20 year old science documentaries that we keep in rotation, we get the digital versions of these. Does anyone want to do the math?

I am waiting for the day when neural implants can actually read the human brain, and as such, you can archive experiences to some type of storage medium. I am sure wikipedia has somewhere how much information the human brain processes a second. Now, I am sure we will find a way of compressing stuff, we can already do audio and video, so I am sure one day we will have the ability to compress smell, taste and touch, granting that we actually have a way of capturing these. Still, the amount of data would be massive, and will probably be a whole new avenue for the Porn industry.

Granted, these are extremes, but who would have thought 15 years ago when we first started hitting the 1 gig barrier, that in 2008 we would have discs used for storing movies that have a capacity of 50 gig, and we would even consider saving stuff at a resolution of 1920x1080 and have PCM sound at a bitrate of 4.6Mbps?

Give us the storage space, and we will find a use for it.

Re:I could see practical applications (1)

N!k0N (883435) | more than 5 years ago | (#24737383)

I am waiting for the day when neural implants can actually read the human brain, and as such, you can archive experiences to some type of storage medium. I am sure wikipedia has somewhere how much information the human brain processes a second.

I don't know how *accurate* this is, but I ran across this...

Current estimates of brain capacity range from 1 to 1000 terabytes! [google.com] "Robert Birge (Syracuse University) who studies the storage of data in proteins, estimated in 1996 that the memory capacity of the brain was between one and ten terabytes, with a most likely value of 3 terabytes. Such estimates are generally based on counting neurons and assuming each neuron holds 1 bit. Bear in mind that the brain has better algorithms for compressing certain types of information than computers do."

and this

couldn't find anything on wikipedia though.

No big news here.... (5, Interesting)

edwardd (127355) | more than 5 years ago | (#24735577)

Take a look at almost any large financial firm. The email retention system alone is much larger than a petabyte, and that's just dealing with the online media, not including what's spooled to tape. Due to deficiencies in RDBMS ssytems, each of the large firms usually develop their own systems for managing the archival system on top of the database.

Re:No big news here.... (1)

jiayao (1350937) | more than 5 years ago | (#24737127)

Email retention system and any archival system is boring. I wouldn't be excited to see such a system to manage exabytes of data. The only operation you need to do is search. You don't do join or updates. Another uninteresting example is a huge database with most of the data being opaque blobs. It's more like a file system than a database. It's the live data that are queried and updated constantly that counts. E.g. Yahoo's web analytics database. I think it's amazing they managed to keep such an amount of data possibly distributed across data centers consistent during updates and still keep the system responsive with constant queries.

It's not... (0)

Anonymous Coward | more than 5 years ago | (#24735581)

It's not how big it is, but how you use it. :)

Oh, come on. (4, Interesting)

seven of five (578993) | more than 5 years ago | (#24735583)

Call me old fashioned, but I don't see why anyone but a search engine like google would need anything like a petabyte. You can have only so much useful information about anything. Sounds to me like, fill your garage with sh1t, build a bigger garage.

Re:Oh, come on. (4, Insightful)

poetmatt (793785) | more than 5 years ago | (#24735649)

So the fact that movies have gone from 780mb (dvdrips) to 4.8gb (straight up copies) to 25gig (blu ray) doesn't bear any significance to you?

Or how about games which have gone from 1mb to installations that are upwards of 10gigs now (warhammer IIRC is 9 something).

Not to mention MS's fiasco of their Office XML format where things take up a ridiculous amount of space in comparison to open office (10mb docx vs 2.9mb open office)...it's all about the level of tech knowledge of someone that determines their space usage.

I wouldn't mind 3-4 TB, I'd split it off into about 4 partitions or raid stripe and call it a day for a while.

However consumer use is indicative of business use, so I would expect things to head towards exabyte eventually.

Re:Oh, come on. (4, Insightful)

seven of five (578993) | more than 5 years ago | (#24735869)

However consumer use is indicative of business use, so I would expect things to head towards exabyte eventually.

This is kind of my point. Do companies keep libraries of pr0n, video, music? Sure, if you're a media company you will. But say you're a plumbing distributor. You'll have the usual accounting stuff, and media for marketing, and some BS overhead, but don't tell me it adds up to a TB much less a PB.

On the other hand, if you have the extra space, it invites the usual waste in the form of archive directories for closed-out years, development junk, etc. Spinning round and round, doing nothing.

Re:Oh, come on. (3, Insightful)

AP31R0N (723649) | more than 5 years ago | (#24735665)

Agreed.

And i'd also be worried about losing a PB all at once. There are TB drives at my local Best Buy, but that's a lot to lose at once. i'd rather split my files and programs between two or more smaller drives (and have a RAID).

Re:Oh, come on. (1)

tekiegreg (674773) | more than 5 years ago | (#24736663)

This might be going slightly offtopic but yeah I've noticed that with the increases in data size, an increase in backup awareness and redundancy has been percolating down even to the home users.

For example, recently I set up a mirrored drive system for my stepdad for his home photos (which are somewhere in the 200GB range as he is semi-professional) just in case one drive goes out. Also I've been looking at a cheap DVD Autoload backup option. Any ideas there from the Slashdot crowd?

Science! (5, Informative)

edremy (36408) | more than 5 years ago | (#24735791)

Petabytes are actually pretty common in the sciences. I visited NCAR (National Center for Atmospheric Research [ucar.edu]) in Boulder five years ago and their main database was in the 2PB region even then. I'm sure it's a lot larger today

The LHC will generate several PB of data per year, as will the Large Synoptic Survey Telescope [lsst.org]. These projects aren't all that uncommon.

Re:Science! (1)

dargaud (518470) | more than 5 years ago | (#24736631)

The LHC will generate several PB of data per year, as will the Large Synoptic Survey Telescope [lsst.org]. These projects aren't all that uncommon.

Shit, I'm working on those 2 projects. I'd better ask for a bigger hard drive to management...

Re:Science! (1)

boombaard (1001577) | more than 5 years ago | (#24737253)

don't forget projects like LOFAR [wikipedia.org] (snippets from lofar website)

In the first digital processing step 256 kHz subbands are formed. Only a subset of these bands is further processed. The maximum total bandwidth selected for further processing will be 32 MHz. Each Remote Station delivers a single dual polarization beam at 32 MHz, or 8 dual polarization beams at 4 MHz or any combination in between. The resulting output data rate is 2 Gb/s. The secondary filtering stage (to 1kHz channels) is done in the Central Processing system.

LOFAR produces large data streams, especially for the astronomy application (e.g. 6 TB of raw visibility data for an 8 beam, 4 hour synthesis observation, after integration for 1 sec and over 10kHz). One month of observing in this mode results in a PetaByte of data. (Systematic long-term storage for such data volumes thus becomes extremely expensive.)

The project is hardly up and running yet, but still, quite a bit of raw data to process. (powered by IBM's BlueGene/L)

Re:Oh, come on. (1)

garcia (6573) | more than 5 years ago | (#24735931)

You can have only so much useful information about anything.

If you have the space available and the tools to utilize the stored data, why not? The more data you keep, the more information you will have available when techniques or routines become available to you to utilize this data.

Re:Oh, come on. (3, Insightful)

Kjella (173770) | more than 5 years ago | (#24736275)

Call me old fashioned, but I don't see why anyone but a search engine like google would need anything like a petabyte. You can have only so much useful information about anything. Sounds to me like, fill your garage with sh1t, build a bigger garage.

Unfortunately, you gather up a lot of digital stuff fast and most of the time it's not useful. Take for example my business mail, it's full of old presentations and random versions of various documents and whatnot. Is it worth cleaning up? No. Is it worth keeping? Well, from time to time clients start asking about old things and it's very useful to have it. I figure 90% of it could be deleted, only keeping final versions and important mails. Of those 90% will never be asked for again, so I keep 100% for maybe 1%. Make a company with hundreds of thousands of people all like that and you get huge, huge amounts of data. It's still cheaper than to go through those huge, huge amounts of data. That goes double for many automated data collection processes - it's cheaper to keep until it's all guaranteed useless than trying to sort it out.

Re:Oh, come on. (1)

Orestesx (629343) | more than 5 years ago | (#24736659)

Do you want to figure out which is the useful stuff? Better just to store it all; you don't know what is useful into you need it.

Re:Oh, come on. (1)

jimmux (1096839) | more than 5 years ago | (#24736685)

I'm currently working on a project which has a working database of around 1.5 petabytes (at last count).

What's more, this database is constantly ingesting more data and shuffling off old data to tape archives. If the technology was available, this DB would be even bigger so we wouldn't have to retrieve data from archives in order to query data more than a year old.

There is an unbelievable amount of data out there. As long as there is somewhere to put it, we will find reasons to stick it in a database and analyse it.

Re:Oh, come on. (1)

abigor (540274) | more than 5 years ago | (#24736965)

a. How on earth would you know? Do you work in a data-intensive industry?

b. Do you understand what a data warehouse even is?

c. Data mining is statistically based. The more information that's available to mine, the more accurate the results will be. And by "information", I don't mean some kid's hard drive filled with terrible mp3s and downloaded movies.

Re:Oh, come on. (1)

MrMarket (983874) | more than 5 years ago | (#24737467)

I'm guessing most of these databases are keeping CYA information, most of which will never be used.

Too Bad Most of that is Due to Poor... (1, Insightful)

eno2001 (527078) | more than 5 years ago | (#24735587)

... DB design and old data that should be purged. Color me unimpressed.

Re:Too Bad Most of that is Due to Poor... (2, Interesting)

Anonymous Coward | more than 5 years ago | (#24735753)

... DB design and old data that should be purged. Color me unimpressed.

I'm convinced now that regardless of attempted discrimination, HUMANS are pack-rats. THAT I can deal with, as people can be trained to actually throw shit away. The problem is when lawyers get involved in the matter. Yes, most of the shit we have today in the corporate world we are FORCED to keep due to some insane lawsuit and follow-up "fix-it-forever" law that calls for us to keep a copy of every damn thing that flows electronically for the next 7 - 70 years.

Could you almost call it corruption? Yes, I can. The similarities between supply and demand feeding the corruption of oil companies can also be seen in data storage markets. Hard drives probably wouldn't be eclipsing 80GB if it were not for laws driving it that way. New personal computers with almost a terabyte of storage, yeah like Grandma is ever gonna fill that up. Give me a break.

Re:Too Bad Most of that is Due to Poor... (1)

N!k0N (883435) | more than 5 years ago | (#24737565)

Hard drives probably wouldn't be eclipsing 80GB if it were not for laws driving it that way.

I'm not sure it's laws in the consumer market rather than a cyclical "failure"* in thinking.

Lets take the hypothetical situation that Company A has made a 1GB drive, when all predecessors were making 250-500MB. Company (M)$ sees this, and instead of going through all the necessary steps to clean bloat from their new piece of software leaves it in, because "everyone has big drives now". Software companies start to follow in company (M)$'s footsteps, and hard drive manufacturers are then forced to make increasingly larger drives, which Joe Consumer then fills with pr0n and other random junk because the space is just *there* now....

* I say "failure" because at some level it is a flaw in the thinking regarding the whole harddrive issue that you've stated...

Effect of the scale (2, Insightful)

cefek (148764) | more than 5 years ago | (#24735761)

Imagine having tens of millions, or just millions users - all of them with their records, history, targeted ads data. Or some mail provider that stores attachments in a database. Or a file sharing service like those you and I know. That's a plenty of information to manage. Add an overhead, and it's easy to overfill even the biggest database.

Also I agree with you that bad design might be a concern. Of course there's no big database that couldn't get on a "purge" diet.

Now seems to me we might have a problem with querying such a big bucket of random data. Imagine a query taking months to complete. We're gonna be there in another ten years.

And then we lose the capacity to make electricity. And we can use our CDs, DVDs, let alone magnetic media to... well, dig trenches.

Those pesky petabytes of data are going to doom us.

Five alarm privacy invasion (-1, Troll)

Anonymous Coward | more than 5 years ago | (#24735595)

The potential for misuse of that kind and amount of private data, combined with the propensity of government and private industry to abuse it should be setting off alarm bells in everyone's heads. Not to mention that every black hat out there is diligently looking for the chinks in the (likely not effective) armor around those things.

OO databases have done this ten years ago (5, Interesting)

cjonslashdot (904508) | more than 5 years ago | (#24735641)

I remember encountering a 1+ petabyte database 10 years ago: it was the database to record and analyze particle accelerator experiment data at CERN. And it was built using a commercial object database - not relational. Oh but wait - the relational vendors have told us that OO databases don't scale....

That was ten years ago.

Re:OO databases have done this ten years ago (0)

Anonymous Coward | more than 5 years ago | (#24735849)

if all you are doing is reading from the data than OO is OK. if you are doing a lot of writes than you need relational

Re:OO databases have done this ten years ago (0)

Anonymous Coward | more than 5 years ago | (#24736015)

So, 10 years ago when hard drives where 50 gigs max you saw a 1 petabyte database?

That is what? 20,000 drives, minimum? I think 50 gigs for the time would have been enormous, probably a lot closer to 5.

So, it was probably a tape system, if it existed at all. I wouldn't call a system living on tape a database by any means, the access would be to slow to do anything. Something isn't adding up.

There are many reasons to have DB vendors, but don't be a sore losing just because your technology of choice turned out to be a massive failure.

Re:OO databases have done this ten years ago (1)

dfetter (2035) | more than 5 years ago | (#24736587)

Storing it is one thing. Querying is a very different thing. What happens when somebody wants to find out something not specifically envisioned in the original experiment?

Re:OO databases have done this ten years ago (3, Interesting)

littlewink (996298) | more than 5 years ago | (#24737285)

You are mistaken. While certainly almost everything (right or wrong) has been said at some time by someone, nobody respectable who knew what they were doing ever claimed that object-oriented databases would not scale.

In fact OO and similar (CODASYL, network-style, etc. ) databases were used and continue to be used very heavily in applications where relational database do not scale.

Google Maps is way bigger... (3, Informative)

Plantain (1207762) | more than 5 years ago | (#24735651)

Google Maps' database is far bigger...

A base of 8 tiles, with each becoming four more smaller tiles, in two modes (map/satellite), and 16 zoom levels.

Each tile is approx. 30kB.

(((0.03* (8 * (4^16)))/1024)/1024) == 983.04TB right there.

My calculator doesn't handle numbers big enough for streetview. O_O

Re:Google Maps is way bigger... (4, Funny)

Speare (84249) | more than 5 years ago | (#24735689)

Google Maps' database is far bigger...

A base of 8 tiles, with each becoming four more smaller tiles, in two modes (map/satellite), and 16 zoom levels.

We are sorry, but we don't
have maps at this zoom
level for this region.
Try zooming out for a
broader look.

Re:Google Maps is way bigger... (1)

Plantain (1207762) | more than 5 years ago | (#24735743)

There's actually 20 zoom levels, but I'm approximating 16 as the average.

Re:Google Maps is way bigger... (1)

Speare (84249) | more than 5 years ago | (#24736339)

My point is, two thirds of the surface of the earth is water. Oceans have maybe two or three zoom levels. Given the fractal nature of the data, your estimate of "16 levels" as the global average is waaaaaay off base. I'd be very surprised if all the unique graphics for all modes ends up being more than 1 terabyte.

Re:Google Maps is way bigger... (1)

imsabbel (611519) | more than 5 years ago | (#24736441)

Then be surprised.

The landsat data alone comes close to 1TB.
And that is just the whole world in the broad 30m or so array.
(I know, because waaay back, i mirrored part of the Nasa WorldWind data)

This data is in no way fractal in nature.

And just do the math (just to see that your argument is bogus):

A km^2 at level 20 has 4^4=256 times as much data as one at level 16.
If you do the math, central europe alone is enough to push the world to an average of level 16 (germany, e.g., is completely covered in airplane pictures, equaling about 25% of the earch surface in level-16 equivalent)

"Barrier"? (1, Insightful)

Anonymous Coward | more than 5 years ago | (#24735673)

Gigabyte barrier. Petabyte barrier.

In what sense are these barriers? Does the database resist putting more data in it the closer to a petabyte you get? Is it likely to explode once it reaches 1 petabyte?

Re:"Barrier"? (1)

n9hmg (548792) | more than 5 years ago | (#24736439)

That is exactly why I bothered to post. I think banal idiots try to amplify the importance of a milestone, and a PB IS something of a psychological milestone, by calling it a barrier. There WAS a barrier, of sorts, at 2G or 4G depending on addressing scheme, but that was easily put away with other addressing schemes, and with 64-bit architecture, it's not even relevant any more.

Hey, I just passed the 384-character barrier! Whoah!, breezing right on past! This is amazing!

When the petafile barrier crumbles ... (5, Funny)

cpu_fusion (705735) | more than 5 years ago | (#24735709)

... we'll need an army of Chris Hansens and a mountain of beartraps. God help us.

the only *real* barrier is backup time (5, Interesting)

petes_PoV (912422) | more than 5 years ago | (#24735751)

or more correctly, restore time.

Any organisation that wishes to be classed in any way professional knows that the value in it's databases has to be protected. That requires them to have the means to recover the data if something bad happens. A hot-mirrored copy is simply not good enough (one corruption would get written to both copies).

As a consequence, the size of commercial databases is limited by the amount of time the organisation is willing to have it unavailable while it is restored, in the case of a disaster, or the time taken to create/update secure, offline, copies.

Not by intrinsic properties of the database or host architecture

Re:the only *real* barrier is backup time (1)

jdanton1 (1178389) | more than 5 years ago | (#24736357)

Block based snapshots in conjuntion with database backup packages are the only way to do this. For instance with a Net App filer, you can take a block level image, and tie it in to Oracle's RMAN (Recovery Mananger). It's the only way to deal with DBs that large. BTW, I think the size limit in Oracle 10 is on the order of 10 exabytes, and Oracle 11 has no size limit.

life0cidal corepirate nazi execrable decomposing (-1, Offtopic)

Anonymous Coward | more than 5 years ago | (#24735781)

they won't be giving up right away, however, the light is beating the hell into them. fear is unprecedented evile's primary weapon. that, along with deception & coercion, helps most of us remain (unwittingly?) dependent on its' greed/fear/ego based hired goons' agenda. Most of yOUR dwindling resources are being squandered on the 'war', & continuation of the billionerrors stock markup FraUD/pyramid scheme. nobody ever mentions the real long term costs of those debacles in both life & the notion of prosperity, not to mention the abuse of the consciences of those of us who still have one. see you on the other side of it. the lights are coming up all over now. conspiracy theorists are being vindicated. some might choose a tin umbrella to go with their hats. the fairytail is winding down now. let your conscience be yOUR guide. you can be more helpful than you might have imagined. there are still some choices. if they do not suit you, consider the likely results of continuing to follow the corepirate nazi hypenosys story LIEn, whereas anything of relevance is replaced almost instantly with pr ?firm? scriptdead mindphuking propaganda or 'celebrity' trivia 'foam'. meanwhile; don't forget to get a little more oxygen on yOUR brain, & look up in the sky from time to time, starting early in the day. there's lots going on up there.

http://news.google.com/?ncl=1216734813&hl=en&topic=n
http://www.nytimes.com/2007/12/31/opinion/31mon1.html?em&ex=1199336400&en=c4b5414371631707&ei=5087%0A
http://www.nytimes.com/2008/05/29/world/29amnesty.html?hp
http://www.cnn.com/2008/US/06/02/nasa.global.warming.ap/index.html
http://www.cnn.com/2008/US/weather/06/05/severe.weather.ap/index.html
http://www.cnn.com/2008/US/weather/06/02/honore.preparedness/index.html
http://www.nytimes.com/2008/06/01/opinion/01dowd.html?em&ex=1212638400&en=744b7cebc86723e5&ei=5087%0A
http://www.cnn.com/2008/POLITICS/06/05/senate.iraq/index.html
http://www.nytimes.com/2008/06/17/washington/17contractor.html?hp
http://www.nytimes.com/2008/07/03/world/middleeast/03kurdistan.html?_r=1&hp&oref=slogin
http://biz.yahoo.com/ap/080708/cheney_climate.html
http://news.yahoo.com/s/politico/20080805/pl_politico/12308;_ylt=A0wNcxTPdJhILAYAVQms0NUE

is it time to get real yet? A LOT of energy is being squandered in attempts to keep US in the dark. in the end (give or take a few 1000 years), the creators will prevail (world without end, etc...), as it has always been. the process of gaining yOUR release from the current hostage situation may not be what you might think it is. butt of course, most of US don't know, or care what a precarious/fatal situation we're in. for example; the insidious attempts by the felonious corepirate nazi execrable to block the suns' light, interfering with a requirement (sunlight) for us to stay healthy/alive. it's likely not good for yOUR health/memories 'else they'd be bragging about it? we're intending for the whoreabully deceptive (they'll do ANYTHING for a bit more monIE/power) felons to give up/fail even further, in attempting to control the 'weather', as well as a # of other things/events.

http://www.google.com/search?hl=en&q=weather+manipulation&btnG=Search
http://video.google.com/videosearch?hl=en&q=video+cloud+spraying

dictator style micro management has never worked (for very long). it's an illness. tie that with life0cidal aggression & softwar gangster style bullying, & what do we have? a greed/fear/ego based recipe for disaster. meanwhile, you can help to stop the bleeding (loss of life & limb);

http://www.cnn.com/2007/POLITICS/12/28/vermont.banning.bush.ap/index.html

the bleeding must be stopped before any healing can begin. jailing a couple of corepirate nazi hired goons would send a clear message to the rest of the world from US. any truthful look at the 'scorecard' would reveal that we are a society in decline/deep doo-doo, despite all of the scriptdead pr ?firm? generated drum beating & flag waving propaganda that we are constantly bombarded with. is it time to get real yet? please consider carefully ALL of yOUR other 'options'. the creators will prevail. as it has always been.

corepirate nazi execrable costs outweigh benefits
(Score:-)mynuts won, the king is a fink)
by ourselves on everyday 24/7

as there are no benefits, just more&more death/debt & disruption. fortunately there's an 'army' of light bringers, coming yOUR way. the little ones/innocents must/will be protected. after the big flash, ALL of yOUR imaginary 'borders' may blur a bit? for each of the creators' innocents harmed in any way, there is a debt that must/will be repaid by you/us, as the perpetrators/minions of unprecedented evile, will not be available. 'vote' with (what's left in) yOUR wallet, & by your behaviors. help bring an end to unprecedented evile's manifestation through yOUR owned felonious corepirate nazi glowbull warmongering execrable. some of US should consider ourselves somewhat fortunate to be among those scheduled to survive after the big flash/implementation of the creators' wwwildly popular planet/population rescue initiative/mandate. it's right in the manual, 'world without end', etc.... as we all ?know?, change is inevitable, & denying/ignoring gravity, logic, morality, etc..., is only possible, on a temporary basis. concern about the course of events that will occur should the life0cidal execrable fail to be intervened upon is in order. 'do not be dismayed' (also from the manual). however, it's ok/recommended, to not attempt to live under/accept, fauxking nazi felon greed/fear/ego based pr ?firm? scriptdead mindphuking hypenosys.

consult with/trust in yOUR creators. providing more than enough of everything for everyone (without any distracting/spiritdead personal gain motives), whilst badtolling unprecedented evile, using an unlimited supply of newclear power, since/until forever. see you there?

"If my people, which are called by my name, shall humble themselves, and pray, and seek my face, and turn from their wicked ways; then will I hear from heaven, and will forgive their sin, and will heal their land."

meanwhile, the life0cidal philistines continue on their path of death, debt, & disruption for most of US. gov. bush denies health care for the little ones;

http://www.cnn.com/2007/POLITICS/10/03/bush.veto/index.html

whilst demanding/extorting billions to paint more targets on the bigger kids;

http://www.cnn.com/2007/POLITICS/12/12/bush.war.funding/index.html

& pretending that it isn't happening here;

http://www.timesonline.co.uk/tol/news/world/us_and_americas/article3086937.ece
all is not lost/forgotten/forgiven

(yOUR elected) president al gore (deciding not to wait for the much anticipated 'lonesome al answers yOUR questions' interview here on /.) continues to attempt to shed some light on yOUR foibles. talk about reverse polarity;

http://www.timesonline.co.uk/tol/news/environment/article3046116.ece

The world will only ever need 5 large databases (5, Funny)

davidwr (791652) | more than 5 years ago | (#24735857)

The world will only need 5 large databases.

None of them will never need more than 640KB^H^HMB^H^HGBMB^H^HTB of RAM and 32MB^H^HGB^H^HTB^H^HPB of storage.

Is this article a commercial for proprietary dbs? (1)

GNUPublicLicense (1242094) | more than 5 years ago | (#24736061)

I think that's obvious... actually with hard disk of 1 terrabytes being broadband, reaching a petabyte is quite easy, even for a midsize organization. Where I work, we build ourselves our disk matrixes, and reaching 1000 terabytes is about to put together just a few 1000's of disks, not a big deal.

IBM Boulder (2, Insightful)

Abattoir (16282) | more than 5 years ago | (#24736267)

Is the location of IBM's Managed Storage Services (MSS) division, which deploys SAN for customers in Boulder (including IBM internal) and other locations (over high speed fibre links) on IBM "Shark" (ESS) and DS6000/DS8000 devices. When I worked at IBM their marketing materials stated they were managing over 4 petabytes of data for enterprise customers out of that location alone - that was four years ago! That doesn't count for other MSS locations either, nor all the other areas where IBM implements large amounts of storage for customers. Remember, many if not most of IBM's customers are governments and Fortune 100 companies, particularly high finance. I think they've got some data.

So you want to talk about high levels of storage - IBM has the game covered, considering they invented the [ibm.com] HDD [wikipedia.org].

I wonder (1)

DragonTHC (208439) | more than 5 years ago | (#24736311)

How much of that data is marketing information?

seriously, is all of that data current and necessary?

seems to me that they should prune off and backup old data.

Re:I wonder (1)

arrowrod (1256976) | more than 5 years ago | (#24736675)

Wonder no more. Some jackass always puts a retention date on data. Usually 25 or 100 years. Mostly trivial backups. The rational: "Well you never know". What is most amusing is backup data is usually the same over and over. Day after day, year after year.

Johnny Mnemonic (5, Funny)

vjmurphy (190266) | more than 5 years ago | (#24736347)

I need measurements I can understand, like how many Keanu Reeves' brains is a petabyte? And could he hold it indefinitely, or would his head explode at some point? If the latter, can we get him started on it now?

Re:Johnny Mnemonic (0)

Anonymous Coward | more than 5 years ago | (#24736573)

Well, it depends on whether or not he used a doubler.

Silly (0)

Anonymous Coward | more than 5 years ago | (#24736481)

My database will never reach 640K.

Chuck Norris (0)

Anonymous Coward | more than 5 years ago | (#24736513)

The hard drive was invented by Chuck Norris and he gave IBM the permission to use it. The petabyte is just a keyword Chuck Norris uses to describe the way he can take down Johnny Lawbreakers with his teeth.

My only concern (0)

Anonymous Coward | more than 5 years ago | (#24736611)

Is if this will run on Linux

How is this news? (4, Interesting)

Dark$ide (732508) | more than 5 years ago | (#24736617)

We've had petabyte databases on mainframes for a good couple of years. DB2 v9 on zSeries has two new tablespace types that make managing these humungous databases much easier.

So it may be news for the PC world but it's bordering on ancient history on IBM mainframes.

Gigabyte barrier? (1)

MaxEmerika (701730) | more than 5 years ago | (#24736793)

He meant that the terabyte barrier (not the gigabyte barrier) was broken fifteen years ago, correct?

The 1-petabyte Barrier is flattened (2, Informative)

cjjjer (530715) | more than 5 years ago | (#24737375)

Seems that Yahoo made this claim months [computerworld.com] ago but for a 2 petabyte database. The article goes on to list a couple of others that have more than 2 petabytes of archived data. So it's safe to say that the petabyte data barrier has been broken for some time.
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...