Google's Academic TB Swap Project

Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

Google's Academic TB Swap Project 190

Posted by CmdrTaco on Wednesday March 07, 2007 @12:00PM from the hey-look-it's-chris dept.

eldavojohn writes "Google is transferring data the old fashioned way — by mailing hard drive arrays around to collect information and then sending copies to other institutions. All in the name of science & education. From the article, 'The program is currently informal and not open to the general public. Google either approaches bodies that it knows has large data sets or is contacted by scientists themselves. One of the largest data sets copied and distributed was data from the Hubble telescope — 120 terabytes of data. One terabyte is equivalent to 1,000 gigabytes. Mr. DiBona said he hoped that Google could one day make the data available to the public.'"

This discussion has been archived. No new comments can be posted.

Google's Academic TB Swap Project

Load All Comments

Search 190 Comments Log In/Create an Account

Comments Filter:

Should we be continuing this fallacy? (Score:3, Informative)

by garcia ( 6573 ) writes: on Wednesday March 07, 2007 @12:02PM (#18262634)

One terabyte is equivalent to 1,000 gigabytes.

Uhh, no it isn't. It's really 0.9765625 terabytes.

Share
twitter facebook
- Re:Should we be continuing this fallacy? (Score:5, Funny)
  
  by Cristofori42 ( 1001206 ) writes: on Wednesday March 07, 2007 @12:06PM (#18262694)
  
  umm a terabyte is really 1 terabyte. Though 1 terabyte = 1024 gigabytes not 1000... but whatever.
  
  Parent Share
  twitter facebook
  - Re: (Score:2, Informative)
    
    by garcia ( 6573 ) writes:
    
    Thanks for pointing out that I should have been hitting Preview instead of getting First Post :)
    
    1000GB = 0.9765625 TB, not 1TB.
  - Nope (Score:3, Informative)
    
    by sheldon ( 2322 ) writes:
    
    How you measure a terabyte depends on whether you are buying disk, or monitoring disk usage on your server.
    
    The disk manufacturers define it as 1000 megabytes which is 1000 kilobytes which is 1000 bytes.
    
    The OS measures it as 1024 megabytes, which is 1024 kilobytes, which is 1024 bytes
    
    Why? Because when you're buying a drive, 750 Gigs sounds bigger than 698.5 gigs.
  - Not acording to NIST (Score:4, Interesting)
    
    by Ernesto Alvarez ( 750678 ) writes: on Wednesday March 07, 2007 @04:51PM (#18267030) Homepage Journal
    
    If you want to be strict, the SI defines the "tera" prefix as 10^12, so 1 terabyte = 1000 gigabytes.
    
    If you want to use the binary values, you might as well use the correct "tebi" prefix. NIST [nist.gov] says you should, and it looks like the IEC, IEEE and BIPM agree.
    
    Parent Share
    twitter facebook
- Re: (Score:2, Informative)
  
  by wizzard2k ( 979669 ) writes:
  
  From wikipedia:
  (a contraction of tera binary byte) is a unit of information or computer storage, abbreviated TiB.
  
  1 tebibyte [wikipedia.org] = 240 bytes = 1,099,511,627,776 bytes = 1,024 gibibytes
  
  The tebibyte is closely related to the terabyte, which can either be an (inaccurate) synonym for tebibyte, or refer to 1012 bytes = 1,000,000,000,000 bytes, depending on context.
  - - Re: (Score:3, Insightful)
      
      by servoled ( 174239 ) writes:
      
      Makes you wonder why some morons decided to do it in the first place when they tried to redefine kilo, mega, giga, etc... to be 2^x instead of 10^y.
    - Re: (Score:2)
      
      by General Wesc ( 59919 ) writes:
      
      The 'morons' are the IEEE and their standards are recognised by the ISO. What you consider 'established usage' was not established enough to be accepted by the hard drive manufacturers, which was the primary place where the prefixes were in use. We're not going to go back to the inconsistent, ambiguous de facto standards just because the new ones annoy you. Do you also reject the redefinition of the foot to a standard length?
      - Re: (Score:3, Informative)
        
        by jonbryce ( 703250 ) writes:
        
        The other primary place where the prefixes are in use is RAM chips, and they do use 2^10 rather than 10^3.
    - Re: (Score:2)
      
      by ResidntGeek ( 772730 ) writes:
      
      Why do you insist on keeping the metric system down? Are you keeping Atlantis off the maps too?
- Re: (Score:2, Insightful)
  
  by AchiIIe ( 974900 ) writes:
  
  Nope, that's wrong
  
  see: http://en.wikipedia.org/wiki/Tebibyte [wikipedia.org]
  * 1 Terabyte = 1000 Gigabyte
  * 1 Tebibyte = 1024 Gibibyte
  - Bark! Bark! Bark! (Score:5, Funny)
    
    by ColdWetDog ( 752185 ) writes: on Wednesday March 07, 2007 @12:35PM (#18263096) Homepage
    
    I'm so tired of this stuff. Byte me!
    
    Parent Share
    twitter facebook
    - Re:Bark! Bark! Bark! (Score:5, Funny)
      
      by AchiIIe ( 974900 ) writes: on Wednesday March 07, 2007 @02:36PM (#18265168)
      
      > I'm so tired of this stuff. Byte me!
      
      I'm sorry, that's wrong too:
      
      * 1 byte == 2 nibbles
      * 1 byte != 1 bite
      
      --
      Byte nazi police, proudly serving since 2^1025
      
      Parent Share
      twitter facebook
  - Re:Should we be continuing this fallacy? (Score:5, Insightful)
    
    by Professor_UNIX ( 867045 ) writes: on Wednesday March 07, 2007 @12:53PM (#18263342)
    
    * 1 Terabyte = 1000 Gigabyte * 1 Tebibyte = 1024 Gibibyte
    Yea, yea, yea. And you also believe a hacker isn't someone who maliciously breaks into computer systems, it's just a curious innocent person right... crackers are the criminals! Give it up. The general public is never going to adopt "Tebibyte" into the language because terabyte sounds much more fucking cool.
    
    Parent Share
    twitter facebook
- Re: (Score:2, Insightful)
  
  by wolff000 ( 447340 ) writes:
  
  WHO CARES?!? I have worked with mathematicians that did not squabble over these terms so why the hell are we?!? My mother who can hardly turn a computer on knows damn well that 1000 megabytes is roughly 1 gigabyte. Now lets get back to the topic. It seems Google would have some brilliant way to push a terabyte through the "tubes" instead of just mailing drives, how archaic.
  - - Re:Should we be continuing this fallacy? (Score:4, Insightful)
      
      by Anpheus ( 908711 ) writes: on Wednesday March 07, 2007 @02:47PM (#18265354)
      
      That's not the problem, the problem is, when you buy a X GB drive, you don't know what you're getting until you find the fine print. Some manufacturers provide different sizes of the same labeled drive, differing only in whether it's "1 GB = 1,000,000 KB" or "1 GB = 1,000,000,000 B"
      
      So if you buy a set for RAID one day, the next day they may no longer stock the drive you need and your vital information is put at unnecessary risk because... what, because the hard drive manufacturers can't decide whether they want to screw you out of 7% (using 1 GB = 1 billion bytes) or 5% (using 1 GB = 1 million kilobytes, which they curiously agree on equaling 1024 billion bytes. What a coincidence that KB is 2^10, but GB is 10^9?)
      
      Think about that for a moment before you lambast the argument for proper labeling of drives.
      
      Parent Share
      twitter facebook
- Re: Google's Academic TB Swap Project (Score:2)
  
  by HTH NE1 ( 675604 ) writes:
  
  I'm just happy they're not swapping tuberculosis [acronymfinder.com].
- Re: (Score:2)
  
  by guruevi ( 827432 ) writes:
  
  I'm old and interested enough to know what REALLY happened through the history:
  
  First, as taught in any school book and computer manual through history (see Apple, Amiga, Microsoft, Commodore): 1024 bytes = 1Kilobyte, 1024 Kilobyte = 1 Megabyte etc. because the computer could only calculate in exponents of 2 (1 and 0) and 20MB (20480 kilobyte) was about the largest size hard drive you could get.
  
  A Kilobyte is 1024 (2^10) bytes. A Megabyte is 1024 Kilobytes or 1,048,576 bytes (2^20) and a Gigabyte is 1024 Mega
  - Re: (Score:2)
    
    by HTH NE1 ( 675604 ) writes:
    
    The annoying part for me today is that flash memory is in powers of two (64 MB, 128 MB, 256 MB, 512 MB, etc.), be it for cameras or in USB thumbdrives, yet the units are metric, not binary (stating 1 MB == 1 million bytes on the packaging).
    
    When I see a power of 2 next to the units, I expect the units to be in a power of 2 too.
Large datasets (Score:5, Informative)

by BWJones ( 18351 ) * writes: on Wednesday March 07, 2007 @12:03PM (#18262648) Homepage Journal

This is absolutely the most cost effective way of transferring large amounts of data like this. If you do the calculations on terrabyte size files, sneakernet (of FedEx net) is actually faster and less expensive. We also went to one of Jim Grey's seminars when he was here giving an Organick Memorial Lecture and he made an incredibly compelling demonstration using a variety of data types. We ended up talking with him for some time after about new projects we are engaging in that will also be generating terrabytes of data and his suggestion was to pass applications rather than data which was interesting.

This is becoming more and more the norm in scientific research and Google's work is quite welcome.

Share
twitter facebook
- Re:Large datasets (Score:4, Funny)
  
  by Sobrique ( 543255 ) writes: on Wednesday March 07, 2007 @12:09PM (#18262734) Homepage
  
  Never underestimate the bandwidth of a lorryload of backup tapes traveling at 60 miles an hour.
  Latency may leave something to be desired though :)
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by ezzzD55J ( 697465 ) writes:
    
    Never underestimate the bandwidth of a lorryload of backup tapes traveling at 60 miles an hour.
    Close enough.. This is attributable to Andy Tanenbaum according to http://www.bbc.co.uk/dna/h2g2/A678576 [bbc.co.uk] (and one of his books I read).
    Another ontopic remark.
    Google either approaches bodies that it knows has large data sets
    
    I know people who also approach bodies that they know have large 'data sets', but that doesn't get them a lot of 'bandwidth' ;)
- In Other News (Score:5, Funny)
  
  by UnknowingFool ( 672806 ) writes: on Wednesday March 07, 2007 @12:11PM (#18262774)
  
  FedEx delivered what appeared to be a ton of broken office chairs to Google headquarters this morning. When asked for the sender's ID, the severely beaten FedEx courier would only reply that the sender wished to remain anonymous.
  
  Parent Share
  twitter facebook
  - Mod parent up (Score:3, Informative)
    
    by ari_j ( 90255 ) writes:
    
    Here's what happened when I FedExed my RMA to Newegg, packed very carefully. Note the bent motherboard - I didn't even know you could do that. The good news is that FedEx paid part of my claim ... they paid $100 plus the $8.33 that the FedEx store charged me to fax in the claim forms. The bad news is that they did not refund my original shipping or pay more than $100 on the over $280 of damage that they did. It also took about 4 hours of phone calls to even convince FedEx that I was not the seller, and
    - Re: (Score:2)
      
      by monkeydo ( 173558 ) writes:
      
      The bad news is that they did not refund my original shipping or pay more than $100 on the over $280 of damage that they did.
      
      Did you buy additional insurance over the $100 you get by default?
      - Re:Mod parent up (Score:5, Informative)
        
        by MajinBlayze ( 942250 ) writes: on Wednesday March 07, 2007 @03:26PM (#18265958)
        
        As a former UPS employee, (I worked as a package handler, the guy that beats the shit out of your boxes as he loads them on the truck) I will never ship anything of value without paying extra for the insurance. when you do that, a couple of things happen:
        
        the item goes into a big bag (by itself, not mixed with other items) with red/white stripes, so employess know not to mess with it)
        
        it gets hand-carted to the destination truck, and is the last thing to be loaded, and first unloaded
        
        only seasoned workers ever touch your package, and generally care about the state that it's in
        
        finally, they are good about paying up if the item arrives damaged.
        
        did I forget to include ???? and Profit!
        
        Parent Share
        twitter facebook
      - Re: (Score:2)
        
        by evilviper ( 135110 ) writes:
        
        Moreover, insurance becomes largely irrelevant when you get into the "run over with a truck" territory that this particular shipment was in.
        Why? You said they did pay your $100 claim after all.
        Also, FedEx never offered me insurance when I told them what I was shipping and its value.
        No idea what you're talking about. You generally fill out the form yourself, and select what insurance you want.
        I'd roughly estimate that FedEx has caused me at least $800 worth of uncompensated losses in the past 5 years, rega
        
        Re: (Score:2)
        
        by ari_j ( 90255 ) writes:
        
        Or maybe you are just lucky. I don't ship that often, and FedEx has to date managed only once or twice to get a package through without undue delay or damage. As to the $100 they paid this time, I had never before that had any of their insurance honored. I wasn't about to pay for it again on the off chance that it'd work out for once. You can hardly hold the end result against my decision, obviously made without knowledge of the outcome. Besides, insurance is meant to cover damage due to normal mishand
        
        Re: (Score:3, Informative)
        
        by evilviper ( 135110 ) writes:
        
        Besides, insurance is meant to cover damage due to normal mishandling, such as dropping a box by mistake, not the kind of (at least nearly) intentional damage that must have been involved in my case. Or maybe you have a theory of how my box got squashed that badly in the normal course of FedEx's business.
        I still don't know where you get that idea. Insurance is meant to handle any kind of damage, including being completely destroyed in plane crashes, car accidents, train derailments, theft, loss, and anythi
        
        Re: (Score:2)
        
        by ari_j ( 90255 ) writes:
        
        What I mean is that being run over by a truck is not within the realm of what a person buying FedEx insurance contemplates. When you ship something, you can assume a certain level of negligence is possible - such as dropping your package from a height of 4 feet or setting a package that leaks liquids on top of it. You don't normally think that FedEx will be so careless as to run your package over with a truck.
        
        I use the word "intentional" because it wouldn't surprise me if the kind people at my local Fe
        
        FedEx, UPS, insurance. (Score:2)
        
        by Kadin2048 ( 468275 ) writes:
        
        I'm with you, although I have seen FedEx and UPS both damage a lot of packages. I think that their automated systems are a lot rougher on packages than AirBorne Express / DHL or the USPS's Parcel Post. But if you don't insure it, you're accepting that risk when you give them the goods.
        
        A while back I bought a radio-controlled airplane, pre-assembled. It came in a big box, most of which contained the wing. So it was fairly fragile, but well packed, in tri-wall. Got it sent UPS, with insurance for the full val
    - Re: (Score:2)
      
      by winnabago ( 949419 ) writes:
      
      the insurance remedy was to return it to the origination address and ask to see an original purchase receipt to award the insurance claim
      
      Sorry to nitpick, but this scam has been around for ages - you broke something, oh no! I'll send it to myself and pretend UPS did it. Hell, I even saw it in Seinfeld. Not that you were doing this, but what you tried is pretty suspicious to an outside observer.
      
      They need SOME proof of value or even that the box was actually full to fight this type of fraud, and the
      - Re: (Score:2)
        
        by ari_j ( 90255 ) writes:
        
        Customers in general ought not to be held to know FedEx's corporate structure. I did indeed use the Newegg-provided label. As to my prior shipment broke by UPS, of course I realize that there is the potential for scams. I was shipping Christmas presents to myself because it was cheaper and, on average, safer than trying to check them on my return flight. See my other replies in this thread for more on the FedEx $100 insurance situation.
        
        Re: (Score:2)
        
        by winnabago ( 949419 ) writes:
        
        Customers in general ought not to be held to know FedEx's corporate structure.
        
        I don't know if, in this age, this is wise. With so many corporations buying up major parts of our lives like food, communications, salaries, and transportation, I would challenge you to take a look at the structure of the different entities that affect you daily. The unfortunate fact is that every decision you make needs to be researched to find the most appropriate course of action based on who is behind the marketing. Su
        
        Re: (Score:2)
        
        by ari_j ( 90255 ) writes:
        
        I would agree with you, except that I don't think that the average consumer should be held to that level of sophistication. This is mostly a cheaper cost avoider issue, for me. Who can more efficiently discover the relevant information? Clearly, the answer here is FedEx.
    - Good Hard Padded Cases (Score:2)
      
      by chrisd ( 1457 ) * writes:
      
      We ended up buying a bunch of these to ship the arrays around in. Cardboard == bad :-)
- Re: (Score:3, Insightful)
  
  by dmayle ( 200765 ) writes:
  
  I remember an article I read on this I think back in the year 2000. The was a research scientist who built a standardized platform (That is to say, a specific PC case with a certain number of hard drive bays, and certain network cards) so that he could exchange data with other universities. They would fill up the data on the networked PC, and they could ship it to any of the participating projects, knowing that they'd get back the same hardware in return.
  
  I remember at the time thinking it was just one of
  - Re: (Score:3, Insightful)
    
    by BWJones ( 18351 ) * writes:
    
    Yeah, there have been a number of folks using variations on this theme for a while now. It's been interesting that network performance really has not followed the same performance curve as storage and CPU throughput. Add to that the growing amount of data being pushed through "consumer" pipes from people obtaining broadband and pushing sources such as YouTube and company and you have the makings for a bandwidth crunch. This of course is the reason for separate academic and government Internet paths, but
    - Re: (Score:2)
      
      by TubeSteak ( 669689 ) writes:
      
      In fact, at some universities engaging in data intensive projects, it is not uncommon for them to occupy the entire bandwidth of the university in off hours to transfer data around the country to various collaborators.
      Even using the full bandwidth between Internet2 connected Unis, it would still take 2~3+ days to transfer 250Tb of data.
      
      10Gb/s is close to the max you can do with one frequency. That will all change once they start pumping multiple colors down their fiber. Their bandwidth will explode & Go
    - Re: (Score:2)
      
      by jcnnghm ( 538570 ) writes:
      
      Internet bandwidth hasn't kept up, but local bandwidth definitely has. My network throughput is more than capable of transmitting data faster than my hard drives are able to write it. And I wouldn't even agree about the net bandwidth. I have a 15mb connection where I used to have a 56k.
  - Re: (Score:2)
    
    by chrisd ( 1457 ) * writes:
    
    That might have been Queue's interview with Jim Gray? Check it out here:
    http://www.acmqueue.org/modules.php?name=Content&p a=showpage&pid=43 [acmqueue.org]
    Chris
- Re: (Score:3, Informative)
  
  by Agent Orange ( 34692 ) writes:
  
  Yup. There was a paper a few years back entitled "terascale sneakernet", by jim gray and a couple of guys at MSFT research division on this. You can find it in the arxiv [arxiv.org].
  
  This concept has also been applied to such things as the Sloan Digital Sky Survey [sdss.org]. Astronomers do tend to generate a lot of data with large surveys such as this.
- Re: (Score:2)
  
  by SuperMog2002 ( 702837 ) writes:
  
  As the old joke goes, never underetimate the bandwidth of a station wagon full of magnetic tapes. Or a Fed Ex plane full of hard drives. Your choice.
- rsync (Score:2)
  
  by G3ckoG33k ( 647276 ) writes:
  
  We have been sending two DVDs, with about 6-8 GB data, around every month for updates. Now we are trying rsync, which in our view has been more convenient.
  - Re: (Score:2)
    
    by Laur ( 673497 ) writes:
    
    We have been sending two DVDs, with about 6-8 GB data, around every month for updates. Now we are trying rsync, which in our view has been more convenient.
    The article and the GP is about sending large amounts of data, as in terabytes. In this discussion, 8 GB is tiny, and is easily downloaded much faster than even express mail. Besides, rsync won't really help if all your data is unique (such as astronomical data). Rsync really helps when very little of your data set changes between updates, such as ba
- Re: (Score:2)
  
  by corbettw ( 214229 ) writes:
  
  Wait, I'm confused, what happened to the tubes?
- Re: (Score:2)
  
  by Duncan3 ( 10537 ) writes:
  
  Shhhh... It's Google we're talking about, THEY came up with this groundbreaking shift in how data is handled.
  
  Praise the Google, don't point out they are just doing the same thing as everyone else.
  
  Google is watching.
Never underestimate... (Score:2)

by bjpirt ( 251795 ) writes:

But are they using station wagons?
Oblig. (Score:2)

by RyanFenton ( 230700 ) writes:

Never underestimate the bandwidth of a station wagon... [bpfh.net]

Still very much applies today.

Ryan Fenton
- Re: (Score:2)
  
  by Paulrothrock ( 685079 ) writes:
  
  The page you linked to had a smart idea. Rather than just have the raw disks, create some sort of architecture inside to allow for rapid transmission of the data from the vehicle upon arrival. I could see specialized vehicles that have been hardened against an accident with an inverter to power the drives that have external fiber optic ports hooked up to massive, high speed RAID arrays to rapidly dump the contents to another system at the location and upload content for the next destination.
  Then a GPS syst
How long until... (Score:2)

by gEvil (beta) ( 945888 ) writes:

How long do you think it will be until some maroon somewhere plunks a hard drive into an unpadded envelope and drops it in the big blue mailbox on the corner?
- - - - Re: (Score:2)
        
        by treeves ( 963993 ) writes:
        
        I have a pretty good vocabulary, yet I was unfamiliar with the 1st definition provided above.
        I assumed the Bugs Bunny interpretation as well. There, now you have at least two instances.
so.. (Score:3, Interesting)

by mastershake_phd ( 1050150 ) writes: on Wednesday March 07, 2007 @12:08PM (#18262718) Homepage

Whos going to own the data? I hope Google isnt going to say they do like they want to with the old books theyre scanning. Everytime you download a hubble picture will it have a google watermark?

Share
twitter facebook
- Re: (Score:2, Flamebait)
  
  by 99BottlesOfBeerInMyF ( 813746 ) writes:
  
  Whos going to own the data?
  As always the people of the world own the data. The copyright holders are, however, given a short term monopoly on making copies of it, with certain exceptions.
  I hope Google isnt going to say they do like they want to with the old books theyre scanning.
  Google has not, as far as I know, claimed "ownership" or even copyright on anything they've scanned. They have, however, created their own database of metadata about the works, which they use to enable people to more easily find specific items in the original data.
  Everytime you download a hubble picture will it have a google watermark?
  Umm, maybe. Why do I care if they add watermarks to it? If they are in the way
  - Re: (Score:2)
    
    by electrosoccertux ( 874415 ) writes:
    
    Everytime you download a hubble picture will it have a google watermark?
    
    Umm, maybe. Why do I care if they add watermarks to it?
    
    Because there's no water in space! Obviously then there shouldn't be any marks indicating as such on a picture the Hubble telescope took!
- Re: (Score:3, Interesting)
  
  by cfulmer ( 3166 ) writes:
  
  The ownership of data is presumably a case-by-case thing that depends on what the data is and how it was acquired.
  
  For example, Google does not own the copyright on out-of-copyright books that it scans in (nobody does, by definition.) At best, it might own the copyright on the scan that it did, but that's really unlikely--copyright protects creative expression and a straight scan doesn't add any.
  
  However, they probably have some rights under unfair competition law because they have gone through a lot of work
  - Re: (Score:3, Informative)
    
    by oneiros27 ( 46144 ) writes:
    
    So, if Google takes the raw data and does that color assignment itself, well, the result is theirs.
    I'm not so sure that the result in theirs, necessarily. They'd need to properly attribute it. Many science archives have rules about how to properly attribute their work.
    
    Don't get me wrong -- many of the scientists want people to use their data (eg, see The Astronomer's Data Manifesto [ivoa.net]), but they also want to know who's using it, because it's how they justify the value of their projects, and the costs incurr
    - Re: (Score:2)
      
      by cfulmer ( 3166 ) writes:
      
      Attribution is different from copyright. For example, say you have a novel scientific idea which you write about in some scientific journal and that I read your article and publish my own article, using your idea without attribution.
      
      Now, what I've done would reasonably upset you, but there is no law (at least in the US) that requires me to attribute your ideas to you. In fact, under those facts, I completely own the copyright in my article and you have no legal remedy. Now, there may be repercussions--I
- - Re: (Score:2)
    
    by 99BottlesOfBeerInMyF ( 813746 ) writes:
    
    I really don't like the idea of a "private" (yes i know its publically traded) company having control of this public information.
    You do know many government agencies already outsource IT and other projects to "private" companies who have all this government generated information, right?
    The data was paid for by tax payers. Google will inevitably make money from this otherwise they wouldn't be doing it.
    Yeah, and right now Microsoft makes money off of selling them the OS and office suite. This isn't a question of if the government will be paying for the ability of their employees to do word processing, it is just a matter of how much and which companies will be getting the money. I don't trust Google any less than I do MS, who currently supplies
  - - Re: (Score:2)
      
      by HTH NE1 ( 675604 ) writes:
      
      So, what you're saying is that this public data shouldn't be copied? It's not like they're taking all of the data and destroying the originals.
      
      There's destroying and then there's locking away. There are people pushing for laws that say one person's copy of a public domain work is copyrighted by that person for the typical term and that no one else may make a copy from that copy without permission. It's specifically about granting broadcasters copyright over their rebroadcast of a public domain work, but i
Never underestimate ... (Score:3, Interesting)

by boyfaceddog ( 788041 ) writes: on Wednesday March 07, 2007 @12:09PM (#18262736) Journal

The bandwidth of a moving van full of disks.

Looks like Google is hoarding data. Seems they at least are equating information with power and money. And them that has the power and money makes the rules.

Share
twitter facebook
- Re: (Score:2)
  
  by Paulrothrock ( 685079 ) writes:
  
  They're not hoarding the data. They're storing it online in open formats, at least according to the article.
- - Re: (Score:2)
    
    by boyfaceddog ( 788041 ) writes:
    
    I first heard is as "wagon full of punch cards", but I thought I'd update it.:-)
Other Uses for Mass Data Transfer (Score:4, Funny)

by Anonymous Coward writes: on Wednesday March 07, 2007 @12:12PM (#18262784)

Moe: Say, Barn, uh, remember when I said I'd have to send away to NASA to calculate your bar tab?
Barney: Oh ho, oh yeah, you had a good laugh, Moe.
Moe: The results came back today. (reading a printout) You owe me seventy billion dollars.
Barney: Huh?
Moe: No, wait, wait, wait, that's for the Voyager spacecraft. Your tab is fourteen billion dollars.

Share
twitter facebook
Hubble Data (Score:2, Funny)

by Ikyaat ( 764422 ) writes:

120 TB of data from the Hubble telescope? I wish I was paid to go through that. And this picture is of a...star and this one is a star And a star another star OMG its a FRICKIN STAR
- Re: (Score:2)
  
  by Chris Burke ( 6130 ) writes:
  
  Don't get too complacent...
  
  "Star. Star. Star. Damnit, star. Star. God this sucks. Star. Star. Space ship. Star. Star. Star. God nothing but fucking stars! Fuck hubble, useless piece of shit!"
- Re: (Score:3, Funny)
  
  by nharmon ( 97591 ) writes:
  
  "That's no moon"
- Re: (Score:2)
  
  by LakeSolon ( 699033 ) * writes:
  
  My God, it's full of stars.
I hope they're not using ... (Score:2)

by LaughingCoder ( 914424 ) writes:

SUVs to transport those hard drives. That would be evil.
Isn't TB... dangerous? (Score:2, Redundant)

by Qubit ( 100461 ) writes:

I don't know what the article title conjured up in your head, but when I saw:
Google's Academic TB Swap Project
...the first thing I thought was "why are they swapping around samples of a dangerous infectious disease like tuberculosis?"
- Re: (Score:2)
  
  by sckeener ( 137243 ) writes:
  
  ...the first thing I thought was "why are they swapping around samples of a dangerous infectious disease like tuberculosis?"
  
  I'm glad I wasn't the only one!
Dangerous precedent (Score:2)

by DebateG ( 1001165 ) writes:

Don't say I didn't warn you guys about this "don't be evil thing." First they start swapping TB for "academic" purposes, then maybe some avian influenza in some apartments around Mountain View, and next thing you know, they'll be a smallpox outbreak and we will coincidentally receive advertisements on gmail that we can buy the cure for a few thousand dollars from one of their Adsense "partners."
Units... (Score:2)

by alexhs ( 877055 ) writes:

One terabyte is equivalent to 1,000 gigabytes.
Hey, where do you think you are ? It's Slashdot here ! Everyone knows that ! What people here want to know is how much that does in Library of Congress...

The only thing you're getting by saying that is a flamewar between 10 kinds of people, whose who count only in MB (and disagree with you) an those who count in both MB and MiB (and agree with you) !

For my take on the issue, see this precedent post [slashdot.org] of mine.
- Re: (Score:2)
  
  by nbritton ( 823086 ) writes:
  
  Actually it 1024 gigabytes using binary units (base 2), we use binary units because formatted capacity is measured in binary units. For exampe: 1 Exabyte = 1(1024) Petabytes = 1(1024)(1024) Terabytes = 1(1024)(1024)(1024) Gigabytes and so on... The formula to convert si units into binary units is si_unit * (125/128) which comes out to 0.9765625. For example: a 750GB hard drive is 750(125/128) = 732.421875 Gigabytes. Also don't forget reserved space... On FreeBSD it's 8% of the format capacity, so 732.421875
  - Re: (Score:2)
    
    by alexhs ( 877055 ) writes:
    
    we use binary units because formatted capacity is measured in binary units.
    It seems you haven't read my previous post I was linking to. Please do :)
    Your affirmation is wrong. The correct affirmation would be "we use binary units because some OSes reports formatted capacity in binary units".
    
    Proof I've read your post in its entirety is that I was going to write "MS Windows" (like I did in the aforementionned post) instead of "some OSes" :) . My server at home is a FreeBSD, I launched fdisk and it reports size in "Meg", neither MB nor MiB. So I can't say :) What command did you ente
As the old sayng goes (Score:2)

by nweaver ( 113078 ) writes:

"The moral of the story is: Never underestimate the bandwith of a station wagon full of tapes hurtling down the highway."

-Andrew Tannenbaum
- Re: (Score:2)
  
  by Compulawyer ( 318018 ) writes:
  
  DOH! Beat me to it ....
Google != Open Source (Score:2)

by xxxJonBoyxxx ( 565205 ) writes:

Mr Dibona, who is a long-standing Linux evangelist, said: "I am comfortable with where Google is operating. People are often upset and feel we should be releasing more.

"And I agree; I would love to release more. It's more a function of engineering time, than it is a function of desire."

I call B.S. "Lack of engineering time" is why we haven't seen the source to the core search engines or gmail?
- - So you're saying "Microsoft = open source", then? (Score:2)
    
    by xxxJonBoyxxx ( 565205 ) writes:
    
    Just because they want to help and release lots of open source software doesn't mean they have to release the family jewels.
    If the average Slashdotter applied the same flawed logic to Microsoft, you'd have to say they're big open source sponsors too. After all, Microsoft has released GB of free source code for utilities, etc. for decades. Sure, the code mostly only works with their proprietary "family jewels" (the OS and development tools), but why quibble?
Now I just my own PB HD. (Score:2)

by kabocox ( 199019 ) writes:

I've been thinking that the only home use app lots of HD storage space would be A/V. Now, I guess when 10 PB of HD are $100-1120, then we'll be able to get copies of these 120 TB of hubble data or TBs of other datasets to fill up those future home PB HDs. One day we'll need home exabyte HD to store and play around with public PB datasets.

I can only hope that bandwidth can keep up. How long would it take to transfer a 120 TB bit torrent file over either cable or dsl?

Well, maybe we'll have small TB USB flashd
Just waiting for the day... (Score:2)

by Billosaur ( 927319 ) * writes:

...that a researcher sends them all the printouts of his/her data... on greenbar...
So... (Score:2)

by Perseid ( 660451 ) writes:

...what does this new P2P technology mean for me? I guess the RIAA is really in for it now.
...why not tapes? (Score:4, Interesting)

by Penguinisto ( 415985 ) writes: on Wednesday March 07, 2007 @01:18PM (#18263766) Journal

I understand the whole "HDD w/ a common filesystem = more compatibility" thing, but wouldn't it be easier to simply send along some tapes of a type appropriate to the format/type that the scientific institution uses? LTO-3 can do 800GB compressed, SDLT can do up to 600... and neither is susceptible to data loss when it gets bounced too hard by FedEx/UPS/DHL/Whatever. (plus it would make for a lighter package, wouldn't require some poor IT schmuck to disassemble a server or wait forver for USB to transfer all of it, etc...)
I'm not criticizing or anything; just curious is all.
/P

Share
twitter facebook
- 120Tb is 100 SAIT tapes (Score:2)
  
  by Colin Smith ( 2679 ) writes:
  
  1.3Tb each or so. About $150,000. the drive is about $5500. $155,000 in total. A 750Gb hard disk costs about $1000. so it'd cost about $160k to do the same with hard disks.
- Re: (Score:2)
  
  by Laur ( 673497 ) writes:
  
  wouldn't it be easier to simply send along some tapes of a type appropriate to the format/type that the scientific institution uses?
  There are basically two reasons one would choose to use HDDs over tapes: compatibility and price.
  Compatibility: Sure, one scientific institution may have standardized on a specific type of tape, but what about all the rest? Pretty much everyone in the world can read a standard HDD formated with a well known filesystem.
  Price: what is the cost of HDDs vs. tapes per gigabyt
- Re: (Score:2, Interesting)
  
  by kulover ( 967626 ) writes:
  
  The reason for not using tapes is exactly because of the compression. The time it takes to compress that data and then send the data to the tape takes a lot of time. That same process would have to be repeated on the other end.
  
  Besides, using HDD for transfer means immediate access to the same data on the other end with speeds that are unmatched with tape backup systems. It might also be worthy to note that data sets that large usually are stored on large RAID systems like this one from LSI Logic, http [lsilogic.com]
- Re: (Score:3, Interesting)
  
  by K8Fan ( 37875 ) writes:
  
  The "TeraScale SneakerNet" paper posted earlier [arxiv.org] anticipates and answers that. They ship a fully assembled computer with processor, RAM, OS and network interface. Plug it in to the wall, plug it in to the network and assuming you had previously agreed on a networking protocol, you're rolling as soon as it boots! No restoration, no decompressing, immediate access to the data.
  Does anyone have a Linux distro for this specific purpose? Preferably tiny enough to fit onto a USB key and optimized for bandwidth, p
- Re:1TB = 1024 GB (Score:5, Insightful)
  
  by 91degrees ( 207121 ) writes: on Wednesday March 07, 2007 @12:12PM (#18262790) Journal
  
  Why?
  
  Why is a Kilobyte 1024 bytes, if "Kilo" means 1000, both according to the SI and the greeks (Kilo is derived from khilioi). If 1 kg = 1000g, 1 kV = 1000V, 1 km = 1000m, why should hard disks break the pattern?
  
  When we're talking about addressable computer memory, approximating the kilobyte to 1024 is a convenience, but since Terabyte gives such a huge error, and makes absolutely no sense for data transfer or disk sizes, it's really time we stopped this illogical naming convention just because some engineers found a term convenient 40 years ago.
  
  Parent Share
  twitter facebook
  - Re: (Score:2, Interesting)
    
    by NinjaTariq ( 1034260 ) writes:
    
    Use the kibibyte [wikipedia.org] if you have a big problem with it.
    
    But I have long since buried my problem with using the SI prefix with byte to mean a power of 2, actually not sure i ever had one, I just accepted it. I am happy with the 1024b=1Kb, 1024Kb=1Gb and 1024Gb=1Tb. The usable space is lower in the case of non-volatile storage anyway, 1Tb never means 1024Gb might be closer to 1000Gb (i don't know).
  - Re: (Score:2)
    
    by 0123456 ( 636235 ) writes:
    
    Because only real nerds have a problem with 1KB being 1024 bytes rather than 1000 bytes, and kibibytes or whatever you want to call them is a really stupid name. Who wants to have to deal with buying 1.073741 gigabyte DIMMs for their PC when we can just agree instead that a gigabyte is a power of two, not a power of ten?
    
    As for why it's different for disks to RAM, disk manufacturers discovered a long time ago that they could make more money by using SI rather than binary measures for disk size, because it ar
  - Re: (Score:3, Insightful)
    
    by vidarh ( 309115 ) writes:
    
    Byte isn't an SI unit, so what makes you think we care?
    Real geeks have no problem with overloading.
    - Re: (Score:3, Informative)
      
      by 91degrees ( 207121 ) writes:
      
      Well, the IEC and IEEE as well as the CIPM and NIST all agree thatthere are 1000 bytes to a Kilobyte and 1024 bytes tothe kibibyte. So there:P
  - Re: (Score:2)
    
    by l3v1 ( 787564 ) writes:
    
    When we're talking about addressable computer memory, approximating the kilobyte to 1024 is a convenience, but since Terabyte gives such a huge error, and makes absolutely no sense for data transfer or disk sizes, it's really time we stopped this illogical naming convention just because some engineers found a term convenient 40 years ago.
    
    Yes, it's so funny when all these guys just keep arguing why 1024bytes should really be 1000bytes because they don't want to care that it's history, it's practical, it wor
  - - Re: (Score:2, Insightful)
      
      by 91degrees ( 207121 ) writes:
      
      It's not illogical it makes perfect sense to anyone who programs, well anyone who dose lower level programming. If computers were to work in base 10... Sorry I can not even go there.
      
      If we want to worry about that then use KiB and MiB. But that doesn't make a huge amount of sense. 1KiB = 400h bytes. 1MiB = 100000h bytes. Powers of 256 would make a lot more sense.
- - Re: (Score:3, Funny)
    
    by Macthorpe ( 960048 ) writes:
    
    Wrong. One tebibyte is equal to 1024 gibibytes. One tarabyte equals 1000 gigabytes. If you're going to correct someone, do it right.
    
    You meant 'terabyte', not 'tarabyte'. If you're going to correct someone, do it right.
    - Re: (Score:2)
      
      by l3v1 ( 787564 ) writes:
      
      Actually, even tarabyte sounds better than tebibyte :P
    - - Re: (Score:2)
        
        by fourchannel ( 946359 ) writes:
        
        um...he did.
- Re: (Score:2)
  
  by zappepcs ( 820751 ) writes:
  
  There are more uses than just sending data. I'm using removable hard drive trays instead of dual-booting my machine. Swap the tray, reboot, I'm running Ubuntu. Repeat and its XP. I only keep that one as it came free with the PC, boot it up now and then to keep it updated. It makes life easy when you know that you can't possibly fsck up your regular installation when playing with a new distribution or whatever. Never needed to send one to anyone else, but that might be a huge support possibility for family?
- Re:Like days of old (Score:4, Interesting)
  
  by meringuoid ( 568297 ) writes: on Wednesday March 07, 2007 @01:06PM (#18263564)
  
  This sounds almost like stories of scholars trading/copying books from long long ago.
  According to what I'm told every time I watch a DVD, these scholars were in fact stealing books.
  
  Parent Share
  twitter facebook
- - Re: (Score:2)
    
    by HTH NE1 ( 675604 ) writes:
    
    The latency sucks, though.
    
    They're working on that part of the problem by subjecting two trucks of hard drives to quantum entanglement.

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Should we be continuing this fallacy? (Score:3, Informative)

Re:Should we be continuing this fallacy? (Score:5, Funny)

Re: (Score:2, Informative)

Nope (Score:3, Informative)

Not acording to NIST (Score:4, Interesting)

Re: (Score:2, Informative)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:2)

Re: (Score:2, Insightful)

Bark! Bark! Bark! (Score:5, Funny)

Re:Bark! Bark! Bark! (Score:5, Funny)

Re:Should we be continuing this fallacy? (Score:5, Insightful)

Re: (Score:2, Insightful)

Re:Should we be continuing this fallacy? (Score:4, Insightful)

Re: Google's Academic TB Swap Project (Score:2)

Re: (Score:2)

Re: (Score:2)

Large datasets (Score:5, Informative)

Re:Large datasets (Score:4, Funny)

Re: (Score:2)

In Other News (Score:5, Funny)

Mod parent up (Score:3, Informative)

Re: (Score:2)

Re:Mod parent up (Score:5, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:2)

FedEx, UPS, insurance. (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Good Hard Padded Cases (Score:2)

Re: (Score:3, Insightful)

Re: (Score:3, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3, Informative)

Re: (Score:2)

rsync (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Never underestimate... (Score:2)

Oblig. (Score:2)

Re: (Score:2)

How long until... (Score:2)

Re: (Score:2)

so.. (Score:3, Interesting)

Re: (Score:2, Flamebait)

Re: (Score:2)

Re: (Score:3, Interesting)

Re: (Score:3, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Never underestimate ... (Score:3, Interesting)

Re: (Score:2)

Re: (Score:2)

Other Uses for Mass Data Transfer (Score:4, Funny)

Hubble Data (Score:2, Funny)

Re: (Score:2)

Re: (Score:3, Funny)

Re: (Score:2)

I hope they're not using ... (Score:2)

Isn't TB... dangerous? (Score:2, Redundant)

Re: (Score:2)

Dangerous precedent (Score:2)

Units... (Score:2)

Re: (Score:2)

Re: (Score:2)

As the old sayng goes (Score:2)

Re: (Score:2)

Google != Open Source (Score:2)

So you're saying "Microsoft = open source", then? (Score:2)

Now I just my own PB HD. (Score:2)