×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Google's Academic TB Swap Project

CmdrTaco posted more than 7 years ago | from the hey-look-it's-chris dept.

Education 190

eldavojohn writes "Google is transferring data the old fashioned way — by mailing hard drive arrays around to collect information and then sending copies to other institutions. All in the name of science & education. From the article, 'The program is currently informal and not open to the general public. Google either approaches bodies that it knows has large data sets or is contacted by scientists themselves. One of the largest data sets copied and distributed was data from the Hubble telescope — 120 terabytes of data. One terabyte is equivalent to 1,000 gigabytes. Mr. DiBona said he hoped that Google could one day make the data available to the public.'"

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

190 comments

Should we be continuing this fallacy? (3, Informative)

garcia (6573) | more than 7 years ago | (#18262634)

One terabyte is equivalent to 1,000 gigabytes.

Uhh, no it isn't. It's really 0.9765625 terabytes.

Re:Should we be continuing this fallacy? (5, Funny)

Cristofori42 (1001206) | more than 7 years ago | (#18262694)

umm a terabyte is really 1 terabyte. Though 1 terabyte = 1024 gigabytes not 1000... but whatever.

Re:Should we be continuing this fallacy? (2, Informative)

garcia (6573) | more than 7 years ago | (#18262766)

Thanks for pointing out that I should have been hitting Preview instead of getting First Post :)

1000GB = 0.9765625 TB, not 1TB.

Re:Should we be continuing this fallacy? (2, Informative)

wizzard2k (979669) | more than 7 years ago | (#18262714)

From wikipedia:
(a contraction of tera binary byte) is a unit of information or computer storage, abbreviated TiB.

1 tebibyte [wikipedia.org] = 240 bytes = 1,099,511,627,776 bytes = 1,024 gibibytes

The tebibyte is closely related to the terabyte, which can either be an (inaccurate) synonym for tebibyte, or refer to 1012 bytes = 1,000,000,000,000 bytes, depending on context.

Re:Should we be continuing this fallacy? (1)

wizzard2k (979669) | more than 7 years ago | (#18262856)

Oops. /. doesnt do <sup>...
240 bytes = 2^40 bytes
1012 bytes = 10^12 bytes


My bad.

Re:Should we be continuing this fallacy? (0, Troll)

vidarh (309115) | more than 7 years ago | (#18264064)

Just because some morons decided to randomly invent new names and try to get people to change established usage doesn't give them any authority... Whenever I see someone pushing the *bibyte crap the only thing it achieves is to annoy me.

Re:Should we be continuing this fallacy? (1, Insightful)

AchiIIe (974900) | more than 7 years ago | (#18262900)

Nope, that's wrong

see: http://en.wikipedia.org/wiki/Tebibyte [wikipedia.org]
* 1 Terabyte = 1000 Gigabyte
* 1 Tebibyte = 1024 Gibibyte

Re:Should we be continuing this fallacy? (0)

Anonymous Coward | more than 7 years ago | (#18263252)

I'll die stubborn and proud before I start saying words like "gibibyte" out loud. I appreciate the want for accurate language so that our poor little brains don't have to infer what is meant by the context, but "they" could have picked a less stupid-sounding word.

Re:Should we be continuing this fallacy? (4, Insightful)

Professor_UNIX (867045) | more than 7 years ago | (#18263342)

* 1 Terabyte = 1000 Gigabyte * 1 Tebibyte = 1024 Gibibyte
Yea, yea, yea. And you also believe a hacker isn't someone who maliciously breaks into computer systems, it's just a curious innocent person right... crackers are the criminals! Give it up. The general public is never going to adopt "Tebibyte" into the language because terabyte sounds much more fucking cool.

Re:Should we be continuing this fallacy? (2, Insightful)

wolff000 (447340) | more than 7 years ago | (#18263524)

WHO CARES?!? I have worked with mathematicians that did not squabble over these terms so why the hell are we?!? My mother who can hardly turn a computer on knows damn well that 1000 megabytes is roughly 1 gigabyte. Now lets get back to the topic. It seems Google would have some brilliant way to push a terabyte through the "tubes" instead of just mailing drives, how archaic.

Re:Should we be continuing this fallacy? (1, Insightful)

Anonymous Coward | more than 7 years ago | (#18263684)

Very dumb people have convinced themselves that the entire disk industry has been cheating them out of storage space for decades. They seem to believe that if disk manufacturers printed the "real" size on the box, the disk drives would somehow cost less.

Problem (-1, Troll)

Anonymous Coward | more than 7 years ago | (#18262640)

I get sperm in my keyboard. Can't lick it out. Any suggestions ?

Re:Problem (0)

Anonymous Coward | more than 7 years ago | (#18263964)

Talk about physical transfer of terabytes of DNA info!!

How do you get the sperm separated from the semenal fluid before it hits the keyboard?

[mod -10 troll ;-)>]

1TB = 1024 GB (0)

adityapk (841961) | more than 7 years ago | (#18262644)

1 TB = 1024 GB, for gods sake

Re:1TB = 1024 GB (5, Insightful)

91degrees (207121) | more than 7 years ago | (#18262790)

Why?

Why is a Kilobyte 1024 bytes, if "Kilo" means 1000, both according to the SI and the greeks (Kilo is derived from khilioi). If 1 kg = 1000g, 1 kV = 1000V, 1 km = 1000m, why should hard disks break the pattern?

When we're talking about addressable computer memory, approximating the kilobyte to 1024 is a convenience, but since Terabyte gives such a huge error, and makes absolutely no sense for data transfer or disk sizes, it's really time we stopped this illogical naming convention just because some engineers found a term convenient 40 years ago.

Re:1TB = 1024 GB (2, Interesting)

NinjaTariq (1034260) | more than 7 years ago | (#18263006)

Use the kibibyte [wikipedia.org] if you have a big problem with it.

But I have long since buried my problem with using the SI prefix with byte to mean a power of 2, actually not sure i ever had one, I just accepted it. I am happy with the 1024b=1Kb, 1024Kb=1Gb and 1024Gb=1Tb. The usable space is lower in the case of non-volatile storage anyway, 1Tb never means 1024Gb might be closer to 1000Gb (i don't know).

Re:1TB = 1024 GB (1)

91degrees (207121) | more than 7 years ago | (#18263118)

No. I'm quite happy to accept that a Terabyte is anywhere between 1,000,000,000,000 and 1,100,000,000,000 bytes for general use, simply because it doesn't matter. It gives an idea of the amount of storage, which is all we need. If I was specifying I'd use neither and just say 3.7*10^12 bytes or whatever.

I just get a bit fed up when people insist that the illogical, and deprecated usage of terminology is correct and a usage that has been accepted for quite some time (and long before marketting got involved) is incorrect.

Re:1TB = 1024 GB (1)

0123456 (636235) | more than 7 years ago | (#18263142)

Because only real nerds have a problem with 1KB being 1024 bytes rather than 1000 bytes, and kibibytes or whatever you want to call them is a really stupid name. Who wants to have to deal with buying 1.073741 gigabyte DIMMs for their PC when we can just agree instead that a gigabyte is a power of two, not a power of ten?

As for why it's different for disks to RAM, disk manufacturers discovered a long time ago that they could make more money by using SI rather than binary measures for disk size, because it artificially inflated their size. Hence people now complain that they buy a 'one terabyte' drive and it actually only holds 900 gigabytes and change.

Re:1TB = 1024 GB (1)

Vandilizer (201798) | more than 7 years ago | (#18263172)

Simple we don't we just work in a different base:

2^10 = 1024 bytes

See:
http://en.wikipedia.org/wiki/Kilobyte [wikipedia.org]

It's not illogical it makes perfect sense to anyone who programs, well anyone who dose lower level programming. If computers were to work in base 10... Sorry I can not even go there.

True false

v.s. the classic

True Maybe False

v.s. The new base 10 computing

True
Could be factual
Might be accurate
Maybe right
Slightly correct
Slightly fake
Maybe phony
Might be counterfeit
Could be wrong
False

(Ok maybe not this bad)

Re:1TB = 1024 GB (2, Insightful)

91degrees (207121) | more than 7 years ago | (#18263410)

It's not illogical it makes perfect sense to anyone who programs, well anyone who dose lower level programming. If computers were to work in base 10... Sorry I can not even go there.

If we want to worry about that then use KiB and MiB. But that doesn't make a huge amount of sense. 1KiB = 400h bytes. 1MiB = 100000h bytes. Powers of 256 would make a lot more sense.

Re:1TB = 1024 GB (2, Insightful)

vidarh (309115) | more than 7 years ago | (#18264132)

Byte isn't an SI unit, so what makes you think we care?

Real geeks have no problem with overloading.

Large datasets (4, Informative)

BWJones (18351) | more than 7 years ago | (#18262648)

This is absolutely the most cost effective way of transferring large amounts of data like this. If you do the calculations on terrabyte size files, sneakernet (of FedEx net) is actually faster and less expensive. We also went to one of Jim Grey's seminars when he was here giving an Organick Memorial Lecture and he made an incredibly compelling demonstration using a variety of data types. We ended up talking with him for some time after about new projects we are engaging in that will also be generating terrabytes of data and his suggestion was to pass applications rather than data which was interesting.

This is becoming more and more the norm in scientific research and Google's work is quite welcome.

Re:Large datasets (4, Funny)

Sobrique (543255) | more than 7 years ago | (#18262734)

Never underestimate the bandwidth of a lorryload of backup tapes traveling at 60 miles an hour.

Latency may leave something to be desired though :)

In Other News (4, Funny)

UnknowingFool (672806) | more than 7 years ago | (#18262774)

FedEx delivered what appeared to be a ton of broken office chairs to Google headquarters this morning. When asked for the sender's ID, the severely beaten FedEx courier would only reply that the sender wished to remain anonymous.

Mod parent up (2, Informative)

ari_j (90255) | more than 7 years ago | (#18263138)

Here's what happened when I FedExed my RMA to Newegg, packed very carefully. Note the bent motherboard - I didn't even know you could do that. The good news is that FedEx paid part of my claim ... they paid $100 plus the $8.33 that the FedEx store charged me to fax in the claim forms. The bad news is that they did not refund my original shipping or pay more than $100 on the over $280 of damage that they did. It also took about 4 hours of phone calls to even convince FedEx that I was not the seller, and then they lost my claim in their e-mail system (and did not reply to my e-mails) and closed it out for inactivity after a month or so, until I called them and asked what happened.

On a side note, don't bother with UPS insurance. I insured something when I sent it to myself once, and they broke it and the insurance remedy was to return it to the origination address and ask to see an original purchase receipt to award the insurance claim. If you happened to make something yourself or even received something as a gift, don't insure it when you ship it. And hire a private courier (unless someone has found a common carrier that doesn't suck).

Re:Mod parent up (1)

monkeydo (173558) | more than 7 years ago | (#18263396)

The bad news is that they did not refund my original shipping or pay more than $100 on the over $280 of damage that they did.


Did you buy additional insurance over the $100 you get by default?

Re:Mod parent up (0)

ari_j (90255) | more than 7 years ago | (#18263802)

Of course not. None of these companies has ever honored their insurance for me in the past when I've shipped something other than as part of a sale. Moreover, insurance becomes largely irrelevant when you get into the "run over with a truck" territory that this particular shipment was in. Also, FedEx never offered me insurance when I told them what I was shipping and its value. Furthermore, I have had major problems with FedEx in the past, including "overnight" deliveries sitting on a truck in my city for over a week (including food products that were thus rendered worthless). I wasn't about to pay anything extra when I reasonably believed it'd make no difference in how my shipment was treated, before or after its destruction. I'd roughly estimate that FedEx has caused me at least $800 worth of uncompensated losses in the past 5 years, regardless of what insurance or delivery terms I paid for, because of unprofessional and incompetent behavior. Don't ask why I used them for this particular shipment - I will plead insanity and amnesia.

On a side note, apparently FedEx "express" gets you consequential damages whereas regular FedEx ground disclaims them. At least keep that in mind if you use FedEx. I have had better experience with UPS, notwithstanding them not honoring their insurance. And so far, I have had the best experience with DHL, with no problems to date, but I suspect that may largely be due to the frequency with which I use DHL compared to UPS. (That said, I use DHL more than FedEx, so either DHL is way better than FedEx or I am living in a statistical anomaly, a possibility I won't deny.)

Re:Mod parent up (1)

evilviper (135110) | more than 7 years ago | (#18264242)

Moreover, insurance becomes largely irrelevant when you get into the "run over with a truck" territory that this particular shipment was in.

Why? You said they did pay your $100 claim after all.

Also, FedEx never offered me insurance when I told them what I was shipping and its value.

No idea what you're talking about. You generally fill out the form yourself, and select what insurance you want.

I'd roughly estimate that FedEx has caused me at least $800 worth of uncompensated losses in the past 5 years, regardless of what insurance or delivery terms I paid for, because of unprofessional and incompetent behavior.

Your experience is most definitely atypical. You must be shipping unbelievable numbers of packages to get that much damage. I have yet to have one package seriously damaged by UPS or Fedex (or DHL).

Re:Mod parent up (1)

winnabago (949419) | more than 7 years ago | (#18263908)

the insurance remedy was to return it to the origination address and ask to see an original purchase receipt to award the insurance claim

Sorry to nitpick, but this scam has been around for ages - you broke something, oh no! I'll send it to myself and pretend UPS did it. Hell, I even saw it in Seinfeld. Not that you were doing this, but what you tried is pretty suspicious to an outside observer.

They need SOME proof of value or even that the box was actually full to fight this type of fraud, and the original merchant is one way to do it. Also, what are you doing sending packages to yourself? It is cheaper than taking it with you?

And always buy the extra insurance, or instruct the shipper to declare it properly. Newegg is usually pretty good about this, if they provided the return label. If FedEx limited it to $100, you should have definitely added to the base value. Remember that FedEx Express, FedEx Ground (sometimes called home delivery), and FedEx Smartpost are all separate entities, with varying policies, too. You probably used their ground service, which is a conglomerate of private couriers, with little to no accountability to FedEx corporate.

Re:Large datasets (2, Insightful)

dmayle (200765) | more than 7 years ago | (#18263106)

I remember an article I read on this I think back in the year 2000. The was a research scientist who built a standardized platform (That is to say, a specific PC case with a certain number of hard drive bays, and certain network cards) so that he could exchange data with other universities. They would fill up the data on the networked PC, and they could ship it to any of the participating projects, knowing that they'd get back the same hardware in return.

I remember at the time thinking it was just one of those smart little details that just make working together easier. It's not some great leap of genius, but enough of a well crafted idea that it could really help.

Re:Large datasets (2, Insightful)

BWJones (18351) | more than 7 years ago | (#18263256)

Yeah, there have been a number of folks using variations on this theme for a while now. It's been interesting that network performance really has not followed the same performance curve as storage and CPU throughput. Add to that the growing amount of data being pushed through "consumer" pipes from people obtaining broadband and pushing sources such as YouTube and company and you have the makings for a bandwidth crunch. This of course is the reason for separate academic and government Internet paths, but it is still a limited commodity. In fact, at some universities engaging in data intensive projects, it is not uncommon for them to occupy the entire bandwidth of the university in off hours to transfer data around the country to various collaborators.

Re:Large datasets (2, Informative)

Agent Orange (34692) | more than 7 years ago | (#18263612)

Yup. There was a paper a few years back entitled "terascale sneakernet", by jim gray and a couple of guys at MSFT research division on this. You can find it in the arxiv [arxiv.org].

This concept has also been applied to such things as the Sloan Digital Sky Survey [sdss.org]. Astronomers do tend to generate a lot of data with large surveys such as this.

Re:Large datasets (1)

SuperMog2002 (702837) | more than 7 years ago | (#18263758)

As the old joke goes, never underetimate the bandwidth of a station wagon full of magnetic tapes. Or a Fed Ex plane full of hard drives. Your choice.

rsync (1)

G3ckoG33k (647276) | more than 7 years ago | (#18263864)

We have been sending two DVDs, with about 6-8 GB data, around every month for updates. Now we are trying rsync, which in our view has been more convenient.

Oblig. (1)

RyanFenton (230700) | more than 7 years ago | (#18262666)

Never underestimate the bandwidth of a station wagon... [bpfh.net]

Still very much applies today.

Ryan Fenton

Re:Oblig. (1)

Paulrothrock (685079) | more than 7 years ago | (#18263250)

The page you linked to had a smart idea. Rather than just have the raw disks, create some sort of architecture inside to allow for rapid transmission of the data from the vehicle upon arrival. I could see specialized vehicles that have been hardened against an accident with an inverter to power the drives that have external fiber optic ports hooked up to massive, high speed RAID arrays to rapidly dump the contents to another system at the location and upload content for the next destination.

Then a GPS system in the front automatically generates a route for the driver and after a few hours of waiting for the data to transmit, off he goes!

Re:Oblig. (0)

Anonymous Coward | more than 7 years ago | (#18263910)

I don't reply to Anonymous Cowards.
And yet, we reply to you...

Like days of old (1)

tulmad (25666) | more than 7 years ago | (#18262680)

This sounds almost like stories of scholars trading/copying books from long long ago. It's actually a somewhat interesting plan.

Re:Like days of old (3, Interesting)

meringuoid (568297) | more than 7 years ago | (#18263564)

This sounds almost like stories of scholars trading/copying books from long long ago.

According to what I'm told every time I watch a DVD, these scholars were in fact stealing books.

Re:Like days of old (0)

Anonymous Coward | more than 7 years ago | (#18263676)

Considering the laws at the time vs. the laws now, you have no point. I'm sure that doesn't stop you in your quest for free entertainment, I just wanted to point out the facts aren't actually on your side.

How long until... (1)

gEvil (beta) (945888) | more than 7 years ago | (#18262684)

How long do you think it will be until some maroon somewhere plunks a hard drive into an unpadded envelope and drops it in the big blue mailbox on the corner?

Wha? (0)

Anonymous Coward | more than 7 years ago | (#18262780)

Um, I do not think that word means what you think it means:

maroon (plural maroons) [wiktionary.org]

1. An escaped negro slave of the Caribbean and the Americas or a descendent of escaped slaves.
2. A castaway; a person who has been marooned.
So, were you making some kind of racial remark about escaped negro slaves being stupid with computers? Or maybe you are going to claim you heard it from bugs bunny or your mother?

Re:Wha? (0)

Anonymous Coward | more than 7 years ago | (#18262876)

Don't Use 'Maroon' Negatively (0)

Anonymous Coward | more than 7 years ago | (#18263262)

That's the only instance of anyone claiming it's a jocular misspelling of 'moron.' other sites [snarkout.org] point out why it shouldn't be used as a derogatory name. I suggest gEvil beta refrain from using that word in a negative light considering what that word (when used as a noun) has meant for a long time for many people.

That excuse is about as weak as George Allen's.

Re:Wha? (0)

Anonymous Coward | more than 7 years ago | (#18263268)

What a maroon.

The ironing is delicsious.

Re:How long until... (1)

maxume (22995) | more than 7 years ago | (#18262884)

If I had to choose between sometime in the future and sometime in the past, I would go with sometime in the past. I think that quote about the universe inventing better fools works at a rather quick pace.

fixed (-1, Offtopic)

Dance_Dance_Karnov (793804) | more than 7 years ago | (#18262688)

gonna fix your summary for free... "...one terabyte is equal to 1024 gigabytes..."

Re:fixed (0)

Anonymous Coward | more than 7 years ago | (#18262786)

Technically that would be a tebibyte. Tera does indeed mean 10^12.

I'm glad they explained what terabyte means though, I doubt the slashdot crows would be familiar with such a term!

Re:fixed (0)

Anonymous Coward | more than 7 years ago | (#18263126)


I doubt the slashdot crows would be familiar with such a term!

Most of them have TBs of pr0n running on a RAID in their mom's basement.

Re:fixed (0)

Anonymous Coward | more than 7 years ago | (#18262802)

Wrong. One tebibyte is equal to 1024 gibibytes. One tarabyte equals 1000 gigabytes. If you're going to correct someone, do it right.

Re:fixed (2, Funny)

Macthorpe (960048) | more than 7 years ago | (#18262870)

Wrong. One tebibyte is equal to 1024 gibibytes. One tarabyte equals 1000 gigabytes. If you're going to correct someone, do it right.

You meant 'terabyte', not 'tarabyte'. If you're going to correct someone, do it right.

Re:fixed (0)

Anonymous Coward | more than 7 years ago | (#18264146)

A tarabyte is a half a small cucumber sandwich with the crusts removed, served at tea-time on the Plantation.

so.. (2, Interesting)

mastershake_phd (1050150) | more than 7 years ago | (#18262718)

Whos going to own the data? I hope Google isnt going to say they do like they want to with the old books theyre scanning. Everytime you download a hubble picture will it have a google watermark?

Re:so.. (1, Flamebait)

99BottlesOfBeerInMyF (813746) | more than 7 years ago | (#18262830)

Whos going to own the data?

As always the people of the world own the data. The copyright holders are, however, given a short term monopoly on making copies of it, with certain exceptions.

I hope Google isnt going to say they do like they want to with the old books theyre scanning.

Google has not, as far as I know, claimed "ownership" or even copyright on anything they've scanned. They have, however, created their own database of metadata about the works, which they use to enable people to more easily find specific items in the original data.

Everytime you download a hubble picture will it have a google watermark?

Umm, maybe. Why do I care if they add watermarks to it? If they are in the way, I'll just get them from another source that does not add watermarks. Google can also provide free copies of public domain pictures from other sources with Google advertising slogans on them if they want. It's called "freedom."

Re:so.. (2, Interesting)

cfulmer (3166) | more than 7 years ago | (#18262940)

The ownership of data is presumably a case-by-case thing that depends on what the data is and how it was acquired.

For example, Google does not own the copyright on out-of-copyright books that it scans in (nobody does, by definition.) At best, it might own the copyright on the scan that it did, but that's really unlikely--copyright protects creative expression and a straight scan doesn't add any.

However, they probably have some rights under unfair competition law because they have gone through a lot of work acquiring all this data and it would be unfair for somebody else to piggyback on that work to compete with them.

Recognize also that many of the "Hubble Pictures" you see are colorized versions of raw data that incorporates non-visible parts of the EM spectrum, assigning colors to things you can't see with your eyes. That assignment of colors to create something pleasing to the eye is certainly creative expression. So, if Google takes the raw data and does that color assignment itself, well, the result is theirs.

Re:so.. (2, Informative)

oneiros27 (46144) | more than 7 years ago | (#18263344)

So, if Google takes the raw data and does that color assignment itself, well, the result is theirs.
I'm not so sure that the result in theirs, necessarily. They'd need to properly attribute it. Many science archives have rules about how to properly attribute their work.

Don't get me wrong -- many of the scientists want people to use their data (eg, see The Astronomer's Data Manifesto [ivoa.net]), but they also want to know who's using it, because it's how they justify the value of their projects, and the costs incurred from distributing the data (especially for non-active projects).

The science community is also working on the Science Commons [sciencecommons.org] (an equivalent of the Creative Commons for marking scientific data) and various federated search engines (eg, night time (astronomy) virtual observatories [wikipedia.org], as well as other space and earth science discipline specific VOs. [nasa.gov]).

This is NOT good news (0)

Anonymous Coward | more than 7 years ago | (#18263304)

I really don't like the idea of a "private" (yes i know its publically traded) company having control of this public information. The data was paid for by tax payers. Google will inevitably make money from this otherwise they wouldn't be doing it.

This is not right.

Re:This is NOT good news (1)

99BottlesOfBeerInMyF (813746) | more than 7 years ago | (#18263512)

I really don't like the idea of a "private" (yes i know its publically traded) company having control of this public information.

You do know many government agencies already outsource IT and other projects to "private" companies who have all this government generated information, right?

The data was paid for by tax payers. Google will inevitably make money from this otherwise they wouldn't be doing it.

Yeah, and right now Microsoft makes money off of selling them the OS and office suite. This isn't a question of if the government will be paying for the ability of their employees to do word processing, it is just a matter of how much and which companies will be getting the money. I don't trust Google any less than I do MS, who currently supplies the OS and the networking and the word processor. I don't trust them any less than the contractors the government already exports this data to. If they can save 75% of the current cost I pay in taxes, I'm all for it.

I'd probably rather they saved 50% of the cost and implemented Linux and OpenOffice in house instead, which would solve both the security issue and the finance issue, but given a choice between their current solution and going with Google, I don't see how Google is any worse.

Re:This is NOT good news (1)

astanix (923360) | more than 7 years ago | (#18263624)

So, what you're saying is that this public data shouldn't be copied? It's not like they're taking all of the data and destroying the originals. They are obtaining copies of all of the public data that they can.
I'd just like to point out Google's Corp Info page. http://www.google.com/corporate/index.html [google.com]

Company Overview

Google's mission is to organize the world's information and make it universally accessible and useful.

Old Truth (0, Redundant)

Nom du Keyboard (633989) | more than 7 years ago | (#18262720)

It was said some time ago that the fastest way to transfer data was in a station wagon full of backup tapes traveling down the Interstate. I guess we now update that now to a mini-van full of hard drives...

Re:Old Truth (0)

Anonymous Coward | more than 7 years ago | (#18263312)

The latency sucks, though.

Never underestimate ... (2, Interesting)

boyfaceddog (788041) | more than 7 years ago | (#18262736)

The bandwidth of a moving van full of disks.

Looks like Google is hoarding data. Seems they at least are equating information with power and money. And them that has the power and money makes the rules.

Re:Never underestimate ... (1)

Paulrothrock (685079) | more than 7 years ago | (#18263324)

They're not hoarding the data. They're storing it online in open formats, at least according to the article.

Re:Never underestimate ... (1)

veektor (545483) | more than 7 years ago | (#18263748)

I first heard it as "never underestimate the bandwidth of a station wagon full of magtapes".

There, two archaic conveyances in the same cliche.

I Mail Externals (0)

moore.dustin (942289) | more than 7 years ago | (#18262742)

I mail my external hard drives to different friends a few times a year. I have several, but one specifically for mailing to friends and co-workers. I thought this was somewhat of a common practice.. I have never had a fellow geek gawk at the idea, rather it seemed like the only logically way to get what we wanted to do done.

Google is doing something cool by getting and hopefully displaying the data, but the method is not really anything newsworthy is it? I mean, this is the same as using a flash drive to transfer files real quick, this is just on a much larger scale :)

Re:I Mail Externals (1)

zappepcs (820751) | more than 7 years ago | (#18262924)

There are more uses than just sending data. I'm using removable hard drive trays instead of dual-booting my machine. Swap the tray, reboot, I'm running Ubuntu. Repeat and its XP. I only keep that one as it came free with the PC, boot it up now and then to keep it updated. It makes life easy when you know that you can't possibly fsck up your regular installation when playing with a new distribution or whatever. Never needed to send one to anyone else, but that might be a huge support possibility for family? Never thought of that.

Mea culpa. (0)

Anonymous Coward | more than 7 years ago | (#18262772)

Google is transferring data the old fashioned way -- by mailing hard drive arrays around to collect information and then sending copies to other institutions...
Old fashioned??? What about sneakernets?

Google vs Microsoft (1)

spazmolytic666 (549909) | more than 7 years ago | (#18262776)

Whos going to own the data? I hope Google isnt going to say they do like they want to with the old books theyre scanning. Everytime you download a hubble picture will it have a google watermark?

In 10 years google will own just about all data worth owning. Then slashdoters will be railing on them instead of microsoft... or maybe google and MS will merge and collect our taxes too.

Other Uses for Mass Data Transfer (4, Funny)

Anonymous Coward | more than 7 years ago | (#18262784)

Moe: Say, Barn, uh, remember when I said I'd have to send away to NASA to calculate your bar tab?
Barney: Oh ho, oh yeah, you had a good laugh, Moe.
Moe: The results came back today. (reading a printout) You owe me seventy billion dollars.
Barney: Huh?
Moe: No, wait, wait, wait, that's for the Voyager spacecraft. Your tab is fourteen billion dollars.

Hubble Data (2, Funny)

Ikyaat (764422) | more than 7 years ago | (#18262794)

120 TB of data from the Hubble telescope? I wish I was paid to go through that. And this picture is of a...star and this one is a star And a star another star OMG its a FRICKIN STAR

Re:Hubble Data (0)

Anonymous Coward | more than 7 years ago | (#18262964)

Don't forget to give each one some rediculous and meaningless name while you are at it.

Re:Hubble Data (1)

Chris Burke (6130) | more than 7 years ago | (#18263912)

Don't get too complacent...

"Star. Star. Star. Damnit, star. Star. God this sucks. Star. Star. Space ship. Star. Star. Star. God nothing but fucking stars! Fuck hubble, useless piece of shit!"

Isn't TB... dangerous? (1, Redundant)

Qubit (100461) | more than 7 years ago | (#18262846)

I don't know what the article title conjured up in your head, but when I saw:

Google's Academic TB Swap Project
...the first thing I thought was "why are they swapping around samples of a dangerous infectious disease like tuberculosis?"

Re:Isn't TB... dangerous? (1)

sckeener (137243) | more than 7 years ago | (#18263550)

...the first thing I thought was "why are they swapping around samples of a dangerous infectious disease like tuberculosis?"

I'm glad I wasn't the only one!

Dangerous precedent (1)

DebateG (1001165) | more than 7 years ago | (#18262860)

Don't say I didn't warn you guys about this "don't be evil thing." First they start swapping TB for "academic" purposes, then maybe some avian influenza in some apartments around Mountain View, and next thing you know, they'll be a smallpox outbreak and we will coincidentally receive advertisements on gmail that we can buy the cure for a few thousand dollars from one of their Adsense "partners."

Units... (1)

alexhs (877055) | more than 7 years ago | (#18262862)

One terabyte is equivalent to 1,000 gigabytes.
Hey, where do you think you are ? It's Slashdot here ! Everyone knows that ! What people here want to know is how much that does in Library of Congress...

The only thing you're getting by saying that is a flamewar between 10 kinds of people, whose who count only in MB (and disagree with you) an those who count in both MB and MiB (and agree with you) !

For my take on the issue, see this precedent post [slashdot.org] of mine.

Re:Units... (1)

nbritton (823086) | more than 7 years ago | (#18263662)

Actually it 1024 gigabytes using binary units (base 2), we use binary units because formatted capacity is measured in binary units. For exampe: 1 Exabyte = 1(1024) Petabytes = 1(1024)(1024) Terabytes = 1(1024)(1024)(1024) Gigabytes and so on... The formula to convert si units into binary units is si_unit * (125/128) which comes out to 0.9765625. For example: a 750GB hard drive is 750(125/128) = 732.421875 Gigabytes. Also don't forget reserved space... On FreeBSD it's 8% of the format capacity, so 732.421875 * 92% = 673.828125 Gigabytes of usable space.

The Library of Congress is estimated at 3 petabytes, or 3(1024) terabytes:

http://www.lesk.com/mlesk/ksg97/ksg.html [lesk.com]

Re:Units... (1)

alexhs (877055) | more than 7 years ago | (#18264324)

we use binary units because formatted capacity is measured in binary units.
It seems you haven't read my previous post I was linking to. Please do :)
Your affirmation is wrong. The correct affirmation would be "we use binary units because some OSes reports formatted capacity in binary units".

Proof I've read your post in its entirety is that I was going to write "MS Windows" (like I did in the aforementionned post) instead of "some OSes" :) . My server at home is a FreeBSD, I launched fdisk and it reports size in "Meg", neither MB nor MiB. So I can't say :) What command did you enter to get your MiB size as MB ?

The formula to convert si units into binary units is si_unit * (125/128) which comes out to 0.9765625. For example: a 750GB hard drive is 750(125/128) = 732.421875 Gigabytes.
Also your formula isn't accurate, 10^(3n) / 2^(10n) ratio depends on n. Your estimation only works for n=1 (KB/KiB). For GB/GiB, n=3, and ratio is approximatively 0,93.

So a 750 GB HD is only 698,49 GiB.

As the old sayng goes (1)

nweaver (113078) | more than 7 years ago | (#18262888)

"The moral of the story is: Never underestimate the bandwith of a station wagon full of tapes hurtling down the highway."

-Andrew Tannenbaum

Re:As the old sayng goes (1)

monkeypuzzle (644204) | more than 7 years ago | (#18263834)

One of my favorites. It still figures into our disaster plan, if you replace station wagon with helicopter...

Google != Open Source (1)

xxxJonBoyxxx (565205) | more than 7 years ago | (#18262902)

Mr Dibona, who is a long-standing Linux evangelist, said: "I am comfortable with where Google is operating. People are often upset and feel we should be releasing more.

"And I agree; I would love to release more. It's more a function of engineering time, than it is a function of desire."


I call B.S. "Lack of engineering time" is why we haven't seen the source to the core search engines or gmail?

Re:Google != Open Source (1)

the_B0fh (208483) | more than 7 years ago | (#18263720)

That's kind of stupid. Just because they want to help and release lots of open source software doesn't mean they have to release the family jewels.

Just because I want to show and teach you how to fish doesn't mean I'm going to give you the plans to my power boat.

That's how I got Linux (1)

jtownatpunk.net (245670) | more than 7 years ago | (#18263014)

My first copy of linux was received on a tape mailed across the country. We dragged (drug?) our CPUs down to the campus computer lab where we pulled the files off the tape with a VAX then transferred the files to our PCs using a null-modem cable. (We couldn't afford NICs in those days.)

/Misty watercolor memories

Now I just my own PB HD. (1)

kabocox (199019) | more than 7 years ago | (#18263028)

I've been thinking that the only home use app lots of HD storage space would be A/V. Now, I guess when 10 PB of HD are $100-1120, then we'll be able to get copies of these 120 TB of hubble data or TBs of other datasets to fill up those future home PB HDs. One day we'll need home exabyte HD to store and play around with public PB datasets.

I can only hope that bandwidth can keep up. How long would it take to transfer a 120 TB bit torrent file over either cable or dsl?

Well, maybe we'll have small TB USB flashdrives that we can just mail those around instead of upgrading our bandwidth.

Just waiting for the day... (1)

Billosaur (927319) | more than 7 years ago | (#18263066)

...that a researcher sends them all the printouts of his/her data... on greenbar...

...why not tapes? (3, Interesting)

Penguinisto (415985) | more than 7 years ago | (#18263766)

I understand the whole "HDD w/ a common filesystem = more compatibility" thing, but wouldn't it be easier to simply send along some tapes of a type appropriate to the format/type that the scientific institution uses? LTO-3 can do 800GB compressed, SDLT can do up to 600... and neither is susceptible to data loss when it gets bounced too hard by FedEx/UPS/DHL/Whatever. (plus it would make for a lighter package, wouldn't require some poor IT schmuck to disassemble a server or wait forver for USB to transfer all of it, etc...)

I'm not criticizing or anything; just curious is all.

/P

Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...