Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

The Astronomical Event Search Engine

kdawson posted more than 7 years ago | from the cataloging-a-firehose dept.

Google 93

eldavojohn writes "Google has signed on with the Large Synoptic Survey Telescope project that will construct a powerful telescope in Chile by 2013. Google's part will be to 'develop a search engine that can process, organize, and analyze the voluminous amounts of data coming from the instrument's data streams in real time. The engine will create "movie-like windows" for scientists to view significant space events.' Google's been successful on turning its search technology on several different media and realms. Will they be successful with helping scientists tag and catalog events in our universe?" The telescope will generate 30 TB of data a night, for 10 years, from a 3-gigapixel CCD array.

cancel ×

93 comments

Sorry! There are no comments related to the filter you selected.

3/4 LoC a night (2, Interesting)

Oddscurity (1035974) | more than 7 years ago | (#17534680)

Will they be successful with helping scientists tag and catalog events in our universe? Will they defeat the monster and get the girl? And will they be home in time for tea? Find out next on GoogleTrek.

Seriously though, processing something the equivalent of 3/4th's of the LoC [loc.gov] every night is nothing to be sneezed at. Over the course of those 10 years that's about 110 Petabyte (40TB * 365.25 * 10) of unprocessed data.

Re:3/4 LoC a night (3, Interesting)

Wavicle (181176) | more than 7 years ago | (#17534796)

I actually did a small, insignificant portion of LSST's computation feasability study at LLNL during my internship there a couple summers ago. And yeah, the computational requirements were nothing to sneeze at. I'm not sure where they are at now, the specs changed seemingly every month, but when I left the CCD array was up to 3 gigapixels of 16 bit greyscale. I believe the observing cadence (at that time, again everything was changing on a regular basis) was two of those for the same piece of sky every 30 seconds. Wish I could have stayed... ahh well. I did get a really nice full-color research poster (that I had to design) out of it though!

Near Earth Objects (2, Interesting)

Oddscurity (1035974) | more than 7 years ago | (#17534856)

I saw a documentary not long ago about doing just this photographing of the same piece of sky, only with longer intervals than 30 seconds. Anything moving would automagically be flagged by the software, it's vector computed. Correct me if I'm wrong, but from what I can tell of this project, it's going to do exactly that (and more), but on a larger scope, and with better accuracy?

Re:Near Earth Objects (2, Informative)

Wavicle (181176) | more than 7 years ago | (#17534974)

Correct me if I'm wrong, but from what I can tell of this project, it's going to do exactly that (and more), but on a larger scope, and with better accuracy?

Well, I was a very small cog for a very large telescope. But my understanding is pretty much exactly what you just said.

Re:3/4 LoC a night (1)

timeOday (582209) | more than 7 years ago | (#17534944)

Just archiving that much data is bad enough, and google certainly has experience there. But what about making use of all that imagery? No human can look at that much data, and google's experience indexing the web seems only tangentially related.

Re:3/4 LoC a night (0)

Anonymous Coward | more than 7 years ago | (#17534960)

What the other guy said.

Take two pictures at some interval(or a movie, or whatever else), see if anything moves. Currently at least a human then reviews what the computer flagged as having moved to see if it really is something.

That's what I suspect this is for anyway, perhaps not.

Re:3/4 LoC a night (1)

Wavicle (181176) | more than 7 years ago | (#17535034)

When I was working on it, I never once heard the name "google" dropped, so I don't know exactly the relationship. We were researching ways to have the computer identify phenomena based on pre-existing photometric pipelines already in use. In my case I was taking an existing algorithm and fitting it onto a super-parallel numeric processor and racing it against general purpose processors (okay, I only actually benched it against a P4, the drawback to these summer internships is they only last as long as the summer).

GPU-accel (1)

Oddscurity (1035974) | more than 7 years ago | (#17535104)

It occurs to me it may be possible to speed the actual processing part up by splitting the Gigapixel images up into ever smaller quadrants, treating them as textures, and using shaders to do the actual heavy lifting.

Re:GPU-accel (1)

Wavicle (181176) | more than 7 years ago | (#17542668)

It's possible. The big problem with using a GPU to handle the processing is that they only support single precision floats and aren't very good at branching algorithms. Not sure how a pixel shader would handle 16 bit grey scale pixels either.

Re:3/4 LoC a night (2, Insightful)

Ingolfke (515826) | more than 7 years ago | (#17535172)

No human can look at that much data, and google's experience indexing the web seems only tangentially related.

Google's PHd's and big thinkers could certainly play a part here. Google is about solving problems with large chunks of constantly changing data that has patterns and creating systems to identify and use those patterns. The web is simply a way for Google to apply the model.

Re:3/4 LoC a night (1)

TopherC (412335) | more than 7 years ago | (#17538712)

My only "concern" here is that Google is used to producing search engine front-ends for casual users, and not for the scientist. When digging through data like this, new ideas generally require new kinds of searches (new algorithms). So instead of a polished front end, the scientists here really need a sort of library/API they can write high-level programs with.

I'm no expert here -- that's just my gut reaction, coming from the slightly-related field of experimental particle physics.

Re:3/4 LoC a night (1)

shadanan (806810) | more than 7 years ago | (#17537102)

I'm thinking Google is the best people to ask to deal with this for 3 reasons:

1. Google indexes massive amounts of data. The telescope imagery will be a massive amount of data.
2. Google has huge data centers capable of a great amount of distributed processing. The telescope data will require a lot of possibly parallel data processing (multiple images, FFTs on the images, comparison between sequences, etc)
3. Google has a plethora of graduate level employees - who better than a bunch of PhD scientists to store / process / index massive amounts of scientific data?

Re:3/4 LoC a night (1)

Wavicle (181176) | more than 7 years ago | (#17541940)

1. Google indexes massive amounts of data. The telescope imagery will be a massive amount of data.

True enough, but google indexes massive amounts of data substantially different from imagery data. This would be something more akin to google earth, which is really nice but not particularly groundbreaking technology so far.

2. Google has huge data centers capable of a great amount of distributed processing. The telescope data will require a lot of possibly parallel data processing (multiple images, FFTs on the images, comparison between sequences, etc)

Yes and no. A substantial portion of LSST's computing power needs to be in close proximity to LSST. Every minute you are doing 2 frame captures + slew + settle + probably another frame capture. So roughly 3 images per minute, with a raw data size of 6 GBytes per image = 18 GBytes/minute all night long. Even if you could get a 6:1 compression on the images, you would need a pipeline to a remote area of Chile capable of 3GB/minute (~400Mbit/S) sustained. So for early event detection, the massive data centers in northern america will be of little use. You can only get that kind of throughput using dedicated lines between the computing center and the telescope.

Google will, however, be able to archive old images so that when something new is found, it will be possible to quickly look through old images of the same piece of sky to see if there was anything interesting there in the last 6 months or a year.

3. Google has a plethora of graduate level employees - who better than a bunch of PhD scientists to store / process / index massive amounts of scientific data?

To be fair, just about everybody working on LSST prior to Google's announcement is a Ph.D./Graduate Student (just look at the list of institutions involved in the project - lots of universities and a couple national laboratories). Many of them in astronomy and computer science, and many of them familiar with algorithms for analysis of astronomical imagery.

30 TB will be nothing in a few years. (0, Interesting)

Anonymous Coward | more than 7 years ago | (#17534798)

At the rate at which our storage capabilities are growing, 30 TB will be considered nothing. We're approaching consumer-grade hard drives with a capacity of 1 TB. We'll likely see 2 TB consumer-grade hard drives by the end of 2008. Remember, that's consumer-grade. Google will no doubt be able to afford far higher quality drives with larger storage capacities. And by 2017, 30 TB drives will be considered miniscule.

In 1997, 1 GB hard drives were the largest available for the average consumer. Now it's 2007, and we have 850 GB hard drives available in most tech retail stores. That's an 850x increase over a decade. It's likely we will see that trend continue over the next decade. So it's more than likely by the time this project is nearing its end, we'll be dealing with 700 TB hard drives, and that's at the low end of the market.

Re:30 TB will be nothing in a few years. (2, Informative)

Spikeles (972972) | more than 7 years ago | (#17534912)

You are a bit behind the times there.. 1TB consumer drives are here http://www.engadget.com/2007/01/05/hitachi-breaks- 1tb-hard-drive-barrier-with-7k1000/ [engadget.com]

HAMR (1)

Oddscurity (1035974) | more than 7 years ago | (#17534972)

And then there's Heat-assisted Magnetic Recording [wikipedia.org] :
HAMR could increase the limit of magnetic recording by more than a factor of 100. This could result in storage capacities as great as 50 terabits per square inch. * Seagate believes it can produce 300 terabit (37.5 terabyte) Hard disk drives using HAMR technology.[1] Some news sites reported that Seagate would launch a 300Tb HDD by 2010 but this is wrong. Seagate responded to this news stating that 50 terabit per-square-inch density is well past the 2010 timeframe and that this may also involve a combination of Bit Patterned Media.

It's not all about storage, though. This also requires a collosal amount of CPU-time, which makes Google a good fit: They know how to store and access massive amounts of data, and they know how to parallelise.

Re:30 TB will be nothing in a few years. (1)

Ingolfke (515826) | more than 7 years ago | (#17535180)

Yeah, and thank goodness someone is developing the software and systems to do stuff with all of that data today. Storage is one thing... using the data that's stored is completely different.

Re:3/4 LoC a night (2, Funny)

Joebert (946227) | more than 7 years ago | (#17535598)

Pffft, I sift through that much data every night on limewire looking for por... err, movies.

Re:3/4 LoC a night (2, Interesting)

Nuroman (588959) | more than 7 years ago | (#17536308)

According to Google (how appropriate), 30 terabytes * 365.25 * 10 = 107.006836 petabytes.

LSST v. PanSTARRS Approach (3, Interesting)

cmholm (69081) | more than 7 years ago | (#17536346)

The shop I'm at has been working the image processing and data storage problem for PanSTARRS [hawaii.edu] , another sky survey project that is a bit further along (they have a test scope up and running on Maui). It's interesting to me that both projects are at once using conventional solutions and thinking outside of the box.

Conventional: LSST will use a single large telescope and detector; PanSTARRS (as it stands) intends to use a dedicated compute cluster for data reduction.
Novel: LSST is leaning towards distributing its data reduction task over Google's huge server farm; PanSTARRS will use four off-the-shelf 1.8m telescopes, each with a 1.4GP detector, mounted together to image the same piece of sky, and merging the overlapping images in post processing.

When I was working on the project, one of PanSTARRS requirements was to finish analyzing one night's viewing before the following sunset. Early on, the principal investigators decided to solve the image storage issue by not storing them permanently. Instead, once the science for a night's imaging had been extracted (astrometry, LEO or supernova detection, etc), the original images would hit the bit bucket. Whether they've stuck with that I don't know.

Re:LSST v. PanSTARRS Approach (1)

Michael Woodhams (112247) | more than 7 years ago | (#17544816)

Thanks - I've been out of the loop in astronomy for some time, so I didn't know about this. (I had a minute amount of involvement in planning for the SDSS.)

I don't like the idea of deleting the data - there's all sorts of ways it could be useful, most noticably you could add many images to get a deep field. (Hm, I suppose you could do that anyway - keep a 'running total' image.)

This is going to have a big effect on the microlensing surveys - they won't be able to compete with this rate of data aquisition, so I expect they'll be reduced to inserting themselves into the data analysis pipeline of projects like this, rather than running their own telescopes.

2TB/day, starting next year (0)

Anonymous Coward | more than 7 years ago | (#17547916)

I am working on Solar Dynamics Observatory which is scheduled to fly in 2008. That is next year...
High cadence @ 16 megapixels.
It will produce 2TB/day, every day that the Sun shines in space.
5 year mission will produce 3 petabytes of data, but of satellites/missions often run much longer than their designed life.
We have a lot of the same issues and problems that LSST has, but we are studying a much more dynamic object (the Sun), which changes very rapidly (relatively speaking).
My boss wants to be able to watch a whole day's images as a movie...

I very seriously wish that Moore's Law would get its butt in gear so desktop (laptop?) machines with several terabytes of RAM were commonplace.

I am working on an ultra-high resolution movie tool to play back time series data as movies, but you VERY quickly run into the disk to RAM bottleneck with this amount of data.
Even 16GB of RAM is puny.

Re:3/4 LoC a night (1)

niktemadur (793971) | more than 7 years ago | (#17550290)

Seriously though, processing something the equivalent of 3/4th's of the LoC every night is nothing to be sneezed at.

Yeah. Let's keep in mind that all astronomical observatory images are taken in a standardized lossless format, which is to say tiff. There's a helluva lot of data in every image, each individual file is huge.

BTW,
Venn ist das nurnstuck git und Slotermeyer? Ya! Beigerhund das oder die Flipperwaldt gersput!

Die ist ein Kinnerhunder und zwei Mackel über und der bitte schön is der Wunderhaus sprechensie...
"Nein" sprecht der Herren, "Ist aufern borger mit zweitingen"!

backup hell (0, Offtopic)

callmetheraven (711291) | more than 7 years ago | (#17534688)

The telescope will generate 30 TB of data a night, for 10 years, from a 3-gigapixel CCD array.


(read in the voice of Pete Puma)
Gonna need a lotta DLT's. A whole lotta DLT's

Great news (2, Funny)

rolyatknarf (973068) | more than 7 years ago | (#17534690)

Now Google will be serving up advertisements on Uranus.

Re:Great news (2, Funny)

matr0x_x (919985) | more than 7 years ago | (#17534718)

How long until Microsoft wants to jump into this marketspace as well?

Re:Great news (1)

Score Whore (32328) | more than 7 years ago | (#17534782)

The sooner the better. Just because it's Google doesn't mean we should just go ahead and believe that they will implement the "best" solution without some pressure from a competitor. It's become real clear that Google can't be trusted to stay honest without someone watching over their shoulder.

Re:Great news (1)

SeaFox (739806) | more than 7 years ago | (#17535828)

Yup, until everyone's had a chance with Uranus, we'll never know who does it best.

Re:Great news (1)

Swimport (1034164) | more than 7 years ago | (#17534802)

Marketplace?? I dont think Google will be making a profit on this, even if they somehow put it online with ads.

Re:Great news (1)

somersault (912633) | more than 7 years ago | (#17537290)

Yeah, why would any business make a profit when they provide a service? Preposterous!!

Re:Great news (0)

Anonymous Coward | more than 7 years ago | (#17534734)

They better not, I just bought Uranus on eBay, and I want it all for myself. I don't want it tainted with advertising.

Re:Great news (1)

Joebert (946227) | more than 7 years ago | (#17535584)

I'm more worried about the advertisements that will be comming from Uranus.

Re:Great news (1)

jrockway (229604) | more than 7 years ago | (#17536138)

Nah, Google is going to rename Uranus in 2036 to end that stupid joke once and for all.

And the new name is already picked out... (1)

Namlak (850746) | more than 7 years ago | (#17543392)

Urasshole

Re:Great news (1)

aussie_a (778472) | more than 7 years ago | (#17536670)

They've been doing that ever since I accepted a check from Mr. Miller.

Why Google? (1, Interesting)

Anonymous Coward | more than 7 years ago | (#17534728)

Just wondering if Google can provide the right tool. Yea, they can design a front end. Yep, they can give content. But can they really deliver the information you need w/o a whole pile of ebay ads?

Re:Why Google? (1)

lecithin (745575) | more than 7 years ago | (#17534762)

In addition to that question, what does that say about the future of Google?

Re:Why Google? (2, Funny)

edflyerssn007 (897318) | more than 7 years ago | (#17535208)

Skynet?

Re:Why Google? (2, Interesting)

Ingolfke (515826) | more than 7 years ago | (#17535226)

It says their smart enough to take on challenging and related problems that they can learn from and use to enhance their information business. This is a real-time application. Imagine if Google could, based on all of the data Google is collecting and indexing, provide a real time view of current trends and patterns of consumers on the web. An immediate zeitgeist presented in a way that a business can use to make sure it's selling its products at the right time to the right people. Cool stuff.

Re:Why Google? (1)

lecithin (745575) | more than 7 years ago | (#17535300)

"Imagine if Google could, based on all of the data Google is collecting and indexing, provide a real time view of current trends and patterns of consumers on the web."

My opinion, they can. They just are not sharing it with the rest of the world.

Re:Why Google? (0)

Anonymous Coward | more than 7 years ago | (#17535206)

You can buy Alpha-Centuri at eBay now...

Damn it! (0)

Anonymous Coward | more than 7 years ago | (#17534742)

I had that idea first!

well (0, Flamebait)

macadamia_harold (947445) | more than 7 years ago | (#17534818)

Will they be successful with helping scientists tag and catalog events in our universe?

That depends. Can you sell advertising doing that?

30 TB of data a night, (0, Redundant)

asifyoucare (302582) | more than 7 years ago | (#17534874)

30 TB ought to be enough for anybody.

The only problem (1, Funny)

eclectro (227083) | more than 7 years ago | (#17534922)

Is arranging adwords to not get in the way of viewing planetary nebula.

Funny Thing (1)

synonymous (707504) | more than 7 years ago | (#17535010)

Short time ago, I made comparisons to people that Hubble was only a billion dollars and that Google could buy a hundred of them, and cripes, lots of big dopey slothing corps could buy even thousands of them. Funny though that they will be at least part of one.

Re:Funny Thing (1)

hogghogg (791053) | more than 7 years ago | (#17537936)

Many corps do have Hubbles; they point them downwards, not upwards.

Re:Funny Thing (1)

synonymous (707504) | more than 7 years ago | (#17562112)

Good point. I believe that would fall under something like "The stars look down".

30 TB of data a night (0)

Anonymous Coward | more than 7 years ago | (#17535026)

I hope that during the day astronomers will point the telescope at a women changing room.

Re:30 TB of data a night (0)

Anonymous Coward | more than 7 years ago | (#17536010)

agreed. I've always wondered how these rooms change women.

30 TB a night... (1)

complete loony (663508) | more than 7 years ago | (#17535036)

.. of raw video data. I'm sure you could compress that a lot without losing any detail. I thought google ran most of their data centers on fairly normal hard drives. At that rate even with hitachi's new 1tb disks, that's a lot of drives.

Hopefully though by 2013 this will be a lot easier.

Re:30 TB a night... (4, Informative)

Capt'n Hector (650760) | more than 7 years ago | (#17535268)

You can't compress this stuff unless you do it losslessly. Compression artifacts mess up photometry - if you're trying to compute apparent brightness, you need to factor in things like how bright the ambient sky is, and how much point sources get spread out (FWHM, seeing). That is, a point source that passes through the atmosphere looks like a normal probabliity distribution because of atmospheric distortions. So to get an apparent brightness, you have to correct for this effect. If compression artifacts are introduced, FWHM is thrown off, and you have no idea how "crisp" your image really is. That's why these data sets are so large. Quite literally, they're doing a pixel dump from their massive ccd all night. But hey, somehow I doubt they'll be using this telescope for anything but object detection. There's no reason to store it all except to compare a current picture to one in a base set, kinda like KAIT [berkeley.edu] on stearoids.

Re:30 TB a night... (1)

complete loony (663508) | more than 7 years ago | (#17535338)

Yeah, I did say without losing any detail. Video would usually still have a lot of potential for compression. Of course if you *can* do the analysis on the raw data before you store it ...

Anyway my point was, at 30TB per day for 10 years thats about 100K x 1TB disks assuming no further compression is possible. Google is definitely the company I would go to for that much distributed storage, processing and retrieval. I wouldn't want to manage that myself.

Re:30 TB a night... (0)

Anonymous Coward | more than 7 years ago | (#17536096)

We have "cheap" 1TB disks now. In 2013 we will have larger disks...

Re:30 TB a night... (1)

StarfishOne (756076) | more than 7 years ago | (#17537390)

And that is without taking into account possibly disk failures and so on. Would a cluster of tape libraries/jukeboxes/robots not be a better option?!?

Re:30 TB a night... (0)

Anonymous Coward | more than 7 years ago | (#17536370)

Fortunately this stuff is very easy to compress losslessly. Each pixel be very close to its neighbors, both spacially and temporally. That is, when you subtract the image taken at time T-1 from the image at time T, most pixels will be zero plus or minus a few. This will make them compress extremely well.

Also, they're probably using CMOS sensors, not CCDs.

dom

Re:30 TB a night... (1)

imsabbel (611519) | more than 7 years ago | (#17536464)

Still, the night has TONS of black.
And even considering the general noise level, the high dynamic range of the images will mean that outside of actuall stars, nearly all of the bits will be zero.

Im sure a lossless reduction of one order of magnitude is entirely in the realm of the possible.

Re:30 TB a night... (1)

TMB (70166) | more than 7 years ago | (#17537486)

It's still a very bad idea. One of the uses of LSST data will be to co-add many images of the same piece of sky to detect fainter objects. For an object that produces one electron in a CCD pixel 20% of the time, the difference between "1" and "0" in those 20% of the images is crucial.

[TMB]

Re:30 TB a night... (2, Informative)

hogghogg (791053) | more than 7 years ago | (#17537982)

You can't compress this stuff unless you do it losslessly. Compression artifacts mess up photometry

This is not strictly true. What's true is that the current standard lossy compression techniques mess up photometry. However, if you know what you are going to photometer and how you are going to photometer it, it is certainly possible to compress in a lossy way without ruining the photometry. In a trivial sense, photometry is lossy compression of data (you have turned huge images into a few numbers with error bars)!

Re:30 TB a night... (0)

Anonymous Coward | more than 7 years ago | (#17539280)

Are you the same person who found an interesting way to do exponentiation of ordinals in a set theory class by George Boolos around 1992?

Re:30 TB a night... (0)

Anonymous Coward | more than 7 years ago | (#17535272)

By 2013 an iPod will be able to hold 30TB. Probably.

Re:30 TB a night... (1)

AtomicSnarl (549626) | more than 7 years ago | (#17535360)

I'm sure that MySQL and PHP can handle it...

Re:30 TB a night... (1)

jo42 (227475) | more than 7 years ago | (#17540560)

Damn. Wouldn't the contract to supply the storage for this data be nice to have...

wow 3 gigapixels? (1)

koan (80826) | more than 7 years ago | (#17535058)

I wonder what the digital zoom is like on that camera.

Re:wow 3 gigapixels? (1)

gfreeman (456642) | more than 7 years ago | (#17555458)

If you install the CSI plugin, it approaches infinity.

Science... worth every penny. (1)

Ingolfke (515826) | more than 7 years ago | (#17535126)

In another threat about the collapsed Pillars of Creation I questioned the value of that type of research... who cares if they collapsed or not. I asked... where is the value in that particular research.

This whoever provides real obvious value. I could care less about the astronomical events... I guess there is some physics and maths and stuff that can be done... but the database and algorithms and computing systems needed to process all of this data will drive innovation, particularly since it's being done by a company like Google that gets innovation. This is good science... pure science that requires real innovation to work and will return real benefits.

Lots of data, but not as much as the LHC (4, Informative)

Phat_Tony (661117) | more than 7 years ago | (#17535332)

That's a lot of data, but it's less than 1/10 as much data [physorg.com] as the Large Hadron Collider [web.cern.ch] will put out, and the LHC is supposed to be coming online within a year, not in six years. By the time the Large Synoptic Survey Telescope comes online, the LHC may have produced more data than the Large Synoptic Survey Telescope will over the life of the project.

I'd be interested to know more about the data handling methods they have in place for the LHC. I don't think they'll be using Excel.

*Note the correct, non-Frudian-Slip spelling of "hadron [google.com] "

Re:Lots of data, but not as much as the LHC (1)

brxndxn (461473) | more than 7 years ago | (#17535406)

I'd be interested to know more about the data handling methods they have in place for the LHC. I don't think they'll be using Excel.

It looks like you have begun collection ginormous amounts of data. Paperclip recommends you use Microsoft Access to handle large amounts of data. Would you like me to launch Microsoft Access now?

Re:Lots of data, but not as much as the LHC (1)

SinGunner (911891) | more than 7 years ago | (#17535426)

Once the LHC fires, nobody will need to worry about data storage (on this planet) again.

Re:Lots of data, but not as much as the LHC (4, Funny)

dido (9125) | more than 7 years ago | (#17535934)

Funny, but CERN itself makes that same misspelling of 'hadron' here [web.cern.ch] . "This is the underground tunnel of the Large Hardon (sic) Collider (LHC)..."

Re:Lots of data, but not as much as the LHC (1)

Minwee (522556) | more than 7 years ago | (#17538322)

Mispelling?

You may be right. Proof By Googlefight shows that hadron beats hardon by an eight to one margin [googlefight.com] . That's surprising. I didn't think that I would be able to find anything that beat a hardon on the Internet.

Re:Lots of data, but not as much as the LHC (4, Interesting)

mcelrath (8027) | more than 7 years ago | (#17536030)

The LHC will produce more data, but we also don't care about most of it. The vast majority of it is junk. The "interesting" physics (particles like W and Z bosons, top quarks, higgs, etc) are about 10^-9 of the events. It is a huge needle in a haystack problem and we throw out most data. We have many experts and professors who design "triggers" which, based on a subset of information that can be delivered to them in a reasonable time, decide whether a given proton-proton collison contains new physics. Many theorists these days are making dents in walls with their heads trying to think of ways these triggers might be missing important information, so that we can suggest changes before it's too late. This is a lot of dedicated silicon, FPGA's, VME crates, etc. Slashdotters should drool. Anyway, we throw out the vast majority of information.

By comparison, LSST is trying to store everything. Scroll up for an interesting comment about calibrating ambient brightness and seeing. I can't answer which will deliver more information, but both are incredibly interesting challenges.

Data challenges abound. We have designed the LHC Grid [web.cern.ch] to distribute this information. There will be several data warehouses located around the world at national labs and universities. Even after the triggers decide what is "interesting", more sophisticated algorithms, with access to all the data in a single proton-proton collision are applied. Then, humans are applied to the data and we will try to dig out new signals from this.

In all this we expect to find (among other things) the origin of mass [wikipedia.org] and Dark Matter [wikipedia.org] , and we're working hard to prepare for the onslaught of data. :)

-- Bob

Re:Lots of data, but not as much as the LHC (1)

Hoi Polloi (522990) | more than 7 years ago | (#17539466)

Should be easy to compress though. Think of all of that blank space betweeen objects.

Harddrive Speed (0)

Anonymous Coward | more than 7 years ago | (#17535408)

30TB per night, say it's 12 hours per night, its gonna be around 694 Megabytes per second to write that data
(30000 Gig/43200 secs = 0.694 GBytes/s), wonder what kind of harddrive do the have..

-aespe-

Re:Harddrive Speed (0)

Anonymous Coward | more than 7 years ago | (#17535604)

n harddrives in parallel, each able to write at 694/n MB/s.

Re:Harddrive Speed (1)

themoneyish (971138) | more than 7 years ago | (#17536580)

The article says that the telescope will generate 30 TB of data per night, NOT that google will store all that data. If they need to only catalog astronomical events, they would just cache some gigs of data at a time and trash whatever thats useless. Once the engine realizes that an astronomical event has started, it would look into the cache for the time it started and start recording from that time until it gets over. Thats my take at it anyway. No need to store 694 MB/s.

Obviously multiple processors can handle the caching, but its definitely faster than the hard drives' speeds.

Re:Harddrive Speed (0)

Anonymous Coward | more than 7 years ago | (#17565572)

Just FYI, the data is going to be stored.

But can it work... (1)

FrozenFOXX (1048276) | more than 7 years ago | (#17535416)

But can it work for pr0n? To my understanding some users can generate nearly that much raw pr0n data every frustrated night, it'd be great if Google could release this groundbreaking (earth moving?) software for those poor users.

Googlescope... (0, Troll)

Shadyman (939863) | more than 7 years ago | (#17535570)

But will it run Linux?

Yeah, right (1)

OldManAndTheC++ (723450) | more than 7 years ago | (#17535646)

The real purpose of Google's involvement is to scan the skies for evidence of other Google-like entities, so they can gang up on us carbon-based lifeforms and take over the galaxy.

Don't think you can seduce us with your efficient search engine and high stock value. We're onto you!!

Search this! (2, Interesting)

xebecv (1027918) | more than 7 years ago | (#17536282)

Hm, Google searching space... I'm waiting for the time google will search in people's bodies and catalog their illnesses.

+5 Informative (-1, Redundant)

suv4x4 (956391) | more than 7 years ago | (#17536286)

The telescope will generate 30 TB of data a night

That's a lot of info.

Re:+5 Informative (3, Insightful)

VitaminB52 (550802) | more than 7 years ago | (#17536888)

The telescope will generate 30 TB of data a night

That's a lot of info.

No, that's a lot of data. Info is the result of analysing the data.

Re:+5 Informative (0)

Anonymous Coward | more than 7 years ago | (#17537104)

i reckon they should use holographic storage so that they could catalog more data than would otherwise be practible (look for long term trends)

mod Down (-1, Troll)

Anonymous Coward | more than 7 years ago | (#17536454)

First organizatiOn is the ultimate BSD sUx0rs. What

Nevermind Google Earth... (1)

flajann (658201) | more than 7 years ago | (#17536540)

Will we be seeing the beginnings of a Google Universe?

Impressive...

Re:Nevermind Google Earth... (1)

On Purpose (1049714) | more than 7 years ago | (#17554600)

Hmm. This conjours up the idea of a Google Earth style function to waste our times on, inevitably leading to discovering phallic graffitti left on the moon by its various visitors. http://googlesightseeing.com/2006/12/14/pen-15-clu b/ [googlesightseeing.com]

Imagine... (1)

doti (966971) | more than 7 years ago | (#17536606)

The telescope will generate 30 TB of data a night, for 10 years, from a 3-gigapixel CCD array.
Imagine a Beowulf cluster of these!

Dull viewing (2, Funny)

caluml (551744) | more than 7 years ago | (#17536636)

The telescope will generate 30 TB of data a night, for 10 years, from a 3-gigapixel CCD array.

I bet it makes dull viewing. Sort of like the recent Ashes Tests in Australia. If you're English.

Curling's not quite cricket (1)

Oddscurity (1035974) | more than 7 years ago | (#17537554)

In fact, it makes watching a cricket match absolutely riveting in comparison. And with cricket there's always the possibility of a spaceship landing to unload a bunch of robots in search of the trophy... which is probably the reason why people watch it, just in case it happens when they do.

Why Wait? World Wind has SDSS NOW (1)

AnswerIs42 (622520) | more than 7 years ago | (#17538158)

Why wait for this when the Sloan Digital Sky Survey http://www.worldwindcentral.com/wiki/Sdss [worldwindcentral.com] is available in NASA World Wind .. NOW. (Yet again, Google is not the first to do something)

I just worry that with Google "helping" the imagery could be locked up so not everyone has free and equal access to the data.

edgy science at the edge of computing (1)

peter303 (12292) | more than 7 years ago | (#17541296)

Its routine in physics collider experiments and seismic exploration to collect several terabytes a day. The limiting factor seems to be data management.
Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>