×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Digitizing 100 Years of Astronomical Data

ScuttleMonkey posted more than 6 years ago | from the jobs-for-photosynth dept.

Space 115

Maximum Prophet writes to mention that a collection of glass plates containing astronomical information from the late 19th century through the mid-1980s is being considered for digitization. "The accumulated result weighs heavily on its keepers on Observatory Hill, just up Garden Street from Harvard Square: more than half a million images constituting humanity's only record of a century's worth of sky. 'Besides being 25 percent of the world's total of astronomical photographic plates, this is the only collection that covers both hemispheres,' said Alison Doane, curator of a glass database occupying three floors, two of them subterranean, connected by corkscrew stairs. It weighs 165 tons and contains more than a petabyte of data. The scary thing is that there is no backup." I'm sure that anyone with a spare $5 million or so would be welcomed with open arms.

cancel ×
This is a preview of your comment

No Comment Title Entered

Anonymous Coward 1 minute ago

No Comment Entered

115 comments

That's quite a bit. (2, Funny)

L. VeGas (580015) | more than 6 years ago | (#19831693)

165 tons of glass plates?

Sounds like a typical lunch clean-up after Rosie O'Donnel.

Sorry. I'm truly sorry.

Re:That's quite a bit. (0)

Anonymous Coward | more than 6 years ago | (#19831733)

Sorry. I'm truly sorry.
Yep.

Glass plates will outlive the digital"backup" (4, Insightful)

gatkinso (15975) | more than 6 years ago | (#19831741)

now there is some irony.

Re:Glass plates will outlive the digital"backup" (5, Funny)

WhatHappenedToTanith (1126905) | more than 6 years ago | (#19831967)

Are you sure about the stability of glass plates? I hear a lot of people have real trouble with windows stability! Sorry, I'll go now....

Luckily glass isn't a liquid.... (2, Insightful)

Joce640k (829181) | more than 6 years ago | (#19833941)

Luckily glass isn't a liquid so they won't distort.

Re:Luckily glass isn't a liquid.... (1)

ZachMG (1122511) | more than 6 years ago | (#19834829)

you know we could also ingrain the data into the structure of twinkies and it would withstand a nuclear holacuast, as long as Rosie didnt get to it first.

Re:Luckily glass isn't a liquid.... (3, Interesting)

ajs (35943) | more than 6 years ago | (#19835547)

Glass *is* a liquid (sort of) [wikipedia.org], but it does not flow, which is what I think you were getting at.

Re:Luckily glass isn't a liquid.... (0)

Anonymous Coward | more than 6 years ago | (#19837431)

Glass does flow...a look at 80-150 year old windows will show this.

Re:Luckily glass isn't a liquid.... (2, Informative)

ajs (35943) | more than 6 years ago | (#19837853)

Glass does flow...a look at 80-150 year old windows will show this.
Please follow the link in the post you were replying to, and/or look this up on snopes. This is not true. 80-150 year old glass is simply warped due to less advanced manufacturing techniques, and often thicker at the bottom because window makers tended to place the thicker edge at the bottom.

Re:Glass plates will outlive the digital"backup" (4, Insightful)

KokorHekkus (986906) | more than 6 years ago | (#19832101)

now there is some irony.
But currently they also makes them vulnerable to a single point of failure (as indirectly pointed out in the article). If you have some data that has any real value for you then having only one copy (or only one storage facility) isn't any real protection whatever method you use. In this case we have data that would be readily accepted for backup by organisations all around the globe and barring a worldwide upheaval the safety of the data would be much better than any single glassplate could offer.

Of course the ideal would be if we could develop a cheap digital permanent storage that had guaranteed physical longevity, say several millenia. That combination would allow easy dissemination of the data and safety by using a multiplicty of sources.

Re:Glass plates will outlive the digital"backup" (1)

bryan1945 (301828) | more than 6 years ago | (#19834123)

"Of course the ideal would be if we could develop a cheap digital permanent storage that had guaranteed physical longevity, say several millenia"

But would it be compatible with Office 5007 or OO v3452.23?

Re:Glass plates will outlive the digital"backup" (1)

DerekLyons (302214) | more than 6 years ago | (#19834511)

Of course the ideal would be if we could develop a cheap digital permanent storage that had guaranteed physical longevity, say several millenia. That combination would allow easy dissemination of the data and safety by using a multiplicty of sources.

Ceramic/fired clay tablets will nicely. There's likely to be density issues however.

Re:Glass plates will outlive the digital"backup" (3, Informative)

seaturnip (1068078) | more than 6 years ago | (#19832515)

So what? Copy the digital version onto a second set of disks when it comes close to expiring.

Lossless copying means that given a little bit of maintenance, expiration of digital media is a nonissue.

Re:Glass plates will outlive the digital"backup" (1)

Venik (915777) | more than 6 years ago | (#19833727)

Ever tried to maintain archival backups for a petabyte-worth of data? You would need a team of operators, millions of dollars worth of storage hardware, expensive climate-controlled facilities. This will not be a one-time expense either. The best way to back up 165 tons of photographic plates is to use more photographic plates to hold microfiche sets. Not exactly cutting-edge technology, but it will outlast any tape archive or storage array. And it will be cheaper.

Re:Glass plates will outlive the digital"backup" (1)

Smauler (915644) | more than 6 years ago | (#19834283)

A petabyte is not that impressive any more - It's only a few thousand consumer level hard drives. I do realise that the challenge of managing all that data is far from easy, but claiming that a petabyte requires "millions of dollars worth of storage hardware" is wrong. I just checked on newegg, 500Gb is $110, thus 1Tb is $220, thus 1Pb is $220,000, and this is retail prices. Managing all that crap would be the main problem.

Re:Glass plates will outlive the digital"backup" (1)

sendai2ci (629417) | more than 6 years ago | (#19835109)

I don't think they want management as much as equal or better longevity. Some of these plated have lasted over a century, most over 25 years...

How long would a RAID5|ZFS system last? Ten years? At which point you can no longer get spare-parts for the system (well not easily.)

Tape systems would have slightly longer longevity (due to tapes being mainly used for backups.)

Of course cost of storage will go down, so perhaps they could continually upgrade to smaller higher capacity systems, but then it becomes a matter of continual long-term funding...

Perhaps they could ask Nasa for help? Gold disks [wikipedia.org] ought to last a while...

Oh, and never-mind they have with digitising the images in the first place...

Re:Glass plates will outlive the digital"backup" (2, Insightful)

FST777 (913657) | more than 6 years ago | (#19835167)

By that time, other techniques will be available to copy the digital archive over. Heck, it might even be possible to make a copy of the digital data on glass plates, complete with descriptions of the used protocol.

It's true that digitized data is more prone to failure than most analog carriers. The whole point is that digitized data is much easier copied over and over again, without loss, independent from whatever carrier used.

Re:Glass plates will outlive the digital"backup" (2, Interesting)

profplump (309017) | more than 6 years ago | (#19834359)

And your photographic copy would A) degrade over time and B) lose quality with each copy. IMHO that's not a very good archive. Moreover, in order to slow the inevitable decay that comes with time and reactive chemicals on paper/plastic/metal/whatever, you'd still need a climate-controlled facility. And you'd still need a team of operators to make the copy, and to make later copies as the earlier ones degrade. And more than anything else, you'd need someplace to store *another* 165 tons of photos, which is certainly larger than the space required to store a petabyte of data in a modern digital format.

I'm not really seeing how your photographic archive saves money. I'm not convinced it would produce better longevity either. You might get better longevity for a single copy than with digital data, but it's a whole lot cheaper to make digital copy #2 than to make photographic copy #2.

If you're worried about file formats you could simply leave a printed text detailing the data format. Then anyone with the ability to read the media would be able to recreate viewing software, even if none existed for then-modern computers.

If you're worried about being able to read the media then you're really worried about ongoing funding -- someone to continue preserving the archive in the future. That's a problem that exists regardless of the format of the archive; if someone decided they didn't want to keep paying for 3 floors of a building, or to continue making copies of the photographic archive, you'd still be in trouble.

Re:Glass plates will outlive the digital"backup" (4, Insightful)

Cecil (37810) | more than 6 years ago | (#19834555)

Ever tried to maintain archival backups for a petabyte-worth of data?

Yes, as a matter of fact. Definitely a lot of work is involved, but do you believe that you wouldn't need a team of document managers, millions of dollars worth of floor space, and expensive climate controlled facilities for archival of microfiche? You most certainly do. It's a lot of data. Period. No matter what you try to do with it, it's a lot of data. It's going to require a lot of resources. That's just a fact of life.

Anyway, noone in their right mind would choose microfiche for that type of data. If you're only storing plain text pages it's adequate (though I still don't think it would be the "right way to do it" in this day and age), but for photographic plates? Not going to work.

Microfiche is vastly overrated, in my opinion. My current project involves taking 2 floors worth of 30-50 year old microfiche and scanning it, OCRing it, and PDFing it. Yes it certainly does age. Quite poorly, in fact. The quality is absolutely terrible compared to the paper versions, some of it is stuck together, and indexing and cataloging it is a nightmare all of its own.

Yes, there are challenges in the digital world too, but most are easily surmountable given a little bit of common sense in understanding that digital is not magic. It doesn't mean you can "fire and forget". The documents will still require maintenance, cataloging, protection and monitoring. Format obsolescence is very nearly a nonissue, it is blown way out of proportion. That's where the "maintenance" comes in. The key benefit of digital is that you can and should losslessly upgrade your format whenever obsolescence is becoming a concern. Formats do not disappear overnight and suddenly everyone forgets what to do with them, you have plenty of time to make your transition if you're paying attention (which you must be: again, digital is not magic).

Microfiche is terrible, short life span (1)

fantomas (94850) | more than 6 years ago | (#19835431)

Microfiche has a short life span. When I was working at the Royal Greenwich Observatory they'd done some research and discounted that as a feasible option. Something like 25 years if you're lucky? The glass plates in the RGO were from the same period as these American ones, and in equally reasonable condition (in most cases.... the problem was there as well... we were transferring them to acid free paper sleeves).

Re:Glass plates will outlive the digital"backup" (0)

Anonymous Coward | more than 6 years ago | (#19832821)

uh, no. If the backups were stored on optical media, with the glass plates being backed up outliving these backups, that would be irony.

Re:Glass plates will outlive the digital"backup" (1)

jon287 (977520) | more than 6 years ago | (#19834999)

Hey now...

People who curate glass museums shouldn't throw stones!

Re:Glass plates will outlive the digital"backup" (0)

Anonymous Coward | more than 6 years ago | (#19837779)

You're absolutely right. Who's going to shell out the next $5 million when the data has to be migrated?

BUT, it's still a good idea to have backups. We just need to find a smarter way to do it.

Would Google archive it, perhaps? (1)

gjyoung (320540) | more than 6 years ago | (#19831853)

I'd think they'd hit it.

Re:Would Google archive it, perhaps? (2, Insightful)

ScrewMaster (602015) | more than 6 years ago | (#19832327)

Google might do it just because it would be un-evil, and worth quite a few brownie points with scientists around the globe, not to mention that it would be cool archive to search.

Re:Would Google archive it, perhaps? (4, Funny)

networkBoy (774728) | more than 6 years ago | (#19832851)

Google Universe (beta)
Searchable in Lat/Lon/time/intensity
that would be awesome...

Saving entertainment. (0)

Anonymous Coward | more than 6 years ago | (#19831857)

"I'm sure that anyone with a spare $5 million or so would be welcomed with open arms."

I'll do it. I've saved all the money I didn't spend by downloading entertainment, and I'm willing to give to a more worthy cause.

Re:Saving entertainment. (1, Funny)

Anonymous Coward | more than 6 years ago | (#19832183)

You only saved $5 million? Are you on dialup or something?

data/mass ratio (1)

tenco (773732) | more than 6 years ago | (#19831877)

That's about 6e6 Bytes per gram. Digitizing that data means lots of redundancy while preserving the total mass of this collection.

Re:data/mass ratio (3, Informative)

HappyEngineer (888000) | more than 6 years ago | (#19832677)

It depends on the number of pounds in a ton, but if it's short tons then

165 short tons = 149,685,482 grams
1e15 / 149,685,482 = 6,680,674 bytes per gram

A quick check of amazon turns up a 1TB drive which weights 2.4 pounds.
That's 1,089 grams which is 918,592,757 bytes per gram.

Unless I've messed up my math, it looks like hard drives store 137 times more information per gram. That's not as large a multiple as I had imagined though. The whole thing should still be between 1 and 2 tons when put on hard drives.

This sounds like a job for Google (2, Insightful)

MDMurphy (208495) | more than 6 years ago | (#19831897)

Google provides views of the Earth, Moon and Mars, why not stars? If the information was made available for them to deliver to their users, they might be interested.

Re:This sounds like a job for Google (1)

TheSciBoy (1050166) | more than 6 years ago | (#19835247)

What a lousy internship that'll be.

"Yeah. Mmmm. I'm going to have to have you do this job for me; it's digitizing this 165 tons of astronomical photographs on glass plates. Mmm. It would be good if you're done before the summer is over. That's great. Mmmm. Thanks."

Irony (0, Flamebait)

iknownuttin (1099999) | more than 6 years ago | (#19831919)

...examined by a staff of "computers," women hired by Harvard, initially at 25 cents an hour, because they were good at math.

Who else felt the pang of irony considering what happened at Harvard last year with the then president saying something about women and science and math?

A Million People With $5 (3, Insightful)

Stranger4U (153613) | more than 6 years ago | (#19831947)

This seems like a great opportunity for either corporate sponsorship, or a grass-roots donation drive. In all honesty, $5 million isn't a whole lot of money for the likes of any real corporation, and it probably wouldn't be that hard to raise it through small donations from individuals. Espectially if you could ascribe names to some or all of it. How would it feel to be able to personally identify which plates you paid to have scanned? (this image of the Crab Nebula brought to you by John Smith) I'm surprised Paul Allen or Richard Branson aren't all over this like stink on shit.

A Million Geeks With pocket lint. (0)

Anonymous Coward | more than 6 years ago | (#19834371)

"This seems like a great opportunity for either corporate sponsorship, or a grass-roots donation drive. In all honesty, $5 million isn't a whole lot of money for the likes of any real corporation, and it probably wouldn't be that hard to raise it through small donations from individuals."

I doubt there's a million geeks on the entire planet.

Google (4, Insightful)

blhack (921171) | more than 6 years ago | (#19831953)

I'm sure that a company like google would be MORE than willing to fund a project archiving these. The positive press, proliferation of their intended "do no evil/good guy/just another bunch of geeks" image, having their name on a major scientific project would easily be worth the investment.

Re:Google (1)

dpilot (134227) | more than 6 years ago | (#19832889)

I'm sure Microsoft has a way to store this data, no doubt with an "open" format. Things like this are too important to trust to anyone but Microsoft!

Backup? (1, Interesting)

Anonymous Coward | more than 6 years ago | (#19831957)

Of course, as long as they can keep mildew at bay, odds are that the plates will long outlast any digital record. Of course it always makes sense to keep a backup, not to mention the value of an instantly-retrievable library.

Why does it have to cost so much? (0)

Lumpy (12016) | more than 6 years ago | (#19832071)

I spearheaded a "digital backup" of around 90 filing cabinets of papers at my last job by spending $1500 on a decent scanner setup and paid a pair of temps to scan everything in. They simply scanned the documents, named the scan the identifier at the top of the page (invoice number or recordID number) and then went to the next one.

It took 2 years and way WAY WAY less than $5,000,000 to do it. granted these are delicate glass plates but hiring a pair of competent people or astronomy students to do it would not take 5 million dollars unless someone was trying to squeeze a summer home and a new farrari out of the deal.

A decent lighted copy table and a cheapish 20 megapixel hasselblad digital camera with the right lens will do the trick nicely.

Re:Why does it have to cost so much? (3, Insightful)

JohnnyGTO (102952) | more than 6 years ago | (#19832177)

Those plates as well as being old and delicate contain a LOT more information then a piece of paper. Considering that something less then 1/4 the size of the period on the end of the sentence is important your scanning at a much higher resolution.

Re:Why does it have to cost so much? (5, Informative)

ghostlibrary (450718) | more than 6 years ago | (#19832487)

> I spearheaded a "digital backup" of around 90 filing cabinets of papers ...
> It took 2 years and way WAY WAY less than $5,000,000 to do it

500,000 plates. Over 2 years, assuming 50 wks/yr means just 5000 plates need be scanned per week. 1000 plates per day. 125 plates per hour. And this is large, fragile glass with really high data density, so you have to be a) careful in handling and b) use slow high-res scanning.

Let's take a guess that it takes only 10 minutes per plate (to fetch, tag, load, scan, and return). So we need only 20 people to scan 125 plates/hour.

Well, assume 20 scanning people and 1 IT guy handling the sysadmin work for the petabyte storage. Also one scientist/manager. Take a low intern/grad student $35k, 1 sysadmin at $65k, 1 PM/sci at $85K. All x2.5 for overhead, for 2 years. That's $4.25 mil in salaries.

There's also buying a redundant petabyte and all the necessary gear. I'm amazed they figure $5mil can do it.

Re:Why does it have to cost so much? (1)

mikael (484) | more than 6 years ago | (#19832635)

The storage hardware wouldn't need to be stored onsite - all they would need is a high-speed data connection through a fibre-optic link wired up to the optical scanner. For redundancy, you would want the digital backup stored somewhere else
as well as onsite.

Re:Why does it have to cost so much? (1)

simong (32944) | more than 6 years ago | (#19835533)

In fact, they should put them in Picasa and let Google store them.

>>coat.

Re:Why does it have to cost so much? (1)

going_the_2Rpi_way (818355) | more than 6 years ago | (#19833339)

Well, they could work on a cost-recovery basis at first, and prioritize 'client' files on demand, adding them to the master DB as they go. If two or more universities want the same palte or same series of plates, they can form an alliance to split the cost. That way, high demand or high interest plates are scanned first, and losses are minimized. That's not unlike what they've done for GIS datasets in many places for instance, and governments are often the largest single client so far as I know.

Re:Why does it have to cost so much? (0)

Anonymous Coward | more than 6 years ago | (#19833971)

Take a low intern/grad student $35k
I take offense that you lumped low-intern and grad student. I think us astro grad students deserve at least high-intern status. And I *wish* we made even close to 35k a year!

Re:Why does it have to cost so much? (1)

geekoid (135745) | more than 6 years ago | (#19832831)

haha, man, I run into people like you and wonder "Have people always been unable to think."

You are comparing 90 filing cabnets of paper to this.
The fact that A the paper isn't at all the fragile B it's not nearly as much data, 3 that they need special scanners? The they would need at least 5 people to do this? Probable 9 since you are going to want to have people whose only job it is to move the plates.

I wonder what you thin the true cost of the work you mentioned was?
I would guess it at about 200K, for your puny scanning job. This job is easily 25 time more complex then what your company did.

Re:Why does it have to cost so much? (1)

hey! (33014) | more than 6 years ago | (#19833765)

In addition to the other points made, remember that the plates are fundamentally analog. Last year's memos, even if they are typewritten, are discrete. A smudge that makes a letter hard to read hardly matters, but it could be the one item you are looking for on the plates, irretrievably lost.

Re:Why does it have to cost so much? (2, Insightful)

geekyMD (812672) | more than 6 years ago | (#19834011)

Holy crap dude, you just won the asshat of the year prize. Do you have any idea of the magnitude, delicacy, or importance of the data you're talking about? To say nothing of the needed precision when scanning.

"I scanzord 90 filing cabinets of paper into teh computerz"

You know what, I used to launch model rockets. Its really easy to make stuff go up. Just buy the kit, attach a little engine and off it goes. $30 easy! Freakin NASA I bet they're spending all of our tax dollars on pr0n.

"cheapish 20megapixel camera" - Ever hear of the Hubble? I hear people like it for more than those weird nebulae pictures. I guess we should have just given one of those astronuts a Nikkon and let him go to town. Much cheaper.

And I guess we should use lossy compression, its just empty space out there right? I bet we could get the infinite sky down to a couple hundred GB. (JPEG, its for astronomy too!)

Re:Why does it have to cost so much? (2, Informative)

Chief Camel Breeder (1015017) | more than 6 years ago | (#19835581)

You need specialized scanning machines for astronomy. Office equipment doesn't do the job.

  • The plates have to be scanned in transmission, not reflection (they are photographic negatives).
  • You have to accurately measure the darkness of the plate in order to deduce the light intensity that fell on it. Office scanners only approximately measure the light and dark - enough for visual presentation, not enough to do maths with the result.

My colleagues in the UK had such a scanner. It was ~7 tonnes of metal, glass and electronics (heavy so as to be very stable), lived in its own building and needed several clever people to keep it running. Building one of these (or cloning one you already have so as to work faster) could cost a big chunk of the $5M.

The scanner I knew took ~ 30 minutes to scan a plate. For the harvard collection, choose between one scanner (which they may not have; otherwise why did they wait until now to start the project?) and a long project with big sallary bill, or multiple scanners, at extra capital cost, and less money for people.

InfiniBytes (4, Informative)

Doc Ruby (173196) | more than 6 years ago | (#19832095)

contains more than a petabyte of data

Glass photographic plates, especially from silver emulsion, are analog at extremely fine granularity. Effectively molecular, depending on how flat the glass surface was settled from its molten liquid state. The features of its silver oxide crystals, laid in place by individual photons arriving from vastly distant stars, could be meaningful at less than a nanometer. Especially when measuring extremely subtle influences, like the gravity from one distant star bending the light of another distant star, measured across a century in which those stars lost gravitational mass, for comparison.

There is a practically infinite amount of data on each of those plates, limited by our precision in measuring them. It's a smaller degree of infinity than that of the sky. But the original infinite sky is lost. While the plates' lesser infinities are impossible to replace, and all we'll get to use to look back across all the billions of years we saw in a long century of them.

Re:InfiniBytes (0)

Anonymous Coward | more than 6 years ago | (#19832267)

I don't believe the optics used are of sufficient quality to warrant that statement, I have high doubts that the airy disk is as small as you are implying.

Re:InfiniBytes (1)

Doc Ruby (173196) | more than 6 years ago | (#19833151)

The optics can be subtracted from the recording if their specific details are known. That's the beauty of analog recording: it is infinite.

And when the sampled phenomenon is as vast as all of interstellar space, that infinitude is relevant.

Re:InfiniBytes (1)

sholden (12227) | more than 6 years ago | (#19833807)

It's obviously not infinite. There is some number of atoms involved in the image, and they can only take on a finite number of states and still be an image on a plate. Plus of course the pigeon hole principle should make it pretty obvious that the image on the plate can't represent all possible configurations of the atoms in the part of the universe it took an image of (since there are more of them then there are possible configurations of the plate) and hence not everything has been recorded and hence it's not "infinite".

Good luck distinguishing all the atoms on the facing surface of every star captured on one of those infinite analog recordings.

Re:InfiniBytes (1)

Doc Ruby (173196) | more than 6 years ago | (#19833985)

I said "practically infinite". That's because the data is not limited by the storage medium, but by our limited ability to read it.

It's funny how many people jumped on me for my pointing out how much more than a "petabyte" is on those plates. But no one has joined me in laughing at the "petabytes" claim.

Slashdot is retarded.

Re:InfiniBytes (1)

sholden (12227) | more than 6 years ago | (#19834905)

Please point out the word practically in: "The optics can be subtracted from the recording if their specific details are known. That's the beauty of analog recording: it is infinite" cause I sure can't see it.

Maybe if you didn't make completely ridiculous claims people would join in with you laughing at other people making less ridiculous claims? The claimer in question, curator of the collection and hence possibly knows a little about it - though less about computing and digital versus analog, is certainly closer than you since whatever the actual storage requirements are to represent those plates (store the atomic configuration - Heisenberg be damned) it minus a petabyte is smaller than infinity minus it :)
 

Re:InfiniBytes (1)

Bugmaster (227959) | more than 6 years ago | (#19833845)

Ok, let's say that we've managed to get the original optics. We've discovered that, due to slight focusing error (which is fixed) and atmospheric effects (which are chaotic), each point on the photographic plate was blurred by 0.01 mm. How do you recover the "infinite" precision from this data ?

Re:InfiniBytes (1)

Doc Ruby (173196) | more than 6 years ago | (#19833957)

"Blurred" is an oversimplification. What we could get from the original optics is the convolution by what optics signal the recorded signal was changed. So we could deconvolve the data on the plates. In a hundred years, I expect our skills at such deconvolution of optics at the nanoscale will be quite good. We held those plates for a century, we ought to hold them for at least another to get the full use of them.

Re:InfiniBytes (0)

Anonymous Coward | more than 6 years ago | (#19834833)

OK. We'll keep 'em at your place.

Re:InfiniBytes (4, Insightful)

modecx (130548) | more than 6 years ago | (#19832497)

here is a practically infinite amount of data on each of those plates, limited by our precision in measuring them.

And limited by the lenses/mirrors, and limited by atmospheric effects, and inconsistencies in the glass, and the silver, and, and....

I can't testify to the quality of the glass negatives, but I can testify to the fact that as much as people like to believe, even the best modern analog capture sources aren't anywhere near practically infinite, even in the best laboratory conditions.

Re:InfiniBytes (2, Insightful)

Doc Ruby (173196) | more than 6 years ago | (#19832611)

Well, the lenses/mirrors that are now lost to history do introduce noise. But the atmospheric effects, and inconsistencies in the glass and silver, and probably much of the "writing" noise from the optics do all hold the possibility of being filtered out. Maybe not now, with today's early signal processing tech. But in another hundred or more years, that signal info could be available. If we don't damage them in the interim.

Re:InfiniBytes (0)

Anonymous Coward | more than 6 years ago | (#19832565)

There are physical constraints that limit to what we resolve even with perfect optics. We hit that limit long, long ago.

Without much technical effort, you'd just be measuring noise.

As an amatuer astronomer and telescope builder, I don't see a whole lot of value in this project. Considering the age of the universe, not much is going to be noticed over a few decades, or even a few centuries.

At best we might be able to detect a few comets or asteriods. Big whoop. (unless one of them is a big one that's going to hit us)

Going to extreme lengths to maintain this data is just a bit anal.

Re:InfiniBytes (2, Informative)

Doc Ruby (173196) | more than 6 years ago | (#19833223)

In another hundred years that kind of data collection will probably be easy. But still extremely valuable, because the data recorded in them is irreplaceable.

If the astronomers who recorded these plates weren't anal, then astronomy wouldn't be advanced enough by now for you to enjoy it as an amateur.

Re:InfiniBytes (1)

Smauler (915644) | more than 6 years ago | (#19834409)

You're ignoring noise, and noise is a huge factor with analogue (and digital) readings. Many people in the past have ascribed meaning to analogue readings that were not there. At the nanometer level, especially, I think it's almost bound to all noise with most applications.

Saying that "The features of silver oxide crystals [...] could be meaningful at less than a nanometer" is not actually saying anything. If they are meaningful, say they are. If there is evidence of them becoming meaningful, reference the evidence.

Re:InfiniBytes (2, Insightful)

monopole (44023) | more than 6 years ago | (#19834693)

Having worked with holographic media for decades (which is about as fine resolution as you can get optically) the maximum resolution is on par with the grain size 40 nm (Afga 8e75) and considerably worse both due to the wavelength of light and the expansion of grains during exposure. To get 'molecular' resolution you'd have to go over to dichromate plates far too slow.

Due to speed considerations the grain of these plates would be much worse. But well within the resolution of the 'scope used for recording.

All that said these plates are a goldmine once digitized due to the ability to do massive searches both spatially and temporally.

Re:InfiniBytes (1)

Teancum (67324) | more than 6 years ago | (#19837833)

The article does mention that this staff tried to mark a compromise between scanning so finely that they would record noise (a problem when doing analog to digital conversion in any format, but especially image data) and not well enough that you also lose data.

You can debate the choice here in terms of where to draw this line, and to suggest that they use some archival quality file format that does lossless image compression, but this is something that did go into their consideration.

I'm so glad that they have chosen to make this effort now instead of 10 years ago, where the decisions of image quality would have been a bit different (perhaps quite a bit) and the data storage requirements for this project would have been pushing or even beyond the capabilities of technology at the time. Petabyte storage facilities are now starting to be developed (it is still state-of-the-art even now) but it isn't completely unusual. And image processing techniques and storage concepts have matured to the point that you can reasonably determine as an archivist what technological route to take. Ten or twenty years ago that wasn't the case at all.

My largest worry here is that it appears as though only the "normalized" data is going to be preserved. I would strongly urge this staff to keep and preserve the original scanning data along with the normalized data, and store that as two totally different databases. I do understand the research needs for standardizing the brightness data and compensating for different emulsions, viewing conditions, plate making techniques, and even exposure times. But raw data is raw data, and often you can mine that original data for more details later one that sometimes the sanitized data doesn't make out.

I've seen this as a problem with other scientific fields, especially with climatological data. The sanitized version is preserved digitally and only the raw analog data (read here paper records from weather stations) is left in analog form.... only to be left in a warehouse to rot and be eaten by critters. Or even simply destroyed now that the information has been digitized and is stored in a "compressed" form.

I also hope that after an effort like this digitizing project that the original plates are still preserved, so they can be referenced twenty years from now and compared to the digitized data but with the plates scanned with future processing techniques.

sounds familar (4, Funny)

Anonymous Coward | more than 6 years ago | (#19832221)

anyone with a spare $5 million or so would be welcomed with open arms

That's what she said!

All this stuff should be digitized and made public (2, Insightful)

syousef (465911) | more than 6 years ago | (#19832689)

When I completed my Astronomy masters access to publicly available data from various sources (most notably NASA data made free to the public) was a real boon. It meant we could do analysis on actual real data instead of artificial or sanitized textbook material. A couple of the students built on this to do some original research. (Sadly that's not the way I went, as my time was more limited).

There are also lots of amateurs out there running a wide variety of very specialized packages to do everything from discovering asteroids to keeping tabs on the brightness of stars and watching for supernovae.

Re:All this stuff should be digitized and made pub (1)

dargaud (518470) | more than 6 years ago | (#19835503)

Two years ago I worked in a place [gdargaud.net] doing some preliminary astronomy experiments previous to going BIG. I was doing atmosphere science. During a chat with the resident astronomer, I asked where their data was publicly available. His answer, in short: "absolutely not, it's our funding, so it's our data. We release only the final paper. We don't want competition from other labs/astronomers."

That answer astounded me as in our own project the point was to make the data public as efficiently as possible. I mean, their funding is public, so why not their data ? I can understand holding onto it until you have a paper published, but after than it should be required in the funding statement. I don't know if this is typical of the field of astronomy, but I've searched high-res sky images in the past without finding anything systematic except some specific projects such the Sloan Sky survey (which are just coordinates) or the odd marketing Hubble shot.

Re:All this stuff should be digitized and made pub (1)

faxafloi (228519) | more than 6 years ago | (#19840811)

...but I've searched high-res sky images in the past without finding anything systematic except some specific projects such the Sloan Sky survey (which are just coordinates) or the odd marketing Hubble shot.

If that's all you found, you didn't look hard enough. Sloan serves imaging and spectral data, and all of Hubble's science data (for example) has been available from three different data centers since 1992. (This is data we're talking about, not pretty pictures.) In fact, all NASA-funded missions are required to archive their data, and NSF is (finally) getting into the act. I don't know what ESA requires, but I know they're building a large archive. And just about every large ground-based project in development has a significant archival component.

I'd say your previous employer's attitude ("our funding, our data") is the exception nowadays. Even privately funded projects are looking at archives, if only to connect to the VO [ivoa.net].

Harvard can handle the burden (4, Informative)

tchdab1 (164848) | more than 6 years ago | (#19832835)

From here: http://www.hno.harvard.edu/guide/finance/index.htm l [harvard.edu],

This:
Harvard University's endowment, valued at $25.9 billion at the end of FY 2005, is a collection of more than 10,800 separate funds established over the years to provide scholarships; to maintain libraries, museums, and other collections; to support teaching and research activities; and to provide ongoing support for a wide variety of other activities. The great majority of these funds carry some type of restriction.

I think they can scare up the change.

Re:Harvard can handle the burden (2, Informative)

Anonymous Coward | more than 6 years ago | (#19833435)

Absolutely correct. According to records, Harvard saw their endowment fund appreciate over 16% in a single year (FY2005). Sixteen percent of $30 billion is nearly $5 billion which would allow them to quite easily fund this project. Even if Harvard has the fund invested in an interest-bearing account at 5%, they're still seeing around $1.5 billion per year in interest income - something more than $4 million per day. This project is chump change.

Re:Harvard can handle the burden (2, Interesting)

moosesocks (264553) | more than 6 years ago | (#19835387)

They might, but I doubt it. Unless they could potentially turn it into a media blitz, I genuinely doubt that Harvard (or any private institution for that matter) would pick up this sort of project.

If they did, they'd keep it private, and only share it amongst other institutions "prestigious" enough to be deserving of the blood and sweat of Harvard scientists.

I'm sorry, but the Ivy League has quickly degenerated into a billionaire's playground. If they turn away thousands of "perfectly qualified" applicants per year, and have all this money lying around, there are very few legitimate reasons not to capitalize on this, build up their capacity, and start being equitable about who gets to study/work there.

The Ivy League has become a game of prestige, and nothing more. I don't trust them with vital bits of science that could potentially go toward the public good. They've tarnished the name of academia.

Re:Harvard can handle the burden (0)

Anonymous Coward | more than 6 years ago | (#19838329)

You have to understand that while Harvard's endowment is an impressive amount of money, that money is spread across many semi-autonomous schools and centers (as noted in your quote). The funds that make up the endowment are donated to specific arms of the university and sometimes earmarked for a specific research activity. In this environment, you can't move the money around at will. That's not to mention that the $30bil represents long-term investment capital and not necessarily liquid assets; the yearly payout of the endowment is below %5 I believe. Harvard College is a well-funded part of the university, but even such, they perform a lot of functions and $5mil isn't chump change.

Re:Harvard can handle the burden (1)

drwho (4190) | more than 6 years ago | (#19840453)

The reason why Harvard HAS so much money is because they don't SPEND it. Kind of like that crazy uncle Joe you have (or wish you had), the one that drove a school bus for his whole adult life and died with an estate worth over a million dollars. Yeah, Harvard is weird like that. I should know. I am depending on a NSF grant for my salary at Harvard, the school doesn't seem to want to give me any of its money. However, they seem to fund all sorts of useless "humanities" programs (I am not saying that all humanities interests are useless, but that Harvard doesn't place enough value on hard science).

A great idea. (4, Interesting)

niktemadur (793971) | more than 6 years ago | (#19833201)

If they manage to standarize a century of these plates, it would significantly extend the time range of data to digitally extrapolate and detect objects previously missed. Just to speak of mapping our own cosmic backyard, a significant amount of slow moving, previously undetected Kuiper Belt Objects, for example, would more easily pop into view. Surely a bunch of comets, too.

Clyde Tombaugh captured Pluto several times during his three decades long hunt for the elusive Planet X, but failed to put the pieces together. If he had had digital technology, he would have shaved off at least a decade of effort. So imagine all the extremely useful raw data still stored in those plates.

Re:A great idea. (1)

dwarmstr (993558) | more than 6 years ago | (#19840033)

Uh... Tombaugh was 24 when he discovered Pluto aka Planet X. He later searched for other objects on the ecliptic, is that what you mean?

Alternate funding. (0, Flamebait)

going_the_2Rpi_way (818355) | more than 6 years ago | (#19833393)

A simple way to fund this would be to sell the scanned plates as scientific history artifacts/souvenirs. I bet you could not only sell them to universities worldwide but also to cosmologists, scientists and astronomy fans in general.

I mean heck, $10 a plate for 500K plates gets us to $5M. I'd pay that without even knowing what I was getting. Up it to $25, $50 or $100 and I'm probably still interested.

More than just a flat scan (4, Informative)

CraterGlass (893417) | more than 6 years ago | (#19835279)

There is more to this than simply scanning a flat image. The emulsion on these plates is a three dimensional medium, and different data can be extracted depending on your focal depth into the the emulsion. I believe David Malin did much pioneering work on this kind of thing, including the use of different layers for unsharp masking.

There will be information in the plates that is not yet part of human knowledge, and a simple scan of one focal plane is not going to get it all.

Certainly it is worth taking backup images of these plates in any way we know how, but we should remain aware that, as of today, no technology exists that will make exact duplicates of them, so great care should always be taken to preserve the originals.

Re:More than just a flat scan (1)

Richthofen80 (412488) | more than 6 years ago | (#19838175)

no technology exists that will make exact duplicates of them

Well, I mean, there is no technology that exists to make an exact duplicate of ANYTHING. That doesn't mean that digitizing this information is useless, however.

You mention that these are three dimensional mediums. Fine, three-dimensional data capture is nothing new. You're right, plopping this thing on a scanner is not going to work. But I'm sure that in some way we can get an image at a set of depth intervals and save those. Now there is an additional search criteria, the depth.

Like anything else analog, there is a certain granularity that cannot be surpassed... there is only so much 'meaningful' data that can be extracted. To say that at some point humans will be more knowledgeable and be able to extract even more info from these is a little bit of a stretch. Sure, we get smarter, but when we get smarter, wouldn't it also be possible we'd find a much more efficient way of seeing how the universe used to look? Maybe by looking at reflections of the light that bounced off of the stars that was 100 years ago? (or something equally fantastic)

Not only that, but how often are these plates even looked at? Having them available digitally, with quick referencing, probably makes the information in these plates a 100 times more valuable.

GoogleSky (2, Insightful)

12357bd (686909) | more than 6 years ago | (#19836021)

Seriously, let Google index not only that collection, but any stellar image information and launch GoogleSky.

Funding suggestion (0)

Anonymous Coward | more than 6 years ago | (#19836319)

I don't have US $5 Million to offer, but how about a suggestion. Set up a mechanism to have individuals/entities sponsor (pay for) the digitization of individual plates. The sponsor gets a public credit for being the sponsor (perhaps displayed around the exterior perimeter of the digital image also). I think the astronomy fans would help. Perhaps organization like Sky and Telescope magazine or Astronomy magazine would sponsor groups of plates. Perhaps schools could be induced to have the kids collect to sponsor a plate for their class/grade. Once the public excitement is largely over, then see if a white knight will finish the job.

Solution: Roaming "Scanner" Trucks (1)

weinrich (414267) | more than 6 years ago | (#19836485)

Step 1: Ask 15 to 20 major companies to each sponsor a "scanning trailer". They'd get their name and logo all over it and be part of the on-going story and never-ending literature, etc.
Step 2: Build-out a tractor trailer per sponsor to include everything needed to do scanning of archived materials (books, papers, photos, glass photo plates, etc.). Power source, scanners (many per trailer), etc.
Step 3: Drive the swarm of scanning trucks to the parking lots of an archive in need of backup.
Step 4: Connect the truck's network output to the archive's network to store the scanned data.
Step 5: Get local volunteers to work with the "full-time professional" in each truck to retreive (a little at a time), scan, and return the materials to the archive.

The plan would be to drive this swarm around the country, full time, and do this kind of work whereever it is needed.

Crunching the numbers (1)

Muad'Dave (255648) | more than 6 years ago | (#19838805)

If we assume there are a half million plates as the article states (let's call it a 512K, i.e. 2^19), and there's a petabyte (2^50) worth of uncompressed data on them, that's 2^31 bytes (2GB) per plate. Assuming 3 bytes/pixel and square plates, that's about 26750x26750 pixels. With a 12x12 inch plate, that'd be about 2230 pixels/inch. If the plates are smaller, say 4 inches, that goes up to a more respectable 6700 pixels/inch.



Cost of storage? Free!!! They should get a few gmail accounts and store the scans there. Occasionally mail them between accounts for redundancy. 8-)

Both hemispheres? (1)

faxafloi (228519) | more than 6 years ago | (#19839095)

Not sure what's meant exactly by it being the "only collection to cover both hemispheres". The Digitized Sky Survey [stsci.edu] covers the whole sky and it's been online [stsci.edu] for 12 years.

Re:Both hemispheres? (1)

bmk67 (971394) | more than 6 years ago | (#19840849)

It's the only collection of photographic plates that covers both hemispheres. Despite DSS, this data is extremely important to astronomical research.

Funding, complexity, ownership (1)

drwho (4190) | more than 6 years ago | (#19840667)

I don't think it would be hard to find some company to pay $5m, if they could keep the rights to the images, and pull a Westlaw type of scam. I am sure Harvard-Smithsonian isn't going to fall for this. They want to keep these images for the public, which makes it difficult for anyone to build a business model on and therefore difficult to get funding for. How would Google make money on this? Google adwords for a particular star? Or perhaps on google maps - "coffee near Barnard's star"? I am not saying that Google won't do this, just that it's not as simple of a decision for them as you might think. There's really no way to prove it's value to Google stockholders.

I am sure that the glass plates aren't going to be thrown away when this is done. They'll just be moved away from the very expensive Cambridge real estate on which they currently sit, and the space will be reused for storing more astronomers.
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Sign up for Slashdot Newsletters
Create a Slashdot Account

Loading...