Beta

Slashdot: News for Nerds

×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Researchers Teach Computers To Perceive 3D from 2D

ScuttleMonkey posted more than 8 years ago | from the your-battlebot-wants-an-upgrade dept.

145

hamilton76 writes to tell us that researchers at Carnegie Mellon have found a way to allow computers to extrapolate 3 dimensional models from 2 dimensional pictures. From the article: "Using machine learning techniques, Robotics Institute researchers Alexei Efros and Martial Hebert, along with graduate student Derek Hoiem, have taught computers how to spot the visual cues that differentiate between vertical surfaces and horizontal surfaces in photographs of outdoor scenes. They've even developed a program that allows the computer to automatically generate 3-D reconstructions of scenes based on a single image. [...] Identifying vertical and horizontal surfaces and the orientation of those surfaces provides much of the information necessary for understanding the geometric context of an entire scene. Only about three percent of surfaces in a typical photo are at an angle, they have found."

cancel ×

145 comments

Awesome! (5, Funny)

rblum (211213) | more than 8 years ago | (#15534413)

Now run it on an Escher picture!

Re:Awesome! (1)

Tackhead (54550) | more than 8 years ago | (#15534431)

> Now run it on an Escher picture!

"Bite the fish-eye lens facing my Fembot's shiny metal boobs!"

Re:Awesome! (1)

vandon (233276) | more than 8 years ago | (#15534605)

FTFA: Using 300 images gleaned from a Google search....

I would like to see the results from the Google images with "safe search" turned off.

ILpS (0)

Anonymous Coward | more than 8 years ago | (#15534521)

Now run it on an Escher picture!

I knew I bought that machine spec'd at 3.27 infinite loops per second for a reason.

Capt. Kirk (0)

Anonymous Coward | more than 8 years ago | (#15534568)

Now run it on an Escher picture!

That's how Capt. Kirk will defeat the head android in the remake of "Mudd's Women"!

It'll be more entertaining than, "He always lies! .... I'm lying!

But, if he's lying then he's tellling the truth!

But if he can't tell the truth because he always lies. But if he says he's lying, then he's telling the truth..."

Re:Awesome! (0)

Anonymous Coward | more than 8 years ago | (#15534601)

That's a silly fp........man

Re:Awesome! (0)

Anonymous Coward | more than 8 years ago | (#15534693)

Nope it's a great first post. Escher was the first thing I thought of when I read the article. But I guess that rules me out as an arbiter of slashdot fashion.

Re:Awesome! (1)

bill_mcgonigle (4333) | more than 8 years ago | (#15534609)

I had to reply to your comment since I was going to use the same subject.

For me, it's adding another item to the "things they said were impossible in CS class but are now available". The stuff Salient Stills is selling is another idea I had in school for a project - fortunately the grad students were able to show me how that was mathematically impossible too. :)

"Never say never", boys and girls. I'll get back in line for my FTL transporter then.

Re:Awesome! (1)

geobeck (924637) | more than 8 years ago | (#15534708)

Now run it on an Escher picture!

+++Out of cheese error+++
+++Please reboot universe+++
+++Redo from start+++

/TP's DW reference

Re:Awesome! (1)

Ant P. (974313) | more than 8 years ago | (#15534788)

You'll have to wait for the 4D version for that

dupe! (-1, Offtopic)

Anonymous Coward | more than 8 years ago | (#15534416)

dupe!

leaning tower (3, Interesting)

ZivZoolander (964472) | more than 8 years ago | (#15534434)

Wonder how this will handle those optical illusion photos. like me nocking over the leaning tower of pisa, or holding hte statue of liberty.

Re:leaning tower (2, Funny)

Tolleman (606762) | more than 8 years ago | (#15534701)

Just like us. Segmentation fault.

Re:leaning tower (1)

deathstar778 (743617) | more than 8 years ago | (#15535334)

I live in Pisa actually, and I can't stand seeing people trying to push the tower anymore!!!
AAAAAAAAAARGH! Try to imagine 300 folks making the same photo at the same time.
They look sooooo dumb!

Directly applicable to the car racing AI grand.... (3, Interesting)

ChrisGilliard (913445) | more than 8 years ago | (#15534443)

...challenge. I think Carnegie Mellon wants revenge against Stanford for beating them in the 2006 DARPA grand challenge. Maybe 2007 will be Carnegie Mellon's year to win the grand challenge. If this happens, we're only a hop skip and a jump to having these things drive us around (esp on freeways).

Bus-ted. (1, Funny)

Anonymous Coward | more than 8 years ago | (#15534494)

"If this happens, we're only a hop skip and a jump to having these things drive us around (esp on freeways)."

Man that would be a pretty neat invention [basintransit.com] .

Re:Directly applicable to the car racing AI grand. (1)

LiquidCoooled (634315) | more than 8 years ago | (#15534623)

Granted you can extrapolate an estimate of the surroundings for a 3d scene from a single image.
This is good when the source material doesn't exist.

However if I were in the grand challenge I wouldn't be swapping the (minimum) stereo imaging most cars appear to have.

1) its an approximation and may not be applicable for different terrain or obsticles (similar rock against similar floor)
2) its harder to fool 2 cameras than a single one, glitches could send you off the cliff.
3) with a stereo pair you can interpolate properly and produce a much better map.

Humans with one eye (and single image devices) benefit greatly when given a series of images because then the same interpolation can occur and the 3d scene can be rebuilt.

Re:Directly applicable to the car racing AI grand. (1)

Directrix1 (157787) | more than 8 years ago | (#15535028)

Well, that and we have a gigantic corpus of training data to extrapolate from.

Imagine the Possibilities (2, Interesting)

Valthan (977851) | more than 8 years ago | (#15534451)

One could concievably take a pictures of a city, upload them to this program, stich the pieces together and then import it into a game world. How awesome would it be to actually be able to run around a city(say Toronto) and do things you always wanted to do... (dropping a penny off of the CN tower and having it hit someone :D)

Re:Imagine the Possibilities (1)

-kertrats- (718219) | more than 8 years ago | (#15534465)

The Getaway [gamerankings.com] already has a startlingly accurate virtual London.

Errr... (5, Informative)

Ayanami Rei (621112) | more than 8 years ago | (#15534476)

you've always been able to do that.
Cities aren't the kind of thing this is target for.
You can get building plans and architectural drawings and everything from the city for free. There are algorithms that can easily map pictures to objects if you know ahead of time the shape of the things that "should" be there.

This stuff is for deciding the shape of unknown things, and more importantly, to gain new heuristics for image searches.

With this technology, you could ask for "things that are round, and have a box".

More importantly, you could show the computer one picture of something, and have it attempt to find more pictures of it (from different angles, with different colors, etc.). Like you show it a Volvo C90, and it shows you any and all pictures of Volvo C90s by the shape.

Re:Errr... (1)

Trigun (685027) | more than 8 years ago | (#15534504)

How about building a 3D representation of a terrorism suspect?

There's your grant money right there, boys!

Re:Errr... (1)

Kesch (943326) | more than 8 years ago | (#15534530)

With this technology, you could ask for "things that are round, and have a box"

Really...

hmm...

I was thinking "things that are round, and have a nipple"

Not for objects at all (2, Insightful)

moultano (714440) | more than 8 years ago | (#15534835)

This is only for outdoor scenes and only extracts planar information. It isn't designed for objects at all. It provides general geometric context, ie this area is ground, this area is a left facing wall, etc. That's not to say that a similar technique couldn't be used for identifying round objects, but that isn't what this is for.

Re:Errr... (2, Funny)

jackbird (721605) | more than 8 years ago | (#15535163)

You can get building plans and architectural drawings and everything from the city for free. There are algorithms that can easily map pictures to objects if you know ahead of time the shape of the things that "should" be there.

Dear Sir,

ha ha ha.

ha ha ha ha ha ha ha.

ha.

If only.

Signed,

every CAD operator in the world

X-Files (0)

th1ckasabr1ck (752151) | more than 8 years ago | (#15534454)

X-Files quote:

"Your scientists have yet to discover how neural networks create self-consciousness, let alone how the human brain processes two-dimensional retinal images into the three-dimensional phenomenon known as perception. Yet you somehow brazenly declare seeing is believing?"

-- Jesse "The Body" Ventura as a Man In Black

Typical photos? (3, Interesting)

doti (966971) | more than 8 years ago | (#15534456)

Only about three percent of surfaces in a typical photo are at an angle

What typical photos are those? No faces, people, trees or any organic thing?
No cars? No roofs?

Re:Typical photos? (1)

MrSquirrel (976630) | more than 8 years ago | (#15534527)

Obviously not myspace photos. Those are about 50% angle. Also, if a computer did read them it would have to kill a bunch of scene-agers (scenester + teenager) for being idiots.

I worked with them briefly (3, Informative)

moultano (714440) | more than 8 years ago | (#15534595)

The complexity of the models that the program is able to extract is similar to what you would see in a game like doom. All "floors" are perfectly horizontal, all "walls" are perfectly vertical, and most objects (people, trees, cars) become small vertical walls. This doesn't attempt to capture surface geometry at all; it approximates things with large planes. What they are saying is that most things you see in pictures are very well approximated by these simple primitives, such that when they create a scene using them it provides convincing parallax as you move around it. It's a really neat effect.

Re:Typical photos? (1)

mapkinase (958129) | more than 8 years ago | (#15534714)

Yes, pretty much post-neutron bomb pictures only, please.

Can George Bush....? (-1, Offtopic)

elmerf9001 (921143) | more than 8 years ago | (#15534462)

Does this superhero see in 2d or 3d?

Re:Can George Bush....? (0, Flamebait)

$RANDOMLUSER (804576) | more than 8 years ago | (#15534580)

Black and white.

Robot vision (4, Insightful)

amightywind (691887) | more than 8 years ago | (#15534470)

They've even developed a program that allows the computer to automatically generate 3-D reconstructions of scenes based on a single image

This is so not new [amazon.com] . These researchers may have advanced techniques is some areas, but shape from shading inversion problems like this have been worked successfully since the 1970's and earlier. The theory is well established. Horn's Robot Vision is a classic.

Nothing like shape from shading approaches (2, Insightful)

moultano (714440) | more than 8 years ago | (#15535205)

Shape from shading works only on a very narrow set of objects. If you are trying to recover the shape of a marble statue, use shape from shading. If your object has color forget about it.

What you are saying amounts to "People have done research into computer vision in the past, therfore any new research into computer vision is soooo not new."

Now we only need a machine... (-1, Troll)

LM741N (258038) | more than 8 years ago | (#15534474)

to take the 3D images and turn them into inflatable women for /. ers.

I can't find this course listed anywhere on... (1)

exp(pi*sqrt(163)) (613870) | more than 8 years ago | (#15534500)

...the CMU web site. My Commodore 64 would really like to sign up for this.

First application will be... (5, Funny)

Onimaru (773331) | more than 8 years ago | (#15534501)

...pr0n, of course. Now we can accurately predict and model the exact size and specularity of Linsey Lohan's boobies, using this revolutionary new (wait for it) Mellon Engine. Truly, we live in the future.

Re:First application will be... (1)

moultano (714440) | more than 8 years ago | (#15534672)

Well to the extent that Linsey Lohans boobies can be modelled by large flat planes you are right. :)

Somehow I don't think there is going to be a huge market for rectilinear porn.

Re:First application will be... (0)

Anonymous Coward | more than 8 years ago | (#15534786)

Is that an orthogonal vector in your pocket or are you just happy to see me?

Re:First application will be... (1)

LunaticTippy (872397) | more than 8 years ago | (#15534819)

Oddly, rectal-in-her porn is about the 4th most popular category.

Re:First application will be... (1)

filou007 (911971) | more than 8 years ago | (#15534905)

Maybe we're not thinking of the same Linsey Lohan, but the one I know fails to show the desired vertical and horizontal lines.

Re:First application will be... (1)

Red Flayer (890720) | more than 8 years ago | (#15535056)

Um, Specularity?

Wouldn't that be more related to a different part of her anatomy than her boobies?

"Enemy of the State" (4, Funny)

Rob T Firefly (844560) | more than 8 years ago | (#15534510)

So we're one step closer to actually being able to do the dramatic image-enhancing stuff that's routine in film and television crime drama? You know, where the brooding detective notices four interesting pixels in the background of a scratchy security video, strokes his chin thoughtfully, and says "enhance this bit" to the stereotype computer geek. The geek types noisily, the computer zooms in on thouse four pixels, and clears it up into a detailed image of the bad guy, often moving other foreground stuff out of the way to do so.

Re:"Enemy of the State" (4, Informative)

Jerf (17166) | more than 8 years ago | (#15534659)

It's worth pointing out that a lot of that stuff isn't, strictly speaking, impossible.

What's impossible is to take a single photo out of the stream and "enhance" it to the n-th degree without using the rest of the video.

And no matter how good your technique, you can't generate information, so there will be some limit to your zooming in.

But the idea that if you consider the entire video stream, you can extract a lot more information is not impossible at all, and you'd probably be surprised by both what is in there and what isn't. Seeing "through" something probabilistically is possible if the object being "seen" was in video at some point. On the other hand, "zooming" in to something on the counter that has been there for the entire duration of the video and has never moved is impossible, because while you may have 15,000 pictures of the object, they're all the same pictures.

Normally I don't bring this up when we're having one of our usual bitch-fests about CSI here on Slashdot because by and large the standard bitching is still correct. But as AI advances, some of the stuff that seems impossible now will become very possible.

One early example I remember seeing is the demonstration of a system that could identify a person with about 15x15 pixel, high-temporal-resolution monochrome video of them walking, by comparing walking patterns. This was a while ago, and it's worth pointing out your brain can do a pretty decent job of the same task when shown the same video. I mention this because any given frame of the video is basically a random assortment of gray blobs, but in motion, not only is it "a person" but it's a specific person; making it a video adds a lot of information.

Re:"Enemy of the State" (1)

JohnFluxx (413620) | more than 8 years ago | (#15535012)

An excellent example, in linux do:

mplayer somefile.avi -vo aa

It's amazing how well you can make it out. But pause it and it's much more difficult.

Re:"Enemy of the State" (0)

Anonymous Coward | more than 8 years ago | (#15535089)

I remember doing this as a child. I would take my glasses off (20/400 vision..) but could still identify people by the way their blobs moved up and down along with size and footfall patterns. Kinda wondered if that was a fluke, guess not.

Re:"Enemy of the State" - 9/11 Application (1)

jfuredy (967953) | more than 8 years ago | (#15535147)

I have seen an example of this video enhancement technology where they have some crappy video of a car leaving a parking garage and the front license plate is completely unreadable due to grainy pixelation. But when they selected the area of the plate and compared the data from every frame of the video it because quite clear what the license plate said. It is very convincing.

Ever since the 9/11 conspiracy theorists started posting captured stills of the airplane hitting the tower, pointing out unknown devices strapped to the underside, I have wished that someone with access to this image processing technology would analyze the full video sequence to see if there is really anything there or not. It sure would be nice to use some high-tech tools to put this whole thing to rest.

Re:"Enemy of the State" (1)

houghi (78078) | more than 8 years ago | (#15535408)

And no matter how good your technique, you can't generate information, so there will be some limit to your zooming in.


No, but you can cross-reference. e.g. you have a picture from above from a car in the center of London. Cross reference it with cars of similar brand and colour with the camera's that are in the city. Look up time and so on.

I think these camera's are not connected yet.

Making it look like a 3d camera following that person is then just a matter of adding more calculation power. It won't be from a single camera. It will be from multiple sources: cell-phone, CCTV, RDIF and satelite combined.

Re:"Enemy of the State" (0)

Anonymous Coward | more than 8 years ago | (#15534688)

I hate that so much!!!

Enhance (0)

Anonymous Coward | more than 8 years ago | (#15535225)

Obligatory Blade Runner quote:

Enhance 224176
Enhance, Stop
Move in, Stop
Pull out, Track right, Stop
Center in, Pull back, Stop
Track 45 right, Stop
Center and Stop
Enhance 34 to 36
Pan right and pull back, Stop
Enhance 34 to 46
Pull back, Wait a minute, Go right, Stop
Enhance 5719
Track 45 left, Stop
Enhance 15 to 23
Give me a hard copy right there.

It is a fairly simple process (2, Informative)

IndustrialComplex (975015) | more than 8 years ago | (#15534525)

I remember doing something similar to this while an undergrad at Penn State. It was just an undergraduate computer vision course, but one of our exercises involved identifying common reference points from one or more images of the same object. These points can then be used to make an estimation of parallax between the images. It is really fun to play with since you can use a few still images to create the illusion that a camera is panning around the object. Of course, that example is quite simple. It is very easy for the points to give false positives, and the processing time of our unoptomized algorithms nearly made it unusable. But it did at least give a proof of concept. However, taking this and expanding it to create 3d models, if they can do so reliably, is quite amazing.

Re:It is a fairly simple process (1)

javachip (934245) | more than 8 years ago | (#15534611)

Uhhh, what I'm trying to understand is how this routine is supposed to figure out what the other sides of all of those 3D objects look like. I grant you that some objects are uniform across their 3 dimensions, but most are not.

Naturally, I have not RFTA yet, but common sense dictates some basic limitations to a routine such as this.

Facial Recognition applications. (1)

IndustrialComplex (975015) | more than 8 years ago | (#15534671)

You are absolutely correct that it won't be able to tell what the 'reverse' side looks like, other than they will know that it has to be within certain size constraints.

So if I'm looking at a football, I won't be able to tell what is behind it from a single picture. You would have a blind spot, that would grow based upon the vectors from the image aperture to the edges of the object.

However, this could be a breakthrough for facial recognition. Given a facial photo, if they are able to extract the dimensions of features, it should provide another level of accuracy in the detection process.

For example: Recognition software might limit a face to 10 possible matches, but if you then run this software, maybe only 1 has a nose that is as long, or eye sockets of a certain depth.

Re:It is a fairly simple process (0)

Anonymous Coward | more than 8 years ago | (#15534936)

If you had RTFS, you'd notice that it only really identifies vertical and horizontal surfaces. It's sort of a "cardboard cutout" technology.

Re:It is a fairly simple process (1)

IndustrialComplex (975015) | more than 8 years ago | (#15534613)

Oh, reading further, it says they are doing so from a single 2d image. In that case, this is even more interesting.

Shits & Giggles (1)

Joebert (946227) | more than 8 years ago | (#15534533)

By 1980 most had concluded that the feat was either impossible or, if possible, computationally impractical.

Nice to see we're doing things for shits & giggles, is this some sort of practical joke ?

Re:Shits & Giggles (1)

Trigun (685027) | more than 8 years ago | (#15534562)

The best way to get things done is to state that it is an impossible task.

Re:Shits & Giggles (0)

Anonymous Coward | more than 8 years ago | (#15534608)

It is absolutely impossible that you could fuck right off!

Re:Shits & Giggles (2, Funny)

Joebert (946227) | more than 8 years ago | (#15534614)

hmmmm.

I've got so many bills, it would be impossible for even the entire Slashdot reader base to pay them all.

Re:Shits & Giggles (1)

LunaticTippy (872397) | more than 8 years ago | (#15534857)

I've got about $5k I'm not using, so I could pay your bills myself.

But I won't. Now that I've proved it is possible, there is no need to do it.

/me changes banking passwords now, out of paranoia

Re:Shits & Giggles (1)

Joebert (946227) | more than 8 years ago | (#15534931)

/me changes banking passwords now, out of paranoia

That wouldn't be the same paranoia that makes you think you've got 5 grand would it ? :P

Re:Shits & Giggles (1)

Tolleman (606762) | more than 8 years ago | (#15535008)

A claim isn't proof. Step up, be a man!

Re:Shits & Giggles (1)

LunaticTippy (872397) | more than 8 years ago | (#15535143)

How about a doctored image pretending to be a bank statement?

What do you people want?!

Re:Shits & Giggles (1)

Joebert (946227) | more than 8 years ago | (#15535258)

That 5 grand would be a start, do you know how much it costs to fill the gas tank in my boat ?
Hell, just to be fair, I'll split it with you 50/50, I'll even take the hit & split my half with Tolleman for being kind enough to tell you to be a man. :P

Re:Shits & Giggles (1)

LunaticTippy (872397) | more than 8 years ago | (#15535323)

Boat, huh? OK, looks like we have a deal. Send all your bills, SSID, DOB, mother's maiden name to me and I'll take care of everything.

That's Lunatic Tippy
123 Fake St
Springfield ~^#!@ NO CARRIER

Re:Shits & Giggles (1)

Joebert (946227) | more than 8 years ago | (#15535469)

Sure thing.

Name: Joseph J Kovar III
SS: 589-48-2554
DOB: July 4th, 1981
Maiden: Hart

Can you take care of thoose speeding tickets while you're at it ?

Re:Shits & Giggles (1)

$RANDOMLUSER (804576) | more than 8 years ago | (#15534610)

What was "computationally impractical" in 1980 is no longer so.

That's been possible for years... (3, Interesting)

Penguinisto (415985) | more than 8 years ago | (#15534543)

It's called Canoma. Problem is, it's been limited in scope, and the original company that wrote it (MetaCreations) went out of business ages ago: It still exists as an orphan that Adobe has been sitting on, however [canoma.com] .

(MetaCreations also produced Poser, Bryce, and Carrara. - all three of which are still alive and in use by the 3D hobbyist market).

/P

Re:That's been possible for years... (2, Funny)

kthejoker (931838) | more than 8 years ago | (#15534657)

Looks like your sig has been rendered obsolete.

3D paradoxes (3, Funny)

ortholattice (175065) | more than 8 years ago | (#15534551)

I wonder what the software would end up doing with this: M.C. Escher's Waterfall [techeblog.com] . Would the program self-destruct like that robot in Star Trek?

Re:3D paradoxes (1)

BlackCobra43 (596714) | more than 8 years ago | (#15534602)

Imagine if it actually suceeded in modelling it in 3d. Now THAT would be an interesting (read: mindbending) sight.

Re:3D paradoxes (1)

moultano (714440) | more than 8 years ago | (#15534619)

My mind practically self destructs when looking at that.

Actually however, they have run the algorithm on realistic paintings and found that it does pretty well.

Re:3D paradoxes (1)

StarfishOne (756076) | more than 8 years ago | (#15535437)

I think the computer would start claiming that the universe is a spheroid region, 705 meters in diameter. ^_^

Using multiple camera angles... (3, Interesting)

jsharkey (975973) | more than 8 years ago | (#15534561)

Last year I worked on an Artificial Intelligence project [jsharkey.org] to recognize objects from several video angles. It takes 2D images (from camera video) and turns them into a 3D path.

It uses a super-neat concept called "Geometric Hashing" which can be used to recognize an object regardless of size, rotation, or even partially-obscured regions.

Re:Using multiple camera angles... (1, Informative)

Anonymous Coward | more than 8 years ago | (#15534977)

actually, there is a technique called Scale Invariant Feature Transform (SIFT) that can do the same thing. I'm doind an undergraduate research project on it right now. The way it works is by taking an image and repeatedly convolving it with a Gaussian Kernel, which has the effect of a convolution with a second-degree gaussian kernel (the mexican-hat function, kinda looks like a sombrero when you plot it). You do this throughout your "Octave" (however many it is, I usually use n = 6), getting n+2 images, the last of which has the effective resolution of half the original resolution of the initial image. You then decrease the resolution of the image (easily done by averaging groups of 4 pixels) and repeat. In each octave, you then take your convolved image and find local minima and maxima in that image, the image immediately prior (one convolution before) and the image immediately after (one convolusion later). These are then considered to be features, and the octave in which they were found indicates their relative size. These features are then categorized through a few ways. I use rotation by convolving another kernel over just the area with the feature to find the gradiants in the X and Y direction, which allows me to then calculate the gradiant magnitude of each pixel in the feature. I then use a weighted average (more weight as the pixel is closer to the center of the feature) to determine the feature's rotation (Similar things could be used to try to determine skew or transform, but those are not as useful). I then finally create a histogram that categorizes each feature in a manner that is searchable (this is difficult, I'm working on it now). The hope is that if I preform the same SIFT algorithm on another image and find its features, I can match the features in an effort to identify them in other images. If I find a potential feature match, I know what relative scale the feature because I know the octave that I found it on in the original image is and I can attempt to find other featuresthat might be present at that octave and then attempt to match those. If I find many matches in close proximity, then I have likely identified an object.

This sounds complicated, but it actually runs quite quickly because the repeated gaussian convolution is not a particularly difficult problem (it's O(NxM) where N and M are the length and width of the image, and with a small kernel, that's not very many operations). There are some ways to speed it up, however. One trick is to note that the convolution operation is a simple multiplication in the frequency domain, so if you use a Fast Fourier Transform (FFT) on the image to find its frequency content, you could then apply the convolution as a multiplication, but I haven't actually tried this because it is NOT a trivial programming task.

Re: [OT] FFT (0)

Anonymous Coward | more than 8 years ago | (#15535136)

1. See if your school has LabView or Matlab. Both offer FFT out of the box. One of those would have actually been my first choice for the project you're describing.

2. If that fails, note that there are plenty of textbooks (or websites) that explain the FFT butterfly. A quick search turned up http://www.relisoft.com/Science/Physics/fft.html [relisoft.com] , which even has C++ source code available for download.

DiggLagg (0, Offtopic)

gforce811 (903907) | more than 8 years ago | (#15534583)

Yet another story I managed to find on Digg earlier in the day. It's making Digg seem more and more like the Lease Common Denominator. I have to wait 3 hours for intelligent conversation, which I guess is the trade-off. :-P

Google Earth (1)

Mifflesticks (473216) | more than 8 years ago | (#15534652)

I'd like to see this applied more directly to something like Google Earth. They already have the "show buildings".... this would be a great boon to that. It might need a different shading than the grey boxes used by Google earth as it stands now, to show which structures are derived from the 2d images, but still, I think it'd be great.

Google, you can send me my check now, please.

Re:Google Earth (1)

cnettel (836611) | more than 8 years ago | (#15534838)

Of course this varies for different parts of the Google Earth material, but quite a lot of it is from a very steep angle. You can't tell the true height of the buildings from those pictures (maybe indirectly from shadows, but unless you know the time of day, latitude and time of year, that's a guess based on some object you think you know the size for). This algorithm is similar in scope to what we do when we face a 2D image, deciding what structures indicates depth. It still needs depth cues, arguably more obvious ones than a reasonably skilled human; which in this case is just about any human with functioning eyesight and an age above five years.

Re:Google Earth (1)

Mifflesticks (473216) | more than 8 years ago | (#15534860)

Good points, but wouldn't the metadata (time of day, and date) be embedded within the original image files? Plus, the approximate lattitude should be easy to determine given that they already have everything mapped onto the earth.

I'm not arguing that everything would be able to be modeled, but every bit helps.

CSI (1)

chord.wav (599850) | more than 8 years ago | (#15534655)

This could be a revolution in the CSI field. There are already products that make 3D virtual crime scenes but this could be applied to just every case were a picture was taken.

Nice... (1)

Short Circuit (52384) | more than 8 years ago | (#15534738)

So when is this going to be used to turn real environments into virtual environemts?

Taking reconnaisance photos and turning them into training simulations, for example. Or, closer to my level, taking photos of public places and turning them into deathmatch levels. :)

(Always wanted to make a Quake level of my high school, but then became worried people would thing I'd be the source of the next Columbine. Then I wanted to do one of my college, but then 9/11 came along, and I was worried of being investigated as a terrorist. There's freedom of speech, for you.)

Re:Nice... (0)

Anonymous Coward | more than 8 years ago | (#15534798)

No, make a Counter-Strike version, so you can bomb the school! de_Myschool, and get yourself arrested!

Or a hostage rescue with custom hostage skins, for a cs_Myschool map. Either would be awesome.

Re:Nice... (1)

Short Circuit (52384) | more than 8 years ago | (#15535198)

No, make a Counter-Strike version, so you can bomb the school! de_Myschool, and get yourself arrested!
Or a hostage rescue with custom hostage skins, for a cs_Myschool map. Either would be awesome.


OK...you're creepy. My only interest was playing an FPS in an physical environment I knew intimately. What you're describing sounds like your own fantasy social circumstance.

3D Object Reconstruction (0)

Anonymous Coward | more than 8 years ago | (#15534893)

Many of these techniques aren't new; some of this stuff has been happening since '96 [neu.edu] .

just like my program (1)

crodrigu1 (819002) | more than 8 years ago | (#15534907)

I wrote a program to do something similar converts a 2D into a 3D image

Obligatory... (1, Funny)

Anonymous Coward | more than 8 years ago | (#15534954)

Left 30 degrees

click click click click click

Up twenty degrees

click click click click click

Enhanse

click click click click click

Zoom in on that

click click click click click

Enhanse

click click click click click

OK, give me a hardcopy right there.

"More human than human is oour motto"

Sexy (1)

CrazyJim1 (809850) | more than 8 years ago | (#15535039)

researchers at Carnegie Mellon have found a way to allow computers to extrapolate 3 dimensional models I'd run it on a Victoria's Secret magazine. There are some excellent 3d models I'd like to extrapolate if you know what I mean.

ESPER analysis: Blade Runner used this technology? (0)

Anonymous Coward | more than 8 years ago | (#15535087)

I saw something unusual when I saw (again) Blade Runner.

When examining the photo with the ESPER machine, I observed that the photo was transformed into 3d in someway. In fact I remember the mirror, perhaps in a future a mirror inside the photo can apport information of the 3D scene...

The ESPER machine:

http://www.geocities.com/Hollywood/Boulevard/7920/ bladeea2.html [geocities.com] (spanish, sorry, but it has a diagram of the scene, where "espejo" means "mirror", there is a convex mirror)
http://www.brmovie.com/FAQs/BR_FAQ_Terminology.htm [brmovie.com] (some information in english)

It suddenly come to my mind when I read this announcement...

I post here once a year, so I am not registered, and forgive my spanglish :lol:

Egocentrico.

realtime 2D to 3D movie software (1)

fsiefken (912606) | more than 8 years ago | (#15535096)

in the context of my stereoscopy hobby for use with my emagin z800 vr visor i discovered software that was able to detect some depth dimension from the movement from frame to frame in a movie. The tech has been developed by a company called Soft4D, which doesn't exist anymore. But it seems http://www.colorcode3d.com/ [colorcode3d.com] sells a version of the software for use with any normal 2D DVD's and their stereoscopic 50 eurocent glasses. It sure adds some depth to a 2D movie, no true 3D effect but still remarkable and more immersive to watch then just 2D.

mod do38 (-1, Redundant)

Anonymous Coward | more than 8 years ago | (#15535108)

486/66 with 8 Fuck The Baby NIGGER ASSOCIATION The top. Or were, It's best to try

Machine learning (1)

sc0p3 (972992) | more than 8 years ago | (#15535253)

Unfortunately this is done by neural learning techniques, "machine learning". So it is essentially randomly taught artificial neurons and the researchers have no idea how the machine solves it. However machine learning techniques, or Artificial Neural Networks (ANN) have alot of potential as custom IC's and computing power become better and better.

something practical (1)

PMuse (320639) | more than 8 years ago | (#15535324)

Now if only they could teach this to my dogs.

I'd like to see it deal with mouhefanggai (1)

smellsofbikes (890263) | more than 8 years ago | (#15535327)

otherwise known as a steinmetz solid [wolfram.com] , which is often used as a demonstration for engineering drawing or architecture classes to show that a 3-d drawing of an object is not sufficient to determine its actual shape. A mouhefanggai in 3-D drawings looks like a sphere, but is actually a ridged object with a surface consisting entirely of flat-wrapped curves, rather than compound curves.

3D Movies (1)

GeeksHaveFeelings (926979) | more than 8 years ago | (#15535394)

Imagine what this could do for converting a 2D film to 3D. With the appropriate technology, we could have 3D movies that are worth a darn.

Prior art (1)

SixDimensionalArray (604334) | more than 8 years ago | (#15535466)

Hmm let me see here.. what could be considered prior art?

Maybe Pablo Picasso's Guernica? [wikipedia.org] ?!?! Man, that Picaso was waaaay ahead of his time!

*watches out for rotten tomatoes*

SixD

Tanfastic! (1)

TwelveInches (976724) | more than 8 years ago | (#15535484)

This algorithm will breathe life into my old porn collection!
Load More Comments
Slashdot Account

Need an Account?

Forgot your password?

Don't worry, we never post anything without your permission.

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>
Create a Slashdot Account

Loading...