Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Making 3D Models from Video Clips

ScuttleMonkey posted more than 6 years ago | from the fun-toys dept.

Media 103

BoingBoing is covering an interesting piece of software called VideoTrace that allows you to easily create 3D models from the images in video clips. "The user interacts with VideoTrace by tracing the shape of the object to be modeled over one or more frames of the video. By interpreting the sketch drawn by the user in light of 3D information obtained from computer vision techniques, a small number of simple 2D interactions can be used to generate a realistic 3D model."

Sorry! There are no comments related to the filter you selected.

first post (-1, Redundant)

Anonymous Coward | more than 6 years ago | (#21947546)

tremble at my first posting skills.

hahahahahha

i am the king of the nerds.

Re:first post (0)

Anonymous Coward | more than 6 years ago | (#21947638)

I've never seen a first post from Tim O'Reilly.

Re:first post (0)

Anonymous Coward | more than 6 years ago | (#21947732)

I like the fact he's been marked 'redundant'

Post Some News ( NOT Trivia ) (-1, Offtopic)

Anonymous Coward | more than 6 years ago | (#21947878)

instead of reading about toys for your PC-Windoze-Apple videoplayers,
try to facilitate democracy in the United Gulags of America and stop
BushCo's Sale of Nuclear Weapons to Pakistan [blogspot.com] .

I for one... (-1)

weedwhacker (1146417) | more than 6 years ago | (#21947562)

I for one welcome our new 3d video traced overlords.

Re:I for one... (1)

stonedcat (80201) | more than 6 years ago | (#21948050)

I for one welcome our new 3d video Traci Lords


fixed

Re:I for one... (2, Funny)

Anonymous Coward | more than 6 years ago | (#21948620)

When I was in grad school, I knew a fellow who was working on similar technology. I don't think he got anywhere near as advanced as this, but he did get good enough that given 10 to 15 still images, his software could create a primitive 3D model.

Unfortunately for him, he tried to make a 3D model of his erect penis. I'm not sure if he realized it or not, but he wasn't very well hung (he's Korean). Well, at one of the presentations he had to make regarding his work, he accidentally opened up the model of his penis. He couldn't even deny that it was his, since his name was in the filename. And his supervisor, an older woman, just couldn't stop laughing. He did go on to get his degree, but I think his pride took a real beating.

Terrible link (5, Informative)

masterz (143854) | more than 6 years ago | (#21947632)

wow, what a terrible link.

A quick search turns up the project homepage http://www.acvt.com.au/research/videotrace/ [acvt.com.au]

Youtube (5, Informative)

Anonymous Coward | more than 6 years ago | (#21948124)

Re:Terrible link (1)

GroeFaZ (850443) | more than 6 years ago | (#21948132)

A quick look and less desire for first post would have revealed the very same link at the end of the BB post.

Re:Terrible link (4, Insightful)

apankrat (314147) | more than 6 years ago | (#21948208)

Outside of /. this sort of news "wrapper" articles (BB or not) is considered a blog spam. There is absolutely no reason to link to a wrapper, when it just rehashes what's in the original article and then forwards to it for details (which is what a vast majority of readers would want anyways).

Re:Terrible link (1)

mk_is_here (912747) | more than 6 years ago | (#21949032)

Sometimes the editors just wish to give credits to who discovers it, and so the wrapped link.
I think both link should be provided, the direct and the wrapped...

Re:Terrible link (1)

adolf (21054) | more than 6 years ago | (#21950366)

Why?

It is not as if they have a shortage of submissions [slashdot.org] . So why bother being kind to spammers who are more interested in self-promotion than producing content?

Re:Terrible link (2, Funny)

wdebruij (239038) | more than 6 years ago | (#21950782)

which is what a vast majority of readers would want

Are we on the same site? What is this "article" you talk of?

Wake me when... (1)

Derekloffin (741455) | more than 6 years ago | (#21948448)

...I can make a perfectly accurate 3-D character model by just feeding the program a bit of video and pointing out the character. Then, all we need is the same with voice and I can make my own animes! Man, that would be sweet, but I think we're still a ways off from that.

Re:Wake me when... (1)

Unoti (731964) | more than 6 years ago | (#21949234)

We're a heck of a lot closer with this than without it. This is a huge step in that direction. There's already quite a bit of technology out there to convert bitmaps to line drawings, and things to track the same object in a video. We'll wake you up later if you insist, but I expect a lot of hardcore developers are waking up now and getting started on some badass research projects.

Re:Wake me when... (3, Insightful)

pnewhook (788591) | more than 6 years ago | (#21949786)

We're a heck of a lot closer with this than without it. This is a huge step in that direction.

Actually our company has had technology more advanced than that described in the article for years. With ours you simply pan the camera around and the model creation is fully automatic - there is no need to trace the image at all.

It's called Instant Scene Modeller and heres a link to a demo of the technology for anyone that's interested: http://www.demo.com/demonstrators/demo2005/54188.php [demo.com]

Re:Wake me when... (0)

elmarkitse (816597) | more than 6 years ago | (#21950406)

Mod Parent up. The linked video is hot shit and dead on relevant, save for a likely cost differential.

linking to wrappers is probably good (2, Insightful)

someone1234 (830754) | more than 6 years ago | (#21951284)

It surely mitigates the slashdot effect.

link me, link you (0, Redundant)

BadAnalogyGuy (945258) | more than 6 years ago | (#21947650)

Don't link to blogs.
http://www.acvt.com.au/research/videotrace/ [acvt.com.au]

A copy of the video on Youtube (0)

Anonymous Coward | more than 6 years ago | (#21947946)

Another step towards AI (3, Interesting)

CrazyJim1 (809850) | more than 6 years ago | (#21947654)

AI needs a way of interpreting video input into 3d objects and environment. Once a computer can represent objects in a 3d environment, it can then perform operations on them. Technically you could make AI without this tool, but you'd have to do extremely precise and patient CAD inputs that would take most of your life. With a tool to convert video into 3d objects, you can just start cataloging all the objects out there. Add in a 3d physics simulator, and you're halfway to true AI. I have a quick overview on how to do AI, and as you'll note on the very beginning of the page [geocities.com] : the reason I haven't worked on AI myself is that I can't code a video->3d object converter myself.

Re:Another step towards AI (4, Interesting)

QuantumG (50515) | more than 6 years ago | (#21947718)

Have you heard of the Scale Invariant Feature Transform [wikipedia.org] ? Well you have now. There are libraries written in C# (no less) which are publicly available to do this stuff. You can recognize a large collection of objects.

Re:Another step towards AI (5, Interesting)

kudokatz (1110689) | more than 6 years ago | (#21948054)

SIFT is ok even for occluded objects, but is horrid in 3-d because SIFT features cannot match up for a significantly rotated scene. There are better algorithms that can recover both the shape of the scene as in the article and even produce the location of the camera as a by-product.

In terms of object recognition, there has been great work done by treating an "nxn" pixel image as a point in n^2 space, and then reducing the computation space and projecting a given image onto that new, lower-dimensional approximation of the original object, and finding a match via a nearest-neighbor search through recognized objects.

There is also good work being done in terms of getting a detailed 3-d model using structured light methods: http://www.prip.tuwien.ac.at/research/research-areas/3d-vision/structured-light [tuwien.ac.at]

There is good literature out there, but sometimes the math gets over my head =P

D should have replaced C# by now (1)

CarpetShark (865376) | more than 6 years ago | (#21950802)

There are libraries written in C# (no less)


People should really give up on that and start using D :)

Re:D should have replaced C# by now (1)

SomeoneGotMyNick (200685) | more than 6 years ago | (#21952706)

There are libraries written in C# (no less)


People should really give up on that and start using D :)
I'm more of a Blues Programmer and have tuned my compiler to A minor.

Re:D should have replaced C# by now (0)

Anonymous Coward | more than 6 years ago | (#21957110)

C# is D flat.

"True AI"? (2, Insightful)

Anonymous Coward | more than 6 years ago | (#21948056)

Add in a 3d physics simulator, and you're halfway to true AI.

I've never heard of "true AI" -- do you mean strong AI [wikipedia.org] ?

And no, computer vision plus physics simulation does not make half of strong AI, either. Russell and Norvig, the classic AI text, lists 9 abilities generally required for strong AI. 2 is not half of 9.

I have a quick overview on how to do AI, and as you'll note on the very beginning of the page [geocities.com]: the reason I haven't worked on AI myself is that I can't code a video->3d object converter myself.

I don't know what your dead geocities page has, but not working on AI because you can't write a video->3d object converter is like not working on video compression because you can't act.

Re:"True AI"? (0)

Anonymous Coward | more than 6 years ago | (#21950982)

I think when someone refers to 'true AI' they mean synthetic intelligences capable of being analogous to people. That includes factors NOT on the accepted list of key items for 'strong AI'. The additional factors include emotion and feeling like humans, empathy, ability to be creative in ways that humans can be, and much more.

'Strong AI' seems to aim at intelligent agents, and war machines, and the like, but misses big. Indeed, Russell and Norvig, in stating nine key items, are missing some items and some subtleties. But then the field is too. Their book is popular and an authority, but it is not the end-all and be-all. I take this position because in my AI work, I'm developing exactly the areas in which R&N are deficient. It's a little like medieval alchemists saying 'all is earth, air, fire, or water!", but I'm saying no, there are about a hundred elements and the fundamental model is different, you're looking at some pieces in the wrong way, your capabilities model is too coarse, not always subtle enough.

Anyway, on the 3D issue, it's interesting to note that a one-eye man is almost identical to a single-view camera, but he can recognize many 3D objects in it with some accuracy from a single 2D image. Because he is merging a situation model with an attempt to model a 3D worldframe, and correlating that with learned knowledge about the world. By the way, R&N fail to recognize that human behavior among other capabilities generally involves (requires) integrating visual, aural, kinesic, and conceptual mental blackboards, and development of any 'strong AI' absolutely needs to understand that that is critical.

Re:Another step towards AI (0)

Anonymous Coward | more than 6 years ago | (#21948166)

Blind people can be intelligent. I don't believe that 3d object recognition is the key to A.I. On your website you just assume the mind works on your A.I., but how exactly would you say that mind works? Thats the real problem with A.I.

Re:Another step towards AI (4, Interesting)

CrazyJim1 (809850) | more than 6 years ago | (#21948452)

I get that a lot. Blind people still have a 3d imagination. They need to know where the doors are, where the stairs are, and where objects they use are. You need a 3d imagination space to have AI and that is the primary reason that past attempts at making AI have failed. I love to watch the advances in video card technology and the competition between NVIDIA and ATI because the more they work, the easier it will be to do AI, and all computer advances for that matter. I think I could start some basic AI with this 3d recognition software with the hardware of an average modern desktop. I think it is just a software problem and not necessarily a hardware one. We'll see. I'm going to keep in touch with this group and see if they let me use their software because I'm an unemployed coder and I might as well work on AI because some group has to do it. I'll make it an open source project in Source Forge and maybe extra coders will jump on.

Re:Another step towards AI (1)

Lumpy (12016) | more than 6 years ago | (#21948828)

Incorrect. AI does not NEED 3d. Navigation needs 3d and that already works. I have built autonomous robots that can navigate a room from point A to point B without prior knowledge of the room arrangement. It can even have moving objects. This is easily accomplished with a pair of low res B&W cameras and one rangefinder or just a rotating rangefinder. (My first used the sensors out of mice and a couple of cheap 6mm screw lenses and mounts for el-cheapo video cameras.)

From that I can get distance to object to avoid. and quickly navigate a room for tasks. AI to make decisions like, Object C7 is blocking my way.... I wonder if it moves and tries to push it without having that preprogrammed, I.E.... I have an idea...

THAT IS AI and does not need 3d.

Re:Another step towards AI (2, Interesting)

ADRenalyn (598918) | more than 6 years ago | (#21949340)

Navigation needs 3D and that already works.

Navigation might work, but it's far from perfect, or even good.

It's nice that your robot can tell when something is blocking its way. But how does it know when there is nothing left to walk/drive on? For instance, a stair leading down, or a change in materials (from sand to water, or asphalt to ice) that would prevent it from moving properly? Can it tell that certain variations are normal (a rug, or different colored tiles on a ceramic floor) and some are dangerous (the edge of an in-ground pool)?

When a robot/computer can tell that something is in it's way- and figure out what that object is, and if it can be moved (safely, and to where), then we're approaching *decent* AI.

Re:Another step towards AI (1)

mwvdlee (775178) | more than 6 years ago | (#21952306)

Is it only AI if it can move around?
I really don't see how text recognition AI needs to be able to handle 3D space.
Same for voice recognition and probably a lot of other types of AI.

Re:Another step towards AI (0)

AuntieWillow (1188799) | more than 6 years ago | (#21948396)

I'm still kinda new here.
Can I make a joke about welcoming our 3-D Modeling AI Overlords now, or do I need more Karma?

Re:Another step towards AI (0)

Anonymous Coward | more than 6 years ago | (#21951110)

The answers, in order, are:

Magic Eight Ball says "It's okay to be new",

Not until you have been through the Initiation of 1000 Inept But Smug Putdowns, and

Not if you buy Karma Helper, from General Mills.

Re:Another step towards AI (0)

Anonymous Coward | more than 6 years ago | (#21948554)

Ironically in the old times of AI (dozens of years ago), researchers in this field thought that computer vision is a small easy task which will be solved quickly.

Re:Another step towards AI (0)

Anonymous Coward | more than 6 years ago | (#21949416)

Dude, are you like... secretly 14 years old or something?

Re:Another step towards AI (0)

Anonymous Coward | more than 6 years ago | (#21952902)

From this guys website, funny, entertaining, and Wiitastic.

6:49 AM 8/5/02

HUGE NEWS!
Thanks to Crystal Space , I now have a 3d engine. Crystal Space is amazing, its cross platform like java. You can make a video game on Linux then port it to windows, playstation 2, Xbox, Macintosh, and Unix. Crystal Space is incredible.

Not only do I have a 3d engine now, but I am using Linux. Linux is amazingly coder friendly. Windows trys to obfuscate its code so no one can write programs in its language(monopolistic capitalism). In Linux, everything is shared and people are helpful. I am using Camstream: code designed to capture images on a web cam and upload them via FTP. I modified the code, now it transfers images into the 3d engine!.

My biggest obsticles to continuing the project are out of the way. I'm looking forward to learning more about image detection and feature recognition on computer vision. I need to learn how to network cameras together in order to have cameras merge their sight together. I need to get faster hardware so it can be done in real time. But I need more basic information about image detection first.

I decided to write a video game to step me in the right direction. RocketSword is the name. You play a super hero who has two swords, and they can be rocket powered to help you slash, or lift the guy in the air. To play, you have two colored sticks, and the computer is to track them as swords for the in game character. If you press a button on one of the swords, it fires the thruster out of the back of the sword and projects your guy in 3d space. You have two swords, so the vector math can get complex and fun :) Also, you have the plain slashes that come with swords. It seems like it will be a fun game, and all it needs is one camera to watch your guy.

I'll learn the basics when coding this game, and maybe make some money to fund the next system of cameras. Its a good step to take in the journey to true AI. It seems really fun and I am optimistic about doing it.

Buck Rogers (1)

conureman (748753) | more than 6 years ago | (#21947664)

Hasn't this been a mainstay of movies forever?

Re:Buck Rogers (1)

timthorn (690924) | more than 6 years ago | (#21947754)

Since 2001 at least. Boujou from 2d3 Ltd.

Re:Buck Rogers (1)

aseidl (656884) | more than 6 years ago | (#21948726)

Or try the free (as in beer) Voodoo Camera Tracker [uni-hannover.de]
Lets you export pointclouds (not 3d models, as in the story) to a variety of formats, including Blender.

Re:Buck Rogers (1)

Chris Shannon (897827) | more than 6 years ago | (#21958028)

Voodoo is pretty great, although its automatic feature point estimation in 3D is a bit limited. For features far away, in Free Move mode, the stereo math breaks down and it only gets direction correct, while distance from the camera can be anything from -infinity to +infinity.

An option for creating a textured 3D VRML model from images is PTStereo [panotools.org] . It is designed to work with 2 or more images that are taken far apart from each other, not a video sequence. Hugin, and SIFT can create the control points, but PTPicker must be used as the frontend for PTStereo.

Blender can import the textured vrml.

Software for 2D images for 3D models is not new (5, Informative)

bn0p (656911) | more than 6 years ago | (#21947694)

Software like Canoma from the now-defunct Metacreations would let you create 3D models from 2D images in the mid-to-late 90s. I also remember reading about people using Viz ImageModeler [realviz.com] to convert images from video to models even though the software is also designed for still images - the users would just capture those frames they needed to create the 3D model.

The only thing "new" about this is using video as the input without having to grab the individual frames yourself.


Never let reality temper imagination

Re:Software for 2D images for 3D models is not new (3, Interesting)

Anonymous Coward | more than 6 years ago | (#21947838)

Actually, algorithmically, you can make a substantial leap in processing capabilities when you switch from feeding in series of still images to video. This may seem a bit counterintuitive, since a video is just a series of still images, but the key is that a video is a continuous series of still images.

The main problem with existing techniques is that they often require a lot of user interaction to create a complete model, because points between images have to be delineated and correlated by hand, or at best with some minimal computer assistance.

A video-based process can take advantage of the fact that changes between the images will be relatively small, and follow definite trajectories, which would allow an appropriate algorithm to identify and correlate features with almost no manual intervention. This would be an absolutely huge improvement in usability, although it's not an easy problem by any means.

For example, the program may be able to easily isolate objects from the background by tracking differences in how points move due to perspective, which can be done with discontinuous still pictures, but is much harder to say with any confidence which points correlate with which under arbitrary changes in point of view.

To give an analogy, it'd be like giving you a picture of a whole egg, and a picture of a crushed egg, and asking you try to and accurately trace back where individual pieces of the shell came from. It'd be much, much easier if you had a video of the egg being smashed, where you could trace out, frame by frame, where individual pieces came from.

It's not the same problem, but for a computer, it's comparably hard. For a human being, if the egg wasn't smashed, it'd be relatively simple to pick out which points relate to which, but that's only because we have a sophisticated image recognition system that allows us to reason about shapes. If you happen to have two pictures of an unfamiliar object from radically different points of view, it can be quite tricky to decide what the whole object must look like. Show a video of the same object, moving around between different points of view, and it's not nearly as hard.

Re:Software for 2D images for 3D models is not new (4, Informative)

samkass (174571) | more than 6 years ago | (#21947868)

Yeah, the big breakthrough in this, IMHO, was a 1994 paper by Takeo Kanade of CMU's Robotics Institute titled "A Sequential Factorization Method for Recovering Shape and Motion from Image Streams [cmu.edu] ", which did a pretty good job of factorizing out the 3D model as well as the camera motion from a video stream... it could tell you not only the dimensions of the house you were videotaping, but the stride of the person holding the camera. This laid the groundwork for a lot of other "model from video" work done throughout the 90's. More recently a group there has done a lot of work on "Shape from Sillouette [cmu.edu] " which looks closer to the technology that this product uses.

I've been waiting for this technology to go big on eBay for a decade... maybe this'll be the year.

Re:Software for 2D images for 3D models is not new (1)

ashooner (834246) | more than 6 years ago | (#21949378)

I think this [unc.edu] is much more impressive. Tracing isn't needed if the location of the camera can be determined. Pretty cool stuff.

Old tech? (0)

Anonymous Coward | more than 6 years ago | (#21947706)

Isn't this old news? I remember seeing a demo of similar technology over 5 years ago. It must have been Japanese, because they traced boobs with it.

Wow, congrats to the submitter (0, Interesting)

Anonymous Coward | more than 6 years ago | (#21947768)

for finding the one boingboing post that's not about Doctorow's Disney fetish, or Xeni's insistance that she is in fact, not a he.

Re:Wow, congrats to the submitter (0)

Anonymous Coward | more than 6 years ago | (#21948138)

you forgot knitted porn.

computer vision technology is pretty wild (3, Insightful)

jollyreaper (513215) | more than 6 years ago | (#21947772)

Remember back in the day when we were told that computers would never be able to learn how to understand human speech because it's too complicated? The arguments were compelling but now we've got voice recognition working over crappy telephone connections and dictation software is getting better all the time. As bad as the voice recognition problem was, computer vision seemed like an even harder nut to crack given how impossible it seemed to get a machine to go from a two-dimensional image to 3D. All of this stuff seems like impossibly difficult "we'll never get there" AI impossibilities and then we see a technology demonstration that nails it. I'm still astounded that DARAPA is not only asking for robot-driven cars, they're actually getting teams producing working results. That's another problem I always thought would be impossible.

My prediction for the future: the 21st century will be for robotics what the 20th was for aviation. We've been thinking about it for centuries but now the technology is maturing to the point that we can really do something with it. The stuff we're amazed by today is going to seem like wood and canvas biplanes.

Re:computer vision technology is pretty wild (0)

Anonymous Coward | more than 6 years ago | (#21947926)

Probably because the 22nd century will be the one where our numbers die out and robots are our lasting legacy to the universe [Don't remember the source of concept]

Re:computer vision technology is pretty wild (1)

Atario (673917) | more than 6 years ago | (#21948064)

I'm still astounded that DARAPA is not only asking for robot-driven cars, they're actually getting teams producing working results. That's another problem I always thought would be impossible.
I knew it was doable (even if only with assistance by way of special roads), but no one was putting any real effort into making it usable en masse. So thank you, DARPA.

Re:computer vision technology is pretty wild (4, Interesting)

MobileTatsu-NJG (946591) | more than 6 years ago | (#21948170)

Remember back in the day when we were told that computers would never be able to learn how to understand human speech because it's too complicated? The arguments were compelling but now we've got voice recognition working over crappy telephone connections and dictation software is getting better all the time. As bad as the voice recognition problem was, computer vision seemed like an even harder nut to crack given how impossible it seemed to get a machine to go from a two-dimensional image to 3D. All of this stuff seems like impossibly difficult "we'll never get there" AI impossibilities and then we see a technology demonstration that nails it. I'm still astounded that DARAPA is not only asking for robot-driven cars, they're actually getting teams producing working results. That's another problem I always thought would be impossible.
Hmm. Though it's not really that clear from your post, I'm concerned that you're seeing one problem where really there is two. In the case of voice recognition, getting a computer to recognize a spoken word within a certain context is far easier than getting the computer to understand a phrase like "Set up an appointment for me on the Fifth of May at 2 pm.". One is simple signal analysis, the other is context-sensitive understanding. The former is easy and has been possible for years. The latter is virtually impossible without the computer in question having 'experience'.

The same is true for image recognition. You can get a computer to recognize movement pretty easily. Heck, the ability for software to detect the 3d form of an object has been around for ages. However, getting a computer to watch Star Wars and say "I see Dennis Lawson sitting inside an X-Wing fighter." is, as I said before, difficult to do without a concept of 'experience'.

We'll get there one of these days, but right now the sorts of cool-sounding advancements we've been seeing really only work in very specific circumstances.

Re:computer vision technology is pretty wild (1)

jollyreaper (513215) | more than 6 years ago | (#21949934)

Hmm. Though it's not really that clear from your post, I'm concerned that you're seeing one problem where really there is two. In the case of voice recognition, getting a computer to recognize a spoken word within a certain context is far easier than getting the computer to understand a phrase like "Set up an appointment for me on the Fifth of May at 2 pm.". One is simple signal analysis, the other is context-sensitive understanding. The former is easy and has been possible for years. The latter is virtually impossible without the computer in question having 'experience'.
I am aware, that's why I made the distinction between voice recognition on the telephone by automated attendants and dictation software. It's not quite perfect yet but it's a lot better than it used to be. We're moving from stuff being years off to in the next model year or two. I find that impressive.

Re:computer vision technology is pretty wild (1)

LordLucless (582312) | more than 6 years ago | (#21950518)

It depends what you mean by "understand". What current dictation programs do is really just pattern-matching. It analyzes each word, and finds a word in its database that fits. It's the same system as the automated phone stuff, just on a larger scale. What the grandparent was talking about is getting a computer to comprehend speech - it's known as the natural language problem, and it's nowhere near solved.

Even if a computer can pick the words out of your speech, it has no idea what they mean, unless such meanings have been pre-programmed in. And programming in every possible meaning in every possible context for every possible combination of terms in a human language is not realistic (and also doesn't compare to the way the human brain performs the same feat).

Re:computer vision technology is pretty wild (1)

PingXao (153057) | more than 6 years ago | (#21948252)

I'm still waiting for computers to be able to recognize speech, untrained and with a speaker-independent vocabulary range greater than a hundred or so recognizable patterns. One that can take dictation, get the grammar and punctuation right, capitalize words properly and distinguish between "right", "write" and "rite" (among others) depending on the context in which they're used.

You say this already exists? On what planet?

Re:computer vision technology is pretty wild (1)

WK2 (1072560) | more than 6 years ago | (#21948506)

Remember back in the day when we were told that computers would never be able to learn how to understand human speech because it's too complicated?

I remember being told a lot of things. Like there is no moon. Only a small percentage of people would say that a technological advance would never happen. Never is a long time. As a previous poster pointed out, this particular advance hasn't happened yet, but it probably will eventually.

now we've got voice recognition working over crappy telephone connections

That depends on how you define "working." I would not qualify yelling into a phone slowly, and repeating yourself over and over as working. It is sad that so many places have replaced the old "press 1 to do x", which was slow, but worked.

Re:computer vision technology is pretty wild (1)

HTH NE1 (675604) | more than 6 years ago | (#21948936)

Remember back in the day when we were told that computers would never be able to learn how to understand human speech because it's too complicated? The arguments were compelling but now we've got voice recognition working over crappy telephone connections and dictation software is getting better all the time.
"Dear Aunt, so let's set double the killer delete select all"

Recognition != Understanding

Why dont google use this? (0)

Anonymous Coward | more than 6 years ago | (#21947846)

ive always thought converting various images (or in this case a video) to a 3d image wouldn't be too hard!
so why don't google use this on google maps to make a 3d world?

Re:Why dont google use this? (0)

Anonymous Coward | more than 6 years ago | (#21948182)

cause they only have one pic of each location

Re:Why dont google use this? (1)

Purity Of Essence (1007601) | more than 6 years ago | (#21948300)

Actually, Google already has similar tech (although a bit more primitive than this). It's called SketchUp [google.com] and it is designed for integrating 3D structures of photographed landmarks into Google Earth [google.com] . In the hands of an expert it is pretty powerful stuff.

Oh yeah ? (3, Funny)

witte (681163) | more than 6 years ago | (#21947866)

I'd like to see how it holds up against Calista Flockhart footage and not go Division By Zero.

Re:Oh yeah ? (0)

Anonymous Coward | more than 6 years ago | (#21948916)

2001 called, it wants its joke back.

Re:Oh yeah ? (0)

sgt_doom (655561) | more than 6 years ago | (#21949702)

Hmmmm.....first this....next the Replicant. Sheeeesh!!!

Re:Oh yeah ? (1)

mgblst (80109) | more than 6 years ago | (#21955258)

The year 2002 called, they want their joke back.

It's not that new (0)

Anonymous Coward | more than 6 years ago | (#21947934)

What they did is not that new. Voxel coloring has been around for a decade. However, the main problems has been it only works well for perfectly diffuse reflective surfaces since the same point viewed from a different camera angle will be different in the real world. Not having enough camera angles of the same point (filling in the gaps) to determine the 3D position via correlation is also a problem. It seems those researchers have found answers to these problems.

This sounds like a project I did some work on (5, Interesting)

markds75 (958873) | more than 6 years ago | (#21947962)

I'm a Ph.D. student at UC Santa Cruz. I finished my masters a few years ago working on enhancements to a project [ucsc.edu] with similar goals. My advisor, Jane Wilhelms [ucsc.edu] (who unfortunately died shortly after I finished my masters) was working on computer vision techniques for several years. Her work focused on extracting motion for animals (often children or horses) out of videos. My Masters contribution was to look at how the accuracy and usability of the software could be improved if we assume that the general motion of a walk is the same for all instances of a particular species (the knees all bend the same way, and the legs move in the same order, etc). I didn't have a high quality capture to start with, so the results were a bit fuzzy in terms of accuracy, but it did make the process easier for the user. The user had only to make the "original" motion match the video at key frames (maybe 4 per "walk cycle"), and the computer could easily interpret the rest; I don't recall off the top of my head, but I think the number of key frames the user had to specify was reduced by half or more over the former process (without the canonical motion as a starting point). I didn't publish any papers based on my work, but my masters thesis (with example filmstrips) is available [ucsc.edu] .

Unfortunate? (0)

Anonymous Coward | more than 6 years ago | (#21949332)

If you think it was unfortunate she died after you finished your thesis, imagine where you'd be if she had died before. (Hint: still in grad school)

Re:This sounds like a project I did some work on (1)

Sleepy (4551) | more than 6 years ago | (#21952792)

About 8 years ago I worked on the SynaPix project which was also very similar to the article. The SynaFlex system could recover motion paths, direction, camera path and scene geometry automatically from video. Or you could focus on one aspect/stage of this and (such as manually tracking points to trace an object or inserting perfect geometry to match a known object such as a floor plane, thus buttressing the geometry and sometimes tracking results).

Pretty neat stuff. A pity the company outspent itself (too many goals, a >50% non-engineering staff before even a first sale - how many times has that scenario repeated itself?)

Breaking News: Vlad announces presidential bid (-1, Flamebait)

Anonymous Coward | more than 6 years ago | (#21947976)

At 5:00 PM EST, Vladinator announced from his Joliet home that he will be running for president as a member of the newly-formed Catholic Freedom Party. In a statement, Vlad explained that his party's main platform involves making it legal for people to "shit in public and fuck babies". More details will be available in tomorrow's full report.

Test case (3, Interesting)

kramulous (977841) | more than 6 years ago | (#21948040)

Hook up google maps api with polar navigated flight path, some edge/point detection algorithms and start mapping. That'd be an interesting video.

I can see it now! (0)

Anonymous Coward | more than 6 years ago | (#21948184)

Imagine the porn!

Like photosynth (0)

Anonymous Coward | more than 6 years ago | (#21948218)

So it's kinda like MS' Photosynth, except it gathers the photos it self from a video.

This is pretty great... (1)

8bitmachinegun (855479) | more than 6 years ago | (#21948562)

I work as a video professional for a very large stock video [thoughtequity.com] provider. I could see software like this being an amazing tool for a company such as mine. Not only can we offer you footage of (for example) a horse running through a field - we might be able to sell the elements themselves or in addition to that? Need some more horses? How about we just sell you the background and you pick what animals you want? A lot of the time the video industry is dictated by extremely tight deadlines and budgets - any tool that gets offered to a producer or editor that makes it cheaper/faster to get to a desired outcome will get snatched up. I could see this as a real labor savor/enabler.

Re:This is pretty great... (1)

dimeglio (456244) | more than 6 years ago | (#21952098)

How about for helping CSI folks reconstruct CCTV footage of a crime.

Maybe even for UFO researchers get more details on so called video footage of a "real UFO."

Not sure if additional information can be extrapolated from the technique (I didn't read TFA), but it can potentially be in fact very helpful.

Can someone ... (1)

elsJake (1129889) | more than 6 years ago | (#21948612)

Apply that to the 2d sprites in doom ? I like the new engines out there created to play doom II wads and new fancy poligonated objects , but it would be nicer if the monsters were 3d as well.

Re:Can someone ... (1)

Warbothong (905464) | more than 6 years ago | (#21951598)

Done [eduke32.com] . NEXT!

Similar concept for my thesis (2, Interesting)

ZedarSlash (982046) | more than 6 years ago | (#21948856)

In my thesis I'm also creating a 3d model from a video stream, only I'm using stereoscopy and pattern recognition to find matching objects in each frame and triangulating the depth to said objects. By the end I'm hoping to reduce the objects to small pixel clusters; the tricky part is that all this is happening in real-time. By mounting the cameras on a device where the point of view is know, it could be used to map out any static terrain by just navigating through it. Adding more cameras from different perspectives increases the completeness of the generated model. The article has definately got the right idea. With sufficient object detection and tracking algorithms, you could minimise or eliminate the need to draw the template.

What, all these comments (4, Funny)

SeaFox (739806) | more than 6 years ago | (#21948982)

...and no one is going to make a porn joke?

Re:What, all these comments (1)

hyades1 (1149581) | more than 6 years ago | (#21950006)

Damn...I was thinking about that, but since you seem to be one step ahead of me, it seems kind of redundant.

I'll leave the elements for you to polish up and use if you like: If you're thinking about creating a model of your favorite porn star, women will stand to benefit from this a bit more than the guys. Might go through a bit more construction material, though.

Re:What, all these comments (0)

Anonymous Coward | more than 6 years ago | (#21952410)

wget -m omdb.org/search?q=jessica+alba | grep trailer{.mov|.avi|.mpg} | wget %i | videotrace > /dev/3dprinter --silicon-lifesize

forgive the pseudobash and psuedoregex

Any open source programs for this?(video or still) (1)

Sebastien_Bailard (1034810) | more than 6 years ago | (#21948998)

This is very interesting. Unfortunately, it is going to be closed-source and patented.

Does anyone know any open-source projects to do object reconstruction from video or still photographs? I'm asking because my group is building a 3D printer.
http://www.reprap.org/ [reprap.org]
(Self-link pimpage, etc. etc.)
and I think it would be cool and useful to be able to capture a 3D model from photos or video of a sculpted maquette, pet cat, broken part, human, or so on.

(I just stumbled across this by googling "gpl object reconstruction", which may be relavant):
https://ezra.dev.java.net/ [java.net]

People may be interested in
http://splinescan.co.uk/ [splinescan.co.uk]
which is a gpl laser scanner hardware (pen laser, prism*, webcam, and turntable) + software project to do 3D object scanning.

I'll follow comment responses to this thread, but I also welcome emails:
penguin at supermeta dot ihatespamtoo dot com

Re:Any open source programs for this?(video or sti (1)

am 2k (217885) | more than 6 years ago | (#21949280)

David [david-laserscanner.com] is anther free DIY-laserline-scanner-based implementation which doesn't need a turntable (merging multiple scans doesn't seem to be included with the free version, though).

Re:Any open source programs for this?(video or sti (1)

ZedarSlash (982046) | more than 6 years ago | (#21949458)

I'll me making my source available once I've finished my thesis, though that code wont be available until the end of 2008.

Can't wait to... (0)

Anonymous Coward | more than 6 years ago | (#21949152)

create a 3D model of my favorite p0rn movie

combine techniques (1)

BrandonBlizard (1007055) | more than 6 years ago | (#21949582)

If you could combine the techniques that create the models automatically, with techniques like this where a skilled artist is involved, you could produce some high quality output indeed.

You can do it manually (1)

ch-chuck (9622) | more than 6 years ago | (#21949726)

You can easily make 3D images viewable with lcd shutter glasses and an nvidia card if you find some shots where the camera is panning across the scene, and it's pretty static, using software like 3D Combine [photoalb.com] . Just take two frames so many frames apart and use one for each eye. I did this with some old Betty Boop cartoons (which were made by rotoscoping, that is, based on actual photographic images) and they worked great.

Re:You can do it manually (0)

Anonymous Coward | more than 6 years ago | (#21950354)

And that's related to the article, how...?

Swordbreaker (1)

jameskojiro (705701) | more than 6 years ago | (#21950324)

Now I can finally have a 3-D model of the starship swordbreaker, finally.

nothingware (1)

meburke (736645) | more than 6 years ago | (#21950346)

If Microsoft made this announcement it would be condemned as "vaporware". The main site claims it is in beta and they are looking for commercial partners, so it apparently is not open source and no use to us at this time.

I appreciate the links and information in the discussion prompted by this article. Although I'm underwhelmed by the actual announcement, I've learned a lot from the links you folks have provided.

I guess you're not familiar with SIGGRAPH? (1)

argent (18001) | more than 6 years ago | (#21950758)

Actually, Microsoft has made a number of presentations at SIGGRAPH over the years without any condemnation or other unpleasantness. Why would you think otherwise? This kind of thing is what SIGGRAPH is for.

Can we scale the models up? (1)

Nefarious Wheel (628136) | more than 6 years ago | (#21950542)

There's this hot little Night Elf Paladin chick I have my eyes on...

Not news, it's called photogrammetry (0)

Anonymous Coward | more than 6 years ago | (#21950712)

Most of the comments about this (both here and on BoingBoing) are clueless to say the least. You people must think the guys at Pixar and ILM have been asleep on the job for the last 20 years.

This has been done for ages. It's called photogrammetry, and it has been used in several movies (ex., Fight Club). Maybe their approach is simpler or maybe it works faster than current techniques, but until they post a video showing the workflow, there's simply nothing new here.

Check out the Campanile movie! (1)

200_success (623160) | more than 6 years ago | (#21950730)

Even more impressive is the Campanile movie [debevec.org] , where an entire 3D model of the UC Berkeley campus and a fly-by shot was generated from just 15 still pictures. This was done a whole decade ago, for SIGGRAPH 97.

Re:Check out the Campanile movie! (0)

Anonymous Coward | more than 6 years ago | (#21954180)

Carnegie Mellon did this type of thing as well from a still image.
http://www.cmu.edu/PR/releases06/060613_3d.html [cmu.edu]

Good Job! Univ.ofKiel created the pro version 2003 (1)

holzschneider (1215146) | more than 6 years ago | (#21951808)

The University of Kiel (Germany) presented quite exactly the same stuff (without the need of manually marking objects or object boundaries) at CeBiT 2003.

check this video (scroll page to "Movie for presentation on CeBIT 2003").
http://www.mip.informatik.uni-kiel.de/tiki-index.php?page=3D+reconstruction+from+images [uni-kiel.de]

Re:Good Job! Univ.ofKiel created the pro version 2 (1)

marjancek (1215230) | more than 6 years ago | (#21952274)

Nice link.

Also, 3D Active countours can be used to trak the shape and reconstruct the model.

Geeks Rejoice! (1)

quickpick (1021471) | more than 6 years ago | (#21957088)

Weird Science wasn't a movie...it was a prophecy!
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?