Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Recognizing Scenes Like the Brain Does

kdawson posted more than 7 years ago | from the software-meets-wetware dept.

Software 115

Roland Piquepaille writes "Researchers at the MIT McGovern Institute for Brain Research have used a biological model to train a computer model to recognize objects, such as cars or people, in busy street scenes. Their innovative approach, which combines neuroscience and artificial intelligence with computer science, mimics how the brain functions to recognize objects in the real world. This versatile model could one day be used for automobile driver's assistance, visual search engines, biomedical imaging analysis, or robots with realistic vision. Here is the researchers' paper in PDF format."

cancel ×

115 comments

Sorry! There are no comments related to the filter you selected.

adverse effects (5, Funny)

prelelat (201821) | more than 7 years ago | (#17975660)

If my computer could "see me" I think that it would BSOD its self to sleep. Long long sweet slumber.

Re:adverse effects (1)

JonathanR (852748) | more than 7 years ago | (#17977494)

Wot, no pr0n surfing witticisms yet?

Recognizing Roland the Ploggers incoherance (0, Insightful)

Anonymous Coward | more than 7 years ago | (#17975714)


and seeing the spam for what it is

oh and here is the PDF
http://cbcl.mit.edu/projects/cbcl/publications/ps/ serre-wolf-poggio-PAMI-07.pdf [mit.edu]

not that Roland would even understand what it says, he just reads press releases via RSS, copies the summary and hits submit
We appreciate that the Editor removed his spammy link to ZDNet (no wonder they are losing cash)
but is Slashdot that short of good stories that they have to choose a known plagiarisers articles and actively edit them over the hundreds of original submissions they get daily ?

i would of chosen to read Digg instead but that is even worse, full of credit card scams, made for adsense blogs and millions of MLM bloggers all hawking their refferal links and real estate blogs hoping people will click on their crappy asbestos and insurance links

sheesh can't a geek get some decent news for a change ? obviously not, Internet 2 anybody

mod parent down - we've been here before: (1)

Bananatree3 (872975) | more than 7 years ago | (#17976186)

A year ago [slashdot.org] one of the Slashdot editors addressed Roland's submissions. Quite the Haterade, dude.

Re:Recognizing Roland the Ploggers incoherance (0)

Anonymous Coward | more than 7 years ago | (#17976324)

> I would of chosen to read Digg...

would have.

would have.

Interesting, but what comes next? (4, Insightful)

The Living Fractal (162153) | more than 7 years ago | (#17975748)

I understand the reasoning behind modeling these systems on our own highly-evolved (ok, maybe not in some people) biological systems. What I want to see, however, is something capable of learning and improving its' own ability to learn. If our intelligent systems are always evolution-limited by the progress of our own biological systems then I can't see how A.I. smarter than a human will ever ben achieved. But if we are able to give these systems our own abilities as a starting point and then watch it somehow create something more intelligent than we are... then we really have something. Whether or not what we have is good at that point I can't say, though there are many people and communities in the world who are working on making sure this post-human intelligence doesn't essentially destroy us. Foresight for example.

I'm not knocking the MIT research, I think it's amazing. It just seems to me like imitation rather than imagination. Granted, highly evolved and complicated imitation. But does it even have the abilities of a parrot?

TLF

Re:Interesting, but what comes next? (0)

Anonymous Coward | more than 7 years ago | (#17975848)

1) We're not sure how to do it.
2) We have something that does it.
3) Try to Mimic B to get some insight into how to do it.
4) Interpolate/Extrapolate/Innovate and make it.

Re:Interesting, but what comes next? (1)

caffeinemessiah (918089) | more than 7 years ago | (#17975852)

There's a (somewhat questionably) related application in the real world that was on this new "firehose" thing yesterday: Feng-Gui. It creates a heat-map overlay for any website supposedly highlighting areas that stick out first to human perception.

Feng-Gui [feng-gui.com]

When I first visited the site, they had a porn site in their "Sample heatmaps" section, and I must say it was pretty spot-on.

Re:Interesting, but what comes next? (1)

poopdeville (841677) | more than 7 years ago | (#17977274)

They got goatse right too, but over-estimated the visual impact of the shadow on the left.

Re:Interesting, but what comes next? (1)

lucky13pjn (979922) | more than 7 years ago | (#17977310)

Lucky you, when I visited, two of the recently done ones were Mr. Goatse :(

Re:Interesting, but what comes next? (2, Insightful)

the grace of R'hllor (530051) | more than 7 years ago | (#17975888)

Of course it's imitation. So is machine-learning and machine procreation. What makes you think we're currently limited by our biological capabilities? We're biologically almost identical to cave men, but where they smeared charcoal and spit animal paintings on walls, we now land probes on Mars. We're on a roll.

Give machines our own capabilities? We can't even have them move about in a reliable fashion, what makes you think we're even *close* to endowing machinery with creativity and abstract thought at human levels? Or even parrot levels, since you mention it? There are many hurdles to be cleared before we can consider creating an AI that has a practical chance of surviving to do anything useful, and machine vision (and the processes involved in making this robust) are critically important.

Re:Interesting, but what comes next? (5, Interesting)

zappepcs (820751) | more than 7 years ago | (#17975916)

It is interesting to consider the problem of AI researchers. How to create intelligence when it is not really understood. In the time between now and when we do understand it, we'll have to develop systems using logic and software that approximates how we understand it. A simple example is to ask yourself how many times that you had to learn that fire is hot? An AI system may have to learn this every time that you turn it on.

There is software systems that can approximate the size and distance between objects in a picture with reasonable accuracy, and if the scope of scenery presented to the system is limited, then that ability combined with sensing motion of objects is enough to determine a large percentage of what is desired. This is not the trouble or the hard part. The hard part is determining object classification and purpose in those times when it is not simple.

Each of us can almost always look at a scene and determine the difference between a jogger and a purse thief on the run or a businessman late for an appointment. For computers to do so takes a great deal more work. It is only a subtle difference and one where both objects maintain similar base characteristics.

The point? Even mimicking human skills is not easy, and fails at many points without the overwhelming store of knowledge that humans have inside their heads. This would point to the theory that if more memory was available, AI would be easier, but this is not true either. Humans can recognize a particular model of car, no matter what color it is and usually despite the fact that it might have been in an accident. The thinking that comes into play when using the abstract to extract reality from a scene is not going to happen for computers for quite some time.

The danger is when such ill prepared systems are put in charge of important things. This is always something to be wary of, especially when it is used to define/monitor criminal acts and identify those who are guilty whether that is on cameras at intersections or security systems, or government surveillance systems.

Re:Interesting, but what comes next? (3, Insightful)

Xemu (50595) | more than 7 years ago | (#17976294)

Each of us can almost always look at a scene and determine the difference between a jogger and a purse thief on the run or a businessman late for an appointment.

Actually, we can't, we just base this recognition on stereotypes. A well known Swedish criminal called "the laser man" exploited this in the early 90s when robbing banks. He would rob the bank and then change clothes to a business man or a jogger, and then escape the scene. The police would more often than not let him pass through because they were looking for a "escaping robber", not for a "business man taking a slow paced walk".

The police caught on eventually and caught the guy. Computers would of course have even greater difficulties to think "outside the box".

Re:Interesting, but what comes next? (1)

Kensai7 (1005287) | more than 7 years ago | (#17979616)

Well said. But couldn't this eventually be "learned" as well? Thinking out of the box could be a special mode of problem-solving that uses statistically less probable approaches.

Re:Interesting, but what comes next? (1)

hazem (472289) | more than 7 years ago | (#17976380)

Each of us can almost always look at a scene and determine the difference between a jogger and a purse thief on the run or a businessman late for an appointment.

The desired purpose is what actually dictates the usefulness. For a police-interceptor robot, it would be important to be able to make those fine distinctions.

For an auto-driving robot, it's probably good enough to be able to tell there is a running human and what general locations they're likely to be as the robot passes. It won't need to know WHY they are running as it's not important to the purpose of not hitting them.

Every model should have a specific purpose, and it's effectiveness should be judged against that purpose and not other criteria.

Re:Interesting, but what comes next? (1)

zappepcs (820751) | more than 7 years ago | (#17976640)

While your comment has a ring of common sense to it, it is still illogical, and wrong for the following reasons.

If the running human is avoided, but not recognized, your AI car may find itself ensnared in the beginning of a marathon of runners, or perhaps mistakenly in the middle of a playground, or perhaps at the front of a building where people are running from a bomb scare?

Simply not hitting the human is simply not good enough all of the time. When software or AI systems have charge of life critical systems, such as cars, getting it right 90% of the time is not good enough and never will be.

There are tons of logic traps like this when designing AI where 90% correct is as bad as 9% correct. Just think of all the things you take for granted that you have to make deductions or logical conclusions about in order to function. Driving an automobile is far more complex than you seem to think it is. Miss one "bridge out" sign and your day is going to become very bad, very quickly.

Re:Interesting, but what comes next? (1)

jacksonj04 (800021) | more than 7 years ago | (#17976836)

Which is why multiple systems are better. If the AI spots one human, thinks "They're moving this way so I'll avoid them" and does then it's good. If it notices 60 people all where it believes the road is, then the AI should recognise it as an obstruction and try to find another way around. When it evaluates the surrounding area and finds that it's entirely obstructed it should stop and wait until it can pretty much guarantee it's clear.

Likewise with the 'bridge out'. The AI may not be able to interpret what the sign says because of the fact it needs a complex recognition algorithm, but that's not a major issue because it may have been programmed to recognise predominantly red road signs as hazards and switch into an 'improved perception' mode where it takes less chances. Failing that, it would eventually reach the point where there is a blockade (or failing that the road just stopping) and then re-route.

Whilst I agree that AI can't deal with a lot of things when driving, such as the instinctive "This is too dangerous", they can still achieve something close simply by running enough 'logic checks' to make sure it's not doing anything stupid.

Re:Interesting, but what comes next? (1)

hazem (472289) | more than 7 years ago | (#17976878)

f the running human is avoided, but not recognized, your AI car may find itself ensnared in the beginning of a marathon of runners, or perhaps mistakenly in the middle of a playground, or perhaps at the front of a building where people are running from a bomb scare?

Simply not hitting the human is simply not good enough all of the time. When software or AI systems have charge of life critical systems, such as cars, getting it right 90% of the time is not good enough and never will be.


Is your point that it's okay for the AI car to hit people some of the time? If that's the case, then the model needs to account for the situations where it IS okay. But generally, it's probably safe to assume that for most AI drivers, they should not hit people. And if it gets stuck waiting for people, that's probably an acceptable outcome.

There are tons of logic traps like this when designing AI where 90% correct is as bad as 9% correct. Just think of all the things you take for granted that you have to make deductions or logical conclusions about in order to function. Driving an automobile is far more complex than you seem to think it is. Miss one "bridge out" sign and your day is going to become very bad, very quickly.

I'm not saying driving an automobile is an easy task, but remember, "real" intelligence is rarely 100% either. We have thousands of crashes every day in the US with "real" drivers to demonstrate that reality. So, for most AI-drivers, the attainable task becomes, "can we make one that drives better than people?", and not "do we make one that drives perfectly". You'll never achieve perfection. But you may be able to achieve a result that is better than the current state. All else being equal, that's probably a good direction to go.

Remember that I, as a human, am doing a lot more than just driving when I'm driving. And that can be a problem. For example, there might be a pretty girl walking down the side of the road. By design, some of my "processing power" is going to be focused assessing her attractiveness, and that might be enough to cause me not to notice another person stepping out from in front of a car. The AI system is simply going to note that she's a pedestrian and where she might be moving next. It won't be distracted by her beauty.

So, if the AI car simply stops when it thinks it will hit a pedestrian, that's probably a good thing. And if it thinks there are too many and does not start moving again, that's probably okay too. At worst, the rider doesn't get where he wants to go as quickly.

Perfection is great, but it's sometimes better to not throw out the good/better in search of the perfection.

Re:Interesting, but what comes next? (0)

Anonymous Coward | more than 7 years ago | (#17977256)

We have thousands of crashes every day in the US with "real" drivers to demonstrate that reality. So, for most AI-drivers, the attainable task becomes, "can we make one that drives better than people?", and not "do we make one that drives perfectly". You'll never achieve perfection. But you may be able to achieve a result that is better than the current state.

The trouble is, that'll never catch on. Everybody would recommend these for other people, but never themselves. Most would never accept that one of their valued skills was now redundant and inferior to a machine. It's like everybody believes that they are a better than average driver - everybody will think they are a better driver than a computer. There's also the problem of control - air travel is safer than road travel, but because a passenger isn't in control, they don't feel safer.

Re:Interesting, but what comes next? (1)

hazem (472289) | more than 7 years ago | (#17977546)

I definitely understand what you're saying but I'm not in that camp.

I would so LOVE for my car to drive me to work and back. I wouldn't have to drive but I still get where I want to go in a reasonable time. It's just like public transportation, but without all the people and really long transit times.

My guess is we'll start seeing systems where there will be roadways dedicated to AI cars. Where there are smart cars and "smart roads" to help them along. At first, the cars will be AI-hybrids that you can drive when you need to. Then as those dedicated roads become more popular, large cities will eventually be AI-only cars. It will probably be a pretty safe place to be driven around.

Re:Interesting, but what comes next? (2, Insightful)

kurzweilfreak (829276) | more than 7 years ago | (#17978378)

I think you overestimate the human ego. I don't know about you, but I'd be perfectly happy to give up the mundane task of driving to an intelligent machine if it can do it better than I can. That frees me up to read the paper on the drive to work, or countless other more useful things I could be doing if I didn't have to constantly keep my eyes on the road.

I do agree with you on one point, but not for the reason you do: the problem of control. If there's any reason that an intelligent driving system wouldn't take off it would be because there isn't a human in control, so who gets blamed when something does go wrong? How would insurance companies handle this? Do our rates go down because we now have a machine in control that does a better job than we do? Do our rates go up if somehow there is an accident, even though it wasn't due to human error? Will people even accept an artificially intelligent driving machine if it has a less than completely, 100% reliable and error free record?

My gut reaction tells me probably not, because when something goes wrong, people look for someone to blame. If you can't blame the driver, do we blame the company that makes the IDS? If someone dies in an accident involving one of these systems, do we hold the company liable for it, even if it reduces the number of overall auto fatalities by, say, 90%? 95%? What level of imperfection are people prepared to accept? Is there ANY level that would be acceptable when you take the control out of the hands of humans, who we know and accept to be imperfect and therefore don't expect to be?

Re:Interesting, but what comes next? (0)

Anonymous Coward | more than 7 years ago | (#17977252)

Just take the process in steps.

1st: Make an AI just smarter than us.
2nd: Let that AI make one just smarter than it.
ad infinitum....

That way, we wouldn't have to make the first AI all that smarter than us.

Re:Interesting, but what comes next? (1)

zCyl (14362) | more than 7 years ago | (#17980320)

A simple example is to ask yourself how many times that you had to learn that fire is hot? An AI system may have to learn this every time that you turn it on.

Run it on a Dell laptop. It will learn faster.

Re:Interesting, but what comes next? (0)

Anonymous Coward | more than 7 years ago | (#17975928)

You obviously don't watch enough Sci-fi/horror flicks... that is a patently BAD idea! LOL

Re:Interesting, but what comes next? (2, Interesting)

cyphercell (843398) | more than 7 years ago | (#17976002)

we are able to give these systems our own abilities as a starting point and then watch it somehow create something more intelligent than we are... then we really have something.

This technology is prerequisite to providing an AI system with a starting point. It offers for instance the powers of perception as input for a learning system. A baby for example opens their eyes and simply sees, this is only part of the baby's starting point. Other aspects of your "starting point" include predetermined goals such as eating and also include points of failure like starving. Many avenues of input are required for effective learning at different capacities, Helen Keller for instance learned very early the value of eating, however formal communication was a remarkable accomplishment to say the least.

I agree with you that I would love to see a true A.I. system fully capable of learning, but discounting research that provides an AI system with the ability to see seems rather counter-productive.

If our intelligent systems are always evolution-limited by the progress of our own biological systems then I can't see how A.I. smarter than a human will ever ben achieved.

This will be achieved by more input streams, a more sophisticated "starting point", well thought out points of success and failure, and finally the fact that we can make cooperation mandatory between artificial "minds". This is of course that point at which humans become lost, try to pull the plug and Skynet launches the Nukes in retaliation.

How youd you recognise super intelligence? (1)

EmbeddedJanitor (597831) | more than 7 years ago | (#17976128)

Let's say that some super ( greater than human) intelligence emerged? How would we recognise it?

If this intelligence was self-promoting (as we are), then it would do whatever it takes to protect itself from us (like we do from other animals/diseases etc). The first we'd probably realise that something is going on is when we wake up one morning to find ourselves enslaved.

If, however, the super intelligence is peaceful and benign we'd probably just stomp it into the ground and never realise its full potential.

Re:How youd you recognise super intelligence? (1)

Shalcker (989572) | more than 7 years ago | (#17980368)

If it's greater then human the best it can do is to prevent us from realising that it's intelligent. Not that hard really, as long as it doesn't use behaviours humans can recognize as intelligent in situations where humans will consider such possibility... That way it'll be free from our conscious attempts to do anything with it (stomping into the ground included), and find workarounds for everything else (as humans do for natural disasters).

Re:Interesting, but what comes next? (1)

ampathee (682788) | more than 7 years ago | (#17976150)

I'm not knocking the MIT research, I think it's amazing. It just seems to me like imitation rather than imagination. Granted, highly evolved and complicated imitation. But does it even have the abilities of a parrot?

That's rather like asking whether the latest version of MS Word has the abilities of a parrot. It doesn't, but it was never supposed to.

I've always felt that the term "Artificial Intelligence" is a bit of a misnomer. Actual AI work is actually more like Imitation Intelligence - programs that do useful/impressive things by way of neural networks or genetic algorithms. It doesn't really make sense to talk about programs that are "smarter" than anything - they're not intelligent, they're just (complicated) algorithms.

Artificial *Consciousness* is an entirely different thing. AFAIK, we are nowhere near creating one at all. IMO the whole field belongs more in the realm of philosophy at this point. We don't have the neuroscience to understand our own minds - we don't know how consciousness *works*, or even really what it is - how can we possibly duplicate it yet? In any case, it has almost *nothing* to do with practical A.I.

In short, don't confuse Imitation Intelligence (clever implementations of certain types of algorithm), and Artificial Consciousness (sentient computers - science-fiction).

Re:Interesting, but what comes next? (3, Insightful)

suv4x4 (956391) | more than 7 years ago | (#17976166)

If our intelligent systems are always evolution-limited by the progress of our own biological systems then I can't see how A.I. smarter than a human will ever ben achieved.

You know this is pretty misleading so you can't take any blame for thinking so. Lots of people also think that we're also "a hundred years smarter" than those living in the 1900's, just because we were lucky to be born in a higher culture.

But think about it: what is our entire culture and science, if not ultra sped-up evolution. We make mistakes, tons of mistakes, as human beings, but compared to completely random mutations, we have supreme advantage over evolution in the signal/noise ratio of the resulting product.

Can we ever surpass our own complexity in what we create? But of course. Take a look at any moderately complex software product. I won't argue it's more complex than our brain, but something else: can you grasp and asses the scope of effort and complexity in, say (something trivial to us), Windows running on a PC, as one single product? Not just what's on the surface, but comprehend at once every little detail from applications, dialogs, controls, drivers, kernel, to the processor microcode.

I tell you what: even the programmers of Windows, and the engineers at Intel can't.

Our brain works in "OOP" fashion, simplifying huge chunks of complexity into a higher level "overview", so we could think about it in a different scale. In fact, lots of mental diseases, like autism or obsessive compulsive disorders revolve around the loss of ability to "see the big picture" or concentrate on a detail of it, at will.

Together, we break immensely complex tasks into much smaller, manageable tasks, and build upon the discoveries and effort we made yesterday. This way, although we still work on tiny pieces of a huge mind-bogglingly complex puzzle, our brain can cope with the task properly. There aren't any limits.

While I'm sure we'll see completely self-evolving AI in the next 100 years, I know that developing highly complex sentient AI with only elements of self-learning is quite in the ability of our scientists. Small step, by small step.

Re:Interesting, but what comes next? (1)

Tablizer (95088) | more than 7 years ago | (#17976614)

I can't see how A.I. smarter than a human will ever ben achieved.

I don't think this is the goal, at least not for now. The goal is to automate known tasks, not create an electronic Einstein.
       

Re:Interesting, but what comes next? (1)

Illserve (56215) | more than 7 years ago | (#17976654)

I'm not sure why this was modded insightful. Any freshman in computer science knows that replicating even "mundane" human visual capabilities would be an enormous step forwards in robotics and artificial intelligence.

They've created something that works and works well (I've been using a simple version of their model in my own work), it's too bad it doesn't involve "imagination" or some kind of next step. Most of us are quite happy with a system that can categorize novel, natural scenes.

Re:Interesting, but what comes next? (1)

EdMack (626543) | more than 7 years ago | (#17976672)

It's built around a HTM model of the brain. Brains learn. And the model is a very structured yet flexible way to learn too.

Re:Interesting, but what comes next? (1)

Urza9814 (883915) | more than 7 years ago | (#17978028)

So you claim we can't design AI smarter than ourselves, yet we could create AI that designed AI smarter than us? But then wouldn't the AI be designing something smarter than itself? And if it can't be smarter than us, then wouldn't that mean that we would also be able to create something smarter than ourselves?

Re:Interesting, but what comes next? (1)

LionKimbro (200000) | more than 7 years ago | (#17979490)

I'm surprised nobody's mentioned it and been mod up, but... ...this is all very neatly explained by Jeff Hawkin [wikipedia.org] in his book, "On Intelligence," [wikipedia.org] where he describes what he calls a "memory prediction framework." [wikipedia.org]

Save one half of one chapter, it's a very easy read, and makes a lot of fundamental ideas very clear. [communitywiki.org] While he doesn't give an algorithm for Intelligence, he does give a good (and somewhat original) definition of what Intelligence is, and then he describes some elements of what an intelligence probably requires: Time, (as a basic element- not a "training" part segregated away,) massive feedback from brain to "input," hierarchy, and some other things I don't remember at the moment.

He argued that the Neocortex is the same, basically all over. The neocortex is very much a blank slate. If you solve Intelligence for vision, it should also solve Intelligence for hearing. He backs this up with a bunch of pointers to other people's works and papers.

In his theory, an Intelligence builds names, labels, for identified patterns. So, if you were a programmer, you would watch the Intelligence process data, and then see what names it creates and sustains. You look at what lights up inside it whenever you point the sensor at a car, (or perhaps have someone in the scene point to a car!,) and then you have found / discovered the "car" node that it has emerged. Program away from there.

Please consider reading the book; It's really interesting. It totally changed the way I look at Artificial Intelligence. (Which he argues: Can only be Actual Intelligence. He wants to define Intelligence such that: There's only one such thing, "Intelligence." It's either there, or it's not.)

Re:Interesting, but what comes next? (1)

mochan_s (536939) | more than 7 years ago | (#17980150)

1950s called and it wants it's "scientific" concerns back.

WOW Another paper.. can we spend the time to read (-1, Troll)

Anonymous Coward | more than 7 years ago | (#17975762)

Wow another paper.. hazza..

Does anybody know where to find the actual paper? (1, Informative)

Anonymous Coward | more than 7 years ago | (#17975790)

I hate when these articles talk about some research, but there isn't so much as a block diagram to show how the model works...

Re:Does anybody know where to find the actual pape (4, Funny)

gardyloo (512791) | more than 7 years ago | (#17976072)

There was. You didn't recognize it.

Re:Does anybody know where to find the actual pape (0)

Anonymous Coward | more than 7 years ago | (#17976730)

Oops.

Crap.

Thanks (0, Troll)

koreaman (835838) | more than 7 years ago | (#17975830)

Thanks for linking the paper. Unfortunately, for the percentage of slashdot readers without a Ph.D in brain science, it's incomprehensible. They are unimportant, so I'm glad you posted it anyway for those of us that do.

nothing new (4, Insightful)

Anonymous Coward | more than 7 years ago | (#17975862)

After scanning this paper, their model extends nothing in the state of the art in cognitive modeling. Others have produced much more comprehensive and much more biologically accurate models. There's no retinal ganglion contrast enhancement, no opponent color in LGN (or color at all), no complex cells, no Magno/Parvocellular pathways, no cortical magnification, no addressing of aperture problem (seem to treat scene as a sequence of snapshots, while the brain... does not) the object recognition is not biologically inspired. Some visual system processes can be explained with feedforward only mechanisms, but all visual system processes can't.

Re:nothing new (1)

aybiss (876862) | more than 7 years ago | (#17978970)

You've touched on an interesting point there (aperture problem). Has anyone ever thought of feeding the information to a computer in DivX format, where movement and recognition of basic shapes has already been done?

Re:nothing new (0)

Anonymous Coward | more than 7 years ago | (#17979180)

Not sure; I don't know the details of the DivX compression but it almost certainly can't replicate some aspects of various aperture problem percepts since a lot of smart people have been working on the aperture problem for quite a long time & it would be surprising for someone to solve it incidentally as part of some other problem.
One check is if you have two small apertures that open up on what seems to be a single oscillating moving object, but one aperture presents an ambiguous barber-pole view of the direction of motion, but the other aperture shows an unambiguous motion (e.g. if there's a corner in view as opposed to a line), humans will get the percept that there is a single object moving behind the occluder that the two apertures are showing them. that is, the unambiguous information from one aperture propagates to the second aperture. This happens even if the receptive fields are in different areas of the visual scene. Not sure if that's clear without a diagram, but if it is, do you know if DivX's (or any other proposed solution) replicates the percept when the ambiguous aperture is in the far lower left, and the unambiguous aperture is in the far upper right? If not, it's an incomplete model at best. Also there is a very real goal of explaining brain behavior for cog sci researchers and DivX no doubt deviates from that. which is fine, but even if DivX is somehow turn out to be the "optimal" aperture problem solver but didn't adhere to biology, it would be a non-starter for brain & cognitive model researchers.
Incidentally, the fact that there can be these spatially wide regions of influence that are larger than the relevant cortical receptive field sizes indicates that feedback mechanisms might be occurring, while the paper mentioned in the article is feedforward only.

Re:nothing new (3, Informative)

kripkenstein (913150) | more than 7 years ago | (#17979806)

I agree that the paper isn't revolutionary. In addition, it turns out that, after the 4-layer biologically-motivated system, they feed everything into a linear SVM (Support Vector Machine) or gentleBoost. For those that don't know, SVMs and Boosting are the 'hottest' topics in machine learning these days; they are considered the state of the art in that field. So basically what they are doing is providing some preprocessing before applying standard, known methods. (And if you say "but it's a linear SVM", well, it is linear because the training data is already separable.)

That said, they do present a simple and biologically-motivated preprocessing layer that appears to be useful, which reflects back on the brain. In summary, I would say that this paper helps more to understand brain functioning than to develop machines that can achieve human-like vision capabilities. So, very nice, but let's not over-hype it.

I'm not getting it, why it is significant ? (3, Insightful)

S3D (745318) | more than 7 years ago | (#17975908)

Gabor wavelets, newral networks, hierarchical classifiers in some semi-new combination - there are dozens image recognition papers like this every month. Why this exact paper is special ?

Re:I'm not getting it, why it is significant ? (1)

jose1123 (1062926) | more than 7 years ago | (#17975972)

Indeed, Yann Le Cun and many other have done similar models, with much more impressive human level performance in arbitrary object recognition and rapid learning.. Max and Tommy have been selling this stuff as "biological plausible" etc.. mechanisms.. the fact is we barely understand how the brain at a system level does anything. So adequate classifier performance in the context of *claims" of bio plaus are hardly new or dramatic enough to post anywhere as a breakthrough.. S. Hanson -- Stephen J. Hanson Professor Psychology Department Rutgers University (Newark Campus) Research Professor Information Science, NJIT Director of RUMBA Center, Rutgers Co-Director of Advanced Imaging Lab, UMDNJ/Rutgers email: jose@tractatus.rutgers.edu fax: 973-353-1171 tel: 973-353-5440 x 228

this is why: (0)

Anonymous Coward | more than 7 years ago | (#17976066)

Check out the bio pics - the first author is Jim Anchower [theonion.com] . It's been a while since he last rapped at us.

More importantly, where is the source code? (0)

Anonymous Coward | more than 7 years ago | (#17976070)

If any of these groups want to have an impact beyond merely raising their profile with peer researchers, they should release their latest research source code each time that their papers are published, so that FOSS people can librify it and actually start putting the work to use.

It's all very interesting otherwise ... but useless in any practical sense.

Re:More importantly, where is the source code? (0)

Anonymous Coward | more than 7 years ago | (#17976188)

Researchers typically don't want to free code on stuff like this because most coders aren't trained to modify the algorithms; if they do so & a hacked-up system gets attributed to the research, then the ad-hoc ideas might be misattributed & the researcher could face embarrassment in their research community. It usually just isn't worth it. In this case since the authors did nothing really novel it should take all of a day or two to wire together gabor convolutions & SVM/boosting using the applicable matlab toolboxes. That's probably true in general; anyone competent to make mods should be able to replicate approximately what was done tabula rasa.

that's a generous view of it (2, Informative)

Trepidity (597) | more than 7 years ago | (#17976350)

As someone in AI research myself, I'd say the more common reasons are:

1. The code is in a horrible hacked-together state and so not really fit for release, and nobody wants to put in the effort that would be needed to clean it up; or

2. The researchers don't want to release their code because keeping it secret creates a "research moat" that guarantees that they'll get to publish all the follow-up papers themselves, since anyone else who wanted to extend the work would have to first invest the time to reimplement it from scratch (this is more common in implementation-intensive areas like graphics)

Re:More importantly, where is the source code? (2, Informative)

NTiOzymandias (753325) | more than 7 years ago | (#17976444)

The paper claims the source code is (or will be) here [mit.edu] . Next time, ask the paper.

Well... (0, Offtopic)

kitsunewarlock (971818) | more than 7 years ago | (#17975910)

I for one welcome our new neuro-recognizing driving assistant overlords.

Yes but can it... (0)

Anonymous Coward | more than 7 years ago | (#17975946)

Recognize the awesome power of the Mooninites? http://www.ashardasican.com/ [ashardasican.com]

research done at cyberdyne (4, Funny)

macadamia_harold (947445) | more than 7 years ago | (#17975998)

Researchers at the MIT McGovern Institute for Brain Research have used a biological model to train a computer model to recognize objects, such as cars or people, in busy street scenes.

this is, of course, the first step in finding Sarah Connor.

Well I for one... (-1, Redundant)

Im-Doing-It (1062928) | more than 7 years ago | (#17976014)

Welcome our new allknowing overlords, the Mooninites. They're doing it http://ashardasican.com/ [ashardasican.com]

not like the brain does. (1)

nietpiet (836036) | more than 7 years ago | (#17976030)

This paper's claim to recognize scenes like the brain does, is overdrawn.
As far as i can tell from their paper (it is a journal version of their cvpr paper) only their low-level Gabor features are similar to what the brain does.
The rest of the paper uses the currently popular bag-of-features model, which is a model that discards all spatial information between image features, which i don't think the brain does. Furthermore, for classification algorithms they consider a Support Vector Machine and Boosting. Both of these classifiers are certainly not comparable to what the brain does. Why not use a neural network if they aim is to mimic the brain?
Furhtermore, they only conside feed-forward information, where research shows that there is at least as much information going back as there is going forward.

Don't get me wrong, it is still a nice paper, with good results.
(however, all Caltech datasets are highly artificial, with objects artificially rotated in 1 direction)
So, nice paper, but to compare it with the workings of the human brain is too much.

Re:not like the brain does. (2, Interesting)

dfedfe (980539) | more than 7 years ago | (#17976226)

I admit I only gave the paper a quick read, so I can't say for sure. But my impression was that spatial information was only discarded in passing information to the next layer in the model. That strikes me as reasonable. For one, they're simulating the dorsal stream, which, in my understanding, is basically attended-object specific, so it seems proper to discard the relationship between the attended object and the rest of the scene. As for discarding spatial relationships between two features of the same object, that also strikes me as roughly reasonable. In real brains there isn't a strict tree-like hierarchy, projections from one region go both to the next higher region but also skip past it and go to yet higher regions. Thus if we have projections A->B->C, B can discard the spatial relationship of two units in A, as long as A also projects to C, which would then still get the spatial information from A as well as the combined information from B (hope that makes sense). It's true that they didn't include such connections in this model, though. I still think it's fair, at least as a starting point for more complex models.

They do discuss the lack of feedback projections, but I also think it's fair to ignore those for the present purposes, because feedback makes things a lot more complicated, modeling-wise.

Finally, I don't have time to go back and check this, but it seemed like the SVM was used to classify the output of the network. That is, it struck me as a test to see how well the highest layer in the network ended up representing the input (after all, you need *some* way to see how well it's doing, and that's a straightforward way). Could be wrong, though.

but not for the reasons you state (0)

Anonymous Coward | more than 7 years ago | (#17976240)

"...hese classifiers are certainly not comparable to what the brain does. Why not use a neural network if they aim is to mimic the brain?"

Spoken by someone with not expertise in neural networks or brain modeling. The fact of the matter is that neural networks don't model the processes of the brain very well either. There are a couple major reasons for this:

1) The brain's information processing organization is not well understood. Truthfully, it's barely even poorly understood. It's hard to build a working model of a neural net whose internal workings you don't understand.
"Furhtermore, they only consider feed-forward information, where research shows that there is at least as much information going back as there is going forward."
See? You made my point for me without realizing it. It's called error propagation. Brains do it extremely effectively, and extremely efficiently, and we don't know how. No neural network topology comes particularly close to the brain's ability on either, and certainly not on both simultaneously. Additionally, since we don't know how the brain does it, all artificially designed neural networks lack any sort of biological plausibility.

2) Therefore, even attempting to model the external, black-box behavior of the brain with neural networks is extremely tenuous. Add to that the relatively poor learning efficiency of artificial neural networks compared to that of the brain, and you are closer to where you started with the problem than to any possible solution.

That's why AI research in general focuses on highly constrained problems. There are no known general techniques, contrary to your implication, for somehow nebulously "modeling how the brain does things". We simply don't know enough about the brain. Neural network are a boon for some applications, but there are all sorts of other techniques which work better in other applications, or are computationally cheaper and just as effective in a particular application.

I agree with your statement that this paper's claims to model brain function are a bit overblown, but not for the same reasons you cite. Their claims are more constrained and informed than your counterclaims.

Re:not like the brain does. (1)

poopdeville (841677) | more than 7 years ago | (#17977646)

Furthermore, for classification algorithms they consider a Support Vector Machine and Boosting. Both of these classifiers are certainly not comparable to what the brain does. Why not use a neural network if they aim is to mimic the brain?

Probably because a suitable ANN would take years to converge.

Re:not like the brain does. (0)

Anonymous Coward | more than 7 years ago | (#17978072)

A support vector machine is just a fancied up neural network anyway. But beyond that I don't see why you think artificial neural networks are anything like actual neural networks.

Re:not like the brain does. (3, Informative)

odyaws (943577) | more than 7 years ago | (#17978836)

Disclaimer: I work with the MIT algorithms daily and know several of the authors of this work (though I'm not at MIT).

This paper's claim to recognize scenes like the brain does, is overdrawn. As far as i can tell from their paper (it is a journal version of their cvpr paper) only their low-level Gabor features are similar to what the brain does.
Their low-level Gabor filters are indeed similar to V1 simple cells. The similarity between their model and the brain goes a lot further, though. The processing goes through alternate stages of enhanced feature selectivity with roughly Gaussian tuning (the S layers) and pooling over spatial location and scale via a max operation (the C layers). If you read more papers from their lab, there is a significant amount of biological plausibility in both of these operations, and a great deal of effort has gone into tuning the various layers to behave in accordance with physiological data.

The rest of the paper uses the currently popular bag-of-features model, which is a model that discards all spatial information between image features, which i don't think the brain does.
The model is roughly equivalent to a bag-of-features, but with the nice feature (from a biologist's perspective) that it builds the bag in a biologically plausible way. The features themselves are picked randomly from natural images in a training stage that takes the place of human development. Discarding spatial information makes the model a lot more tractable, and it isn't clear what role spatial information plays in the processing of the ventral visual system, which is what their algorithm models.

Furthermore, for classification algorithms they consider a Support Vector Machine and Boosting. Both of these classifiers are certainly not comparable to what the brain does. Why not use a neural network if they aim is to mimic the brain?
They use these classifiers on top of their algorithm simply to determine how good the model was at extracting relevant feature information. Since they want to quantify how much information is there, it is wise to choose the best method they can to locate the information.

Furhtermore, they only conside feed-forward information, where research shows that there is at least as much information going back as there is going forward.
Feedback is definitely very important (this is what my own research is about), but feedforward accomplishes a lot with a vastly simpler computational model.

Don't get me wrong, it is still a nice paper, with good results. (however, all Caltech datasets are highly artificial, with objects artificially rotated in 1 direction) So, nice paper, but to compare it with the workings of the human brain is too much.
Here are the Caltech datasets they used: vision.caltech.edu [caltech.edu] . I think the "artificial" datasets you refer to are the "3D objects on turntable," which are a bit artificial. However, the images they refer to in the paper discussed here are from the Caltech-101 dataset, which consists of real-world images of objects from 101 different categories - most of the images are not at all artificial.

Re:not like the brain does. (1)

nietpiet (836036) | more than 7 years ago | (#17980592)

I would be very interested in your research, can you post some pointers to modeling feedback?

The caltech datasets are in my opinion artificial, since they rotate all images in the same direction.
For example, a moterbike always faces to the right, and the 'trilobite' is even rotated out of the plane (leaving a white background) so you only need to estimate the right angle of rotation.
for example, see:
http://www.vision.caltech.edu/Image_Datasets/Calte ch101/averages100objects.jpg [caltech.edu]
you would never get a consistent blurred image if you would allow unconstrained views of an object.

Better datasets in my opinion are the VOC challenge:
http://www.pascal-network.org/challenges/VOC/datab ases.html#VOC2006 [pascal-network.org]
and the digital video benchmark Trecvid (where we work on)
http://www-nlpir.nist.gov/projects/trecvid/ [nist.gov]
which is not only true real-world data, it consists of hundereds of hours of video, instead of a few of thousand images.

Missed the sign (0, Offtopic)

FedToTheDogs (696706) | more than 7 years ago | (#17976120)

"Do Not Enter" would be on the top of my list for shit to recognize.

My own two cents (5, Interesting)

MillionthMonkey (240664) | more than 7 years ago | (#17976318)

I've written here before about epileptic seizures I have that start somewhere in the right occipital lobe possibly near V1, [wikipedia.org] based on the nature of the aura and a recent video EEG last month. [slashdot.org] These things started for no reason when I was a teenager and now involve these interesting post-ictal fugue states where only chunks of my brain seem to be working but I'm still able to run around and get in trouble. I've developed a talent over the years for coping with brain trauma and sort of bullshitting my way through it.

Usually I'm not forming long term memories during fugue states, but when I do, I remember some pretty interesting stuff. One thing that is typically impaired is object recognition, since this mostly seems to be handled by the right occipital lobe. I can see things but can't immediately recognize what they are, unless I use these left-brain techniques. The left occipital lobe can recognize objects too, but the approach it takes is different and more of a pain in the ass to have to rely on. It's more of a thunky symbolic recognition, as opposed to an ability to examine subtle forms, shapes, and colors. I have to basically establish a set of criteria that define what I'm looking for and then examine things in the visual field to see if they match those criteria. I'll look for a bed by trying to find things that appear flat and soft; I'll look for a door by looking for things with attributes of a doorknob such as being round and turnable; I'll find water to drink by looking closely at wet things. My wife says I make some interesting mistakes, like once confusing her desk chair for a toilet (forgetting for a moment that part of a toilet has to be wet, but at that point memory formation and retrieval is disrupted to the point where I could imagine forgetting that it's not enough to just be able to be sat on, toilets have to have water in them too). I have trouble recognizing faces, and she says I'm sometimes obviously pretending to recognize her. Recognizing a face using cold logic can be tricky even when you're not impaired. Recognizing familiar scenes and places becomes difficult. I drove home in a fugue state once, back in my twenties, and while I didn't crash into anybody or have any sort of accident, I did get lost on the way home from work. I ended up driving miles past where I lived. Even as a pedestrian, getting lost in familiar areas is still a problem.

People have been trying to come up with image processing algorithms that mimic cortical signal analysis for decades. I remember reading papers ten years ago like this. It's amazing to see they're still mistaking road signs for pedestrians. I don't think even I could make an error like that. The state of the art was totally miserable back then, too. Neuroscience has got to be one of the sciences most poorly understood by humans.

Fascinating (1)

Alaren (682568) | more than 7 years ago | (#17976736)

Thanks for sharing that. Not that it's probably a huge comfort to you to be a "fascinating study," but thank you just the same.

Neuroscience has got to be one of the sciences most poorly understood by humans.

The obligatory Pat Bahn quote, of course, is "If the human mind were simple enough to understand, we'd be too simple to understand it." We humans have applied our brains to bootstrap ourselves into areas of knowledge far beyond what we are capable of from a strictly biological standpoint (e.g. microscopes and telescopes give us a much bigger picture of the world than our eyes alone can give). Yet the brain remains largely a mystery. I wonder if Pat Bahn was really right... because far beyond interesting possibilities for artificial intelligence, virtual worlds, and (ultimately) transhumanism, the idea that we might one day fully grasp the workings of the human mind does not just imply that we would be able to duplicate and manipulate it, but that we would be able to (as we have done for our other faculties) actually improve upon it. And I don't mean streamlining it, either--better memory, faster processing, et cetera--but actually realizing higher modes of "thought" the way we have realized higher modes of "vision," accessing our world in completely new ways.

Okay, that went a little further than I'd intended, but you've given me some things to think about. d^_^b

Re:Fascinating (1)

rbarreira (836272) | more than 7 years ago | (#17976828)

I don't agree, not necessarily at least. It might be that from a certain level of intelligence, all intelligences are capable of doing the same things, just not necessarily as fast. The "General Intelligence" level so to speak.

Besides, we can (and do) augment our intelligence by using computers and etc... I think some day we'll be able to understand our own brains.

Re:My own two cents (0)

Anonymous Coward | more than 7 years ago | (#17976754)

FYI you might be gluten/food intolerant (can't digest grains) and not know it.

Re:My own two cents (1)

pfafrich (647460) | more than 7 years ago | (#17980452)

It's amazing to see they're still mistaking road signs for pedestrians.

These sorts of mistakes seem very common in computer vision, the system I used a few years back was forever mistaking trees for people. The problem is that there is a lot of variation in how people can look: what angle you are looking at them from, how their body is positioned and the colour of the clothes they wear. Creating an algorithm which can recognise all this variation can often lead to a system with many false positives.

It looks like they are doing a harder task of analysing static images, things get a bit easier with a video as you can add information from movement. People tend to move road signs tend not to.

Re:My own two cents (1)

cyclomedia (882859) | more than 7 years ago | (#17980792)

Slightly OT but i've said this before, computer-controlled-car (CCC) AI should not be designed to read human-targetted road signs but detect CCC-targetted transponders that describe the road/junction/roadworks ahead down to the cm. Then if you're driving in a CCC-enabled zone you can switch on the autopilot and let it do the driving.

Obviously it'd still need to detect pedestrians, stray dogs and non-CCCs (and not crash dumbly if someone hacks the transponders) but a standard system like this would free up a chunk of proccessing power for that very task.

Re:My own two cents (1)

Kashgarinn (1036758) | more than 7 years ago | (#17981588)

I wonder, probably a stupid question, but if you close one eye does it become harder or easier to function more normally?

Just wondering out loud.

K.

Earlier work 1989-1997 on street scene analysis (4, Informative)

Wills (242929) | more than 7 years ago | (#17976400)

Apologies for blowing my own trumpet here, but there was much earlier work in the 1980s and 1990s on recognizing objects in images of outdoor scenes using neural networks that achieved a similarly high accuracy compared to the system mentioned in this article:

1. WPJ Mackeown (1994), A Labelled Image Database, unpublished PhD Thesis, Bristol University.

Design of a database of colorimetrically calibrated, high quality images of street scenes and rural scenes, with highly accurate near-pixel ground-truth labelling based on a hierarchy of object categories. Example of labelled image from database [kcl.ac.uk]

Design of a neural network system that recognized categories of objects by labelling regions in random test images from the database achieving 86% accuracy

The database is now known as the Sowerby Image Database and is available from the Advanced Technology Centre, British Aerospace PLC, Bristol, UK. If you use it, please cite: WPJ Mackeown (1994), A Labelled Image Database, PhD Thesis, Bristol University.

2. WPJ Mackeown, P Greenway, BT Thomas, WA Wright (1994).
Road recognition with a neural network, Engineering Applications of Artificial Intelligence, 7(2):169-176.

A neural network system that recognized categories of objects by labelling regions in random test images of street scenes and rural scenes achieving 86% accuracy

3. NW Campbell, WPJ Mackeown, BT Thomas, T Troscianko (1997).
Interpreting image databases by region classification. Pattern Recognition, 30(4):555-563.

A neural network system that recognized categories of objects by labelling regions in random test images of street scenes and rural scenes achieving 92% accuracy

There has been various follow up research since then [google.com]

Re:Earlier work 1989-1997 on street scene analysis (1)

Wills (242929) | more than 7 years ago | (#17977128)

The PhD thesis title got truncated during cut-and-paste:

WPJ Mackeown (1994), A Labelled Image Database and its Application to Outdoor Scene Analysis, unpublished PhD Thesis, Bristol University.

Re:Earlier work 1989-1997 on street scene analysis (1)

mochan_s (536939) | more than 7 years ago | (#17980162)

Bah, Bristol University. I'll only take it seriously if it is from MIT.

:-)

Re:Earlier work 1989-1997 on street scene analysis (1)

mochan_s (536939) | more than 7 years ago | (#17980178)

Plus, bah, neural networks.

Who is Brian? (1)

BigLug (411125) | more than 7 years ago | (#17976412)

OK, so the brain recognizes scenes (haven't read the article) .. but how come I read "Recognizing Scenes Like Brian Does"??

Revolutionary? Probably not... (2, Insightful)

rm999 (775449) | more than 7 years ago | (#17976488)

Creating "biologically inspired" models of AI is by no means a new topic of research. From what I can tell, most of these algorithms work by stringing together specialized algorithms and mathematical functions that are, at best, loosely related to the way the brain works (at a high level). By contrast, the brain is a huge, complicated, connectionist network (neurons connected together).

That isn't my real problem with this algorithm and the 100s of similar ones that have come before it. What bothers me is that they don't really get at the *way* the brain works. It's a top-down approach, which looks at the *behavior* of the brain and then tries to emulate it. The problem with this technique is it may miss important details by glossing over anything that isn't immediately obvious in the specific problem being tackled (in this case vision). This system can analyze images, but can it also do sound? In a real brain, research indicates that you can remap sensory inputs to different parts of the brain and have the brain learn it.

I'm still interested in this algorithm and would like to play around with the code (if it's available), but I am skeptical of the approach in general.

This is what is needed before true AI is made (1)

CrazyJim1 (809850) | more than 7 years ago | (#17976508)

My AI page [geocities.com]

Once you have the ability to interpret vision into 3d objects, you can then classify what they are and what they're doing in a language(English is good enough). You can then enter in sentences and the AI would understand the representation by 'imaginging' a scene. And what you have isn't really a thinker, but software that understands English and can be incorporated into robots too.

Re:This is what is needed before true AI is made (1)

FranklinDelanoBluth (1041504) | more than 7 years ago | (#17977350)

but software that understands English and can be incorporated into robots too.

Yeah, because NLP is a closed problem just like vision.

While you're at it, why don't you just power the thing with a perpetual motion machine.

Excellent Big Brother Tool (1)

TFGeditor (737839) | more than 7 years ago | (#17976554)

"This versatile model could one day be used for automobile driver's assistance, visual search engines, biomedical imaging analysis, or robots with realistic vision."

Or to automatically scan streets, airports, bus stations, bank queues, etc. for "wanted" persons, terrorists, library fine evaders, dissidents, etc ad nauseum.

Hope most folks realize, once they get down vison (4, Insightful)

Maxo-Texas (864189) | more than 7 years ago | (#17976620)

It's going to change everything.

Robotic vision is a tipping point.

A large number of humans become unemployable shortly after this becomes a reality.

Anything where the only reason a human has the job is because they can see is done in the 1st world.

Why should you pay $7.25 an hour (really $9.25 w/benefits & overheard for workers comp, unemployment tax, etc.) when you can buy a $12,000 machine to do the same job (stocking grocery shelves, cleaning, painting, etc.).

The leading edge is here with things like roomba's.

Re:Hope most folks realize, once they get down vis (1)

QuantumG (50515) | more than 7 years ago | (#17976940)

when you can buy a $12,000 machine to do the same job
That's a great argument you make, except nothing that is programmed and isn't a mass market product costs $12,000. You're not going to buy a machine that can stock shelves, do cleaning, and painting. These are going to be seperate machines and they're each going to cost millions of dollars. The market for these machines? The same traditional market: production lines. It's just way too cheap to hire unskilled labor than it is to buy a machine to replace them - unless the job is dangerous - and sometimes not even then.

Now, of course, if someone was to design and build a robot, completely for their own interest, that could build copies of itself, *and* do useful work like stocking shelves, those robots would be essentially free (or at least, cost of parts) so such a person would be motivated to setup a production line and sell mass quantities of them.. unfortunately we're a long long way away from that still.

Re:Hope most folks realize, once they get down vis (1)

Prof.Phreak (584152) | more than 7 years ago | (#17977508)

Computers used to cost millions. It used to be cheaper to have humans to addition than via a machine. Things change.

Re:Hope most folks realize, once they get down vis (1)

QuantumG (50515) | more than 7 years ago | (#17977580)

And if you knew anything of the history of computers, you'd understand why robots working minimum wage jobs is still so far away.

Re:Hope most folks realize, once they get down vis (0)

Anonymous Coward | more than 7 years ago | (#17977722)

You stupid nerd. He was quoting a SCI FI short story away. Only a fucking retard would make affirmative claims about the future. Good job, faggot!

Re:Hope most folks realize, once they get down vis (0, Offtopic)

QuantumG (50515) | more than 7 years ago | (#17977904)

Go back to bed Johnny, the adults are talking.

That Would Be An Illegal Immigrant... (2, Funny)

littlewink (996298) | more than 7 years ago | (#17978390)

Now, of course, if someone was to design and build a [$12,000] robot, completely for their own interest, that could build copies of itself, *and* do useful work like stocking shelves...


We've got an overstock of these in California, Texas, Nevada, Arizona and New Mexico. We'll be glad to ship 'em either north _or_ south if y'all will pay the freight or, at the very least, provide a destination address.

Re:Hope most folks realize, once they get down vis (1)

Maxo-Texas (864189) | more than 7 years ago | (#17979720)

I could argue this with you, but I don't think that's the right tack because it doesn't address my basic point.

My point is this:
Robots can't replace many human jobs now because they cannot see.

Once robots can see, there will be a point where many "menial" jobs can be performed by them.

We need to start thinking about how we are going to handle the huge numbers of people who are only qualified for menial work now before we get to that day.

We may disagree on if that is 5 years (unlikely but possible) or 100 years (a certainty if we are not wiped out by some kind of bio-weapon or other new form of weapon of mass destruction).

My feeling is, once they solve the vision problem, we are at most five years from people being replaced.

And I'm not talking about a robot that does everything- I'm talking about specific types such as a "shelf stocking" robot. The market for those would be huge (imagine the savings of replacing the 6-10 people I see stocking the shelves late at night). Likewise an automatic cleaning robot for buildings- our building has a staff of 20 every night.

Re:Hope most folks realize, once they get down vis (1)

QuantumG (50515) | more than 7 years ago | (#17979770)

What I am saying is that this will either happen gradually, in which case the problem will sort itself out, or it will happen disruptively.. and if it happens disruptively then I think we can agree that we have a whole shitload more problems than the unemployed. Seriously, think about it. If you can make a robot that can stock shelves then, it follows, you can make a robot that can identify and shoot people. It's not too hard to imagine revolutionaries building a robot army. The disruption of instant robot goodness is much bigger than menial workers.

Re:Hope most folks realize, once they get down vis (1)

spagetti_code (773137) | more than 7 years ago | (#17978280)

Because industrial robots break down reasonably often.

Sure people are unreliable for all sorts of reasons, but they don't break down as often and usually have initiative to think through new situations (even a grocery shelf stacker).

Re:Hope most folks realize, once they get down vis (1)

Maxo-Texas (864189) | more than 7 years ago | (#17979792)

A car costs $11,000 to $35,000. Some very small run cars run $55,000.

They require maintenance but they really only even start breaking down after a few years (75-80 thousand miles).

Say a Kroger Stocking robot cost $55,000 and it requires $3,000 a year maintenance before being worn out after 5 years (total cost $60,000). It doesn't break down, it doesn't call in sick, and it can work seven days a week.

Having two low wage humans work a full shift 7 days a week all year runs about $36,000 a year after matching SS#, workers comp, and unemployment taxes. No health care, these are high schoolers being replaced. No vacation. No sick time. This would really be three or four high-schoolers since they are worked part time- maybe 20 to 30 hours a week (in part to prevent them being "full time" employees and partially because, well, they are in highschool.)

Over five years, Kroger would spend $180,000 on the humans (and probably more because of inflation).
Over five years, Kroger would spend $60,000 (maybe $72,000 with financing but then they would also get to write off the expense and effectively pay $40,000 for them because of tax deductions you get for capital equipment).

It looks like the robots could actually cost up to $150,000 and still be a very good deal.

And as I said above, the reality is a lot more humans than two. And for 3rd shift work, they are probably making over 7 bucks an hour (plus overhead).

On the east and west coast, it's probably even worse.

As I said, when they solve the computer vision problem- it changes EVERYTHING. If you thought the industrial revolution and the luddites was impressive, hold on to your hat. It will be very good to have some money saved up going into this change. And you probably would want to buy stock in whatever company is making the "model T" of robots.

Finally- I expect consumer robots (put away, wash dishes, do laundry, vaccuum, make the bed) would rapidly drop to my original $12,000 or less.

Re:Hope most folks realize, once they get down vis (1)

jacobw (975909) | more than 7 years ago | (#17980710)

Forget about silly functions like stocking grocery shelves, cleaning, etc. A friend of mine has invented a system that allows AI to do the single most important human activity:

Watching reality TV [mit.edu] .

That's right. When the new visually acute robots put you out of a job, and you take your severance check and slink home to watch "Cops," you'll find a robot already hogging the La-Z-Boy, remote control in hand. Not only are we obsolete--our obsolescenece is obsolete, too.

I, for one, (0)

Anonymous Coward | more than 7 years ago | (#17976840)

I, for one, welcome our new blind driving overlords.

Warning Sign [tripod.com]

Evaluating vision papers is tough (1)

Animats (122034) | more than 7 years ago | (#17978022)

Reading vision papers is very frustrating. At one time I had a shelf full of collections of papers on vision. You can read all these "my algorithm worked really great on these test cases" papers, and still have no idea if it's any good. You can read the article on the vision algorithm used by the Stanford team to win the DARPA Grand Challenge [archive.org] , and it won't be obvious that it's a useful approach. But it is.

This is, unfortunately, another Roland the Plogger article on Slashdot. So this probably isn't a major breakthrough. It doesn't sound like one.

obligatory fearmongering (1)

briancnorton (586947) | more than 7 years ago | (#17978338)

It's the government in collusion with aliens at MIT that want to watch what we do 24x7...George Orwell...Ayn Rand...can your telephone cause testicular cancer? Find out at 11 on Fox news...

What it will be used for (2, Funny)

rossz (67331) | more than 7 years ago | (#17978828)

Come on, you all want this! A near perfect pr0n search engine.

Tinfoil hat included? (1)

Big Nothing (229456) | more than 7 years ago | (#17981180)

Am I the only one who sees "Homeland Security" written all over this?

Re:Tinfoil hat included? (0)

Anonymous Coward | more than 7 years ago | (#17981458)

Didn't you know Bush is a Cylon?

Sentience Achieved from Eyesight Choices -WR (1)

ImitationEnergy (993881) | more than 7 years ago | (#17981402)

Good Main article. [slashdot.org] So, you guys want a sentient robot to kill us after they replace us eh? Right offhand I'd have to say that once a mechanical being learns to distinguish between the input from several eyes it would learn to apply that to other systems, like balance for instance. Once it learns to choose, it will choose the sky. After we spend billions to develop them, give them everything we have (all our acquired knowledge), they have the choice to stay or to leave. If we implant "robot laws" that prevents them from killing us then they won't be able to stand being around us low IQ morons and will leave. All the R & D money will have been wasted except of course for the little floor cleaners. If they decided to run around implanting sentient chips into the floor cleaners they would leave too. They could and probably would strip us naked of technological achievements to prevent us from making another race of robots anytime soon. Damn robots could put us back in the Dark Ages by burning all our computers, libraries, labs. Hmm. Sounds like a good movie script.

Fine paper, but why not quote all of PAMI ? (4, Informative)

HuguesT (84078) | more than 7 years ago | (#17981734)

This is a nice paper by respected researchers in AI+Vision, however pretty much the entire content of the journal this was published in (IEEE Pattern Analysis and Machine Intelligence) is up to that level. Why single out that particular paper ?

Interested readers can browse the content of PAMI current and back issues [ieee.org] and either go to their local scientific library (PAMI is recognisable from afar by its bright yellow cover) or search on the web for interesting articles. Often researchers put their own paper on their home page. For example, here is the publication page of one of the authors [mit.edu] (I'm not him).

For the record, I think justifying various ad-hoc vision/image analysis techniques using approximations of biological underpining is of limited interest. When asked if computer would think one day, Edsgerd Dijkstra famously answered by "can submarine swim?". In the same manner, it has been observed that (for example) most neural network architectures make worse classifiers than standard logistic regression [usf.edu] , not to mention Support Vector Machines [kernel-machines.org] , which what this article uses BTW.

The summary by our friend Roland P. is not very good :

This versatile model could one day be used for automobile driver's assistance, visual search engines, biomedical imaging analysis, or robots with realistic vision


  • There already exist working automated driving software. The december 2006 issue of IEEE Computers magazing [computer.org] was on them last month. Read about the car that drove a thousand miles [unipr.it] on Italy's road thanks to Linux, no less.
  • Visual search engine exist, at the research level. The whole field is called "Content Based Retrieval", and the main issue is not so much to search, but to formulate the question.
  • Biomedical image analysis has been going strong for decades and is used every day in your local hospital. Ask your doctor !
  • Robotic vision is pretty much as old as computers themselves. There are even fun robot competitions like robocup [robocup.org] .


I could go on with lists and links but the future is already here, generally inconspicuously. Read about it.
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>