Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Google Using ReCAPTCHA To Decode Street Addresses

timothy posted more than 2 years ago | from the you-are-the-crowd-being-sourced dept.

Google 104

smolloy writes "Apparently some users of reCAPTCHA have recently begun seeing photographs appear in their CAPTCHA puzzles — photos that look very much like zoomed in house numbers taken from Google Streetview. It appears that Google has decided to put the reCAPTCHA system to help clean up Google streetview images, and 'according to a Google spokesperson, the system isn't limited to street addresses, but also involves street names and even traffic signs.' A large collection of these has appeared on the Blackhatworld website."

cancel ×

104 comments

Sorry! There are no comments related to the filter you selected.

I'm a Microsoft whore (-1)

Anonymous Coward | more than 2 years ago | (#39515491)

Awesome. More invasion of privacy. Fuck Google.

bonch

Re:I'm a Microsoft whore (4, Insightful)

nedlohs (1335013) | more than 2 years ago | (#39515519)

Yeah because those street number designed to tell everyone passing by what number the house is on the street are meant to be private.

Re:I'm a Microsoft whore (1)

Anonymous Coward | more than 2 years ago | (#39515575)

Don't feed the trolls... (but you're right, though)

Re:I'm a Microsoft whore (-1)

icebike (68054) | more than 2 years ago | (#39515961)

The odd part is most of these numbers and signs are automatically made blurry when you try to look at them to get around the "Address is Approximate" in google streetview.

The summary says:

It appears that Google has decided to put the reCAPTCHA system to help clean up Google streetview images,

Yet Google would have to know what the address numbers really was in order to validate the reCAPTCHA, so that can hardly be why they are doing it. They don't need to crowd source an answer that they already know.

Using a zoomed house number from some obscure place in the world seems to be a non issue, other than having a virtually unlimited source of images.
But unless Google is paying a zillion people to validate these images visually, all this says is they trust their algorithm of number extraction enough that they can use it in production. And if they have an algorithm that good, its just one more proof of the fallibility and uselessness of image based reCAPTCHAs.

Re:I'm a Microsoft whore (5, Informative)

Baloroth (2370816) | more than 2 years ago | (#39516037)

Yet Google would have to know what the address numbers really was in order to validate the reCAPTCHA, so that can hardly be why they are doing it. They don't need to crowd source an answer that they already know.

No they don't. They also add an altered text image alongside the picture (which presumably they generated), and can use that to validate the CAPTCHA. The street number can be validated by numerical probability (if 70% of them say it is "257", and the numbers "2,5,7" appear frequently in the rest, it is probably "257") even if they don't already know what it is.

Re:I'm a Microsoft whore (1)

poity (465672) | more than 2 years ago | (#39517551)

Plus, street numbers in the US typically go odd/even on either side of the street, so they can extrapolate most of the time.

Re:I'm a Microsoft whore (5, Informative)

cforciea (1926392) | more than 2 years ago | (#39516039)

I don't think you know how reCAPTCHA works. You are always presented with two different items to decode. One of them is always a known answer, and the other they are less sure about, but become more sure after they show it to enough people and get a crowd sourced answer. They don't give you two prompts just to be double sure you are human.

Interesting (1, Insightful)

Sussurros (2457406) | more than 2 years ago | (#39516789)

Thank you for the information, I've often wondered about them.

I only have about a 60% success rate on those swirly semi-inverted ones. My wife's friend's decaptcha software does a much better job than I do with its 79% success rate. I had wondered that as they get harder to read that the day was almost here when only machines would have the ability to decode captchas and prove that they were human.

Re:I'm a Microsoft whore (1)

Anonymous Coward | more than 2 years ago | (#39517027)

ReCaptcha will accept any sequence of symbols for the unknown word. The most telling sign that a word is unknown is that, out of the two, it is the one that is ACTUALLY A WORD. Other signs are non-standard fonts, scanning distortions, non-Latin symbols, and punctuation marks.

Furthermore, there is a 1-chacter fault tolerance for the sequence of letters used as the part of the ReCaptcha to actually check if you pass or fail or not.

Re:I'm a Microsoft whore (0, Troll)

Anonymous Coward | more than 2 years ago | (#39518817)

ReCaptcha will accept any sequence of symbols for the unknown word.

Wait - you can type something other than "nigger" for the unknown word?

One of these days I'm going to do that when someone's looking over my shoulder and get a serious WTF from them.

Re:I'm a Microsoft whore (1, Funny)

Anonymous Coward | more than 2 years ago | (#39520815)

ReCaptcha will accept any sequence of symbols for the unknown word.

Wait - you can type something other than "nigger" for the unknown word?

One of these days I'm going to do that when someone's looking over my shoulder and get a serious WTF from them.

To whoever modded the parent at -1, pay attention.
Out of the two images you are presented, one is known, the other is unknown. When a large enough number of people have entered the same answer for the unknown image, it gets moved to the 'known' list with that particular answer.

So on some places like 4chan, there has been a large effort to get as many people as possible to answer the unknown image with the word 'nigger'. If enough people do it on a single unknown image, it will get added to the pool with the "correct" answer set to the word 'nigger'... thus polluting the reCaptcha system. As the percentage of polluted entries in the "known" image list grows, so does the chance that the answer to any reCaptcha is 'nigger'.

Re:I'm a Microsoft whore (1)

rich_hudds (1360617) | more than 2 years ago | (#39520821)

And what would that achieve exactly?

Re:I'm a Microsoft whore (1)

Anonymous Coward | more than 2 years ago | (#39521037)

Fun.
I didn't know about the nigger thing, but I've always submitted nonsense for the book one.

Re:I'm a Microsoft whore (0)

Anonymous Coward | more than 2 years ago | (#39521305)

That is used to digitalize books
http://www.google.com/recaptcha

Re:I'm a Microsoft whore (1)

MorderVonAllem (931645) | more than 2 years ago | (#39516049)

Yet Google would have to know what the address numbers really was in order to validate the reCAPTCHA, so that can hardly be why they are doing it. They don't need to crowd source an answer that they already know.

Doubtful. They post two images. One they know and one they don't. They use the data for the one they don't, combine it with data from 1000s of other people who have also solved that captcha to get an accurate picture of what that particular number is. They use the one they know to validate the recaptcha data and verify you're human...

Re:I'm a Microsoft whore (3, Informative)

eldorel (828471) | more than 2 years ago | (#39516081)

Recaptcha works by using a known value with an unknown, it's why you have to type 2 words.

One of the two words is considered solved, and is the actual captcha, the second word is using you as an ocr.

After enough people provide the same solution for the second word, it goes into the solved category and is used for validation.

They don't have to pay people to validate the addresses, we're doing it for free.

Re:I'm a Microsoft whore (1)

chrismcb (983081) | more than 2 years ago | (#39516485)

But unless Google is paying a zillion people to validate these images visually,

That is EXACTLY what Google is doing. And the payment is access to the site the reCAPTCHA is protecting.

Be a Roman harlot instead! (5, Funny)

AliasMarlowe (1042386) | more than 2 years ago | (#39515607)

And put your house number in Roman Numerals. Nothing like living in number CLXXIV to screw up the recaptcha. Anyone answering with 174 is likely counted as wrong...

Re:Be a Roman harlot instead! (1)

Translation Error (1176675) | more than 2 years ago | (#39515875)

And there's the added bonus of living at number 2 Maple Lane and getting all of number 11 Maple Lane's mail.

Re:Be a Roman harlot instead! (3, Funny)

gnick (1211984) | more than 2 years ago | (#39515997)

And put your house number in Roman Numerals. Nothing like living in number CLXXIV to screw up the recaptcha.

Not to mention the postal service! Damn snooty mailmen with their eagle-logo cars and fancy uniforms... Now I know how to get back at them.

Re:Be a Roman harlot instead! (1)

chrismcb (983081) | more than 2 years ago | (#39516505)

And put your house number in Roman Numerals. Nothing like living in number CLXXIV to screw up the recaptcha. Anyone answering with 174 is likely counted as wrong...

And then hope that the ambulance driver trying to find your place is skilled in the Roman ways.

Re:Be a Roman harlot instead! (1)

izomiac (815208) | more than 2 years ago | (#39517587)

You're much more than an optimist than me if you did that and expected to still get mail service and pizza deliveries.

Re:I'm a Microsoft whore (0)

hawguy (1600213) | more than 2 years ago | (#39515639)

Awesome. More invasion of privacy. Fuck Google.

bonch

What makes this more of an invasion of privacy than whatever they used to do to find house numbers? I assume they used some combination of databases, OCR, and paying someone to do it.

I'm surprised that this is a big help to them - if they can identify that something on a house is the house number (as opposed to a shadow or some home design pattern), it's surprising that they can't identify the number itself. It seems like there's going to be relatively few instances where something is identifiable as a house number, but the number itself is not OCRable -- especially when they already have a hint from the neighboring house numbers. Though I guess when you're dealing with identifying millions of structures, even "relatively few" is a lot.

Re:I'm a Microsoft whore (2)

b4dc0d3r (1268512) | more than 2 years ago | (#39515793)

Different angles make it hard to be sure you have the number right. If you look at a street photo like a book you're going to OCR, you have first the layout detection, then identify the image part and the text part. Solving this problem would be similar to identifying where the page number is, to be eliminated from the text.

Taking a laser measurement, un-warping the photo, and then doing traditional OCR would be awesome, if they had the forethought to include the laser part in their vast collection, but they didn't. Then you have the multiple "type faces" available.

Anyway, lots of places don't have a specific street address. Type something in and you get a blobby sort of approximation. Or data from Open Street Map - my home address is just a dot in the middle of the street. With street view they could get it more precise.

I would guarantee this is all shots from places like mine, where they may or may not have street names, and definitely don't have address ranges for the blocks. Connect the street name to the GPS tag in the photo, apply that to the orientation of the vehicle, and add the street numbers - accurate mapping better than any in-dash system has today.

Take off your tin foil hat (4, Insightful)

Anonymous Coward | more than 2 years ago | (#39515495)

This is an incredibly fascinating and great use of the technology.

Re:Take off your tin foil hat (0, Troll)

Anonymous Coward | more than 2 years ago | (#39515589)

Sorry, what is "great" about unwittingly asking people to do unpaid work?

Everyone knows that when filling in a re-captcha you write "nigger" for the word Google doesn't know. Or, better, make a minor mistake someone else is likely to make, in the hope that the wrong answer is recorded.

Re:Take off your tin foil hat (0)

Anonymous Coward | more than 2 years ago | (#39515677)

> Sorry, what is "great" about unwittingly asking people to do unpaid work?

Google isn't unwittingly asking anyone to do anything. Sites want/need bot filtering services for their sites, Google needs more accurate map data. It's a win-win situation.

> Everyone knows that when filling in a re-captcha you write [blah blah blah]

If you truly think you're smart enough to outwit Google go right ahead and try. Yes, they will have to deal with their fair share of scams and pranks, but I imagine they'll do a good job of finding these scenarios and weeding them out.

Re:Take off your tin foil hat (2)

Desler (1608317) | more than 2 years ago | (#39515811)

I'm guessing you've never done a copy-and-paste on, say, Google Books because the OCRed text quite frequently contains typos, random inserted spaces and completely wrong words. And since reCapatcha is used to supplement the OCR on Google Books, it would appear they aren't as smart as you would like them to seem.

Re:Take off your tin foil hat (2, Insightful)

Anonymous Coward | more than 2 years ago | (#39516139)

It's mostly the fault of 4chan.
Ever since Re-Captcha was implmented there, most of the RC results are
'(Checkword) Nigger'

Re:Take off your tin foil hat (1)

reub2000 (705806) | more than 2 years ago | (#39520617)

I'm guessing that these books haven't been subjected to recaptcha yet. Considering the number of books that google has scanned, this could take a while.

Re:Take off your tin foil hat (0)

Anonymous Coward | more than 2 years ago | (#39516883)

> If you truly think you're smart enough to outwit Google go right ahead and try.

I'm sure you know that ReCaptcha has one scanned-in word and one computer-generated sequence of symbols that is passed off as a word. To pass the ReCaptcha, you only need to type the latter, and even that has a fault tolerance of one character.

The great thing about the human brain is that it's pretty good at recognizing patterns. Knowing the above, how many ReCaptchas do you think it would take for someone to have a pretty good idea of what the required "word" is? 100? 10?

Here's a hint: it's almost never an actual word. I'd love to post an instructional graphic with more ways to avoid doing more work to make a post, but I can't on Slashdot.

Re:Take off your tin foil hat (0)

Anonymous Coward | more than 2 years ago | (#39515733)

People like you fuck up the research that people like us do. Please fuck off and die.

Re:Take off your tin foil hat (-1, Troll)

Anonymous Coward | more than 2 years ago | (#39516265)

People like you fuck up the research that people like us do. Please fuck off and die.

People like you doing research on people like me are the reason people like me own guns.

Adding the word "please" ahead of your fighting words won't save if we ever cross paths.

Re:Take off your tin foil hat (1)

bhagwad (1426855) | more than 2 years ago | (#39516453)

Very brave on the Internet aren't we? :D

Re:Take off your tin foil hat (0)

Anonymous Coward | more than 2 years ago | (#39519587)

Re:Take off your tin foil hat (0)

Anonymous Coward | more than 2 years ago | (#39516877)

1) I see what you're saying and it sounds like: "You and/or Google are not giving me free labor."

2) If you're using any of Google's services to do "research", except as a Google competitor, then please supply me the name of the organisation that's funding you - it appears they're wasting their money.

3) Responses like yours only serve to confirm that I'm doing the right thing.

Re:Take off your tin foil hat (0)

Anonymous Coward | more than 2 years ago | (#39523273)

Captain, I'm detecting copious amounts of butthurt and faggotry emanating from this area.

Re:Take off your tin foil hat (0)

Anonymous Coward | more than 2 years ago | (#39519449)

Actually it isn't. ReCAPTCHA is a very annoying piece of shit. Most people never get it right the first time and have to reload it several times until something that is human readable finally shows up. Fuck the fucking asshole who created it and the arrogant Google assholes who bought the company (they call it 'innovation' inside of the Googleplex)

--
mchurch

Eyebleed site (1)

Desler (1608317) | more than 2 years ago | (#39515565)

Wow that site is so terrible looking that it makes Geocities and myspace look decent. The only thing it's missing is cosmic cursors.

Re:Eyebleed site (3, Informative)

bertoelcon (1557907) | more than 2 years ago | (#39515599)

Wow that site is so terrible looking that it makes Geocities and myspace look decent. The only thing it's missing is cosmic cursors.

Yeah, Techcrunch is really ugly isn't it.

Re:Eyebleed site (1)

Desler (1608317) | more than 2 years ago | (#39515721)

Baziiiing!

Re:Eyebleed site (2)

wmbetts (1306001) | more than 2 years ago | (#39516129)

Baziiiinga!

Fixed it for you.

If I just type out the necessary word... (2)

mykos (1627575) | more than 2 years ago | (#39515569)

What happens to the other part? Does google keep recycling it until it has multiples of the same answer? Can we all agree on a word for the addresses just to have some fun with google?

Re:If I just type out the necessary word... (1)

melikamp (631205) | more than 2 years ago | (#39515727)

Unless I know that I am contributing to a libre project, I always use "fuck" in reCAPTCHA.

Re:If I just type out the necessary word... (4, Insightful)

X0563511 (793323) | more than 2 years ago | (#39515799)

Great. You know what they were previously? OCR for things like libraries.

I think your own answer to them describes what you are.

Would make for some interesting kids' books (2)

mykos (1627575) | more than 2 years ago | (#39516839)

I do not like green eggs and FUCK
I do not like FUCK Sam I am

Re:Would make for some interesting kids' books (1)

X0563511 (793323) | more than 2 years ago | (#39517341)

... and you just put the idea of "Green Eggs and Ham" censored, just like The Count [youtube.com] . Disclaimer: I am not responsible for any injury occurring as a result of laughing at this video.

Re:If I just type out the necessary word... (1)

tbird81 (946205) | more than 2 years ago | (#39519849)

You are a loser.

Re:If I just type out the necessary word... (1)

martas (1439879) | more than 2 years ago | (#39520809)

I am in machine learning, and you are the worst kind of person. Go kill as many children as you want, but don't you poison my data!

Re:If I just type out the necessary word... (1)

AliasMarlowe (1042386) | more than 2 years ago | (#39515761)

Can we all agree on a word for the addresses just to have some fun with google?

Actually, words instead of numbers could be an issue already. My parents' house does not have a number anywhere. The house has a visible name instead, and that's what is used in letters addressed to them (including government letters): house-name, street-name, etc. Some houses on their street have numbers, but most just have names, and the house names are nothing to do with the names of the occupants. BTW that particular first world country does not have any postal codes, either.

Re:If I just type out the necessary word... (0)

Anonymous Coward | more than 2 years ago | (#39515839)

BTW that particular first world country does not have any postal codes, either.

Any reason you didn't want to post the country name?

Re:If I just type out the necessary word... (0)

Anonymous Coward | more than 2 years ago | (#39515879)

Any reason you didn't want to post the country name?

Ireland seems most likely.

Re:If I just type out the necessary word... (1)

AliasMarlowe (1042386) | more than 2 years ago | (#39519497)

Ireland seems most likely.

Bingo, AC got it. I think almost every other EU country has postal codes - at least they do where I live. Incidentally, I've had to complain to more than one web-shop in the EU since they have the postal code as a required part of the address. So when ordering a gift for a parent, I have to put some bogus crap down (e.g. repeat the town name) as their "post code".

Re:If I just type out the necessary word... (0)

Anonymous Coward | more than 2 years ago | (#39521163)

There are postcodes in Dublin, and they started to do them in Cork but never really finished, but that's it. I usually just put EIRE as the postcode when I order things.

Re:If I just type out the necessary word... (1)

ewieling (90662) | more than 2 years ago | (#39516197)

Somewhat off-topic, I admit.

In the USA when e911 service is introduced into an area each street is named and each house numbered in the maps e911 uses, I assume these are official postal addresses as well. It is a good idea to have your house number visible for emergency services to find you.

Re:If I just type out the necessary word... (1)

operagost (62405) | more than 2 years ago | (#39522349)

Can we pick on Ireland mercilessly for naming instead of numbering their houses-- like the USA gets picked on for still using measures based on some guy's foot?

Re:If I just type out the necessary word... (1)

nine-times (778537) | more than 2 years ago | (#39515987)

If they're using this as a way to identify the street numbers, then I would assume that they're randomly matching the numbers with different words and seeing if they can get several matches to the same numbers. I would guess that they're also comparing the results to attempts at automated OCR. It would be difficult to bomb.

Re:If I just type out the necessary word... (1)

chrismcb (983081) | more than 2 years ago | (#39516521)

Sure. But can you convince enough of the population to answer incorrectly?

Be careful what you wish for... (0)

Anonymous Coward | more than 2 years ago | (#39518357)

/b/ has standardized on "nigger" for anything unreadable.

Re:If I just type out the necessary word... (0)

Anonymous Coward | more than 2 years ago | (#39519961)

You want more fun with Google, get a whole bunch of people to make their house number designs copyrightable, and then copyright them.

This is actually kind of frightening... (0)

NIN1385 (760712) | more than 2 years ago | (#39515571)

They're using us to identify our own home and business addresses, does anyone else feel a little violated by this?

Could just be me being paranoid, but this sounds like something out of a science fiction book. Whoever had the idea to do this, I have to admit, was really using their head though.

Re:This is actually kind of frightening... (4, Insightful)

nine-times (778537) | more than 2 years ago | (#39515613)

I don't find it worrying. The existence of a street address is properly public knowledge. It's not an invasion of privacy until they link the address with who lives there.

Re:This is actually kind of frightening... (4, Funny)

medlefsen (995255) | more than 2 years ago | (#39515625)

Oh shit
http://www.whitepages.com/ [whitepages.com]

Re:This is actually kind of frightening... (0)

Anonymous Coward | more than 2 years ago | (#39515713)

Um... I don't think whitepages has full addresses most of the time

Re:This is actually kind of frightening... (1)

thegarbz (1787294) | more than 2 years ago | (#39516471)

Interesting. It doesn't show the street numbers. That is bloody pointless.

Looks like out whitepages is far more useful [whitepages.com.au]

Re:This is actually kind of frightening... (1)

Kalriath (849904) | more than 2 years ago | (#39518525)

Click on the name and you get their street number as well, sometimes. That's what he meant by "most of the time".

But yes, us civilized (tongue in cheek) countries have way better white pages.

Re:This is actually kind of frightening... (0)

Anonymous Coward | more than 2 years ago | (#39516647)

Also many states/counties/cities are putting their information up on the internet. Usually under some sort of searchable tax information. Or you can do it real old school and go down to the local city hall and look it up.

Re:This is actually kind of frightening... (1)

DMUTPeregrine (612791) | more than 2 years ago | (#39516687)

Wait, you mean that property records are in public databases! And sales of houses get reported to the government and published in local records! This brand new invasion of privacy cannot be allowed!
Seriously, there are privacy invasions out there that actually matter. Making public information more public is not a privacy invasion, and it makes for a "boy who cried wolf" appearance.

Re:This is actually kind of frightening... (1)

NIN1385 (760712) | more than 2 years ago | (#39515697)

I understand the public knowledge part of it, that is fine. I just find it as kind of a sneaky way to get actual humans to identify the numbers.

Re:This is actually kind of frightening... (0)

Anonymous Coward | more than 2 years ago | (#39516257)

Just think of it as recycling your keystrokes. Doesn't seem harmful to me, especially if it makes Google Maps and Google Earth more accurate (which I have to believe is what's going on here).

Re:This is actually kind of frightening... (0)

Anonymous Coward | more than 2 years ago | (#39516369)

Depending on your country's laws, the property ownership records are part of public record. And for a large majority, people live in the houses they own.

It is not an invasion of privacy until they start combining information that is not part of public records with the one that is.

Re:This is actually kind of frightening... (0)

Anonymous Coward | more than 2 years ago | (#39515635)

You're being paranoid. The white pages are not science fiction.

Re:This is actually kind of frightening... (0)

Anonymous Coward | more than 2 years ago | (#39515665)

Whoa... it's not your personal address, it's a RANDOM address. I believe the world might have a few of those. No need to break out the tin foil hats....

Re:This is actually kind of frightening... (0)

Anonymous Coward | more than 2 years ago | (#39515667)

How are they using you to identify your own home? Of they knew the photo they're giving you is of your home then they already know where you live. But that's not at all what's happening. You're just an idiot like everyone else who believes google is looking to stalk you.

Re:This is actually kind of frightening... (1)

mikael (484) | more than 2 years ago | (#39517397)

Theoretically, there is enough information in the actual font of the text, the frame that the text is on, and the material and texture of the wall to identify that location uniquely.

If you have seen that sign before, you would be able to recognise that location. USA street names tend to be white text on small blue rectangular signs at 90 degrees to each other, and on posts. London street names tend to be white rectangular plates mounted on walls along with the postcode at the bottom. In Scotland, thr streetnames are white on black tiles with a pointing hand.

Others I can't classify, like the word "essex" in white text on a green frame with a curvy bit on top.

Re:This is actually kind of frightening... (1)

amRadioHed (463061) | more than 2 years ago | (#39523389)

What is your point? Who cares where a street sign is from? None of them are private information. And you're wrong about US street signs, they vary regionally but the majority tend to be white on green. And we were talking about house numbers anyway, or so i thought. How is any of what you wrote relevant?

Re:This is actually kind of frightening... (1)

geekoid (135745) | more than 2 years ago | (#39515903)

You mean your public information is public? shocking.

"Could just be me being paranoid, but this sounds like something out of a science fiction book."
Warp Drive teleporters, FTL, light sabers and robots are all in Science fiction. Why do you think things in science fiction are bad?

Re:This is actually kind of frightening... (1)

NIN1385 (760712) | more than 2 years ago | (#39516263)

I never stated that I thought public information being public was shocking me. Man, I sense some hostility. I don't recall saying that science fiction was bad either.

I just raised a question, I even stated that I "Could just be me being paranoid". When did /. become so damn hostile. Holy shit.

Re:This is actually kind of frightening... (0)

Anonymous Coward | more than 2 years ago | (#39515915)

They're asking "what do these numbers look like?"

Put away the tinfoil hat, bro.

How does ReCAPTCHA "solve" new images? (0)

Anonymous Coward | more than 2 years ago | (#39515647)

Do they allow the early set of people to type whatever they want, then pass the next set of people on if they type something that matches, let's say, the top five words typed by the first set of people, then assume after 1000 people or whatever that the most frequently entered word properly represents what's in the image?

Re:How does ReCAPTCHA "solve" new images? (2, Interesting)

Anonymous Coward | more than 2 years ago | (#39516293)

They give you two words to solve. One is an old, known word and the other is a new, unknown word. You have no way to tell which is which. To pass the CAPTCHA, you need to answer both and get the known one correct. Eventually entries can go from unknown to known when enough people provide the same answer.

Re:How does ReCAPTCHA "solve" new images? (3, Informative)

eldorel (828471) | more than 2 years ago | (#39516491)

Not exactly, but pretty close.

They give you 2 words, one is an already solved known value, and the other is an unknown word.
if you get the first word correct, they take the value from your second word and add it to the "possible solutions" list.

After 2000 or so people have solved the word, they examine the results for a statistically unique answer. If there is not outlier, (say 65% have the same answer) it goes back into the unknown pile.

Once they find a statistically significant answer, it's considered "solved" and is used as one of the initial validation words.

Rinse, repeat.

Warning: You and I are livestock to the OWNERS. (-1)

Anonymous Coward | more than 2 years ago | (#39515663)

âoeRFID in School Shirts must be trial runâ

The trial runs began a LONG time ago!

Weâ(TM)re way past that process.

Now weâ(TM)re in the portion of the game where they will try and BRAINWASH us into accepting these things because not everyone BROADCASTS themselves on and offline, so RFID tracking will NEED to be EVERYWHERE, eventually.

RFID is employed in MANY areas of society. RFID is used to TRACK their livestock (humans) in:

* 1. A lot of BANKâ(TM)s ATM & DEBIT cards (easily cloned and tracked)
* 2. Subway, rail, bus, other mass transit passes (all of your daily
activities, where you go, are being recorded in many ways)
* 3. A lot of RETAIL storesâ(TM) goods
* 4. Corporate slaves (in badges, tags, etc)

and many more ways!

Search the web about RFID and look at the pictures of various RFID devices, theyâ(TM)re not all the same in form or function! When you see how tiny some of them are, youâ(TM)ll be amazed! Search for GPS tracking and devices, too along with the more obscured:

- FM Fingerprinting &
- Writeprint

tracking methods! Letâ(TM)s not forget the LIQUIDS at their disposal which can be sprayed on you and/or your devices/clothing and TRACKED, similar to STASI methods of tracking their livestock (humans).

Visit David Ickeâ(TM)s and Prison Planetâ(TM)s discussion forums and VCâ(TM)s discussion forums and READ the threads about RFID and electronic tagging, PARTICIPATE in discussions. SHARE what you know with others!

These TRACKING technologies, on and off the net are being THROWN at us by the MEDIA, just as cigarettes and alcohol have and continue to be, though the former less than they used to. The effort to get you to join FACEBOOK and TWITTER, for example, is EVERYWHERE.

Maybe, you think, youâ(TM)ll join FACEBOOK or TWITTER with an innocent reason, in part perhaps because your family, friends, business parters, college ties want or need you. Then itâ(TM)ll start with one photo of yourself or you in a group, then another, then another, and pretty soon you are telling STRANGERS as far away as NIGERIA with scammers reading and archiving your PERSONAL LIFE and many of these CRIMINALS have the MEANS and MOTIVES to use it how they please.

One family was astonished to discover a photo of theirs was being used in an ADVERTISEMENT (on one of those BILLBOARDS you pass by on the road) in ANOTHER COUNTRY! There are other stories. Iâ(TM)ve witnessed people posting their photo in social networking sites, only to have others who dis/like them COPY the photo and use it for THEIR photo! Itâ(TM)s a complete mess.

The whole GAME stretches much farther than the simple RFID device(s), but how far are you willing to READ about these types of instrusive technologies? If youâ(TM)ve heard, Wikileaks exposed corporations selling SPYWARE in software and hardware form to GOVERNMENTS!

You have to wonder, âoeWill my anti-malware program actually DISCOVER government controlled malware? Or has it been WHITELISTED? or obscured to the point where it cannot be detected? Does it carve a nest for itself in your hardware devicesâ(TM) FIRMWARE, what about your BIOS?

Has your graphics card been poisoned, too?â No anti virus programs scan your FIRMWARE on your devices, especially not your ROUTERS which often contain commercially rubber stamped approval of BACKDOORS for certain organizations which hackers may be exploiting right now! Search on the web for CISCO routers and BACKDOORS. That is one of many examples.

Some struggle for privacy, some argue about it, some take preventitive measures, but those who are wise know:

Privacy is DEAD. Youâ(TM)ve just never seen the tombstone.

I seem to have missed something... (4, Informative)

Gen-GNU (36980) | more than 2 years ago | (#39515723)

I have read the quote from Google about what they are doing several times, and I don't see what everyone else sees. It appears to me that they are using the already known street names and numbers as possible ReCAPTCHA images. What they are NOT doing is using the results given by people to define what the image says. The point of the experiment is to determine whether these images are sufficient to separate people from web-bots. I imagine that they will look at the number of 'wrong' answers from both sides of the test, and see if bots are able to parse the street view images significantly more often than the standard test images.

So... can anyone point to something in the Google quote to show me where I went wrong? From TFA, here is the quote:

We’re currently running an experiment in which characters from Street View images are appearing in CAPTCHAs. We often extract data such as street names and traffic signs from Street View imagery to improve Google Maps with useful information like business addresses and locations. Based on the data and results of these reCaptcha tests, we’ll determine if using imagery might also be an effective way to further refine our tools for fighting machine and bot-related abuse online.

Re:I seem to have missed something... (0)

Anonymous Coward | more than 2 years ago | (#39516007)

It's a reCAPTCHA, which displays two words: a "known" control word, and a word from a scanned source that the computer was unable to confidently process using OCR. The CAPTCHA part only depends on the control word; as long as the user gets that part right, he is allowed to pass. It doesn't matter to the user if he gets the second word correct, but his entry will (presumably) be added to a database along with everyone else's entry for the same word, and then the system will more or less know what the scanned word is, despite being unable to recognize the characters with its own algorithms.

Re:I seem to have missed something... (2)

icebike (68054) | more than 2 years ago | (#39516135)

Getting around reCAPTCHA logins is usually easy. Just correctly type the easy to read word, and an approximation of the number of characters in the hard to read one. You don't even have to be close.

Google could have a few thousand house numbers they already know (their own recognition system is probably capable of this), and they can swap these in as well as a hard to read scanned word from a book, and you could never be sure which one was the reCAPTCHA and which was the CAPTCHA.

Re:I seem to have missed something... (1)

Gen-GNU (36980) | more than 2 years ago | (#39516301)

Yes, I understand this. I understand that they can look for most common answers among correct control responses, and crowd source the OCR of difficult street view images. My point is that is not what the experiment is doing. The point of the experiment is to determine if these images are as effective as the current images used in the tests. For the purposes of that experiment, it would be much easier (and probably more scientifically accurate) to use images where the correct answer is already known. As Google already has a large number of those images where it has extracted street names and numbers, they would have a large sample size to use for this experiment. They do not need, at this point, to use the unknown images.

If the experiment shows the street view images are equally effective, people can debate whether it's ok to have random web users do your OCR for you. Until then I'm not going to panic.

Re:I seem to have missed something... (1)

chrismcb (983081) | more than 2 years ago | (#39516579)

Well if it was easy to OCR the picture, it would not be an effective CAPTCHA. Cause that sort of defeats the purpose, doesn't it.
Who is asking who to panic? What are you going to panic about?

Um, what? That's exactly what they're doing. (1, Informative)

LanMan04 (790429) | more than 2 years ago | (#39516619)

What they are NOT doing is using the results given by people to define what the image says.

Um, no, that's exactly what ReCaptcha is for! The standard ReCaptcha images are all from old books that were scanned in (and presumably had trouble being OCRed with high confidence), and Google used ReCaptcha to "read" the words.

For heaven's sake, ReCaptcha's MOTTO is: "reCAPTCHA: Stop Spam, Read Books"

I read how it works. Multiple users are shown the same image, and once a few people have identified a given image as the same word, it's treated as the "correct" answer, and then later users have to match that answer to get past the ReCaptcha. This is why they show you more than one word....one word has a "known" answer, the other word is one they're still trying to figure out the "right" answer to.

Re:Um, what? That's exactly what they're doing. (4, Interesting)

martin-boundary (547041) | more than 2 years ago | (#39517123)

Yeah, the problem with that is that it can't work when most of the humans are robots. The robots will make guesses using standard algorithms, and their guesses will be pretty consistent with the other robots' guesses (which are quite probably the same robot in another instance). Then Google thinks the robot guess is correct, because it's overwhelmingly the most consistent answer. And humans who give the correct answer get marked wrong, because they're a minority.

It's quite noticeable if you use a site which relies heavily on recaptchas. For example, when you get a word which has old english S [wikipedia.org] which looks like a modern small case F, you're much better off claiming it's an F instead of giving the correct answer.

Are people actually annoyed at this? (1)

unreadepitaph (1537383) | more than 2 years ago | (#39515833)

I don't see how anyone can be pissy they're doing this.
They already list the number of the house on maps.

Re:Are people actually annoyed at this? (1)

Anonymous Coward | more than 2 years ago | (#39515967)

My understand of ReCAPTCHA is that it's to help translate books for libraries. Google has distorted that by using it to improve it's own databases. I personally don't have ReCAPTCHA on my website, but if I did I would be completely pissed off. Google is a for-profit company and can pay to do user studies to see how well people can read images. I'm willing to donate my time/reading ability to random libraries, not Google.

Re:Are people actually annoyed at this? (4, Insightful)

icebike (68054) | more than 2 years ago | (#39516169)

Oh, climb down off that ledge before you get hurt.

reCAPTCHA is for what ever you want to use it for, Its simply a technique for crowdsourcing guesses.

In my estimation, Google maps and street view is one of the great accomplishments of our time, easily worth every penny Google monetizes out of it.

Re:Are people actually annoyed at this? (0)

Anonymous Coward | more than 2 years ago | (#39516563)

Exactly. Google maps is a great tool of modern times. If they need to shift the focus for some time so they can improve the results, its fine with me. After all it's people like us who take advantage of the maps. More accurate results is a good thing for everyone.

Re:Are people actually annoyed at this? (1)

Inda (580031) | more than 2 years ago | (#39521057)

It's worth every penny I pay for it and more.

The picture of my house has a funny join; two photos that span my house. The join is in the middle of my burgular siren and it makes it stand out. Brilliant.

Thank Goodness! (1)

scarboni888 (1122993) | more than 2 years ago | (#39516751)

I'm glad something is being done I can't recall how many times I've looked up a street address to find Google maps reporting it as being 4 or 5 blocks away (on average) from where it actually is.

Why? (-1)

Anonymous Coward | more than 2 years ago | (#39517069)

It's beyond me why the fuck you dumb foss-fags linked to a random Ad Words page where you can't even submit the captcha without filling in the form above it.

http://www.google.com/recaptcha/learnmore

There you can acutally enter them in until you get bored. Refreshed it about 20 times and got 3 with building numbers.

At this point I only solve the generated word. (1)

EnsilZah (575600) | more than 2 years ago | (#39520339)

Back when reCaptcha showed two words that you could find in the dictionary, black on white I had no problem with it, it seemed like a good idea and you might be contributing to digitizing a book or something.
But now you just get randomly generated characters with a zigzag going through the middle and blobs that invert it and it's hard to tell if this one letter is an 'i' or an 'r' or a 't'.
So I don't even bother looking at the real word and just solve the generated one.

So they aren't hiding them anymore? (1)

Snaller (147050) | more than 2 years ago | (#39521335)

So many addresses has been fuzzy that I could that could only be a strange design choice.

Unclean by design? (1)

halcyon1234 (834388) | more than 2 years ago | (#39521667)

... to help clean up Google streetview images...

I thought text in Streetview was blurred out by design in the same way that faces were-- automatically and for security reasons (read: so Google doesn't get sued by crazy OMG I'M ON TEH INTERNET people).

I'd actually prefer if they un-blurred all street numbers and signs. It's fine to rely on Map's street number location when you're in a huge city, and the difference between 123 fake street and 125 fake street is ten feet or so. But last time I planned a road trip, the difference between 123 Country Side Road and 200 Country Side Road could be dozens of kilometers or more. Often I'll get a recommendation to visit Out Of The Way Restaurant that has the red sign, just keep an eye out for it. I'll go into Street View, "drive" along my intended route looking for that sign-- and pass by dozens of little buildings with red signs that read "{&o /// &&6$#q blurrrrrrrrrrrrrrrrry".

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?