Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

A Vision For a World Free of CAPTCHAs

Soulskill posted more than 5 years ago | from the is-that-an-oh-or-a-zero dept.

Security 168

An anonymous reader writes "Slate argues that we're going about verifying humans on the Web all wrong: 'As Alan Turing laid out in the 1950 paper that postulated his test, the goal is to determine whether a computer can behave like a human, not perform tasks that a human can. The reason CAPTCHAs have a term limit is that they measure ability, not behavior. ... the random, circuitous way that people interact with Web pages — the scrolling and highlighting and typing and retyping — would be very difficult for a bot to mimic. A system that could capture the way humans interact with forms algorithmically could eventually relieve humans of the need to prove anything altogether.' Seems smart, if an algorithm could actually do that."

cancel ×

168 comments

Sorry! There are no comments related to the filter you selected.

Just a Thought... (5, Insightful)

ryanleary (805532) | more than 5 years ago | (#27710327)

It seems to me that if you can design an algorithm to verify how humans interact with a computer, it should be relatively trivial to engineer an algorithm that mimics this interaction?

Maybe someone smarter than I could clarify?

Re:Just a Thought... (5, Insightful)

Nazlfrag (1035012) | more than 5 years ago | (#27710339)

Using anything other than a human to judge the behaviour puts it outside of the Turing test. So not only does their proposed solution not match the goal they set, it should indeed be defeatable by another algorithm.

Re:Just a Thought... (3, Insightful)

Anonymous Coward | more than 5 years ago | (#27710429)

So if I have an algorithm that can verify an integer factorization quickly, it means there must be an algorithm that can factor any integer quickly? How would that work?

Re:Just a Thought... (2, Informative)

Devout_IPUite (1284636) | more than 5 years ago | (#27710611)

Factoring an integer has one answer. Trial and error doesn't work. Scrolling and clicking tempos have many answers, trial and error does work.

Re:Just a Thought... (4, Insightful)

1 a bee (817783) | more than 5 years ago | (#27710675)

So if I have an algorithm that can verify an integer factorization quickly, it means there must be an algorithm that can factor any integer quickly? How would that work?

The anonymous poster makes a good counter argument against the idea that the algorithm must be easily defeatible: just because you have an algorithm that detects human behavior does not imply you have an algorithm that emulates the human behavior detected by the original algorithm.

In fact, there are many, so-called, one-way (correct terminology?) algorithms. So, for example, for a given file, it's easy to compute its MD5; harder to compute a file for a given MD5 (though doable). And of course, the AC's better example which is impossibly hard in reverse for composite numbers made from very large prime factors.

So no. Labeling the idea flawedbydesign is jumping the gun--logically, speaking.

Re:Just a Thought... (4, Interesting)

Joce640k (829181) | more than 5 years ago | (#27710751)

I disagree. I don't think there's anything terribly un-mimicable about the way humans interact with web pages.

Besides, have you considered the effect of false positives (which will be many)?

With a captcha it's a black/white decision and people know why they passed/failed.

In the world being proposed in the article people will have to sit dejectedly wiggling their mouse while a web page decides if they're human or not based on some unknown criteria. Pass or fail? It's up to the machine.

After two or three sessions of this people will be running away screaming from your web pages.

Re:Just a Thought... (1, Insightful)

Helen Keller (842669) | more than 5 years ago | (#27710793)

people will have to sit dejectedly wiggling their mouse

I FNGnmehdo gnnnthat MNNNNEH GGGGCLOD!

Re:Just a Thought... (1)

1 a bee (817783) | more than 5 years ago | (#27710881)

I disagree. I don't think there's anything terribly un-mimicable about the way humans interact with web pages.

Maybe, maybe not. The point was that claiming

it should indeed be defeatable by another algorithm

is not a logical slam-dunk.

Re:Just a Thought... (2, Insightful)

Joce640k (829181) | more than 5 years ago | (#27711329)

I'd say it's a lot more of a slam-dunk than this:

"Read heavily distorted text on random patterned backgrounds with added noise and geometric figures drawn across it"

My real problem with the proposal is with the false positives. There's no clear feedback to let a user know *why* he's not being allowed into the system, it's just that the machine doesn't like the look of him.

Re:Just a Thought... (2, Interesting)

cskrat (921721) | more than 5 years ago | (#27711639)

The anonymous poster that you're responding to was actually the one to introduce the word "quickly" to the discussion.

That being said, I think the method proposed at the end of the article is flawed in that the algorithm is reversible and facing the wrong direction.

Assuming that the website in question only has access to the message information passed to the GUI window of the browser by the OS, (I'm sure as hell never installing a browser with ring 0 access to my system) it would be fairly trivial to produce an AI algorithm to replicate that behavior. A few hard coded target parameters and a bit of randomization would sufficiently emulate a human based on gathered metrics of a small sample, possibly as small as just one, human subjects. And don't forget that spammers don't need anywhere near a %100 success rate to be viable.

The checking process, on the other hand, would require a very large, heterogeneous sample of human subjects to determine the limits, distribution, and correlations of tested metrics. A team of statisticians and psychologists would be required to analyze the data so that it can be converted into a working algorithm by software engineers. That's an enormous amount of man hours just to produce the system. Assuming, however, that the system is produced in spite of it's high development cost, it would still be computationally expensive to analyze each potential human to see if it's generating a valid combination of metrics.

Think of it this way, It's trivial for me to write a PHP script to quickly generate valid XML markup to send to a remote system. Parsing a string of potential XML on the other side, however, is more computationally intensive and the algorithms to do it are more complex, especially if you consider the complexity of any prebuilt parsing tools, such as regular expression tools, as being part of the overall algorithm complexity. While, granted, a parser can be reasonably expected to run in linear time, the script to produce XML can be reduced to constant time if optimized for a specific purpose.

What does it mean to be human? (4, Insightful)

mcrbids (148650) | more than 5 years ago | (#27710521)

It's a lot tougher do define what a human is than it may seem on the surface, and the difference between man and machine will, by definition become more and more blurred until there is no effective difference.

It's an idea that I've become familiar with esp. aftre reading 'The Singularity is Near' by Ray Kurzweil. As our technology advances, we'll find that our capabilies beyond our technolgy will diminish. Machines have long ago surpassed our running speed (cars/planes/trains) and our ability to farm/grow food (tractors) and our ability to hurl object (guns) and swim (boats) but we've always had the ability to out-think our machines.

Increasingly, this isn't true.

We've already shown that SPAM filters are good enough to be more accurate than the people who read the messages. Machines have long been better than people for math-related stuff, keeping track of stuff, and the like, but now we're getting close to the threshhold for image processing and character recognition. It's already true for voice recognition. Captcha is, therefore, doomed to fall eventually as we approach the singularity, and is already pretty weakened. The next question is, therefore simple: what does it mean to be human?

Remember Lt. Commander Data on Star Trek, trying to be human? It's quaint largely because he/it was a minority on he show, but in reality the machine will outnumber us by a wide margin - they already do!

So what does it mean to be human?

If you have a prosthetic leg, are you still human?

If the leg has a CPU in it, are you still human?

If the CPU is more powerful than your mind, are you still human?

If the chip is wired into your mind, are you still human?

If you use the CPU as though it were part of your mind, are you still human?

If you have transferred modt of your thinking to the CPU, are you still human?

If you transferred all your thinking to the CPU and rarely use your 'wet' brain, are you still human?

If you find th

Re:What does it mean to be human? (2, Interesting)

Devout_IPUite (1284636) | more than 5 years ago | (#27710621)

I might recommend http://en.wikipedia.org/wiki/Homosapien [wikipedia.org] for further reading on this topic. Clearly, you are not a human no matter how smart you are if you're a computer. Are you a person? Well, depends how you define 'person'.

Re:What does it mean to be human? (1)

alx5000 (896642) | more than 5 years ago | (#27710627)

If the leg has a CPU in it, are you still human?

Maybe. Maybe not. I'll be back.

Re:What does it mean to be human? (1)

martin-boundary (547041) | more than 5 years ago | (#27711113)

So what does it mean to be human?

Born of a human mother. Take that, mister data!

Re:What does it mean to be human? (0)

Anonymous Coward | more than 5 years ago | (#27711207)

Born of human mother.....?

It's not too far away that we are going to have children that are both born from engineered mixing of genetic material (not necessaryly from man/woman) and potentially raised in artificial wombs. Will that child still be "Human"?

We are gaining the ability to manipulate and create life, that any definition of HUMAN that relies on hot and sweaty sex followed by 9 months of back ache and haemorrhoids is doomed to failure.

Any definition that relies on any physical attribute is similarly doomed to fail, as more and more prosthetic technologies become available.

Just look at the history of people considering "lesser" primitive peoples to not be human... or indeed people not of the correct religion.

Similarly and definition that requires a particular intellectual ability is doomed to failure... Is the child born with no brain human? It's parents would consider it so, and would give it a human funeral... Does the person who has an accident rendering them severely brain damaged suddenly become "not human"?

I think over the next couple of hundred years humanity is in for a lot of fun, and I don't doubt that in three hundred years, the average human will not look or think much like you and I.

P

Re:What does it mean to be human? (1)

Squeeonline (1323439) | more than 5 years ago | (#27711357)

So what does it mean to be human?

Born of a human mother. Take that, mister data!

So if you replace the mothers placenta with a machine that is linked to her brain so the rights chemicals are transfered, are you still human when you are born?

If you replace that mothers mind with a computer because it will do things right, are you still human when you are born?

The same argument applies to biochemistry. At what point does a group of self replicating molecules constitute life? At what ratio of silicon:organic flesh is considered still human.

Re:What does it mean to be human? (0)

Hurricane78 (562437) | more than 5 years ago | (#27711647)

Machines have long ago surpassed our running speed (cars/planes/trains) and our ability to farm/grow food (tractors) and our ability to hurl object (guns) and swim (boats) but we've always had the ability to out-think our machines.

Machines have done nothing. Machines like you describe them do not act. They are tools. Not the computer does something.
The programmer did something with the computer.
Not the car drove. You drove the car.
And so on.

You wouldn't say, that a glove, or ever your hand "has done" anything. You used it. You controlled it. :)

Of course, it will not be guaranteed to be that way, in the future. ^^

Re:Just a Thought... (1)

phantomfive (622387) | more than 5 years ago | (#27710629)

The human brain is works on an algorithm that is Turing complete. It is also unlikely that the human brain has any algorithmic capability that a computer does not have, so it is reasonable to say that

Any captcha that can be solved by a human, eventually will also be solvable by a computer.

Re:Just a Thought... (1)

martin-boundary (547041) | more than 5 years ago | (#27710723)

it should indeed be defeatable by another algorithm.

True. Let's say you have a test T in mind. This test will have some inputs I1,...,In which represent some observations coming from the keyboard and the mouse input obtained from some websurfer. If a computer tries to pass the test T, all it has to do is know the observations I1,...,In that are being looked for and simulate plausible values.

What are plausible values? To obtain them, all you have to do, before the test T goes live, is ask some humans to act normally, and observe the quantities I1,...,In. That's how you calibrate the test T.

But here's the thing: an attacker can do that too. He observes some friends to get plausible values of I1,...,In and once those values are known, an algorithm can simulate those values and pass the test T.

Re:Just a Thought... (1)

buchner.johannes (1139593) | more than 5 years ago | (#27711449)

Using anything other than a human to judge the behaviour puts it outside of the Turing test. So not only does their proposed solution not match the goal they set, it should indeed be defeatable by another algorithm.

I imagine there will have to be a new job description for the webmaster ...

Re:Just a Thought... (4, Insightful)

l3prador (700532) | more than 5 years ago | (#27710381)

Yep. If you can characterize the behavior pattern enough to automatically determine that it's "human-like," then you can automatically generate "human-like" behavior. The only way around it that I can see is if there is some sort of asymmetrical information involved, such as the invisible form honeypot mentioned in TFA--the website's creator (and thus the bot-detection script) knows that there is an invisible form present, but it's difficult for a script to see without rendering the site in standards compliant CSS.

Re:Just a Thought... (1)

RiotingPacifist (1228016) | more than 5 years ago | (#27710515)

but it's difficult for a script to see without rendering the site in standards compliant CSS.

But with many OpenSource web browsers, would it be that hard to work out what is rendered and what is not? it seams that bots could even run a hidden tab of firefox/chrome on a victims computer if they had to. I suppose it does make cracking capatchas computationally more difficult but isn't OCR much more intensive than rendering a page (wait why not just put capatchas in terribly codded flash apps)?

Re:Just a Thought... (0)

Anonymous Coward | more than 5 years ago | (#27710699)

All in all, this approach is not accounting for delegating to cheap labor breaking CAPTCHA, those pesky laborers are 100% human.

Re:Just a Thought... (1)

Z00L00K (682162) | more than 5 years ago | (#27710385)

Aren't many of those things like captchas circumvented by a trial and error methodology?

What if you get three tries and then a blacklisted IP address? Not that the poster will realize that it's blacklisted, just that the tries to crack the captcha won't work, even if it's the correct answer.

Re:Just a Thought... (3, Insightful)

RiotingPacifist (1228016) | more than 5 years ago | (#27710519)

If you have a botnet then a single computer probably dosen't need to try a site more often than a human would.

Re:Just a Thought... (1)

Z00L00K (682162) | more than 5 years ago | (#27711015)

That's assuming the botnet is targeting a single site or only a few sites.

Re:Just a Thought... (1)

roblarky (1103715) | more than 5 years ago | (#27710387)

Right. The problem can only be solved if the Internet removes all forms of anonymity. Otherwise, it's just jumping through hoops which a bot can emulate.

Re:Just a Thought... (1)

MichaelSmith (789609) | more than 5 years ago | (#27710509)

Right. The problem can only be solved if the Internet removes all forms of anonymity. Otherwise, it's just jumping through hoops which a bot can emulate.

We could see zombies skimming cemeteries for unused human identities.

Re:Just a Thought... (2, Funny)

cjfs (1253208) | more than 5 years ago | (#27710529)

It seems to me that if you can design an algorithm to verify how humans interact with a computer, it should be relatively trivial to engineer an algorithm that mimics this interaction?

Maybe someone smarter than I could clarify?

You're looking at this all backwards. This isn't the humans attempting to prevent access to the bots. It's the bots getting the humans to speed up their evolutionary arms race.

Think of it, bots trying to determine bot from non-bot. Bots honing their human-infiltration skills vs the best of the bots. It'll be the greatest leap since spam filtering. We'll^WThey'll be getting +5s again on Slashdot in no time!

Re:Just a Thought... (3, Insightful)

julesh (229690) | more than 5 years ago | (#27710567)

It seems to me that if you can design an algorithm to verify how humans interact with a computer, it should be relatively trivial to engineer an algorithm that mimics this interaction?

Maybe someone smarter than I could clarify?

Sometimes it's easier to write an algorithm that checks that something is correct than to generate that something in the first place. An example: if you have a public key, checking a message is signed with it is fairly easy; signing a message with it is hard, because it requires you to factor the key.

I see no evidence that "human behaviour" is such an algorithm. It might be, but we're way too far off understanding it to be able to make any sensible guesses in this field.

A simplified approach is doomed to failure; simplified human behaviour is much more likely to behave like you suggest than like public keys, I think. Also, because different people interact with their browser in different ways, how do you cope with that? I tend to navigate via keyboard, so would the script reject me because I tabbed to the form field (thus jumping directly to it) rather than scrolling circuitously to reach it? I also make far fewer typos than average and type faster than the average user, so is this going to count against me?

Re:Just a Thought... (5, Insightful)

major_fault (1384069) | more than 5 years ago | (#27710575)

No algorithm will do. Ultimately the question that must be solved is whether the user is malicious or not. Best possibilities so far are the tried and true invitation system and excluding malicious users from the system. Malicious users are also users who keep including other malicious users. Easily detectable with proper moderation system that needn't be gotten into right here and now.

Re:Just a Thought... (1)

bytesex (112972) | more than 5 years ago | (#27710757)

The only thing I can think of that could break this, is lack of efficiency on the human's part. That is, if the test, or the judgement takes time, then this is time that automated algorithms usually do not have. They want to inject, mass-mail, or do whatever they maliciously want to do, quickly. But then again, they might not.

Re:Just a Thought... (1)

mrsquid0 (1335303) | more than 5 years ago | (#27711591)

Perhaps the verification algorithm could reject any"one" who behaves too much like the algorithm expects a human to.

Seriously though, this sort of verification method seems like it would be easy to defeat.

Not so sure (4, Insightful)

Misanthrope (49269) | more than 5 years ago | (#27710345)

Assuming you could write an algorithm to determine humanistic behavior, it stands to reason that you could write a bot to fool the initial algorithm.

Re:Not so sure (0)

Anonymous Coward | more than 5 years ago | (#27710783)

I see you've graduated from the School of My Dad Always Said, and gotten yourself into the College of It Stands to Reason. I expect you're well on your way to being a postgraduate student at the University of What Some Bloke In the Pub Told Me.

Re:Not so sure (1)

Misanthrope (49269) | more than 5 years ago | (#27710809)

I have a PhD from Tongue Firmly in Cheek U.

Re:Not so sure (0)

Anonymous Coward | more than 5 years ago | (#27710853)

I have a PhD from Tongue Firmly in Cheek U.

So you're miming a BJ? Or perhaps BJing a mime?

Re:Not so sure (1)

noppy (1406485) | more than 5 years ago | (#27711091)

Assuming you could write an algorithm to determine humanistic behavior, it stands to reason that you could write a bot to fool the initial algorithm.

Write a bot that surfs all day at /.

Re:Not so sure (3, Insightful)

TheRaven64 (641858) | more than 5 years ago | (#27711229)

Not true. For example, any NP-complete problems can be solved in polynomial time on a nondeterministic Turing machine, but a solution can be verified in polynomial time on a deterministic Turing machine. There are lots of examples of this kind of problem, for example factoring the product of two primes or the travelling salesman problem. In a vast number of cases, it is easier to test whether a solution is correct than it is to produce the solution. Even division is an example of this; it is easier to find c in a*b = c than it is to find a in c/b = a.

Of course, as the other poster said, there is no evidence that 'seeming human' is in this category, and it's a very wooly description of a problem so it is probably not even possible to prove one way or the other.

Re:Not so sure (1)

smallfries (601545) | more than 5 years ago | (#27711377)

Even division is an example of this; it is easier to find c in a*b = c than it is to find a in c/b = a.

That would be quite hard to prove... ;)

I read something about this (4, Interesting)

gcnaddict (841664) | more than 5 years ago | (#27710351)

I remember reading... I can't remember if it was a post about an algorithm already written or a proposal for an algorithm which would run alongside a CAPTCHA through the entire registration process, but the basic premise was just that: measure the entropy and fluidity of human movement and determine whether or not the user is a bot based on whether or not the user fits typical random human usage patterns.

I also remember the writer of the post noting that this kind of system would basically stretch the human-unwittingly-answers-CAPTCHA out such that humans would have to do the entire setup process manually instead of just the CAPTCHA, thus defeating the point of automated setup.

Does anyone have this article? I can remember reading it but I can't find it.

Re:I read something about this (0)

Anonymous Coward | more than 5 years ago | (#27710419)

Are you a bot?

Re:I read something about this (1)

fahrbot-bot (874524) | more than 5 years ago | (#27710453)

...algorithm ... which would run alongside a CAPTCHA through the entire registration process, ... measure the entropy and fluidity of human movement and determine whether or not the user is a bot based on whether or not the user fits typical random human usage patterns.

Ya. I don't think I'll be whitelisting *that* in NoScript... :-)

Re:I read something about this (4, Insightful)

abolitiontheory (1138999) | more than 5 years ago | (#27710595)

In addition to this, what about those humans who just happen to fall into the seemingly 'mechanical pattern' that a computer registrant would? I know some parents of friends who very meticulously and methodically fill out forms, reading every box and explanation to ensure that they're inputting the right data.

Any computer judgment of what is authentically human is in a way a reverse Turing test. It's a computer judging if humans are behaving enough like humans. The problem here is too many degrees of separation: a very specific type of human [engineer] designs a computer to assess the 'humanness' of other humans actions. Any such assessment would be based on certain assumptions and biases about how humans act. It sounds like putting a document through Google translator into another language and then back again, before turning it in for a final grade.

Re:I read something about this (1)

adamofgreyskull (640712) | more than 5 years ago | (#27710955)

In addition to this, what about those humans who just happen to fall into the seemingly 'mechanical pattern' that a computer registrant would? I know some parents of friends who very meticulously and methodically fill out forms, reading every box and explanation to ensure that they're inputting the right data.

Even the most "mechanical" of your friends wouldn't download the page, parse it in its entirety without scrolling the page in their browser, then enter all form fields in a fraction of a second, before submitting it. In fact what you're describing is probably exactly the kind of thing that the test would detect as normal human behaviour. Scroll down, read field label, read form field explanation, type answer into form field, scroll down, repeat.

The tricksiness of defining a useful (i.e. easy for a human to pass, difficult for a machine to pass) test will be in measuring, "by how much did the browser viewport move that time?", "how fast did they type that word into the field", "did they need to scroll the page to see the field", "is the scroll exactly 20px every time?", "how much time has elapsed since the viewport was last scrolled?" etc. All of which will have to be measured client-side, *ahem*. THEN, you have to feed that into your algorithm and determine how human those inputs make the form submitter. The test could be calibrated by having a number of known humans fill out the form and observing the inputs you get, how much variance there is etc.

The simplest version of the proposed test is to calculate the amount of time between a computer X requesting the form and computer X submitting the form. If you've recorded the time of the fastest human as 30 seconds, then you prevent all form submissions before 30 seconds has elapsed. But that's a single data point and if you were writing a bot,it would be trivial to put in a wait time of 30 seconds between form load and submission, if you were willing to wait. Similarly, it will be possible to emulate a human browsing a form and submitting it...but it would hopefully involve a lot more time, effort and money than is economical for the spammer...

Re:I read something about this (2, Insightful)

TheRaven64 (641858) | more than 5 years ago | (#27711237)

It's a nice idea, but unfortunately it's easy for a computer to work around. How does the client-side JavaScript know how much the page has been scrolled? Because the browser tells it. There is nothing stopping a bot from downloading the page and then submitting the same HTTP requests that the client-side JavaScript would (or even running it in a VM and injecting DOM events into it with some random wait events). Once you know the algorithm used on the server to determine whether something is human, it's easy to work around it. In your simple example, the client just needs to sleep for 30 seconds between downloading and submitting the form - one line of code to program, while the test is likely to need at least four lines. This limits the number of registrations a single bot can do in a single day, but only to one site - the bot can overlap its requests so that it's hitting 30 sites at once, and then it's back up to one spam per second. Or, it may keep using the slow approach, making its traffic harder to spot.

Re:I read something about this (1)

Atraxen (790188) | more than 5 years ago | (#27711547)

Plus, there are hardware based differences in interaction that modify your reading/interaction behavior. Analyzing mouse cursor movements for a trackball, mouse, and touchpad will likely give very different results - and that's assuming they're being moved the same way. When I'm reading with a mouse, I tend to 'follow along' on the page - with a trackball, I park the cursor to the side - with a touchpad, I tend to move in blocks. Add enough variables, and you can model any behavior (at the risk of losing the ability to probe correlation of real factors) - by adding enough exceptions to the algorithm to handle all these cases (and all the others) it strikes me as unlikely that the algo would be able to distinguish between humans and bots.

And if it does, the spammers will probably write a trojan that watches for the user generating a login, and swaps the interaction with the captcha the spammer wants solved. Reminds me of the good ole days of Cold War Arms Racing!

Re:I read something about this (1)

canthusus (463707) | more than 5 years ago | (#27710663)

I can't find the article itself, but there's a short summary of it here [slashdot.org] .

Re:I read something about this (3, Interesting)

caramelcarrot (778148) | more than 5 years ago | (#27711505)

Last time this came up, I suggested the idea of constant bayesian analysis on HTTP logs to determine the likelyhood of the current user being a bot.

It could take things into account like if the user bothered to visit previous pages, request images, the time between requests etc. You could then either just make the webserver kill the connection, or you could add a function to your preferred web language (e.g. PHP) that returned the probability that the current user is a bot, and so redirect them to a more annoying turing test or block them.

This'd also work pretty effectively if people wanted to stop scrapers and bots in browser games. Of course a bot could mimic all this, but it'd raise the cost of entry significantly - and it might end up being that the bot is no more effective than a human working 24/7, though even then you'd need to be changing ips constantly.

I was thinking of trying to implement this over the summer, based on comment spam bots on my website, all without any need for client-side spying

Kills itself like all CAPTCHA killerers (0)

Anonymous Coward | more than 5 years ago | (#27710427)

If you have algorithms to detect human behavior on a web page, you also have algorithms to simulate it. But it would be a little step for better AI, so go ahead.

Anything you can do, I can do... (0, Redundant)

name*censored* (884880) | more than 5 years ago | (#27710433)

It seems to me that if a bot can check whether or not a person is "acting" human, then it must follow that the bot knows what rules are involved with "acting human". If it understands this, then there's nothing stopping someone from telling the computer to obey those rules itself, which means "AI". The main problem with Artificial Intelligence is that we don't have a complete and fully accurate list of rules for what a human can/will do - in other words, we're unpredictable. And it's not like we can't have computers act unpredictably, it's just that we don't know how to make them act unpredictably in the same way a human would act unpredictably.

So, in other words, even if someone could make this test, it would render itself redundant by design..

Re:Anything you can do, I can do... (0, Redundant)

name*censored* (884880) | more than 5 years ago | (#27710445)

redundant

*Ahem*.

Re:Anything you can do, I can do... (0)

Anonymous Coward | more than 5 years ago | (#27710561)

what's your problem? You're at least the fifth person to say this - that's pretty redundant.

Re:Anything you can do, I can do... (0)

Anonymous Coward | more than 5 years ago | (#27711039)

Yes, but look at the timecodes - they were all posted within a few minutes of each other (keep in mind the static page of slashdot updates slowly). It was a pretty obvious thought.

alternate captcha based ways (1)

mehrotra.akash (1539473) | more than 5 years ago | (#27710461)

if there was a way for a computer to determine that the behaviour is human, wouldnt the computer be able to do it anyways?? but what about tricks like telling a user to leave a particular field blank and filling it on the next page instead this field could be indicated by a captcha which contains a URL on opening the URL you get another captcha which has a number, u leave that numbered field empty if the 2nd captcha is entered wrong, then you have to repeat the process from the beginning and fill in 2 captchas on the 2nd page and so on this way most humans would be able to do it in 1-2 attempts, but bots doing it the hit and trial way would be stuck with 1000's of different captchas also, having a central database of all the types of captchas and mixing displaying 2-3 different types would be effective as bots are designed for one type of captcha only, arent they

capture and copy (1)

tmk (712144) | more than 5 years ago | (#27710465)

A system that could capture the way humans interact with forms algorithmically could eventually relieve humans of the need to prove anything altogether.'

This system could also reproduce human interactions. So it's only time until this behavourial approach stops working.

BTW: I don't want slashdot to check how I scroll the page, nor is my typing and retyping business of anybody but me. Imagine you can't comment anywhere because you block Google Analytics.

If an algorithm can be made to detect it (1)

rolfwind (528248) | more than 5 years ago | (#27710489)

doesn't that just mean a computer can also feed the correct data in, defeating it?

Anyway, the little tests these days are stupid and annoying, and perhaps for some people, getting impossible to do. Perhaps instead of the test being administered at the point of registration, new accounts at places should be automatically monitored for type of activity.

For instance, if the first post at a forum has any links to blacklisted ad sites (could be EasyList USA, whatever), it's probably safe to just kick it out automatically. And things of that nature. Or just the old sign up with a credit card and charge onetime $0.41 trick (or whatever to just cover min fees) to keep bots out of the community's hair.

I'm sure other solutions will have the old How-To-Fix-Email-response "Yes, but your idea won't work because (Mark random amount of 100 checkboxes)"

Re:If an algorithm can be made to detect it (1)

kvezach (1199717) | more than 5 years ago | (#27711203)

doesn't that just mean a computer can also feed the correct data in, defeating it?

Unless P == NP, checking a solution can some times be a lot easier than actually generating a solution. Consider, for instance, a hash like SHA-1. The whole point of a secure cryptographic hash is that checking if a certain hash matches that corresponding to a document is very easy, but crafting a document that matches an already specified hash is very hard.

Re:If an algorithm can be made to detect it (1)

Spasemunki (63473) | more than 5 years ago | (#27711455)

But this problem isn't checking a number theoretic property; it's applying a heuristic to a small pool of data points that may have been passed to you by a hostile reporter. Nothing indicates that this problem is significantly harder to provide a solution to programatically than to check programatically. Plus, the attacker gets a free oracle that tells you if you've created a good set of attack data. Let one real person register through the system, capture their data, and add a small amount of randomness to the timing and it would appear that you have permanently broken the system; I don't know how you could exclude an attacker doing that without also excluding a lot of actual users.

Tech Support (5, Funny)

cjfs (1253208) | more than 5 years ago | (#27710491)

I can see it now: "have you tried moving your mouse around randomly?", "how about clicking on a few different parts of the page then making coffee?", "still not working? Try slamming the mouse down several times", "okay, as a last resort click on the tabloid pop-up."

Re:Tech Support (1)

ElectricTurtle (1171201) | more than 5 years ago | (#27710591)

Ugh. Mod parent up for truth.

Modelling behaviour (1)

pfafrich (647460) | more than 5 years ago | (#27710497)

The tricky part of the an alternative solution seems to be modelling human behaviour - in order to detect if something is human or not your need to have a pretty good model of what humans do. I suspect there would be a lot of variation in the sort of way people interact, if I'm feeling sleepy I would present a very different profile of use to when I'm on task and in flow. A program to do this will probably have to be statistical in nature with some sort of confidence intervals of humanness. Maybe it will need some Cluster Analysis. This all makes for some pretty hard code and I'm not convinced the difference between two humans will be smaller than the difference between human and bot.

All those CAPTCHAs... (1)

creimer (824291) | more than 5 years ago | (#27710501)

You mean I didn't need a new pair of glasses every time I couldn't read on of those CAPTCHAs? I want my money back.

good luck humans (0)

Anonymous Coward | more than 5 years ago | (#27710511)

Great. I can just see myself a year from now, getting banned from a website for acting "too much like a robot".

Honeypots are a satisfying solution. Offer actions that the bots will respond to, but that a human would never take.

Response Times (1)

Anenome (1250374) | more than 5 years ago | (#27710517)

Seems some things should be easy. There's a certain minimum amount of time that it takes a human to tab from one field to another as they fill in data, even if they're pasting info in. Even just slowing down bots to the speed that a human could reasonably do a task would put a dent in the problem =\

Re:Response Times (1)

Jason Pollock (45537) | more than 5 years ago | (#27710543)

The problem is already easily parallelised. If it takes you 10s to fill in a form, and it isn't using any CPU (you're sleeping), then run a couple of thousand attempts in parallel. You get the _exact_ same throughput as you do if they are all run serially.

For batch processes, latency isn't really an issue, it just means you need to do more transactions at once.

Re:Response Times (1)

rdnetto (955205) | more than 5 years ago | (#27710815)

Then limit it to one attempt per IP address to prevent the parallelization. The only downside would be that this would also block people behind NAT, since they would have the same address.

Re:Response Times (2, Informative)

Jason Pollock (45537) | more than 5 years ago | (#27710983)

These guys have botnets, and with networks like Tor, you can't limit access to one IP. Besides, if you've got captcha that is being attacked, to limit them by IP, you need to send them all through a single location to perform the detection, completely breaking your load balancing. It becomes a DoS target.

Basically, the attacker has more machines, more IP addresses and more time than the target.

Even if I only have one machine, that's fine, I attack 10 or 100 sites instead of just yours. Or, I use a network like Tor and select random out proxies. The only problem? All of my compatriots will be doing the same.

The target won't see any real decrease in attacks, they will only lose all of their corporate customers who are unable to access the network from home (or dorms, or school, or libraries).

Here's the exploit, zero AI (0)

Anonymous Coward | more than 5 years ago | (#27710535)

Capture those "random" interactions of people with some page of your own (or where you can inject script), replay on target.

hmmmm (1)

thatskinnyguy (1129515) | more than 5 years ago | (#27710581)

A system that can determine whether or not a user is human would have built-in characteristics as to what a human would do in such a situation. What's keeping someone from taking that same algorithm and adapting it for means other than their intended purpose?

If a machine knows what to do, another machine can take advantage of that.

Obligatory: import skynet; blah

The judge is a computer (1)

DeadboltX (751907) | more than 5 years ago | (#27710609)

If the judge of the test is a computer, then the test will always be passable by a computer.

Javascript will kill this idea. (1)

MasterOfDisaster (248401) | more than 5 years ago | (#27710637)

Everyone has been focusing on the how easy/difficult it would be to reverse this hypothetical algorithm that would determine based on your use of a webpage if you're human or not... ...I see a more fundamental problem. This is on the internet, so they have basically 3 options on how to implement this.
1) server side. The only variable you could track is time between page requests. Don't see how that could possibly be enough information
2) Client side JS. Simple, just modify the JS to return &isHuman=true
3) Client side JS acting as a keylogger, sending back for server side verification. Harder to defeat, but you'll lose my business, the business of all of my friends, and have a horde of angry nerds picket your offices.

Also, this doesn't take into account any edge cases, for example if I've already been to your site, surf straight to /contact.html and paste in a email I previously wrote in Word(err, excuse me, OOo)

Re:Javascript will kill this idea. (1)

mrbene (1380531) | more than 5 years ago | (#27710731)

JavaScript can send the datapoints to the server as events without waiting for navigation events - think AJAX [wikipedia.org] .

As for current real world implementation, I've only seen a bank site that uses a Flash app for login, where they measure the typing cadence for user name and password.

Having an unexpected cadence does not prevent log in, but does inform the server to later do additional identity validation, and you get prompted from a pre-configured pool of questions when you try to do things like send money to that guy who just needs to bribe the official, and then the wealth of a princess will be yours...

Re:Javascript will kill this idea. (1)

Spasemunki (63473) | more than 5 years ago | (#27710863)

That still leaves the system vulnerable to a replay attack- modify the client-side JS to record the sequence of events in a successful login, and then play them back later. If you're using the timing between events to determine if a user is human, you have the problem that Ajax runs asynchronously; the timing between events being sent to the server isn't going to be the same as the timing between events. You'll probably also want to batch sending events so that you keep the app responsive. Both those things mean you need to record timing information in the data you send over, which means that you can't trust the timing data because it comes from the client.

Record a human, play it back (0)

Anonymous Coward | more than 5 years ago | (#27710641)

Problem solved. How hard is it to record human mouse and keyboard input and then play it back to "break" the security. Not very. How many seconds did they actually spend thinking about this awful scheme?

human usage patterns might vary too much (1, Insightful)

Anonymous Coward | more than 5 years ago | (#27710657)

I think there might be so much variation in human usage patterns, who all need to be accepted by the algorithm, that it should make it easy to simulate a behaviour that stays within those bounds.

On the other hand, if the algorithm doesn't allow much derivation, it will annoy a lot of people, who get falsely detected as bots. It might hit handicapped people or old people first then.

Simple, no? (1)

CaptSaltyJack (1275472) | more than 5 years ago | (#27710669)

Just use Javascript, watch for either some mouse movements or onBlur/onFocus.. and if those are present, then isHuman will == 1, and you pass that to the server side. Actually, you'll want to have some obscure variable name to make it less obvious.

Re:Simple, no? (1)

audunr (906697) | more than 5 years ago | (#27711051)

Actually, you'll want to have some obscure variable name to make it less obvious.

Like isDancer?

Strength in unity-in-diversity (2, Insightful)

brettz9 (969574) | more than 5 years ago | (#27710671)

The problem with a lot of sites dealing with spam is that they are using the same software that tries to solve everything at the top. Uniformity doesn't help.

But leaving people to their own devices to create or adapt their own forum/blogging/wiki software is not a good solution either. Uncoordinated diversity leaves a lot of people to fend for themselves.

Having unity-in-diversity (a common strength across systems and organisms), however, might well solve the problem.

If forum/blogging/wiki software creators would give sites the opportunity to make (and be able to change) their own set of question and answers for first-time-users (and not trouble them after that), I think bots would be hard-pressed to be programmed to interpret all such site-specific questions on their own. If bots could actually be programmed to intelligently answer all such human language questions, I think the bot-makers could be making a lot more dough in legitimate business...

yeah sure (0)

Anonymous Coward | more than 5 years ago | (#27710713)

It takes a human to know one.

DO YOU KNOW HOW OLD THIS IS? (1)

Jane Q. Public (1010737) | more than 5 years ago | (#27710717)

The idea that behavior was a better judge of identity than "biometrics" is old old. I wish I could remember the name of the program, but there was a Gnu / Unix utility that measured word frequency, letter frequency, the amount of delay between pressing any two letter combinations on the keyboard, and more... all put together to verify identity. And it worked quite well. I think that program is close to 20 years old.

Biometrics fails for the same reason it always has... as soon as someone comes up with a halfway reliable way to identify somebody, others come up with a fairly reliable way to fake the system. But micro-delays on the keyboard, etc. make for a pretty individual signature.

It's a form of biometrics too (0)

Anonymous Coward | more than 5 years ago | (#27711037)

Measuring micro-delays is just another way of authentication based on something you are (as opposed to something you have and something you know). Just another form of biometrics, with similar pitfalls, as others have already pointed out.

Vision = Fail (1)

ZeroNullVoid (886675) | more than 5 years ago | (#27710727)

Simple enough

Rule of reverse CAPTCHA (0)

Anonymous Coward | more than 5 years ago | (#27710775)

``Whoever is against captcha (or claims that it has been broken) is someone who would like the web to be something like facebook where every user has a login-id on their database''.

and at the same time is very pissed off because the captcha breaking programs are not really working.

follow the links to the profit...

April 1 was at the beginning of the month... (1)

Torodung (31985) | more than 5 years ago | (#27710817)

Wouldn't the ability to collect biometric information require a fairly potent piece of spyware to be loaded on the client system? How would a user, or even a security professional, easily tell the difference between a keylogger that reads our actual strokes, and one that is just timing the key presses?

Sounds like a kernel mode device that would have be part of the input drivers. It's an attack surface, IMO. I would think it's safer to have an separate input device for biometric authentication only than attempt to biometric metadata from highly sensitive input devs like keyboards and mice.

I did enjoy the 'honeypot field' example (in TFA). I suspect it is probably easily defeated, unfortunately. If the field is hidden on the page, can't we write a bot to detect that physical fact, or any source code (javascript?) that hides it. How do you obfuscate something like that without serving it with the page?

Sounds to me like CAPTCHA still wins. Oh well, I didn't expect much. ;^)

--
Toro

Re:April 1 was at the beginning of the month... (1)

guyminuslife (1349809) | more than 5 years ago | (#27710953)

I think the deadline for making meta-April Fools jokes must have also passed. And yes, there's a deadline. "April Fools was last year!" so he says.

Re:April 1 was at the beginning of the month... (1)

perryizgr8 (1370173) | more than 5 years ago | (#27711381)

Wouldn't the ability to collect biometric information require a fairly potent piece of spyware to be loaded on the client system? How would a user, or even a security professional, easily tell the difference between a keylogger that reads our actual strokes, and one that is just timing the key presses?

Sounds like a kernel mode device that would have be part of the input drivers. It's an attack surface, IMO. I would think it's safer to have an separate input device for biometric authentication only than attempt to biometric metadata from highly sensitive input devs like keyboards and mice.

I did enjoy the 'honeypot field' example (in TFA). I suspect it is probably easily defeated, unfortunately. If the field is hidden on the page, can't we write a bot to detect that physical fact, or any source code (javascript?) that hides it. How do you obfuscate something like that without serving it with the page?

Sounds to me like CAPTCHA still wins. Oh well, I didn't expect much. ;^)

-- Toro

you don't actually hide it. you write above the field "Please leave this field empty."

Spam Karma? (2, Informative)

nilbog (732352) | more than 5 years ago | (#27710843)

It seems like the old Spam Karma module for Wordpress did this. It calculated how long they were on the page vs. how much they had typed, how fast they typed, and a bunch of other factors before it ever hit a captcha. Back when I used wordpress I remember being it pretty accurate too.

hot or not (0)

Anonymous Coward | more than 5 years ago | (#27710891)

Sinply show 2 pictures of women and ask which one is hotter. Make sure one is ugly and the other fuckable.

voice recording (2, Insightful)

Ofloo (1378781) | more than 5 years ago | (#27711011)

Think of every behavior as a voice recording, record and replay ! And there you go bots are able to mimic.

External authentication (1)

Jeppe Salvesen (101622) | more than 5 years ago | (#27711073)

Captcha's etc won't work perfectly. Ever. There are always bot(net)s that are able to defeat them. If you use software to make the lettering difficult to read, you can still write software to read it. Like the algorithms, we detect the order in the chaos..

So let's just face it:

The internets needs a unified authentication system if we are to kill spam. If there was a unified authentication system, you would't need to store your passwords around the internet, and your mails would be tracable to you.

So, let those who need anonymity create their own solutions for interacting anonymously.

Not a great idea (3, Interesting)

jgoemat (565882) | more than 5 years ago | (#27711101)

The article did have links to some interesting topics, such as google experimenting with image orientation as a test. The premise of using how a user interacts with a page is deeply flawed though. There's not even a need for an algorithm or program to 'figure out' the captcha, just record how an actual user interacts once and you can send the same exact thing every time to pass the test. The reason this works is because the 'question' doesn't change. This would be like showing the same text captcha every time. If they ignore identical values being sent, the values can just be fudged a bit.

Re:Not a great idea (0)

Anonymous Coward | more than 5 years ago | (#27711285)

I'm not sure I understood your point but sending the same response again will not work if the captcha is properly implemented, because the captcha question is usually associated with the IP of the sender and it will change on every pageload from the current IP. So if you send the same captcha answer a second time, it will not pass. One captcha answer would only be valid for one submission.

Re:Not a great idea (1)

perryizgr8 (1370173) | more than 5 years ago | (#27711393)

the image orientation test seems to be the real answer. man, google has some smart people.

Use Turbo Tax Lately (2, Interesting)

SunSpot505 (1356127) | more than 5 years ago | (#27711143)

When I posted question to the Turbo Tax community forum it asked a simple question as a CAPTCHA. Seems like an easy enough solution, and it changes each time to foil a persistent brute force attack.

Of course I'm sure it's only a matter of time before someone has an algorithm smart enought to answer questions. And I suppose that a botnet with enought time would work too. Still an interesting approah I thought.

"Scrolling and typing" (2, Insightful)

Arancaytar (966377) | more than 5 years ago | (#27711153)

The user's local behavior before form submission is detectable only via a client-side script. There are therefore two ways this can go.

1.) You maintain accessibility standards and make the client-side script optional. The effectiveness of this approach is comparable to xkcd's [xkcd.com] "When Littlefoot's mother died in /Land before Time/, did you feel sad? (Bots: NO LYING!)

2.) You require client-side script execution in order to submit the form. The effect is a lot of pissed-off users with NoScript or non-compatible Javascript interpreters (IE or the rest, depending on which one you support).

This idea is basically like visual captchas, but instead of the visually impaired, you're screwing everyone without Javascript.

There is one aspect of user behavior that can be detected, however, and that is the time passed between the user requesting the form and submitting it. From an AI perspective, humans spend an eternity typing, so setting a minimum delay between request and submission will slow the bot right down - especially with a flood control that requires a delay before submitting the next form. Slashdot does both of these things already, by the way.

Google Groups' implementation (1)

tfg004 (974156) | more than 5 years ago | (#27711205)

Some time ago I already noticed that Google Groups has implemented a bot detection based on behaviour.

However, often when I browse through a google group in an efficient way, google thinks I'm a bot and blocks me for quite a while. The only way around is to work inefficiently on purpose, by making my clicks as rondom as possible with as random as possible time intervals. This costs me at least five times as much time as it would cost me the efficient way.
This is very annoying, so I think it would be better for them to ditch the behaviour detection and just rely on properly designed captcha's.

Another flaw in this idea (1)

olddotter (638430) | more than 5 years ago | (#27711343)

The captcha is entered into a field and submitted to the web server. However our random highlights, backspacing, scrolling etc. all happens in the browser on our system. The web server (thank ______ ) doesn't know about any of that, it just sees the end result. So it doesn't have access to any of that data, to make any kinda of determination. Currently only malware would be collecting this data and sending it somewhere. So the proposal here is to be human verified by malware.

There are other flaws that others have pointed out.

Two more angles no one seemed to take... (0)

Anonymous Coward | more than 5 years ago | (#27711463)

First, ask yourself this simple question: Is CAPTCHA popular because no one has thought of anything else, like the alternatives in the article? I doubt it. I'd suggest that CAPTCHA is popular because it is a better solution than those simple alternatives. The only criticism I hear of CAPTCHA in all this debate is that it is inconvenient. The other solutions, while perhaps more convenient for the user, do not solve the problem of sorting bots from humans nearly as well.

To drive this point home, consider the simple fact that CAPTCHA is so effective at sorting out bots from humans that the spammers have taken to paying humans to solve them. Could any of these proposed alternatives be more effective? How will you sort out the humans-paid-by-spammers from the rest of the humans? And if your alternative is no more effective than CAPTCHA, just more convenient, then you have made the humans-paid-by-spammemrs' jobs easier.

Second, I propose a REAL criticizm of CAPTCHA: accessibility. I don't mind that CAPTCHA is inconvenient for 999 out of every 1000 people. I mind that CAPTCHA is impossible for 1 out of every 1000 people. CAPTCHA doesn't just sort bots from humans, it is stronger than that. CAPTCHA sorts fully functioning and healthy humans from everything else, including handicapped humans. Yes, CAPTCHA puts people with disabilities into the bot category, and that is the REAL reason we should move on from CAPTCHA.

Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>