Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Writing Style Fingerprint Tool Easily Fooled

samzenpus posted more than 5 years ago | from the is-that-your-signature dept.

The Courts 96

Urchin writes "Some of the techniques used by literary detectives and courts of law to identify the authorship of text are easily fooled, say US researchers. They found that non-professional writers could hide their identity from 'stylometric' techniques by writing in the style of novelist Cormac McCarthy. Stylometric methods have been used in a number of high-profile legal cases in recent decades, including the 'Unabomber' trial. 'We would strongly suggest that courts examine their methods of stylometry against the possibility of adversarial attacks,' say the researchers."

cancel ×

96 comments

Sorry! There are no comments related to the filter you selected.

Could have told you writing analysis was bogus.... (3, Insightful)

Peter Steil (1619597) | more than 5 years ago | (#29130591)

....from the beginning. Sure it may work on a limited set of individuals. It's the same thing as a polygraph test, it's not based on any sort of quantifiable data but mere suspicion at best. It is completely subjective and there is no real hard science to support such tests. This is the reason why polygraphs are not admissible in court, and why writing analysis shouldn't be either. Be sure to watch for writing analysis to show up on the next Maury show!

Re:Could have told you writing analysis was bogus. (5, Informative)

Anonymous Coward | more than 5 years ago | (#29130635)

Some analysis of handwriting can be useful. In forgery, for instance, a signature can show as false when compared to an authentic one by the presence of a "forger's tremor", because the forger must proceed more slowly to produce the signature than the person to whom it properly belongs.

Re:Could have told you writing analysis was bogus. (5, Interesting)

Jason Levine (196982) | more than 5 years ago | (#29131591)

I've always wondered just how accurate signatures are. I've noticed that my own signature varies widely depending on various factors. For example, when we purchased our house I had to sign my name to a dozen or more papers. The first signature looked "normal" but the later signatures were glorified scribbles. If I needed to sign a check last and just scribbled my signature on the back, would the bank (not privy to my signature's declining quality in the previous paperwork) be able to tell that it wasn't a bad fake?

Re:Could have told you writing analysis was bogus. (5, Funny)

jbudofsky (1279064) | more than 5 years ago | (#29132091)

I've always wondered just how accurate signatures are. I've noticed that my own signature varies widely depending on various factors.

Signatures written on paper are not all that helpful for a few reasons. First off, they are easy to forge. Second off, a single person might sign his name twice and produce two signatures which look very different to both the naked eye and some forms of analysis - hence not accurate. Where they actually are accurate, however, is when written on pressure sensative pads (such as those seen on new-fandangled credit card swipers). If you were to do an analysis of the pressure and speed at which the signer signed various parts of the signature, you would actually produce some very reliable information. This is because even when you sign your name in slightly different manners you have the tendancy to use the same speed/pressure on certain parts of certain letters. Personally I would just use digital signatures...but calculating hash functions on the back of your resteraunt receipt is never fun. Its also difficult to fit a 256-bit output on that miniscule "sign here" line.

Re:Could have told you writing analysis was bogus. (1)

Philip K Dickhead (906971) | more than 5 years ago | (#29132407)

Dear Sirs,

I find highly offensive you suggestion that styles of writing may be subject to gimmickry and impersonation. I wish to complain in the strongest possible terms about the broadcast, and am deeply dismayed at the judgment displayed by the BBC in funding and producing such rubbish. Many of my best friends groom haddock and other north Atlantic fishes and only a few of them are transvestites.

Yours faithfully, Brigadier Sir Charles Arthur Strong (Mrs.)

Re:Could have told you writing analysis was bogus. (2, Funny)

a whoabot (706122) | more than 5 years ago | (#29134031)

Dear Sirs and Madam,

I wish to complain about that last complaint. I can assure you that all groomers of haddock and every other species in order Gadiformes are indeed transvestites. This is in fact a necessary grade to be reached in the apprenticeship process for the Gadiformes Groomers Guild (GGF). If the former complainant indeed knows of any non-transvestite groomers as such, then he should report them both to the GGF and to the Ministry of Fish Groomers in Luton at once!

Angrily,
Mr. Pint

Re:Could have told you writing analysis was bogus. (1)

Jason Levine (196982) | more than 5 years ago | (#29132833)

I don't know, some of those pads are OK at capturing my signature but others leave it a jumbled mess worse than any signature I've ever written with a pen. And that includes the "just signed my name 100 times, here's another paper to sign for my house" signature. I'm guessing that the differences are either expense (places that go with cheap pads get horrid looking signatures) or when the pad was purchased (earlier ones worse at capturing signatures than later ones).

As a side note, am I the only one who doesn't like it when my signature is printed on my receipt? It means that a receipt that I'd otherwise just throw out (since it doesn't have my full credit card number on it) becomes one I need to shred.

Re:Could have told you writing analysis was bogus. (1)

feandil (873841) | more than 5 years ago | (#29134471)

what's wrong with a PIN ? that cannot be forged. civilised countries do use them for credit card payments now you know.

Re:Could have told you writing analysis was bogus. (1)

msaavedra (29918) | more than 5 years ago | (#29137929)

Signatures written on paper are not all that helpful...Where they actually are accurate, however, is when written on pressure sensative pads (such as those seen on new-fandangled credit card swipers)

This may be slightly offtopic (but hopefully interesting to the slashdot crowd), so I apologize in advance. I've been trying to figure out how to use electronic signature pads to verify job authorizations, and haven't been able to come up with a way that they seem airtight to me if a customer denies issuing the authorization. Perhaps you or another reader can enlighten me.

I can record the data coming in from the signature pad and associate it with the job ticket in our database easily enough. However, if the customer denies authorizing the work, and we show them the signature data, they can just claim we copied it from another ticket. That seems like a reasonable defense to me, and one that very well might hold up in court if it came to that

I've tried to think of various ways to hash the signature data with unique information from a job ticket, but can't think of anything that can get around the fact that we have access to the raw data that comes from the signature pad, and can do what we want with it. Therefore, I don't see how they can be used for anything like signing a contract.

Of course, a signature on paper (which is what we currently do) can be forged, but there are ways to tell that have been mentioned elsewhere in this story.

Re:Could have told you writing analysis was bogus. (1, Interesting)

Anonymous Coward | more than 5 years ago | (#29142465)

Over in Japan, we use Hanko, which are simply ink stamps.
While signatures can be forged, Hanko is susceptible to theft AND duplication from the stamp.
I think signatures work on the assumption that signatures are like "artifacts" of one's personality - pretty much like statistics that describe
the character of a population. The same goes for stylometrics.
These, like MD5, are good for match identification, but not for authentication.
Using stylometrics as evidence IMHO is a misuse of technology.

Re:Could have told you writing analysis was bogus. (0)

Anonymous Coward | more than 5 years ago | (#29136755)

I've always wondered just how accurate signatures are. I've noticed that my own signature varies widely depending on various factors.

Yeah... Some of my signatures are clear and readable. Anybody could easily read my name from them. In others I don't even try write all the letters. Today I actually signed one receipt without trying to write even one letter there. It is just curvy lines in general shape of my autograph.

However, I have noticed that there are some factors that can be found from every single signature I've given. For example, I always make the last curve (which originally was the lower right corner of letter "a") in exactly the same shape, even when I don't actually write the correct letter before it. That's probably not the only similarity in all my signatures, even though the circumstances change. Most likely, the same applies to you.

The question is... How difficult would those similarities be to fake. Also, how much would I need to be able to change them to later say in court "You can't prove that I wrote that"

Re:Could have told you writing analysis was bogus. (1)

sjames (1099) | more than 4 years ago | (#29155057)

The fact is, our entire banking system is built upon little more than trust. Neither teller nor merchant has any idea if the squiggle on the check is MY squiggle or not. The cost of analysis to gain any level of certainty exceeds the value of a typical check.

The security features on the check don't mean a lot either. The check printer has no real way of knowing that I am or am not the person whose information they are printing on the check. In any event, a bank will cash anything check like they are given. A man once drew a check on the back of his shirt and sent it to the IRS. He received his shirt back from the his bank canceled.

All of the "security features" and the whole concept of signature for authentication are nothing more than a cursory measure to help keep honest people honest.

Re:Could have told you writing analysis was bogus. (1)

CarpetShark (865376) | more than 5 years ago | (#29132373)

a signature can show as false when compared to an authentic one by the presence of a "forger's tremor", because the forger must proceed more slowly to produce the signature than the person to whom it properly belongs.

Which is a totally arbitrary differentiation, considering that a confident, arrogant, or unconcerned forger might well write less hesitantly than a person worried about their handwriting quality, or whether they actually have enough in the bank to cover what they're signing for.

Re:Could have told you writing analysis was bogus. (1)

onionlee (836083) | more than 5 years ago | (#29132903)

sorry, but RTFA. this is stylometry, not handwriting analysis.
Wikipedia: Sylometry [wikipedia.org]
Wikipedia: Graphology [wikipedia.org]

Re:Could have told you writing analysis was bogus. (1)

mcmonkey (96054) | more than 5 years ago | (#29134047)

Some analysis of handwriting can be useful. In forgery, for instance, a signature can show as false when compared to an authentic one by the presence of a "forger's tremor", because the forger must proceed more slowly to produce the signature than the person to whom it properly belongs.

Perfect example! What you've detected is the speed and deliberation of the signer. In using this method to detect forgeries, you must make many assumptions regarding the state of mind of the signer.

Using myself as an example, my marriage certificate is one of the very examples of my verified signature. That is, it was signed in front of witnesses, including a photographer, so there's no doubt I signed that piece of paper.

But that is not my typical signature. In addition to the audience and the occasion, I knew the certificate would be framed. So I didn't do my usual get-it-over-with quick illegible scrawl.

Since then, I've signed many checks, credit receipts, etc. when there was no witness. I'd guess the difference between my everyday signature and my wedding day signature could cast doubt on whether they were all signed by the same person.

The signatures are different, but forgery is just one explanation of that difference.

Polygraphs have the same weakness. Yes the person's heart rate or skin conductivity changes, but deception is just one explanation of that difference.

Re:Could have told you writing analysis was bogus. (1)

jonbryce (703250) | more than 5 years ago | (#29140385)

This is writing analysis, not handwriting analysis. It looks at the words and punctuation you write, not the shape of the letters you write, so it can be used for typed documents. If they were looking at my writing for example, they would look at my vocabulary, the fact I use British rather than American spellings and words and so on.

Re:Could have told you writing analysis was bogus. (4, Insightful)

KibibyteBrain (1455987) | more than 5 years ago | (#29130677)

I don't think anyone has ever sold writing analysis as a unique identifier. But it can be useful. If one was an unpublished author in any significant form, and then "went unabomber" and started to write letters as a calling card, one could deduce from very similar writing styles and structures between the incriminating work and the unpublished/unpopularized previous would would be evidence to at least raise suspicion that the writer of the previous work was somehow uniquely tied to the crimes, even if not directly. Of course, all bets are off if it is plausible that someone could have pre-analyzed the author to imitate. Its also of note, this is only a positive test(i.e. a failed match in analysis makes no claim at all as to whether or not someone wrote it). I good example would be a set of writing that demonstrates an idiom used only in a certain locale, a business term used only in a certain company, and an ideological term used only in a certain fringe political movement. This is reasonable *evidence* of authorship, where of course evidence != proof. The polygraph, on the other hand, is complete BS because the only real thing a polygraph achieves is psychologically motivate the taker to tell the truth due to "faith" in the fact he will be outted for lying by the device. It doesn't actually measure anything related to the statements, only the physiological condition which can depend on millions of independent factors.

Re:Could have told you writing analysis was bogus. (1)

Blackhalo (572408) | more than 5 years ago | (#29130693)

"I do not think anyone has ever sold as an analysis of writing a unique identifier. But it can be useful. If one was an unpublished author in any way, and then is Unabomber, "and began to write letters as a calling card, can be deduced from very similar writing styles and structures of work and unpublished incriminating / unpopularized previous evidence that at least raise the suspicion that the writer of the earlier work was somehow tied to the crimes, though not directly. Of course, all bets are off if it is possible that someone could have analyzed previously the author to imitate. Also of note, this is only one positive test (ie, a mismatch in the analysis is not intended at all as to whether someone wrote it). I would be a good example of writing that demonstrates a language used only in a particular location, a business in which the term is used only one enterprise, and an ideological term used only in a certain political stripe movement. * * This is reasonable evidence of authorship, which, of course, tests! = Test. The polygraph, on the other hand, is complete BS because the only real thing is a polygraph achieves the beneficial psychological reasons to tell the truth because of the "faith" in the fact that they lie for outt by the device. Not really measure anything related to the statements, only the physiological condition that may depend upon millions of independent factors."

If going unabomer you could always bounce it through a translator as I did with your post. Of course you could end up with quite a bit of "Someone set us up the bomb!" But still semi-legible. I wonder how the analysis would do with that?

Re:Could have told you writing analysis was bogus. (4, Interesting)

KibibyteBrain (1455987) | more than 5 years ago | (#29130719)

Again, thats why its clear that writing analysis is only a positive test. If steps are taken to actively change the style of writing, of course it will fail. It is something like saying an audio recording of someone's voice in a phone call is invalid, because it is possible to speak in a different voice. While true, this doesn't significantly weaken the positive test value.

Re:Could have told you writing analysis was bogus. (0)

Anonymous Coward | more than 5 years ago | (#29134861)

From the Article:
And the techniques consistently identified Cormac McCarthy as the author of the imitations of his work.

Re:Could have told you writing analysis was bogus. (1)

mpe (36238) | more than 5 years ago | (#29131223)

If one was an unpublished author in any way, and then is Unabomber, "and began to write letters as a calling card, can be deduced from very similar writing styles and structures of work and unpublished incriminating / unpopularized previous evidence that at least raise the suspicion that the writer of the earlier work was somehow tied to the crimes, though not directly. Of course, all bets are off if it is possible that someone could have analyzed previously the author to imitate.

Or that either their previous work or their terrorist "press releases" were deliberatly in a different style. e.g. maybe they thought they had to write in a specific way to be published in a certain field or they are trying to incriminate someone else.

Yes, but here's the problem (5, Interesting)

Moraelin (679338) | more than 5 years ago | (#29130845)

Yes, but the problem is this:

1. It's not just that it's possible to fake not being myself, it's also that I can pretty much frame someone else. E.g., given enough messages written by KibibyteBrain (which just clicking on the user name or id will give me a list of), it's trivial to do a stylistical analysis on those and not just get an idea of how to write in the same style, but run the same analysis on the result and refine it until the match is outstanding.

2. From what I understand, the people in this test fooled it by merely being told to write in the style of someone else, without the help of any analysis tools, and still fooled it majorly. That's some pretty damn fragile "evidence" if anyone asks me. It's something Joe Sixpack can do by himself. Add some tools and it can only get crappier.

Even such idioms as you mention, are trivial to notice even without any tools. E.g., with only a little correspondence with another team here and reading some of their docs, I can tell that they use "solution" instead of "application".

3. While it can be handwaved as "eh, nobody said it's perfect", some people do seem to take it as less fallible than it really is. Even you just called it "This is reasonable *evidence* of authorship, where of course evidence != proof." And that's the whole point. Something that can be fooled by almost any Joe Sixpack without any tools or much effort, isn't reasonable evidence at all.

We allow evidence like handwriting, signatures, fingerprints, or DNA because they're supposedly very very hard to fake well. Ok, so DNA turned fakable as well, but you need a fair bit of expensive lab equipment and knowledge. It's something a biology prof at a medical college could probably do, but not something Joey Three-fingers the small time smuggler would even know where to start if he wants to plant someone else's fake blood at his latest shootout scene. Or fingerprints turned out easy to fake for the purpose of fooling a fingerprint reader, but it's still very very hard to transfer to an object in a way that looks genuine.

But here we have something that untrained people fooled by just being told to try. I'm sorry, but for me then it shouldn't be evidence at all.

Re:Yes, but here's the problem (1)

mdarksbane (587589) | more than 5 years ago | (#29131343)

I think it's about as much evidence as having someone's IP address. It can be spoofed, it's not necessarily linkable to that exact person - but it is sort of a pointer in the direction of that person, as occam's razor would suggest that it is more likely to be real than a frame.

So I would not say it should be admissible in court, or if it is it should come with a giant caveat, but I could see it pointing investigators in the direction of someone to try to find more hard evidence.

Article doesn't talk about incriminating others (2, Interesting)

neo (4625) | more than 5 years ago | (#29132287)

While you can attempt to write in someone else's style, you're going to run into problems duplicating it strongly enough for a stylometric analysis to implicate them. Even if you lifted exact phrases from previous works you will invariably need to come up with original words, phrases, and sentence structures to fill the gaps where the original author has not written. These should be enough put reasonable doubt as to the authorship of the faked text.

More over, if it's identified as a fake, by eliminating the material that was copied from previous styles it's likely that your identity may be revealed from the pieces that you inserted to fill gaps. Obviously the longer the piece, the more likely this is.

The technique of hiding one's own identity is a matter of using the same techniques in stylometrics to identify phrases, words, and structures that would identify you, and then changing these until they no longer give an indication of your identity.

Attempting to creating a work that duplicates someone else's stylometric signature would be fairly obvious to linguists.

RTFA, seriously (3, Informative)

Moraelin (679338) | more than 5 years ago | (#29133235)

From TFA: "Each volunteer was then asked to write a description of their neighbourhood in a way that masked their personal style, before writing a further passage in the style of novelist and playwright Cormac McCarthy." [...] "the techniques consistently identified Cormac McCarthy as the author of the imitations of his work."

So, yes, the whole bloody experiment was precisely about disguising your style as someone else, and no, it did not give the tests any reasonable doubt. People trying to imitate Cormac McCarthy were consistently identified as Cormac McCarthy by the stylistic analysis techniques. It doesn't get more clear cut than this, really.

So, yes, it is very possible for an average Joe Sixpack to incriminate someone else, if they so choose.

Re:RTFA, seriously (1)

An Onerous Coward (222037) | more than 5 years ago | (#29136347)

Somebody fairly well versed in these techniques ought to create a tool to help spoof another person. Upload the spoofing text, a substantial volume of the spoof victim's writing, hit go, and it comes back with a match rating, and perhaps suggestions for improving it (e.g.: longer sentences, compound sentences, more frequent use of the word "unfortunately", etc.)

That would pretty much doom the whole enterprise (or at least force it to advance beyond the current state of the art).

Re:Yes, but here's the problem (2, Insightful)

hairyfeet (841228) | more than 5 years ago | (#29133719)

It sounds to me like this "evidence" is just another case of bullet matching [sfgate.com] , which for those that haven't heard the term was the rage at the FBI for awhile and I'm sure there are innocent people rotting in jail right now over its bogus findings.

What we have to be seriously careful about with these pseudoscience "tests", is the simple fact that juries love CSI style mumbo jumbo that makes solving a case little more than a magic box pointing out someone and saying "He did it". And just like bullet matching juries would put far too much weight onto this type of evidence, strictly because of the "CSI Factor" and how scientific it sounds. That is why I am always leery of these kinds of "helpful evidence" simply because juries will give them much more weight than the science behind them says they are worth.

No information is better than bad information... (5, Insightful)

Xenographic (557057) | more than 5 years ago | (#29131157)

> I don't think anyone has ever sold writing analysis as a unique identifier. But it can be useful.

One problem with that is the human tendency to be overconfident as to how good these tests are. This happens everywhere. Court, business, whatever.

Say you have some metric at work (e.g. lines of code) that's easy to measure. If it's the only measure management has, it's what they'll use to measure how good you're doing. This applies even if the results are absurd, because they would rather believe that they have *some* idea what's going on than to accept the fact that they have no idea what's going on.

In summary, sometimes NO information is better than bad information, but people are very reluctant to accept that fact.

Re:Could have told you writing analysis was bogus. (1)

IBBoard (1128019) | more than 5 years ago | (#29130855)

Ah, the irony of someone saying "I could have told you" and then saying that it's "completely subjective" and has "no real hard science to support [it]"!

Writing style probably can be useful evidence where the style isn't known by others in advance, but it is quite easy to fake a style (much like having a "normal written style" and a "formal report style").

Re:Could have told you writing analysis was bogus. (0)

Anonymous Coward | more than 5 years ago | (#29131473)

....from the beginning. Sure it may work on a limited set of individuals. It's the same thing as a polygraph test

No. Comparing someone's writing style and their skin conductance are not even remotely similar. You're a fucking idiot.

Re:Could have told you writing analysis was bogus. (1)

element-o.p. (939033) | more than 5 years ago | (#29134791)

No. Comparing someone's writing style and their skin conductance are not even remotely similar. You're a fucking idiot.

I am truly stunned by the depth of the logical reasoning and analytical skills you displayed in the way you completely debunked GPP's post. After reading your argument, how could anyone possibly conceive that the usefulness of stylography and polygraphy in legal investigations could be at all comparable? Truly, I'm speechless at your boundless intellect.
</sarcasm>

Re:Could have told you writing analysis was bogus. (2, Insightful)

Lillesvin (797939) | more than 5 years ago | (#29131609)

It is completely subjective and there is no real hard science to support such tests.

I beg to differ. There's very little subjective in stylometrics, the subjective part is interpreting the results, but definitely not producing them. Take a look at http://en.wikipedia.org/wiki/Stylometry [wikipedia.org] and tell me which of the methods described there you think is "completely subjective".

The main problem with stylometry is not the methods, but the data. As TFA describes, changing writing style throw off the results - at least to some extent. Stylometrics relies on the fact that old habits die hard, but if someone is aware that the text they are producing might be subjected to stylometric analyses, they can employ various mechanisms to avoid identification and will probably have a better chance at succeeding than if writing casually. However, most texts used in court has been produced casually (letters, emails, text messages) and almost always have some unique traits specific to their author. Even in cases where people plagiarize a known author, they always miss some subtlety in his/her style that gives away the plagiarism. These subtle differences in style are usually caught somewhere in the stylometric analysis.

It occurs to me now that you may be talking about hand-writing analysis, in which case my reply is completely irrelevant and you have completely missed the point of summary and TFA.

Re:Could have told you writing analysis was bogus. (1)

element-o.p. (939033) | more than 5 years ago | (#29135351)

The main problem with stylometry is not the methods, but the data. As TFA describes, changing writing style throw off the results - at least to some extent...if someone is aware that the text they are producing might be subjected to stylometric analyses, they can employ various mechanisms to avoid identification and will probably have a better chance at succeeding than if writing casually. However, most texts used in court has been produced casually (letters, emails, text messages) and almost always have some unique traits specific to their author.

But therein lies the rub: how can you be certain that the actual author didn't consider that the text might be subject to stylometric analysis? Even as a kid, if I wrote something that I didn't want traced back to me, I made an effort to disguise my handwriting and writing style. If I thought of that back when I was a semi-delinquent teen/pre-teen (okay, not really delinquent, but I did get a little mischievous once or twice), I can just about guarantee that anyone who is doing something that might land them in real legal trouble will do likewise.

In other words, for stylometric analysis to have *any* degree of validity whatsoever, you not only have to prove that the styles of the sample text and suspected author's typical body of writing match, you also have to prove that the original author never considered that the writing style would be analyzed, and therefore that the original author did not take any steps to disguise writing style. You can't make any assumptions about what the real author expected when composing the message.

Re:Could have told you writing analysis was bogus. (1)

Lillesvin (797939) | more than 5 years ago | (#29138491)

That is true, but that's where the habitual aspect comes in. While you may be conscious about various aspects of your writing style, there are certain areas that are less prone to conscious manipulation --- e.g. certain syntactical constructions or your active vocabulary. No one (ie. no forensic linguists) will believe that you are Douglas Coupland if the frequency of certain prepositions in your text deviates wildly from his works. And yes, you can of course tamper with such frequencies, but the point is that most people don't. You don't totally dismiss fingerprints as evidence because some criminals wear gloves, do you?

It's also important to note that no court has ever based a verdict solely on stylometry. Stylometry will never give any definitive answer, but it might corroborate other evidence, which is kind of the whole idea. Stylometry may help eliminate a subject as well as identifying one, so while it may not be usable as the sole base for a conviction, it's still very useful and should be acknowledged as such.

If you really want to know about stylometry (and forensic linguistics in general), I suggest taking a look at John Olsson's website: http://thetext.co.uk/ [thetext.co.uk] and/or reading his book Word Crime, which is easily read even by people without linguistic training. John Olsson is one of the only full time forensic linguists and has dealt with a lot of different cases --- some involving stylometry.

A final note: Please stop refering to it as "writing style fingerprint" --- no serious forensic linguists do that, since it's in no way similar to fingerprints. Writing style doesn't rely on biometrics and is much more easily changed than the pattern of the ridges on your finger tips.

Concealing style (4, Funny)

Anonymous Coward | more than 5 years ago | (#29130603)

hide their identity from 'stylometric' techniques by writing in the style of novelist Cormac McCarthy

... or Anonymous Coward.

Re:Concealing style (0, Offtopic)

fridaynightsmoke (1589903) | more than 5 years ago | (#29130961)

Never mind Anonymous Coward, I want more posts by Anonymous Cowardon! I haven't seem him/her around recently.

Re:Concealing style (2, Informative)

Thanshin (1188877) | more than 5 years ago | (#29131097)

What a crappy joke. I wish I could find you and kill you.

I mean...

Oh! A bad pun! Should we cross our paths, I'd rather extinguish your life.

My dear sir.

ENOSHIT (0)

Anonymous Coward | more than 5 years ago | (#29130641)

You mean human beings can mimick each other? Aw shucks

Duh! (3, Insightful)

k.a.f. (168896) | more than 5 years ago | (#29130647)

If the methods a stylometry analysis uses are known (and they couldn't very well be a secret to hold up in court), of course you can game them. As long as the algorithm outputs "no" for any reformulation of your message, you can easily find it, by generate-and-test if necessary. The only question is, how fast can you generate a text that (a) says what you intend and (b) does not point to you? Very fast, I'd wager.

Did you RTFA? (4, Informative)

argent (18001) | more than 5 years ago | (#29130839)

If the methods a stylometry analysis uses are known (and they couldn't very well be a secret to hold up in court), of course you can game them.

Their volunteer "attackers" lacked formal training in linguistics and had no access to stylometry software.

Re:Did you RTFA? (5, Insightful)

Opportunist (166417) | more than 5 years ago | (#29131055)

No, but they knew they were being analyzed and for what. It's trivial to change my style (well, maybe not in English, I don't tend to have the word pool to draw from) and become someone else. If I know in advance that my writing would be used to find me.

You can, probably, given time and persistance, sift through the thousands and millions of board messages posted everywhere on the internet and find out who I am in other boards. I didn't try to hide my identity against comparison of writing styles.

I could see this working if applied to notes and texts written by someone who didn't have any reason to assume it would become the subject of an investigation. I'd deem it utterly worthless, though, when applied to ransom notes and the like.

Re:Did you RTFA? (2, Interesting)

k.a.f. (168896) | more than 5 years ago | (#29131549)

No, but they knew they were being analyzed and for what. It's trivial to change my style (well, maybe not in English, I don't tend to have the word pool to draw from) and become someone else. If I know in advance that my writing would be used to find me.

You can, probably, given time and persistance, sift through the thousands and millions of board messages posted everywhere on the internet and find out who I am in other boards. I didn't try to hide my identity against comparison of writing styles.

I could see this working if applied to notes and texts written by someone who didn't have any reason to assume it would become the subject of an investigation. I'd deem it utterly worthless, though, when applied to ransom notes and the like.

That's what I meant, sorry: even a computer program could outwit such analyses. Given the current state of automatic language analysis (Disclaimer: IAA computational linguist), I consider it obvious that a determined person can fool the discriminators enough to appear as someone else.

Re:Did you RTFA? (1)

steelfood (895457) | more than 5 years ago | (#29133057)

Don't you know, ransom notes are compiled by pasting the individual letters cut from magazines. They do it in the movies all the time.

Re:Did you RTFA? (1)

Opportunist (166417) | more than 5 years ago | (#29134661)

Still, you cut and paste to write what you plan to express. This may already be a lead. Not to mention that your choice of newspaper is a good hint for a profiler, amongst other things, how you cut and glue the paper snippets, how you choose words...

I'd write an email.

Re:Did you RTFA? (1)

element-o.p. (939033) | more than 5 years ago | (#29135623)

well, maybe not in English, I don't tend to have the word pool to draw from

I'm impressed. Your spelling, grammar and punctuation are much, much better than a good portion of the native speakers/writers posting here :)

I could see this working if applied to notes and texts written by someone who didn't have any reason to assume it would become the subject of an investigation. I'd deem it utterly worthless, though, when applied to ransom notes and the like.

Exactly. If you have any inkling that your texts will be analyzed to determine who the actual author is, and you don't want them traced back to you, then, as TFA states, it is trivial even for amateurs to mimic other writing styles to hide the actual author's identity. Even if writing is found in your possession that looks like your style, can you prove that someone didn't mimic your style and plant the sample? TFA says, "no".

Re:Duh! (2, Funny)

bitt3n (941736) | more than 5 years ago | (#29131605)

how fast can you generate a text that (a) says what you intend and (b) does not point to you? Very fast, I'd wager.

as fast as: type it out, auto-translate it into french, auto-translate it back into: "the person who is being hated by myself is to be killed by myself by employment of the method of the bomb conflagration saving if it is the case that I am receiving the stipend of an amount that is one million of dollars. sandwich."

Re:Duh! (1)

YttriumOxide (837412) | more than 5 years ago | (#29133037)

the person who is being hated by myself is to be killed by myself by employment of the method of the bomb conflagration saving if it is the case that I am receiving the stipend of an amount that is one million of dollars. sandwich.

Oddly, if you translate that in to French and back (using Google translate), you get "the person who is hated by myself is to be killed by myself by using the method of the bomb save conflagration if it is that I receive an allocation of that amount is to a million dollars. sandwich.", which is (IMHO) slightly MORE readable than your original!

Re:Duh! (1)

nacturation (646836) | more than 5 years ago | (#29136111)

how fast can you generate a text that (a) says what you intend and (b) does not point to you? Very fast, I'd wager.

as fast as: type it out, auto-translate it into french, auto-translate it back into: "the person who is being hated by myself is to be killed by myself by employment of the method of the bomb conflagration saving if it is the case that I am receiving the stipend of an amount that is one million of dollars. sandwich."

FBI Agent 1: "This guy wants a million dollars. And a sandwich."
FBI Agent 2: "Bastard must be hungry. Let's try starving him out."

P vs. NP (1)

mdmkolbe (944892) | more than 5 years ago | (#29132783)

... you can easily find it, by generate-and-test if necessary.

If you think generate-and-test is an easy way to find it, then I've got some NP-complete problems for you to solve. While you're at it, I also have some public keys I'd like you to crack.

/sarcasm

(Not that I think fooling stylometry is hard, but generate-and-test is generally not useful for anything but the smallest problems.)

No surprise (4, Interesting)

AmiMoJo (196126) | more than 5 years ago | (#29130681)

This should not really come as a surprise to anyone. Like all evidence that has to be interpreted, the interpretation can be flawed.

Shows like CSI have computers getting an exact match on fingerprints and DNA, but the real world is not like that. Fingerprint matching is entirely subjective and the print recovered from a crime scene is rarely a nice clean one like they show on TV. DNA often has to be manipulated before a match can be made (due to the sample found at the scene being too small or of poor quality) and even then it often matches more than one person.

Even when you do get a match, it's not proof that someone was at a specific place because DNA and fingerprints can easily be transferred. Someone broke in to my car a few years ago and despite there being fingerprints the police decided not to prosecute because they were on the outside of the car and the accused could just claim he lent on it on his way home from the pub.

There have been a few cases where fingerprint and DNA evidence have been challenged in the UK courts and shown to be unreliable, with innocent people spending years in jail before being cleared. Yet, the police seem to have started asking for everyone in the area of a crime to "volunteer" their DNA. Presumably if you don't "volunteer" you become a suspect.

The idea that handwriting is any more unique than those two and at all reliable is laughable.

Re:No surprise (3, Insightful)

abigsmurf (919188) | more than 5 years ago | (#29130815)

There was a good article here (or possibly some other social news type site) about the inherent flaw in DNA databases and the weight given to DNA evidence.

The theory goes like this: the chances of getting a false positive on a part sample are something like 1/50million. You have 50 million people on the database. This means You'd expect a false positive on every search. If you're unlucky enough to live close enough to a crime to have committed it, you could easily find yourself in court.

You'll then have to defend yourself based on a 1 in 50 million probability to a jury who won't understand the statistics. If you haven't got a solid alibi, it would be a tough thing to do.

There's probably a good Terry Pratchett quote about 1 in a million chances to be used here.

Re:No surprise (0)

Anonymous Coward | more than 5 years ago | (#29130871)

That's the whole point.... what's that probability of you being somehow connected to the crime (eg. living near the place it was committed, knowing the victim, you having a past criminal record etc.) and you having a DNA match. That combined probability will be much lower than 1 in 50 million.... maybe 1 in 1 billion or less. Combined with corroborating evidence DNA still has to be considered the most reliable form of evidence short of a confession (and even that's not 100% sure.

Re:No surprise (1)

TheLink (130905) | more than 5 years ago | (#29130947)

A DNA match does not establish motive and other important things.

All it shows is "it is very likely that your DNA is here".

Once you add up everything (other evidence, alibi found to be false, etc) else you might have something. But it is certainly not "short of a confession", it is nowhere even close to a confession.

It's certainly very useful to help find out who else to investigate (and who to investigate first :) ).

Same goes for the "writing style" tool. Even if it's easily fooled, it may still be a useful tool.

Even if the tools aren't perfect, the criminals aren't perfect either - they make mistakes, or they never really planned the crime in the first place.

Re:No surprise (1)

onemorechip (816444) | more than 5 years ago | (#29134077)

I place it at 1 in 6.779 billion [wikipedia.org] ...There will always be somebody, somewhere in the world, who is connected to the crime and has a DNA match to the evidence (assuming the DNA in the evidence really came from someone associated with the crime). But in most cases, there will just be one such person.

Re:No surprise (1)

MaskedSlacker (911878) | more than 5 years ago | (#29150785)

Congratulations, you know nothing about how DNA matching works.

Example:

In the OJ Simpson case, the odds of a chance match as good as OJ's were 1 in 170 million. Not even close to 1 in 6 billion. From a 1 in 170 million odds of a random match that means that the median number of random matches in a population of 6 billion on the planet would be 51 random matches, with standard deviation of seven.

In the case of Nichole's blood on the sock there was a 1 in 21 billion chance of a random match. In a population of 6 billion we would then expect a median of 0.41 random matches, with a standard deviation of 0.64.

In the first case the match is pretty weak (unlikelier things occur every day). In the second however, we can say that there is a 97.8% chance that no other human being on the planet matches the blood samples on the socks, and that even in 2.2% chance that there is, the odds of them being around OJ to get their blood on the socks is vanishingly small.

If we take into account both data points at once, the numbers get even smaller.

But nowhere, in any of this, did the DNA samples uniquely identify a person. That isn't how they work (and in fact, it is completely impossible given that some people do not have a single a set of DNA). What DNA samples do is establish the probability that they match a person--and when you take into account multiple samples from multiple persons (OJ's blood at the crime scene, and Nichole's blood on his socks--not multiple samples from the same person at the same place) it can be a very good identifier.

Re:No surprise (1)

onemorechip (816444) | more than 4 years ago | (#29153765)

Wow. You're the king of "whoosh"!

Re:No surprise (1)

JasterBobaMereel (1102861) | more than 5 years ago | (#29131409)

This is the problem with fingerprint evidence as opposed to DNA evidence

DNA Evidence is normally matched on a small number of key points against a database of these points, probability of a mismatch is ~ 1:50million with a world population of 6.5 billion you will get mis-matches, with a US population of 300 million you will get mis-matches CODIS has 5 million entries so far ,, mis-matches are less likely but not impossible ....

NB if you have two samples then they can be matched exactly with total confidence, excepting identical twins, but this takes longer is more expensive and the required level of detail is not kept in a database ...

Fingerprint evidence is subjective and they do not give a confidence probability .. it's either a match or not!

Re:No surprise (1)

jonbryce (703250) | more than 5 years ago | (#29140447)

It can be matched exactly with total confidence if you scan the entire DNA. However, it took the human genome project a long time to do that with just one sample, so I don't think that is being done with police samples.

Re:No surprise (1)

MaskedSlacker (911878) | more than 5 years ago | (#29150813)

For one, it isn't being done with police samples, and it would be utterly stupid to do so, and not for the reason you think.

It is unlikely that any two cells selected at random from your body share exactly the same DNA. Every cell division introduces errors. Some of these errors cause the new cells to malfunction and die (or malfunction and become cancer), but many will not. A total DNA comparison would rarely, if ever, return a perfect match except by chance (either the chance of having picked the right two cells, or having randomly matched a person who the sample didn't come from).

Your DNA is not a uniform monolith throughout your body (and in fact, there are people with multiple sets of DNA in different parts of their body--not from replication errors, but from multiple fertilization). Total DNA matching would be useless.

Re:No surprise (1)

AmiMoJo (196126) | more than 5 years ago | (#29148867)

An excellent point well made.

There is also danger of a match being made on another member of your family, but you being the one somehow tied to the case (in the same city or something) and so you get arrested. Siblings have close enough DNA that such matches can apparently be made.

I question the "1 in 50 million" statistic too. It's far too simplistic, as there are different ways of collecting and matching DNA. Also, so-called experts have been wrong about this sort of thing in the past. Remember that poor woman who spent years in jail because some idiot said that there was a "1 in a million" chance of having three children all die of cot-death?

Re:No surprise (1)

Attila Dimedici (1036002) | more than 5 years ago | (#29135117)

There was a study done where fingerprint "experts" were asked to identify fingerprints from a "crime scene". Except these fingerprints were actually fingerprints that the expert had previously identified. The experts identified the fingerprints the same the second time around at a rate of less than 50% (my recollection is that only 1 in 10 of the experts gave the same identification for the fingerprints the second time around).

a common feature of correlations (3, Insightful)

Trepidity (597) | more than 5 years ago | (#29130685)

Stylometrics is essentially a correlational field: it's not that people inherently must write in unique styles that are identifiable from a few measurable features: there is no strong genetic causation for handwriting or anything like that, which would mean that a handwriting style really does truly identify an individual or narrow set of individuals. Rather, it's that, all else being equal, people in practice, do tend to write in a way that lets the stylometric features distinguish them. But, when all else isn't equal, and people are actively trying to thwart that sort of analysis, they are, unsurprisingly, able to do so in a lot of cases.

I suspect that a lot of forensic analysis runs into this problem: it takes some fact that empirically is true among the general population, but only because the general population is not actively trying to thwart you. The set of robust empirical truths about people, that hold up even when the person is aware that you're trying to use it against them and actively trying to keep you from doing so, is much smaller.

Re:a common feature of correlations (3, Insightful)

digitig (1056110) | more than 5 years ago | (#29130757)

[Sigh] Somebody else who thinks this is about handwriting. It isn't.

Re:a common feature of correlations (1)

An Onerous Coward (222037) | more than 5 years ago | (#29136483)

Why speak you this thing? Can not a man conceal those affectations of his writing mannerisms just as said man can subvert the so natural loops and lines that emanate from the hand which writes? What draws you to this belief, that the prior speaker did mean the scribbles of the hand, not the selection and arrangements of the letters and the words?

Misdirection (1)

wanax (46819) | more than 5 years ago | (#29130723)

The real issue is why we continue to ban 'criminals' when forensics are both available for testimony but often not for further examination because of deliberate overuse. We've now been shown data that even DNA evidence can be manufatured, if it's not first tested for methyl levels. And that is totally independent of physical specification. Which bring back the essential question that we've not had updated since 2000: What are we willing to expend energy for?

Cormac McCarthy Stlye? (2, Interesting)

hansraj (458504) | more than 5 years ago | (#29130737)

What exactly is the "Cormac McCarthy style"? The article doesn't mention it all. I even skimmed through the paper and all it does it quote a paragraph from some work of Cormac McCarthy.

I can't figure out what his style exactly is, and I certainly would not be able to fake it as the participants were supposed to. And the participants were supposed to not be literary geniuses.

Re:Cormac McCarthy Stlye? (0, Flamebait)

Saint Stephen (19450) | more than 5 years ago | (#29131207)

Well, I was just as confused as you were, but it's fairly obvious that he writes in some gimmicky convuluted way that people think is cool. Did you ever try to read Michel Focault? He would string together sentences with about 6 clauses each containing participle/subject/predicate lists of 3 of 4 items, and the whole damn sentence would take up about a page or more.

And then there is James Joyce in Finnegan's Wake, and that dumb guy we all had to read in college that told the story about the retarded guy and it jumped back and forth in time, so called "stream of consciousness."

So it's fairly safe to assume that he has some other gimmick that just makes his shit hard to parse. Unless I thought maybe he prints his letters in some unusual way like ee cummings.

Who the fuck knows.

Re:Cormac McCarthy Stlye? (2, Informative)

SappoMan (51574) | more than 5 years ago | (#29131447)

This is the epilogue from "Blood Meridian", a novel of McCarthy:
"In the dawn there is a man progressing over the plain by means of holes, which he is making in the ground. He uses an implement with two handles and he chocks it into the hole and he enkindles the stone into the hole with he steel, hole by hole, striking the fire out of the rock, which God has put there. On the plain behind him are the wanderers in search of bones, and those who do not search. And they move haltingly in the light, likes mechanisms whose movements are monitored with escapement and palate, so that they appear restrained by a prudence or reflectiveness which has no inner reality. And they cross in their progress one by one that track of holes that runs to the rim of the visible ground and which seems less the pursuit of some continuance than the verification of a principle, a validation of sequence and causality. As if each round and perfect hole owed its existence to the one before it there on that prairie, upon which are the bones and the gatherers of bones, and those who do not gather. He strikes fire in the hole and draws out his steel. Then they all move on again."

Re:Cormac McCarthy Stlye? (1)

langelgjm (860756) | more than 5 years ago | (#29131387)

The only thing of his I've read is "The Road", which is a great post-apocalyptic novel. I do remember his style was a little unusual... it's been a few years, but I'm thinking sentence fragments, half-finished thoughts, etc.

Whatever it was, though, it wasn't distracting enough to prevent me from finishing the book. I'm trying to read some Margaret Atwood now, and not really enjoying it...

Re:Cormac McCarthy Stlye? (1)

will_die (586523) | more than 5 years ago | (#29131691)

Most of the story is written as 3rd person but has various parts written as 1st person.

Re:Cormac McCarthy Stlye? (1)

value_added (719364) | more than 5 years ago | (#29131711)

What exactly is the "Cormac McCarthy style"? The article doesn't mention it all. I even skimmed through the paper and all it does it quote a paragraph from some work of Cormac McCarthy.

Admittedly, they should have included an excerpt of reference text. From a randomly selected [quarterlyc...sation.com] website because I'm too lazy to walk to my bookshelf for something newer:

In large part, The Orchard Keeper is written with the same stylistic tics that that Harold Bloom would later celebrate in Blood Meridian as, to paraphrase, the most remarkable American prose accomplishment since Pynchon. Already, we see: the fresh refurbishment of nouns and adjectives as verbs; the repeated joining of two unlikely nouns to create an adjective without precedent in English; quotation-less dialogue; language that reaches toward the portent and cadence of epic (commonly referred to as "vatic"); the frequent use of proper names and highly precise, almost scientific language to describe nature; and the casual employment of archaic-sounding, uncommon words that perfectly fit the bumps and flows of their sentences.

I can't figure out what his style exactly is, and I certainly would not be able to fake it as the participants were supposed to.

Well, it sounds like you've never read him, so why would you expect to? That said, if I give you a page pulled from a book written by, say Raymond Chandler and Mark Twain, I'd wager you'd have little problem telling them apart, and similarly have little trouble "aping" their respective styles at a single sitting.

And the participants were supposed to not be literary geniuses.

You need to be a "genius" to distinguish things? The CSI television series, Tony Scott movies, Johnny Carson standup humour (as done by Jay Leno, Bill Maher, etc.), Matt Groening cartoon characters, xkcd.com comics, and Chuck Berry guitar riffs span a wide range of disciplines, but each is readily recognisable to the average person, and if not, a single exposure would suffice. Don't know whether a book like All the Pretty Horses is taught in high schools, but Faulkner certainly is, and for those students (even the C ones), his style is obvious.

Re:Cormac McCarthy Stlye? (2, Interesting)

Anonymous Coward | more than 5 years ago | (#29131935)

Ummm Not Fair 20 years ago the exam board just labeled that 'Bad Grammmar' and failed me.

Re:Cormac McCarthy Stlye? (1)

Mindcontrolled (1388007) | more than 5 years ago | (#29132777)

As you have already got some pointers to the style of McCarthy, let me tell a little anecdote. Some literary scientist once tried, only half-jokingly, to come up with a measure [jhu.edu] for the "southernness" of books. After some research, he found out that the deeper the southern roots of the author, the more dead mules appear in his texts. By this metric, Cormac McCarthy is the undisputed king of the genre, with over 100 dead mules in his novel "Blood Meridian" alone. He kills 50 alone when he let's them drop over a cliff while carrying mercury for a mining operation. To give some insight into the style, let me quote:

the animals dropping silently as martyrs, turning sedately in the empty air and exploding on the rocks below in startling bursts of blood and silver as the flasks broke open and the mercury loomed wobbling in the air in great sheets and lobes and small trembling satellites."

Re:Cormac McCarthy Stlye? (1)

Daniel Dvorkin (106857) | more than 5 years ago | (#29136129)

That "Dead Mule Test" essay is OMFG brilliant. Thanks for the laugh.

Re:Cormac McCarthy Stlye? (1)

ChefInnocent (667809) | more than 5 years ago | (#29133849)

The style of an author has many factors. Mostly it deals with word choice, sentence structure, and information flow. Does the author over use certain words? Does he/she have a large vocabulary? Are the descriptive words long or short? Are the sentences short or rambling? Are the sentences passive or active? How much detail and how long does it take for the author to convey his/her ideas?

In college, I took a linguistic class on stylistics. One of the best courses I ever took, and I wish I had taken it much sooner. I went from a B/C paper writer to an A every time. First, I learned to not write like a mathematician (long sentences with excruciating numbers of quantifying clauses). Then, I learned to mimic my instructors. Few instructors are willing to give themselves less than an A. I'd gather as much of their writings as possible and study the words and sentences to figure out how they wrote.

Selfevident, isn't it? (2, Interesting)

Lundse (1036754) | more than 5 years ago | (#29130745)

If you can describe something in enough detail to put it in a certain category (X writes likes this), then you can also imitate that category from that same description (I will now write like this in order to seem like X).

I do not really see how you would ever expect different.

lol gay (1)

Rogerborg (306625) | more than 5 years ago | (#29130829)

no 1 can f00l teh l33t XpertZ just bi change D way D write stuff kekeke stupit fags

Misrepresents forensic linguistics (4, Insightful)

digitig (1056110) | more than 5 years ago | (#29130949)

As the article says "the study only attacked some of the less complex stylometry techniques". In fact, I'm surprised that they even considered lexical density because that varies greatly within a single author's writing. It's usually high at the beginning of a text, usually (not always) gradually falls off, jumps when they change subject, and so on. I'm not aware of it's being used in forensic linguistics (although it is used in analysing texts to identify, for example, objective divisions within a text).

The sort of thing that they used in the Derek Bentley [wikipedia.org] (which contributed to the partial posthumous pardon) was analysis of his statement, which had

  • unusually high proportion of passive constructions
  • the use of police jargon
  • use of language that was not consistent with an educationally sub-normal 17-year-old
  • word frequencies that didn't correlate well with general spoken or written English but that did correlate very well with police reports
  • unusual precision in the expression of times
  • frequent post-positioning of "then" after the subject ("I then went..." instead of "then I went..."), again characteristic of police reports

That all pointed to the statement not being Bentley's own words, but rather being the police version of his answers to a series of police questions that had been removed from the statement. One aspect of his original trial was a statement "I did not know he was going to use the gun", which was taken as evidence that he knew his accomplice, Craig, had a gun (and the inconsistency with the denial that he knew this, later in the statement, was taken as evidence that he was lying). Since the linguistic analysis shows that this was probably a reply to a question, it seems more likely that it went something like:

Police
Did you know he was going to use the gun?
Bentley

No.

Which makes sense because he knew at the time of the interview that Craig had a gun.

Yes, of course this sort of thing can be gamed, but it wasn't credible that Bentley would have been capable of such sophisticated gaming. The important thing as far as this thread is concerned is that forensic linguistics doesn't plug in a single measure, turn a handle and come out with a yes/no answer; it uses a whole range of measures and builds up an overall picture of what probably happened.

Bad Assumptions too (1)

DingerX (847589) | more than 5 years ago | (#29131253)

Some of the techniques tested by Brennan and Greenstadt discard prepositions because they are deemed to have no information content, says Michael Oakes, a computational linguist at the University of Sunderland, UK. This filters out the words that could have helped most, he says.

"deemed to have no information content" is actually a positive feature for analysis. Vocabulary is one thing, but the little things, like prepositions, malapropisms, punctuation and favorite constructions are harder to fake. If someone consistently uses it's as a possessive and writes "for all intensive purposes", it'll be difficult for that person to suddenly start writing consistently.

cut-and-paste is always an option.

Re:Bad Assumptions too (1)

ImprovOmega (744717) | more than 5 years ago | (#29136243)

If someone consistently uses it's as a possessive and writes "for all intensive purposes", it'll be difficult for that person to suddenly start writing consistently.

Well that right there would identify 90% of Slashdot, Fark, and Digg users as being the same author.

Re:Misrepresents forensic linguistics (1)

Half-pint HAL (718102) | more than 5 years ago | (#29132217)

In fact, I'm surprised that they even considered lexical density because that varies greatly within a single author's writing.

Did they? Yes, the article mentions lexical density, but it then* goes on to describe token/type ratio, which is a different beast entirely.

The problem with FAs is that they're anything but a primary source....

HAL.

* Am I hiding my writing style here or adhering to it...?

Re:Misrepresents forensic linguistics (1)

Half-pint HAL (718102) | more than 5 years ago | (#29132363)

Oops. Type/token, of course.

Bad news for Cormac McCarthy... (0)

Anonymous Coward | more than 5 years ago | (#29131101)

Now every time some anonymous terrorist publishes his manifesto Cormac McCarthy gets a visit from police.

Talk Like Yoda, I Will (0)

Anonymous Coward | more than 5 years ago | (#29131283)

When your child's ransom note, I write.

Re:Talk Like Yoda, I Will (1)

element-o.p. (939033) | more than 5 years ago | (#29136209)

Police: Hello, Mr. Taco? I have a search warrant for all of the posts on your website by a Mr. Anonymous Coward. We believe Mr. Coward may be responsible for the ransom note using a peculiar grammar that is characteristic of the character "Yoda" from the Star Wars movies, and we would like to verify this claim by analyzing all of Mr. Coward's posts on your site. CmdrTaco: Right......

Bad news for Mr. McCarthy (0)

Anonymous Coward | more than 5 years ago | (#29131329)

Now every time some anonymous terrorist publishes his manifesto Cormac McCarthy gets a vistit from the police.

When writing samples abound (1)

HikingStick (878216) | more than 5 years ago | (#29131451)

The fact that one person may write in the style of another is nothing new. While the use of such writing-style analysis may still have a valid use in some cases, it is clear that it, like any other forensic tool (even DNA analysis) can be beaten.

Prior to contemporary times, I believe the number of people who would have had access to enough writing samples (of persons other than authors, columnists, and other published figures) to successfully mimic another's style would have been limited to family members, friends, and confidants. Today, with broad use of blogs and social networking sites, many more people are exposed to those writers' styles. In workplaces, a small subset of individuals are often called upon to produce a majority of documentation. While such writing samples may well vary from the style a person may use in personal correspondance, each will share characteristics that, together, are unique to the writer's preferred style.

I've been one who has been called on to create much documentation by my employers. Bits of my memos, procedures, and instructional materials would get lifted and used by other departments (my departmental IT writings were most often copied and used by our central IT department). When they did so, other employees would approach me and ask if I wrote the memo, web page, or training manual, because they could recognize my writing style.

Even here on Slashdot, there are those posters who have been around for a long time, who have posted often, and who have identifiable writing styles. Anyone with enough familiarity, if given the opportunity to post under such another's screen name, would likely be able to post something that would seem to be the words of another. Of course, if the context of the message were a great departure from the copied target's known views, the post would be suspect. If it were in line with such views, it might not raise suspicion.

If anything, the number of available sources of writing samples today increases the likelihood that someone else could learn and mimic a writing style. Of course, this could be a handy defense so long as the accused has produced loads of writing that is accessible to others.

as if law enforcement cares (2)

BigHungryJoe (737554) | more than 5 years ago | (#29131457)

"We would strongly suggest that courts examine their methods of stylometry against the possibility of adversarial attacks,' say the researchers."

Of course, this assumes that law enforcement actually cares about the guilt or innocence of the people they convict. They don't. They only care about putting as many people in prison as they can.

Re:as if law enforcement cares (2, Informative)

TimSSG (1068536) | more than 5 years ago | (#29131985)

They only care about putting as many people in prison as they can.

Wrong|

The Basic Metric used on the police is case closed.
In other words, it is easy to say a dead person committed a crime; because it closes a case.

Metrics have very bad sides.

Tim S.

All evidence is tentative (2, Insightful)

JoshuaZ (1134087) | more than 5 years ago | (#29131971)

So handwriting analysis has problems. Another recent Slashdot article was about how DNA evidence might be falsifiable. And we all know that eye-witnesses have serious problems. We don't however reject any of these. Why not? Because we don't care about single pieces of evidence but rather about bodies of evidence. It is the collective narrative which matters. It might be possible for one or two types of evidence to be wrong or falsified. But it is extremely difficult to falsify four or five. The real problem is when overzealous prosecutors try to portray something like handwriting analysis as a CSI-style magic bullet. This is moreover, being balanced by a problem in the opposite direction, which juries increasingly wanting all sorts of technical evidence to convict even when it would be unnecessary, prohibitively expensive or in some cases, a form of evidence that really only exists in fiction.

read for comprehension (1)

fuzzylollipop (851039) | more than 5 years ago | (#29133351)

this article is NOT about handwriting analysis, it is about style analysis ( stylometrics ) which is NOT about handwriting at all!

Cat-and-mouse game, with end (1)

SlideRuleGuy (987445) | more than 5 years ago | (#29132445)

This has been known for quite a while, with one pair of researchers speculating about how much time the average person would need, with a tool's assistance, to sufficiently hide their identity. A change of 14 words out of 1000 was sufficient to hide their identity pretty well against word-level attacks.

http://research.microsoft.com/apps/pubs/default.aspx?id=69343 [microsoft.com]

It is something of a cat-and-mouse game, for as stylometric analyses become more sophisticated, so will the techniques of obfuscation. However, as more of one's personal style is "blurred", the more likely it is that other as of yet undetected patterns will get swept along in the document alterations. In the end, obfuscation must win.

read for comprehension (1)

fuzzylollipop (851039) | more than 5 years ago | (#29133433)

this article is NOT about handwriting in anyway, it is about writing style which is the style you write in. No where does it mention anything about handwriting!

Book about this (1)

British (51765) | more than 5 years ago | (#29136627)

I read the book "Author Unknown" which talked about this for the forensic side. It was only an okay book. Here's how I would do it.
1. Inconsistently spell things wrong. Misspell a word one way, then down a few paragraphs, misspell it another way.
2. Type in all caps. All capitalization errors you might normally make goes away.
3. Don't use your regional sayings for things. Use some other region's, or use all of them.
4. Run it back & forth with translation services to really obfuscate it.

easy peasy.

Surprised nobody linked to one... (1)

jbrazile (1622211) | more than 5 years ago | (#29144685)

The "Gender Genie" works surprisingly well, even tried a female blogging about math, and a male blogging about gay-bashing...

The Gender Genie [bookblog.net]

Gender Genie: pah! (1)

meowhous (1592411) | more than 4 years ago | (#29152847)

Well, it got me wrong.
Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>