Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Search Engines for Handwritten Documents

michael posted more than 9 years ago | from the lost-art dept.

Software 172

An anonymous reader writes "Researchers at the University of Massachusetts have created a tool for automatically searching handwritten historical documents, such as the 140,000 pages that make up George Washington's personal papers in the Library of Congress. The most interesting part is that the papers are scanned versions of the originals and the search tool actually recognizes the handwritten text from these images."

cancel ×

172 comments

Sorry! There are no comments related to the filter you selected.

Who still reads those? (5, Funny)

Anonymous Coward | more than 9 years ago | (#10992151)

In America, handwriting is only for old people.

Handwriting sucks (4, Interesting)

October_30th (531777) | more than 9 years ago | (#10992259)

You were modded as funny, but I fully agree with you.

I hate reading/producing anything longer than a post-it note that's in handwriting.

Re:Handwriting sucks (5, Insightful)

metlin (258108) | more than 9 years ago | (#10992808)

You're apparently not into the pure sciences like math or physics.

I'd hate to be able to type in my equations, there's a feel to working things out on paper and pen. Besides, the tactile sensation of writing on paper is simply wonderful. No amount of typing can replace that.

Nothing beats a good old fountain pen and writing on good paper =)

Re:Who still reads those? (5, Interesting)

gcaseye6677 (694805) | more than 9 years ago | (#10992266)

Cursive writing certainly is. I can barely even read it anymore, much less write it. Does anybody else who is under 30 still write in cursive, other than when they made you do it in elementary school?

Sure. Lots of arists/commercial artists do. (0, Redundant)

aristus (779174) | more than 9 years ago | (#10992314)

With ink pots and nibs and everything. But, like fish-tickling and lice-picking, it's a dying art.

Re:Who still reads those? (4, Funny)

Sheepdot (211478) | more than 9 years ago | (#10992374)

I write out my checks in cursive. The other day I was admiring how pretty my cursive looked and how well it had developed from when I was in second grade and told to "TRY HARDER WEAKLING OR YOU WILL NEVER GET A JOB!". Then I realized just how ghey it was that I was enjoying the sight of it and hurridly gave it to the cashier... who was a guy... who (ick) winked at me.

Re:Who still reads those? (-1, Flamebait)

Anonymous Coward | more than 9 years ago | (#10992491)

He could tell you were gay from your pretty cursive. When are you going to find out, too?

Re:Who still reads those? (0)

Anonymous Coward | more than 9 years ago | (#10992758)

Then I realized just how ghey it was

Yes, but when will you realise how "ghey" it is to use the word "ghey"?

Re:Who still reads those? (3, Insightful)

realdpk (116490) | more than 9 years ago | (#10992387)

I wish they'd never taught cursive. Cursive destroyed my handwriting. At least, that's my current theory on why my handwriting sucks. :)

Re:Who still reads those? (2, Interesting)

smacktits (737334) | more than 9 years ago | (#10992412)

I'm 23 and I write in perfect cursive. In fact, I prefer it to typing. Maybe I like it because I suffered a serious injury to my hand when I was 12 that necessitated my learning to use it again from scratch.. I dunno. I just like to write, it relaxes me.

Re:Who still reads those? (1)

Feynman (170746) | more than 9 years ago | (#10992416)

Does anybody else who is under 30 still write in cursive, other than when they made you do it in elementary school?

When I was in sixth grade [k12.ia.us] , my teachers all got together and decided to ban me from writing cursive (D'Nealian [geocities.com] , to be exact). I've never looked back.

(Of course, I just turned 30.)

Re:Who still reads those? (1)

AceCaseOR (594637) | more than 9 years ago | (#10992445)

I still write cursive occasionally, mainly in personal notes. If I'm writing something that I need somebody else to be able to read, I definatly print instead of using cursive.

Re:Who still reads those? (4, Interesting)

jgardn (539054) | more than 9 years ago | (#10992537)

Yes, and I use it to record notes in my lab book I use at work. I record all sorts of things I discover there. Some entries are several pages long with charts and graphs and tables and diagrams. Try doing that in a few minutes in Word or OpenOffice.

The best part is I don't have to worry about backing up my lab books. The only real threat is fire, and it is no more dangerous than it is to CDs or hard drives.

While the cursive handwriting of the 1700's and early 1800's may seem curious to us (notably, the tall 's' that looks like an 'f'), it is a very easy style that is neat, legible, and painless. Notice how there are very few back strokes.

For those who are wondering, cursive is what you use when you get sick of trying to write in print legibly and quickly without getting carpal tunnel. Every culture has it. It's unfortunate it isn't common knowledge anymore in the US. Handwriting is a wonderful skill. It used to be people would judge others based on their handwriting skills in addition to their oratory.

Re:Who still reads those? (1)

Lenale (792831) | more than 9 years ago | (#10992599)

I write letters and such in cursive, but it's too slow for course notes... read: becomes a tangled mass of lines and ink blots.

I do write my lab journals in cursive, and three colours of pen... according to one of my classmates I'm not human. :)

Re:Who still reads those? (1)

jacksonj04 (800021) | more than 9 years ago | (#10992800)

I'm learning shorthand just to get notes down easily, it's well worth it if you plan on doing a lot of note taking.

Yes, I do write in cursive (admittedly on my palmtop, so it then just transcribes it).

Re:Who still reads those? (1)

xgamer04 (248962) | more than 9 years ago | (#10992659)

I had a friend in high school who always wrote in cursive, and this was...a year ago, so I'm pretty sure he's still under 30. I think that he was the only one in the whole school who still did, though.

Re:Who still reads those? (0)

Anonymous Coward | more than 9 years ago | (#10992791)

Yeah, cursive sucks. Just a couple of weeks ago I heard a story on NPR about how elementary school teachers go about teaching cursive. Of course the teachers tell the kids it's a necessary life skill. Bullshit, being able to write is a necessary skill, being able to write in cursive is a bonus. If cursive's so great, why aren't books printed using it?

I switched back to print style characters with added personal embelishments years ago. Print characters are orders of magnitude easier to read than cursive no matter how good the writing is.

Re:Who still reads those? (1)

wintermute1000 (731750) | more than 9 years ago | (#10992815)

I do. And I do it well, and I'm proud of it. Of course, I'm in the dying breed that considers the ability to write legibly by hand a part of fluency in one's language. Maybe I should just give in and go back to third grade where I belong.

Re:Who still reads those? (1)

Sheepdot (211478) | more than 9 years ago | (#10992316)

... and second-graders.

To Faith, Family, And Values Coalition: +1 (-1, Troll)

Anonymous Coward | more than 9 years ago | (#10992403)

To borrow a phrase from Dennis Hopper:

Fuck you, you fucking fucks.

Seditiously as always,
Kilgore Trout, CTO

Re:Who still *writes* those? Well, after college? (0, Flamebait)

Tackhead (54550) | more than 9 years ago | (#10992477)

> In America, handwriting is only for old people.

And college students during exam season. (Can't speak for the Koreans.)

Blue-stained hands-up, all those who remember those glorious essay exams from the mandatory humanities courses, where your grade ceases to be based on the merits of your ideas (and/or your ability to parrot your professor's ideas), but is solely a function of how well-developed the muscles in your right hand are, in order to keep scribbling for the entire three hours what would have taken you 90 minutes to type.

Of course, even in the dark days before I discovered Slashdot, my CS education had proven to be more than ample preparation for the worst that any Philosophy, History, or (worst of all) English prof could throw at me. *rimshot*

So, when does henscratch.google.com (searchable handwritten blogs) come out?

Umm (5, Insightful)

swtaarrs (640506) | more than 9 years ago | (#10992155)

The most interesting part is that the papers are scanned versions of the originals and the search tool actually recognizes the handwritten text from these images.

How else would it search handwritten documents? Am I missing something here?

Re:Umm (2, Funny)

KillerDeathRobot (818062) | more than 9 years ago | (#10992569)

Yeah, it would have been much more "interesting" if the papers were, I don't know, read psychically by the computer or something.

Re:Umm (2, Funny)

ZagNuts (789429) | more than 9 years ago | (#10992727)

How else would it search handwritten documents? Am I missing something here?

You write down exactly what you want to find in exactly the same handwriting that the document is written in and then it blocks scans it for what you wrote... duh.

Re:Umm (1)

lawpoop (604919) | more than 9 years ago | (#10992845)

It might search for certain kinds of penstrokes or something like that. You could input a vector map and it would find similar vectors. Or even bitmaps I guess.

gnaa 4 life (-1, Troll)

Anonymous Coward | more than 9 years ago | (#10992159)

fp niggas

Doc (3, Funny)

savagedome (742194) | more than 9 years ago | (#10992161)

Huh? Well, lets see how well it keeps up with my doctor's handwriting...

The search tool? (-1, Redundant)

Andorion (526481) | more than 9 years ago | (#10992164)

Somehow I doubt the search tool recognizes the handwritten text... rather, the text has gone through OCR and is stored as a text-only version somewhere, which is what's actually searched through. The OCR isn't done on the fly each time you search.

Re:The search tool? (2, Interesting)

Skippy_the_Evil_Twin (453297) | more than 9 years ago | (#10992192)

No OCR is performed on the documents. The search tool operates on the image.

Re:The search tool? (0)

Anonymous Coward | more than 9 years ago | (#10992291)

>No OCR is performed on the documents.

Yeah but, um, why?

Re:The search tool? (1)

AHumbleOpinion (546848) | more than 9 years ago | (#10992524)

No OCR is performed on the documents. The search tool operates on the image

The search tool is doing the OCR then. OCR is simply taking an image and analyzing it to recognize text.

Re:The search tool? (1)

tonsofpcs (687961) | more than 9 years ago | (#10992612)

No, OCR stands for Optical Character Recognition. This is Digital Character Recognition on an Optically Acquired Digital Image. Don't you see the difference?

Re:The search tool? (1)

rjelks (635588) | more than 9 years ago | (#10992219)

RTFA :) It actually looks pretty cool, the software is looking through the actual handwritten pages.

Re:The search tool? (1)

ShadeARG (306487) | more than 9 years ago | (#10992260)

From TFA:
Manmatha says, "Right now, searching a scanned handwritten document is very hard to do. Scanned historical documents are basically images, or pictures, and currently can only be searched if someone manually transcribes the documents or creates and index of their contents. This is time consuming and expensive to do. Given the cost, most handwritten documents are never transcribed or indexed," Manmatha says. "But there is an enormous amount of handwritten, historical material.
I forgot, this is Slashdot after all.

Re:The search tool? (2, Funny)

TykeClone (668449) | more than 9 years ago | (#10992404)

We could use it as a jobs program for monks. Their predecessors wrote the manuscripts, and now they could transcribe them into digital form...

slashdot slashdotted (0)

Anonymous Coward | more than 9 years ago | (#10992166)

slashdot page down after 0 posts!

OPEN LETTER TO SLASHDOT MODERATORS AND MICHAEL (1)

arothstein (233805) | more than 9 years ago | (#10992176)

SHOVE IT UP YOUR ASSES, MY LITTLE BITCHES. K? THX. XXOO

PLEASE DON'T MOD ME DOWN.

*_g_o_a_t_s_e_x_*_g_o_a_t_s_e_x_*_g_o_a_t_s_e_x_*_
g_______________________________________________g_ _
o_/_____\_____________\____________/____\_______o_ _
a|_______|_____________\__________|______|______a_ _
t|_______`._____________|_________|_______:_____t_ _
s`________|_____________|________\|_______|_____s_ _
e_\_______|_/_______/__\\\___--___\\_______:____e_ _
x__\______\/____--~~__________~--__|_\_____|____x_ _
*___\______\_-~____________________~-_\____|____*_ _
g____\______\_________.--------.______\|___|____g_ _
o______\_____\______//_________(_(__>__\___|____o_ _
a_______\___.__C____)_________(_(____>__|__/____a_ _
t_______/\_|___C_____)/_TOSS_\_(_____>__|_/_____t_ _
s______/_/\|___C_____)___MY___|__(___>___/__\____s _ _
e_____|___(____C_____)\SALAD!/__//__/_/_____\___e_ _
x_____|____\__|_____\\_________//_(__/_______|__x_ _
*____|_\____\____)___`----___--'_____________|__*_ _
g____|__\______________\_______/____________/_|_g_ _
o___|______________/____|_____|__\____________|_o_ _
a___|_____________|____/_______\__\___________|_a_ _
t___|__________/_/____|_________|__\___________|t_ _
s___|_________/_/______\__/\___/____|__________|s_ _
e__|_________/_/________|____|_______|_________|e_ _
x__|__________|_________|____|_______|_________|x_ _
*_g_o_a_t_s_e_x_*_g_o_a_t_s_e_x_*_g_o_a_t_s_e_x_*_


Important Stuff: Please try to keep posts on topic. Try to reply to other people's comments instead of starting new threads. Read other people's messages before posting your own to avoid simply duplicating what has already been said. Use a clear subject that describes what your message is about. Offtopic, Inflammatory, Inappropriate, Illegal, or Offensive comments might be moderated. (You can read everything, even moderated posts, by adjusting your threshold on the User Preferences Page) If you want replies to your comments sent to you, consider logging in or creating an account.

Important Stuff: Please try to keep posts on topic. Try to reply to other people's comments instead of starting new threads. Read other people's messages before posting your own to avoid simply duplicating what has already been said. Use a clear subject that describes what your message is about. Offtopic, Inflammatory, Inappropriate, Illegal, or Offensive comments might be moderated. (You can read everything, even moderated posts, by adjusting your threshold on the User Preferences Page) If you want replies to your comments sent to you, consider logging in or creating an account.

Important Stuff: Please try to keep posts on topic. Try to reply to other people's comments instead of starting new threads. Read other people's messages before posting your own to avoid simply duplicating what has already been said. Use a clear subject that describes what your message is about. Offtopic, Inflammatory, Inappropriate, Illegal, or Offensive comments might be moderated. (You can read everything, even moderated posts, by adjusting your threshold on the User Preferences Page) If you want replies to your comments sent to you, consider logging in or creating an account.

Re:OPEN LETTER TO SLASHDOT MODERATORS AND MICHAEL (-1, Offtopic)

Anonymous Coward | more than 9 years ago | (#10992220)

What the hell was that? Spam has hit a new low.

Re:OPEN LETTER TO SLASHDOT MODERATORS AND MICHAEL (0)

Anonymous Coward | more than 9 years ago | (#10992380)

You are new here. (PS, your post is spam too).

This is so cool! (2, Funny)

raehl (609729) | more than 9 years ago | (#10992179)

Somebody invented a way for computers to recognize handwriting.

Like, so 10 years ago.

Re:This is so cool! (0)

Anonymous Coward | more than 9 years ago | (#10992369)

Mod the parent up.

Raehl is exactly right. It's just OCR. Whether it transcribes the images on-the-fly with a neural net, or has a preconstructed text version (probably also done with a neural net, bayes network, etc.), it's still just OCR.

Re:This is so cool! (0)

Anonymous Coward | more than 9 years ago | (#10992382)

Or instead of modding the parent up, you could read the article. 8)
They aren't doing OCR.

They are doing OCR (1)

AHumbleOpinion (546848) | more than 9 years ago | (#10992470)

They aren't doing OCR

Yes, they are. They are not using an off-the-shelf OCR package. The OCR functionality is embedded into their software, it is highly specialized, but it is OCR. For those who are fixated on the letter 'C', recognizing multiple characters as a single unit is nothing new.

Re:This is so cool! (1)

networkBoy (774728) | more than 9 years ago | (#10992472)

No....
10 years ago someone invented a (hand) writing style that computers could recognize ala grafitti on the Palm.
-nB

More like twenty years ago ;-) (3, Interesting)

AHumbleOpinion (546848) | more than 9 years ago | (#10992597)

Somebody invented a way for computers to recognize handwriting. Like, so 10 years ago.

I worked on an OCR system about 20 years ago. No pre-defined bitmaps of text, you trained the system on the font to be recognized. After a few hours you could turn it loose and it did fairly well. While goofing off we tried handwritten text. With good penmanship it worked to a degree.

Hard to read! (2, Interesting)

DeionXxX (261398) | more than 9 years ago | (#10992202)

Wow, looking at some of those examples, I was amazed by the fact that I couldn't READ most of the words. It looks completely foreing to me, might as well be trying to read Japanese.

Re:Hard to read! (3, Funny)

kfg (145172) | more than 9 years ago | (#10992275)

It looks completely foreing to me. . .

That's because it's written in a dead language.

English.

KFG

English is dead? (0)

Anonymous Coward | more than 9 years ago | (#10992452)

Long live inglish!!!!!!!!1

Great, but... (0)

Anonymous Coward | more than 9 years ago | (#10992216)

...will this tool be open source, or at least free to use?

Accuracy? (2, Interesting)

b0lt (729408) | more than 9 years ago | (#10992217)

How good is the accuracy? The OCR technology of today might not be able to recognize the "flowery" text of most historical documents (look at "We the People" in the Declaration of Independence)

Re:Accuracy? (1)

LiquidCoooled (634315) | more than 9 years ago | (#10992488)

I think consistency matters more than individual letter formation.

I could write entirely in scribbled hieroglyphs, but if it has a pattern, and the same squiggle means the same thing, then a computer could decipher it.

Re:Accuracy? (1)

hords (619030) | more than 9 years ago | (#10992654)

I agree, my grandmother was heavy into genealogy. She had hundreds of pages of neatly hand written, non-cursive documents. I tried to scan them with many different OCR programs, but none even came close to deciphering the text without skewing it badly. I tried ABBYY, Omnipage Pro 14, and a few others. Anyone have any successes with this kind of thing?

A waste? (5, Insightful)

Anonymous Coward | more than 9 years ago | (#10992230)

These documents are old and handwritten. Why waste the processing power decyphering results for each search when you can decypher the text once with a similar algorithm and search an index built that way? It's not like the information is ever going to change. (unless we do rewrite history)

Re:A waste? (0)

Anonymous Coward | more than 9 years ago | (#10992297)

In America, only old documents are decyphered.

Re:A waste? (2, Interesting)

42forty-two42 (532340) | more than 9 years ago | (#10992327)

Um, that's almost certainly what they did. Running an OCR over 14,000 pages every time you do a search is nearly impossible. I only say nearly because, in theory, you can do it, but then searches days a few days to complete for zero net gain.

Re:A waste? (1)

spud603 (832173) | more than 9 years ago | (#10992407)

These documents are old and handwritten. Why waste the processing power decyphering results for each search when you can decypher the text once with a similar algorithm and search an index built that way? It's not like the information is ever going to change. (unless we do rewrite history) Context, context, context! If there's one thing I've learned in all of my schooling (and there is a lot), it is that how the information is portrayed is just as important as the information itself. Think about hearing vs. reading a speech, or reading a document printed on a dot-matrix instead of a laser printer. Yes it matters. Does it matter enough to use the many more resources necessary in this case? That's another issue...

Re:A waste? (1)

GigsVT (208848) | more than 9 years ago | (#10992544)

I don't get it.. he's advocating building an index. That would point to the image of the original document. Which is what they already did.

This is nothing new (2, Informative)

42forty-two42 (532340) | more than 9 years ago | (#10992234)

Google already did it! [google.com] Well, it's not handwritten, but that's just a logical progression.

Re:This is nothing new (2, Funny)

js7a (579872) | more than 9 years ago | (#10992389)

Vannivar Bush [uni-sb.de] described it before anyone could do it. Actualy maybe Babbage and Lovelace, Asimov, and/or probably someone like Jay Williams [amazon.com] did a better job.

Not handwritten... (0)

Anonymous Coward | more than 9 years ago | (#10992422)

So they didn't already do it, then?

Re:This is nothing new (1)

cuteseal (794590) | more than 9 years ago | (#10992732)

Now... if they only had a search engine for My socks....

In related news... (1)

Sheepdot (211478) | more than 9 years ago | (#10992245)

such as the 140,000 [handwritten] pages that make up George Washington's personal papers in the Library of Congress.

In related news, the family of Tobias Lear, George Washington's personal secretary [64.233.167.104] , who took his own life [64.233.167.104] (arguably due to the horrible pain in his wrists), has filed suit.

OCR, anyone? (0)

Anonymous Coward | more than 9 years ago | (#10992263)

Why not use OCR? It's not like there aren't techniques for dealing with the OCR errors, such as language models for error correction, n-gram document retrieval and relevance feedback.

Also OCR systems are trainable to learn handwriting styles, even on a per-document basis. But I guess it's a cool hack.

!WOW (1, Interesting)

Anonymous Coward | more than 9 years ago | (#10992267)

... eh eh !gniddik tsuJ. !skoobeton inciV ad eht no esool ti teL.

Re:!WOW (0)

Anonymous Coward | more than 9 years ago | (#10992294)

Tsk, tsk. You didn't mirror reverse your font...

Re:!WOW (0)

Anonymous Coward | more than 9 years ago | (#10992339)

--Tsk, tsk. You didn't mirror reverse your font...-- !sdnoces 03 dah ylno I ,wonk I

Useful for more than just historians (4, Interesting)

Thunderstruck (210399) | more than 9 years ago | (#10992289)

I took a lot of notes in College. I took a lot more notes in graduate school. I've even taken notes on books I've read for the fun of it. If I could run all of these through my scanner & search them from an application on my desktop, I could be really obnoxious in an argument.

Re:Useful for more than just historians (1)

terraformer (617565) | more than 9 years ago | (#10992555)

If I could run all of [my notes] through my scanner & search them from an application on my desktop, I could be really obnoxious in an argument.

This is slashdot. You would definitely be obnoxious if you argued a point with actual facts behind you...

Re:Useful for more than just historians (1)

Dubber (101609) | more than 9 years ago | (#10992615)

If I could run all of these through my scanner & search them from an application on my desktop, I could be really obnoxious in an argument.


Hell, *I* don't need all that much processing power to be obnoxious in an argument. Oh, wait...

Uh Oh (1, Funny)

griffitts (739673) | more than 9 years ago | (#10992307)

The article points out that the handwriting reader is a Newton.

EAT UP MARTHA (-1, Troll)

Anonymous Coward | more than 9 years ago | (#10992323)

wHat parcel co ong whit tha it?

Re:EAT UP MARTHA (0)

Anonymous Coward | more than 9 years ago | (#10992631)

Wow, a moderator who has never seen the Simpsons, next they'll have dating Slashot posters and editors who spell check posts.

Yes, but what they don't tell you... (3, Funny)

aristus (779174) | more than 9 years ago | (#10992344)

You have to be able to handle a quill pen to use it.

Interesting, but limited (3, Interesting)

InternationalCow (681980) | more than 9 years ago | (#10992347)

It's an interesting approach that should be extended to other languages than English. Most of the world's history is not about the US and it has certainly not been written down in English. What I would really like to have is a similar tool that can search, say, Greek, or Latin, (or whatever) handwritten text. Imagine being able to query Ovid for an item of interest without having to consult everything he's written. I can imagine that this might encourage people to study the classics (a pet peeve of mine is that many people lack historical sense...) and it would certainly facilitate research in this area.
If you can put the queries in English, with the search engine taking care of translation, it would be even better. Then, extended historical study comes within everyone's reach and the classical studies (or humaniora) might be transformed.

Re:Interesting, but limited (0)

Anonymous Coward | more than 9 years ago | (#10992433)

Good point. Why would an english speaking institute in an english speaking country care about recognizing english handwriting?

Re:Interesting, but limited (0)

Anonymous Coward | more than 9 years ago | (#10992831)

Have you seen the Perseus Project? It's web-based, free, and lets you search through many of the Greek & Latin classics (in original or in translation):
http://www.perseus.tufts.edu/ [tufts.edu]

Good Work! (4, Funny)

CaptainCarrot (84625) | more than 9 years ago | (#10992354)

How pleafant that they've done what waf neceffary to make this happen. How did they train the foftware to recognize the quirky 18th Century handwriting?

Re:Good Work! (4, Funny)

That's Unpossible! (722232) | more than 9 years ago | (#10992685)

How pleafant that they've done what waf neceffary to make this happen.

Personally, I think it fucks.

Standards at risk (0)

Anonymous Coward | more than 9 years ago | (#10992385)

This kind of activity is putting our standard measurement systems at risk. For decades, it has been universally agreed that the most fundamental unit of information capacity in computer science, the "Library of Congress", has been measured in terms of ASCII bytes of text.

Now, if the Library of Congress starts instead storing its data willy-nilly in random image formats, possibly with unpredictable compression algorithms, we are truly on a slippery slope. We risk losing altogether any meaningful standard for what it really means to have a LOC's worth of information. Is it the ASCII text version? Is it the scanned image file? Is it the sum of both? The numbers vary wildly based on the arbitrary choices about which data we include in the LOC. What's worse, there is no single right or wrong answer to this subjective data classification question, so we will never have agreement on this most fundamental of issues.

Clearly, the risks presented by this new untested technology experiment outweigh any possible benefits for the few people who might be interested in these obsolete documents. Consistency must be preserved. Boycott this search system!

Re:Standards at risk (1)

tonsofpcs (687961) | more than 9 years ago | (#10992593)

it has been universally agreed that the most fundamental unit of information capacity in computer science, the "Library of Congress"
Really? Whatever happened to the bit???????

Doesn't work (1)

badmammajamma (171260) | more than 9 years ago | (#10992413)

Their handwriting recognition system doesn't work for shit. It couldn't even correctly retrieve results from words that I know are in its scanned letters. The word "governor" appears as a result from one of their suggested queries (*cough* hard coded results *cough*), but if you do a separate search for governor it returns stuff that doesn't even contain the word.

It's not OCR (4, Funny)

Anonymous Cowdog (154277) | more than 9 years ago | (#10992462)

It's "Pixelative Text Cognizance."

It's different. With OCR these rays of light scan the original, translate each scanpoint to discrete RGB values, and do pattern recognition.

With this system, they just read the discrete RGB values directly from pixels of documents scanned in with rays of light, then they do recognition of patterns. See, it's totally different.

Re:It's not OCR (2, Funny)

GigsVT (208848) | more than 9 years ago | (#10992563)

I'm not sure what's more funny, your post, or that it was modded "informative". :)

Re:It's not OCR (1)

Anonymous Cowdog (154277) | more than 9 years ago | (#10992600)

Heh, yeah, I was just noticing that. Wow! Tears in my eyes!

Re:It's not OCR (2, Insightful)

imsabbel (611519) | more than 9 years ago | (#10992604)

Er.... Do i seriously miss something here or was only some mod fooled by a troll?

Lets examine your definitions:
Ocr: document->RGB(via light)->pixels->patern recognition
PTC: Document->Pixels(via light)->RGB->patern recognition.
Of course you forget that there are no rgb values here, because its black/white, so there is only a brightness value per pixel left. So what is the difference?

Sounds really AWFULLY different...

Maybe its just your description that is lacking...

THIS IS SO COOL (-1, Troll)

Anonymous Coward | more than 9 years ago | (#10992495)

One time when I was drunk I raped the fifteen year old daughter of a friend of mine. She enjoyed and kept coming back for more, using some pretense for going to the other side of town. I used to be a loveless virgin, but this wonderful young lady has willingly become my sex slave. Right now I am laying on my with my giant nine incher shoved up her ass as she moans with pleasure as my throbbing cock ejaculates and I finger her clittoris.

Have a nice day.

Re:THIS IS SO COOL (0)

Anonymous Coward | more than 9 years ago | (#10992528)

do you have a newsletter I can subscribe to?

Re:THIS IS SO COOL (0)

Anonymous Coward | more than 9 years ago | (#10992596)

Yes, only give me your email address and you can read all about my aberrant sexual encounters.

They should do an image search instead (1)

alexislashdot (808899) | more than 9 years ago | (#10992510)


Convert the search text into an image to look as written by hand.

Then do an image search on the documents. You will need a powerful image recognition software.

This would be news.

*** Find that COM error at http://www.comerrors.com [comerrors.com] **

National Treasure (2, Funny)

Torgo's Pizza (547926) | more than 9 years ago | (#10992541)

If only Nicholas Cage had this tool at his disposal, it would have made things much, much easier.

Re:National Treasure (1)

tonsofpcs (687961) | more than 9 years ago | (#10992679)

How? The map was on the back side, these things only scan the front.......

this FP fo;r GNAA?! (-1)

Anonymous Coward | more than 9 years ago | (#10992577)

Numbers continue BSD's aaclaimed

Isn't this just combining existing technology? (0)

Anonymous Coward | more than 9 years ago | (#10992587)

Just run the text through an existing handwriting-aware scanner, then run your favorite search tool on it.

Step 1: Combine two existing software technologies.
Step 2: ???
Step 3: Profit!

Wait, that's how software patents work...

OCR (1)

dustinbarbour (721795) | more than 9 years ago | (#10992591)

Holy shnikes! Optical Character Recognition! Bah.. I'm part of a research team at the Center for Cybermedia Research who are working on new algorithms for OCR with $4 million from Homeland Security. Its to be used on a gi-normous database containing scanned images of documents relating to Yucca Mountain.

On top of that, OCR has been around for years. Yes, it isn't the best, but its functional. Doesn't census bureau use OCR for its census forms?

So, yeah.. where is the news in the article?

Re:OCR (0)

Anonymous Coward | more than 9 years ago | (#10992657)

So, yeah.. where is the news in the article?
You're new here, aren't you?

Must be expensive search engine (1)

CmdrPuto (246448) | more than 9 years ago | (#10992627)

For sure it will cost 5 times and more complicated algorihtm if it were use to search Doctor's handwriting.

Awesome! (0)

Anonymous Coward | more than 9 years ago | (#10992741)

I keep a handwritten log of daily work; when I arrive, when I leave and what I did. Every week these logs are run through an HP Digital Sender and a PDF version is emailed to me. I then take these PDF files and post them on my personal website. If I can add search capabilities, then that's about as ideal setup as I can imagine.

Stupid (1)

bryan986 (833912) | more than 9 years ago | (#10992743)

This is really, really, really, really stupid, it would be faster just to hand type the documents into the database, then search it, you could link to pictures of the documents if you really needed it

OneNote does this already (1)

MOGua (750520) | more than 9 years ago | (#10992820)

I've been using this feature in OneNote for a long time now. It searches through my handwriting with amazing accuracy
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?

Submission Text Formatting Tips

We support a small subset of HTML, namely these tags:

  • b
  • i
  • p
  • br
  • a
  • ol
  • ul
  • li
  • dl
  • dt
  • dd
  • em
  • strong
  • tt
  • blockquote
  • div
  • quote
  • ecode

"ecode" can be used for code snippets, for example:

<ecode>    while(1) { do_something(); } </ecode>