Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Clever Clues Clobber Crossword Computer

Soulskill posted about 2 years ago | from the blithe-banter-bamboozles-brainy-bots dept.

AI 70

Hugh Pickens writes "Steve Lohr reports that an impressive crossword-solving computer program called Dr. Fill matched its digital wits against 600 of the nation's best human crossword-solvers, finishing only 141st at the American Crossword Puzzle Tournament in New York. 'I wish it had done better,' says Dr. Matthew Ginsberg, the creator of Dr. Fill and an expert in artificial intelligence. Dr. Fill typically thrives on conventional crosswords, even ones with arcane clues and answers; it solved one of the most difficult puzzles at the tournament perfectly. But the computer does poorly with clever clues based on puns or jokes, because humans and machines solve the crosswords very differently. Humans recognize patterns based on accumulated knowledge and experience, while computers make endless calculations to determine the most statistically probable answer. The computer program is literal minded, and tends to struggle on puzzles with humor, and puzzles with unusual themes or letter arrangements. Take this clue from a 2010 puzzle in The Times: Apollo 11 and 12 (180 degrees). The answer is SNOISSIWNOOW, seemingly gibberish. A clever human could eventually figure out that those letters, when rotated 180 degrees, spell MOON MISSIONS. Humans get the joke, while a literal-minded computer does not. 'Occasionally, Dr. Fill just doesn't get it,' says Ginsberg. 'That's my nightmare.'"

cancel ×

70 comments

Sorry! There are no comments related to the filter you selected.

He named it a pun, but it sucks at puns. (2, Funny)

Anonymous Coward | about 2 years ago | (#39424633)

Can it manage ironic clues?

I can do that too (-1)

Anonymous Coward | about 2 years ago | (#39424637)

/. = Cynical cesspool contemplating computer curiosities

GIGO (2)

Jimbookis (517778) | about 2 years ago | (#39424647)

It's not Dr Fill's fault for not getting it. Clearly it's Dr Ginsberg who is not getting it.

My favourite "not real" crossword clues (5, Funny)

whyloginwhysubscribe (993688) | about 2 years ago | (#39424649)

If I see someone doing a crossword I usually say "I was stuck on a crossword the other day - the clue was 'very busy postman'". Eventually (sometimes it takes a while) they ask "how many letters" at which point you can say "hundreds!"

I'm such a funny guy...

Oh - another one is to say "seven up is lemonade"...

Re:My favourite "not real" crossword clues (2, Funny)

Anonymous Coward | about 2 years ago | (#39426417)

Perhaps unsurprisingly, one down is justifiable homicide.

Re:My favourite "not real" crossword clues (2)

Inda (580031) | about 2 years ago | (#39428405)

I went to a funeral the other day. The deceased was a crossword compiler.

He was buried 6 down and 4 across.

^^ you can have that one for free :)

I'd luck to congratulate (5, Interesting)

mapkinase (958129) | about 2 years ago | (#39424661)

I'd luck to congratulate submitter on a clever title. Does not happen very often here.

Re:I'd luck to congratulate (1)

mapkinase (958129) | about 2 years ago | (#39427375)

*like. If somebody thought that this was some kind of word play and modded up because of that (that's the only explanation of modups I have: my typo somehow made my comment clever beyond my understanding), it was not.

Re:I'd luck to congratulate (0)

Anonymous Coward | about 2 years ago | (#39427947)

I'd luck to congratulate submitter on a clever title. Does not happen very often here.

And it still hasn't happened, unless by "clever" you mean "something that a 12 year old would think was clever if he came up with it himself".

Re:I'd luck to congratulate (1)

DaFallus (805248) | about 2 years ago | (#39428253)

I have to respectfully disagree. I find these over the top alliterative titles to be incredibly contrived and annoying.

Computers have no sense of humor... (1)

dargaud (518470) | about 2 years ago | (#39424667)

...film at 11. And extended debugging session afterwards.

Re:Computers have no sense of humor... (1)

SnarfQuest (469614) | about 2 years ago | (#39430319)

...film at 11. And extended debugging session afterwards.

Is that 11 across, or down?

Re:Computers have no sense of humor... (1)

dodobh (65811) | about 2 years ago | (#39434801)

Non ex transverso, sed deorsum.

Poor example? (1)

John Pfeiffer (454131) | about 2 years ago | (#39424715)

Would the Apollo example really trip up a decently-written program though? I mean, my first thought was "Well, what if it had a fallback routine where it tries anagrams of possible answers?" so I have to imagine someone smarter than me has thought of that. I guess there's some limitation I'm not seeing...

On second thought...what am I still doing awake at 5:48am commenting on a post about crossword-solving computers?!

Re:Poor example? (1)

N1AK (864906) | about 2 years ago | (#39424797)

You could and chances are that each time it hits something new like this it can be improved a little and any clue about rotation/flipping etc will check for this kind of skullduggery. However just building in functionality to look for something that looks like the word if viewed backwards and upside down isn't automatically something you'd add to a crossword computer.

Re:Poor example? (0)

Anonymous Coward | about 2 years ago | (#39424805)

SNOISSIWNOOWI is not an anagram of MOONMISSIONS, so your suggestion wouldn't help.

Re:Poor example? (3, Funny)

dkf (304284) | about 2 years ago | (#39424829)

Nor is it a word in any conventional sense, nor comprised of words. The only place it ever appears is in relation to discussion of this particular clue. The correct clever human response to such thing is to punch the setter in the face; they have broken the formal compact of crossword setting by using a non-word/non-phrase as an answer.

Re:Poor example? (1)

NekSnappa (803141) | about 2 years ago | (#39425131)

The correct clever human response to such thing is to punch the setter in the face; they have broken the formal compact of crossword setting by using a non-word/non-phrase as an answer.

Exactly. I like doing crosswords, but it hate it when they do stupid shit like that. I really hate it when the answer is two words and there's no indication of it.

Re:Poor example? (2)

fuzzyfuzzyfungus (1223518) | about 2 years ago | (#39425613)

TFA does make it sound like crossword puzzles are the paper equivalent of the most horrible, broken, 'adventure'/'puzzle' games of the 90s. The ones that were only solvable by either having a direct mind-meld with the developer and gaining mystic insight into "puzzles" or by brute-force-clicking every single pixel on every single ill-drawn background to interact with the entire set of interactable objects in the gameworld in all possible orders...

There is a fine line between 'subtle clue' and 'inside joke with a population of one', and it is entirely possible to cross it.

Re:Poor example? (1)

mwvdlee (775178) | about 2 years ago | (#39425995)

Exactly. If not, then what's to stop somebody from using "bunch of random letters" as a clue to a bunch of random letters?

OTOH, without such back-assed clues, a brute force attack would probably be more efficient at solving a crosswords than some AI.

Re:Poor example? (1)

million_monkeys (2480792) | about 2 years ago | (#39424821)

I don't think it's that simple. Of course you could write the program to include answers made of letters rotated 180 degrees. But how often does an answer like that happen? I'm not a big crossword person, but i'd guess it's pretty rare. Maybe 1 out of 10,000 - if even that much? So while you might get that one clue right, meanwhile you've expanded the set of possible answers to include numerous elements with an extremely low chance of ever being correct. In addition to the added expense of considering them, you may inadvertently get wrong answers from including those options.

Whether those issues would be addressed by making that search a fall back option depends on how the program works. For that particular clue, the program may have come up with another answer, in which case the fall back routine would never have been called. Or the fallback might produce false answers that are worse than not having an answer for a given clue.

I guess the point is, you can add custom routines to account for every possibility you're smart enough to think up. And for common tricks, it's probably worth doing. But for what are essentially one off edge cases, whether those extra routines will actually help isn't clear. They may make things worse.

Re:Poor example? (3, Interesting)

bws111 (1216812) | about 2 years ago | (#39425999)

Here is one that just appeared this week (LA Times, I think):

Clue: Hail Answer: DANTESINFERNO
Clue: Poe Answer: FLATBROKE
Clue: What you need to get the above two answers: SOUTHERNDRAWL

Not sure how you make a routine to come up with those answers.

Re:Poor example? (1)

operagost (62405) | about 2 years ago | (#39428959)

That clinches it... not going to bother doing a crossword again!

Re:Poor example? (1)

John Pfeiffer (454131) | about 2 years ago | (#39433295)

Christ! Those are... Those are just ridiculous... I guess what I didn't understand is what crosswords have become!

Now I get what they mean by requiring some cleverness... I mean, there're PEOPLE who wouldn't get those! It makes a lot more sense now.

Not cryptic though (1)

ISoldMyLowIdOnEbay (802697) | about 2 years ago | (#39424739)

All decent crosswords in the UK tend to be of the cryptic kind, rather than just needing a thesaurus most of the time. Writing answers backwards wouldn't be allowed, though, as the answer has to be an actual word. Here's one that a computer might struggle with.... V? (6,2,7) Answer: Centre of Gravity

Re:Not cryptic though (1)

digitig (1056110) | about 2 years ago | (#39424949)

That would be considered an exceptionally poor cryptic clue, though, because one of the "rules" is that there should be some allusion to the -- or an -- actual meaning of the answer, however misleading. A better version of the clue would be "V is a source of strength".

Re:Not cryptic though (1)

Col. Bloodnok (825749) | about 2 years ago | (#39426017)

A clue definition of '...V' is acceptable, if it follows on from a previous related question or answer. Even then it may seem related, but isn't. This will throw one off the scent, but the answer will often be part of an overarching theme. A theme often allows weaker definition cluing. Several other clues might include tangential references to science, but there will one clue that is often refered to by clue number only, elsewhere in the puzzle - e.g. 'Amphibian is working by force' (6) - hinting at the overall theme 'Newton'. I would also be on the look out for oblique references to the biblical character of 'Isaac'.

Themes are particularly popular with Araucaria in the Guardian. He loves to bend the conventional rules and it is expected that he will. Which brings up another reason why this AI would struggle with UK cryptics - a priori knowledge of the setter's style, or even the house style.

Re:Not cryptic though (1)

digitig (1056110) | more than 2 years ago | (#39442351)

I'm not sure, but I seem to remember that it was Araucaria who laid down the general rules that most British setters are nowadays expected to follow. In which case he probably feels somewhat at liberty to bend them :-)

Re:Not cryptic though (1)

netwarerip (2221204) | about 2 years ago | (#39425631)

The computer would definitely struggle with that one because you spelled 'Center' incorrectly. :)

Re:Not cryptic though (1)

CastrTroy (595695) | about 2 years ago | (#39425755)

Not in the UK. Or Canada, or Australia. Actually I think the Americans are the only ones that have it wrong.

Re:Not cryptic though (1)

operagost (62405) | about 2 years ago | (#39428995)

Whoosh.

Re:Not cryptic though (1)

tlhIngan (30335) | about 2 years ago | (#39429419)

The computer would definitely struggle with that one because you spelled 'Center' incorrectly. :)

There is a nasty one inside the crossword app on the Nook Color (not sure if it's in other Nooks, but I'm guessing it is) where the answer is spelled "CENTRE". The problem is the down answers really want "CENTER" to make any sense (one of the down ones was "TENT" which became "TRNT").

Not sure if it was a typo or not. And the puzzles have no identifier so you can point it out.

Re:Not cryptic though (1)

Col. Bloodnok (825749) | about 2 years ago | (#39426221)

Other one AI might struggle with:

Lisping girl of legend (4) - (ans: 'Myth')

How about some Cockney Rhyming slang?:

Beehive in North London? (4, 6) - (ans: 'High Barnet')

Re:Not cryptic though (1)

91degrees (207121) | about 2 years ago | (#39426671)

A really good thesaurus might help with that second one.

Use of cockney isn't uncommon so it would make sense to include both of those in the definition for "hairstyle".

Re:Not cryptic though (0)

Anonymous Coward | more than 2 years ago | (#39437329)

Not difficult to hide the Cockney though.

Intoxicated Walford do?

To be fair, I made the Beehive one up in 30s, and it lacks a

Cultural differences (0)

Anonymous Coward | about 2 years ago | (#39424755)

Different cultures apparently have different rules for crossword puzzles. AFAIK, Finnish crossword puzzles would require that each answer must be a valid, independent word (in singular or plural). Moreover, in Finnish crossword puzzles, half of the clues are graphical [sanaristikot.net] .

Re:Cultural differences (1)

mwvdlee (775178) | about 2 years ago | (#39426053)

It probably has a lot to do with the frequency of letters in a language. I can't think of any reasonable way to fill out all the 4x4 blocks in your puzzle with actual words in my native language, Dutch. I'm guessing the distribution of letters in Finnish is much more skewed.

Looks like (1)

Anonymous Coward | about 2 years ago | (#39424853)

the computer needs help from Joe Piscopo [youtube.com]

My favourites (1)

troon (724114) | about 2 years ago | (#39424889)

I'd like to see if Dr Fill manages these two:

HIJKLMNO (5)
___ (2, 3, 4, 1, 4)

Re:My favourites (1)

91degrees (207121) | about 2 years ago | (#39424965)

HIJKLMNO

Yes, I remember seeing that one. One of those clues that filled me with delight when I got it.

___ (2, 3, 4, 1, 4)

To Not Have A Clue?

Re:My favourites (1)

dkf (304284) | about 2 years ago | (#39426695)

HIJKLMNO (5)

That reminds me of this one:

ABCDEFGHIJKMNOPQRSTUVWXYZ (4)

Re:My favourites (0)

Anonymous Coward | about 2 years ago | (#39427515)

And:

e (13)

Answer: "Senselessness"

Redundancy (3, Insightful)

Hentes (2461350) | about 2 years ago | (#39424975)

Not being able to guess a few words might not be a problem, skip it and solve the other ones, once there are enough letters in it a computer can easily look up the available words, and if there are more than one even use a nonlinear approach. Even without any clues, a few words can't be that hard to bruteforce.

Re:Redundancy (0)

Anonymous Coward | about 2 years ago | (#39425259)

But all the competitors are already doing that. The computer is trying to win, not draw even.

Re:Redundancy (1)

Hentes (2461350) | about 2 years ago | (#39425565)

A computer can bruteforce much faster than a human.

Re:Redundancy (2)

Bigby (659157) | about 2 years ago | (#39425795)

A human can cleverly clobber a crossword computer with brute force.

Re:Redundancy (0)

Anonymous Coward | about 2 years ago | (#39425517)

It's not always that simple. Some years back one of the easiest puzzles at the competition was completely backwards. Every clue was written backwards so it was obvious that all the answers would be backwards -- well, obvious to eveyone but the computer. Just about every human got the puzzle right, but the computer got only one word just by shear luck.

dom

Re:Redundancy (1)

CastrTroy (595695) | about 2 years ago | (#39425789)

This is what I was thinking. Crossword puzzles should be pretty easy to solve if you can brute force the thing. If you can't solve something, solve all the other clues in the other direction, and you have an answer. Unless these contests don't actually solve whole puzzles, but rather are given a partial puzzle with some parts filled in, and have to answer particular clues. Also, are the contestants allowed to use dictionaries, thesauruses, and other reference materials? Because if they aren't it's even more disappointing how well this computer did, since it would be easy for the computer to contain an entire dictionary, and every crossword puzzle ever published in it's memory.

Re:Redundancy (1)

Hillgiant (916436) | about 2 years ago | (#39426783)

Not necessarily. The computer likely double checks cross clues to verify any individual answer. If one of answers starts to appear to be jibberish, it will throw a large swath of intersecting answers in doubt. If the jibberish answer crosses a large portion of the grid, the doubt can propagate through the entire puzzle.

Nice post (0)

normanlee (2600137) | about 2 years ago | (#39425059)

Ya that is right and computer will randomly change the crossword puzzle. The writer has discussed a very good point that we humans recognize and take decision on patterns based on accumulated knowledge and experience, while computers make endless calculations to determine the most statistically probable answer not even knowing what are they doing. Bathroom Faucets [remodelproducts.com]

"That's my nightmare" (2)

Anonymous Coward | about 2 years ago | (#39425467)

Unfortunately for Dr Fill's creator, the problem of how to get the program to work with such unorthodox solutions is the same as getting it to think like a person. At a certain point, all AI questions become the same AI question: this is the very essence of Turing horizon, and all such efforts converge there.

The program he wants to write is, sadly, doomed, as it will be impossible until such time as our species generates a true artificial consciousness with human intelligence, at which point the problem will be trivial - we will have much larger concerns that day.

Witty Watson Won (1)

Anonymous Coward | about 2 years ago | (#39425479)

The crossword puzzle guy needs lessons from Watson, who clobbered several Jeopardy human champions [slashdot.org] . That show has clever categories and clues. Watson probably had more impressive computing power, but I doubt that was the issue. The Watson designers clearly had a better grasp of natural language, including humor-filled and storied language.

Re:Witty Watson Won (0)

Anonymous Coward | about 2 years ago | (#39429855)

Watson won, in part, because he was faster than his opponents. The way that Jeopardy is structured, Watson needed only to know a high percentage of the answers and be faster. It could (and did) fail miserably at some questions just like Dr. Fill did.

If there was a prize for who could fill out 75% of the crossword puzzle first, I'm sure Dr. Fill could have won that easily.

this does not show humans are special (0)

Anonymous Coward | about 2 years ago | (#39425879)

There is nothing magical about the example given. 180 degrees correlates with rotation. Apollo correlates to Moon missions. Rotation correlates to crew rotation, reversing numbers, rotating around the center... bingo. That fits. Humorous answers follow the exact same rules as any other answers.

different processes? (1)

shadowrat (1069614) | about 2 years ago | (#39426041)

Humans recognize patterns based on accumulated knowledge and experience, while computers make endless calculations to determine the most statistically probable answer.

what's different about that? I've often said, "The more I learn about AI, the more I think it lacks any intelligence at all. The more I learn about psychology, the more I believe that humans think just like an AI."

Humans are also determining the most statistically probable answer. They just have a better algorithm for factoring humor into those statistics.

When is a moon mission not a moon mission? (-1)

Anonymous Coward | about 2 years ago | (#39426339)

QUOTE: The answer is SNOISSIWNOOW, seemingly gibberish. A clever human could eventually figure out that those letters, when rotated 180 degrees, spell MOON MISSIONS.

RESPONSE: Using that criteria (rotated 180 degrees) you'd still have the wrong answer, the answer provided is mixing the letter W with letter M ... was the answer Woon Wissions or Moon Missions? Slash-dot editors at their best once more...

Re:When is a moon mission not a moon mission? (1)

OldeTimeGeek (725417) | about 2 years ago | (#39426795)

No, they're correct - you're just thinking of rotation on the wrong axis. To see how it works, try this: Take the word SNOISSIWNOOW and write it on a piece of paper. Rotate the paper 180 degrees so the lower right becomes the upper left. The part of the letters that were the bottom are now on the top - hence "W" becomes an "M" and the result is MOONMISSIONS.

Re:When is a moon mission not a moon mission? (3, Funny)

grandpastackhouse (2036004) | about 2 years ago | (#39426929)

*HSOOM*

Re:When is a moon mission not a moon mission? (0)

Anonymous Coward | about 2 years ago | (#39427347)

Seems like you're a failed AI.

typo? (-1)

Anonymous Coward | about 2 years ago | (#39426517)

you put W for M...

So Doctor Fill... (1)

Oswald McWeany (2428506) | about 2 years ago | (#39427653)

So Doctor Fill has the same ability to comprehend humour as Doctor Phil?

Sounds familiar. (0)

Anonymous Coward | about 2 years ago | (#39427683)

Blaine the Mono was unavailable for comment.

The two puzzles it tanked on (0)

Anonymous Coward | about 2 years ago | (#39428801)

The first puzzle it tanked on (#2 in the tournament) had every other across line reading backwards. Thus, the answer to "Title in a Joel Chandler Harris story" was RERB, not BRER. I would expect a computer to fail at this.

The other one puzzle it tanked on (#5) had long words containing the trigram ANT split into to three parts, with the ANT portion connecting the beginning and the end by running diagonally through the grid. I'd expect the computer to fail on that one too.

However, it did very well on the straightforward, gimmick-less puzzles, cruising through the final puzzle, which was deemed the hardest by humans.

SNOISSIWNOOW (1)

operagost (62405) | about 2 years ago | (#39428905)

Isn't having a gibberish answer in a crossword puzzle like making up your own words in Scrabble? Doesn't the creation of a crossword puzzle have any rules? No wonder I often do poorly with them; I had no idea that they could be making up nonsense words.

Re:SNOISSIWNOOW (1)

SnarfQuest (469614) | about 2 years ago | (#39430497)

Isn't having a gibberish answer in a crossword puzzle like making up your own words in Scrabble?

All they need to do is claim it's Polish. Most puzzles in the major papers consist of 50% foreign words anyway. That allows them to make up very long words containing no vowels.

I thought Crosswords... (1)

thatbloke83 (1529851) | about 2 years ago | (#39430875)

...were supposed to be composed of REAL words.

WTF is "SNOISSIWNOOW"??

Re:I thought Crosswords... (0)

Anonymous Coward | more than 2 years ago | (#39471895)

the clue, and therefore not part of the crossword. ltr.

so it is just plain stupid... (0)

Anonymous Coward | about 2 years ago | (#39432657)

so it is just plain stupid...

I hate crossword puzzles (1)

dentin (2175) | more than 2 years ago | (#39435615)

I can't blame the computer for not doing well on these; a lot of crossword puzzles are a puzzle of "guess what the creator was thinking", and not a puzzle of words and language. Quite frankly, I'm not interested in guessing what someone else happens to be thinking when they write down a clue like "blue, red, and big"; I find that fundamentally uninteresting and of no long term value.

I have the same problem with many Mensa puzzles. A lot of them I can do, but puzzles that require deep and specific information from an extremely narrow field are really not useful tests of intelligence. Just as the "SNOISSIWNOOW" question above requires deep and specific information about an extremely narrow field (the narrow field being "how certain subsets of human minds think"), so to do questions such as "guess the next number in the sequence: 3, 5, 205782654, 6, 308" (which also requires knowledge about how certain human minds think). Neither answer is derivable, and both must be guessed via trial and error from models that few people will have.

The most common response to this viewpoint has been along the lines of "ha ha, you just can't think outside the box". In reality, I'm actually pretty damned good at thinking outside the box. What I'm not particularly good at is thinking inside someone else's box, because if I actually need to know, I can simply talk to them and ask them. It's far more efficient.

Check for New Comments
Slashdot Login

Need an Account?

Forgot your password?