Beta
×

Welcome to the Slashdot Beta site -- learn more here. Use the link in the footer or click here to return to the Classic version of Slashdot.

Thank you!

Before you choose to head back to the Classic look of the site, we'd appreciate it if you share your thoughts on the Beta; your feedback is what drives our ongoing development.

Beta is different and we value you taking the time to try it out. Please take a look at the changes we've made in Beta and  learn more about it. Thanks for reading, and for making the site better!

Mac Version of NaturallySpeaking Launched

kdawson posted more than 6 years ago | from the listen-what-i-say dept.

Input Devices 176

WirePosted writes "MacSpeech, the leading supplier of speech recognition software for the Mac, has canned its long-running iListen product and has launched a Mac version of Dragon NaturallySpeaking, the top-selling Windows speech recognition product. MacSpeech had made a licensing agreement with Dragon's developer, Nuance Communications. The new product is said to reach 99% accuracy after 5 minutes of training."

cancel ×

176 comments

Sorry! There are no comments related to the filter you selected.

ai (2, Funny)

User 956 (568564) | more than 6 years ago | (#22063790)

MacSpeech, the leading supplier of speech recognition software for the Mac, has canned its long-running iListen product and has launched a Mac version of Dragon NaturallySpeaking

Tell me more about has launched a Mac version of Dragon NaturallySpeaking.

Re:ai (1)

arazor (55656) | more than 6 years ago | (#22064026)

Teach me of fire mancub.

At Last! (5, Interesting)

Slurpee (4012) | more than 6 years ago | (#22064188)

I was at the Apple Dev conference in 1999 (or so) when the CEO of Dragon got up during Steve's keynote and announced that they were going to develop a Mac version of Dragon.

Almost 10 years later - and it's finally here!

Or at least a follow up announcement is here.

Re:At Last! (1)

fortunato (106228) | more than 6 years ago | (#22064610)

All I know is that if this means that my wife will be able to get to the right department when calling the insurance company to make a doctor appointment for our kids I'll be a happy camper. ;) I would forgo the cursing, redialing, and angry expletives that are required right now in order to make a simple pre-note that we are taking the kids in for their required annual physical.

Re:ai (0, Redundant)

ozmanjusri (601766) | more than 6 years ago | (#22064300)

Dear aunt, let's set so double the killer delete select all.

Re:ai (0)

Anonymous Coward | more than 6 years ago | (#22066226)

People spent an extra decade yelling at their PCs and the PCs pretended to listen.

Talking to oneself (4, Informative)

flyingfsck (986395) | more than 6 years ago | (#22063844)

I tried Dragon a number of times, but it feels too much like talking to oneself. Training it is a chore too. 99% accuracy after 5 minutes is probably true, but I type much better than that. I suppose it will be great for people who either can't type properly or are lysdexic.

Posting to oneself (0)

Anonymous Coward | more than 6 years ago | (#22063962)

"I suppose it will be great for people who either can't type properly or are lysdexic."

Getting First Post! will be a lot easier.

Re:Talking to oneself (0, Redundant)

calculadoru (760076) | more than 6 years ago | (#22064356)

people who either can't type properly or are lysdexic

Which one are you then?

Re:Talking to oneself (0)

Anonymous Coward | more than 6 years ago | (#22064722)

you found the joke!

have a cookie.

Re:Talking to oneself (2, Insightful)

Seumas (6865) | more than 6 years ago | (#22064408)

I tried Dragon years ago and after a couple hours or so of training, it still completely sucked. Same with IBM Via Voice. Perhaps Google will help improve things with their GOOG411 service that they're using to build up a massive bank of phonetics. Otherwise, it seems like real speech recognition is never seriously going to get off the ground.

Re:Talking to oneself (4, Funny)

rucs_hack (784150) | more than 6 years ago | (#22064896)

I tried it a few years back. I stopped when my youngest, who was still learning to talk started going round the house saying 'mousegrid' all the time.

Good job he didn't get the whole thing though, which was typically.

"Mousegrid...."

"Mousegrid...."

"MOUSEGRID!!...."

"FUCKING MOUSEGRID YOU PIECE OF SHIT PROGRAM!!!"

Re:Talking to oneself (2, Funny)

alex4u2nv (869827) | more than 6 years ago | (#22064518)

Training is tough because they replaced the iListen package with iStoppedListening.

Also, its use may be weak in dictating a paper,but it's great for dictating a command.

Think about it, you could walk up to your iComputer and say "Main Screen Turn on!!"
instead of pressing the power button.

Re:Talking to oneself (2, Interesting)

rolfwind (528248) | more than 6 years ago | (#22064556)

I used it too a number of times - I probably have an accuracy rate not much better than 99% typing - I'm a clutz. But whereas fixing in middle of typing is pretty smooth and not too time consuming - Dragon makes it a chore over every little mistake.

I won't recommend "Don't use it" because it's really a personal choice - some people love it and some hate it. But I have tried 3 versions so far (including the latest)and it wasn't so much a conscious decision to stop using it as much as I just eventually stopped bothering.

I could see using it to write-up letters which is a chore Dragon is very competent once trained (not necessarily faster or even as fast as typing though) but a task I seldom engage in for extended durations.

But part of the dream of Speech Recognition is telling the computer to do this and that -- even just a simplistic version of what is in some Sci-Fi like in Star Trek -- and the computer just knows what it needs to do and does it. I'm not even talking anything as complicated as AI, just something like "look up slashdot" and it fires up the browser and goes to the site. Or while using Dragon the command won't be "Set my dentist appointment for 4:00pm Wednesday" but more like (open calendar app with mouse, put mouse on correct textbox and click) "Dentist Appointment.... Tab..... tab.... numeral 7...." (bring mouse over AM/PM selector and select PM).

This isn't something that is Dragon's fault -- I think in many years programs and OSes as well will have a number of keywords that will control them built in (if I'm not mistaken Apple has a primitive version of this but the speech recognition is crap). Dragon has great accuracy but the program is hopeless in commands and context (yes, I know it can be trained -- like a dog; a lot of effort for a few piddly tasks) and I think that's a major aspect of what many people would secretly like when they try out the program.

Re:Talking to oneself (1)

CastrTroy (595695) | more than 6 years ago | (#22064720)

I would recommmend "don't use it" in an office environment, or any other environment where people can hear you speaking. Nothing more annoying than listening to somebody else say "Dentist appointment.... Tab.... Tab.... numeral 7..." all day long.

Re:Talking to oneself (1)

ubrgeek (679399) | more than 6 years ago | (#22065796)

If I'm not mistaken Apple has a primitive version of this but the speech recognition is crap.

Actually, from my experience it's pretty good, at least for short expressions. I've got mine set-up to do things exactly like your Slashdot example. (I tell it "Browser slashdot" and it works great (I'm guessing because it knows "browser" means that I want the word right afterward to mean the phonetic term "Slashdot" that I've previously told it meant the Website, not "/ .") It's also useful for things like launching mail.app and checking email. With Applescript, it becomes even more useful (I can tell it to launch a pre-built "app" that can do just about any number of things using automater.) While it would obviously be trivial to have those apps on the dock so that I can click them to launch, this way I don't have to take up Dock space to do so.

Re:Talking to oneself (2, Informative)

samkass (174571) | more than 6 years ago | (#22066158)

Yeah, Apple's speech recognizer has very dissimilar goals to Dragon's (although both, if I recall correctly, got their start at Carnegie Mellon's speech labs). Apple is trying to build a speaker-independent, no-training-required recognizer that can handle short commands. Dragon doesn't care as much about speaker-independent, but requires accuracy over sentences and paragraphs. Very different algorithmic, HCI and optimization problems.

Re:Talking to oneself (1)

xtracto (837672) | more than 6 years ago | (#22064596)

I tried Dragon a number of times, but it feels too much like talking to oneself. Training it is a chore too. 99% accuracy after 5 minutes is probably true, but I type much better than that. I suppose it will be great for people who either can't type properly or are lysdexic.
99% accuracy after 5 minutes is probably true, but I type much better than that. I suppose it will be great for people who either can't type properly or are lysdexic.

99% accuracy means that for every 100 words (a paragraph) you will have a wrong word. Now, that accuracy is in the "optimal conditions" and talking at a specific phase. The problem with the other 1% is that the wrong word might not be even related to the text (whereas when you are writting, the error is mostly in spelling).

Personally, last time I tried Dragon was about 4 years ago (installed it for my mom to test it) and it was terrible. I wonder if it wont be a good idea to design a special soundcard (not only the mic) to aid in the recognition?

Re:Talking to oneself (0, Flamebait)

kurt555gs (309278) | more than 6 years ago | (#22065256)

I wonder if they thought of including an algorithm to deal with the lisp that is present in most male MAC users?

This would be different than the Windows version.

Cheers
 

Re:Talking to oneself (0)

Anonymous Coward | more than 6 years ago | (#22065370)

I wonder if they thought of including an algorithm to deal with the lisp that is present in most male MAC users?
They have. That same algorithm also corrects distortion from their holding of the microphone too close to their mouth, and the echo when inside it.

Re:Talking to oneself (0)

Anonymous Coward | more than 6 years ago | (#22066110)

This would be different than the Windows version.

Yeah, the Windows version would fuck YOU in the ass... and you'd like it, Kurt. Come out of the closet!

Re:Talking to oneself (0)

Anonymous Coward | more than 6 years ago | (#22064902)

Training is not a chore at all on Dragon in fact on version 9 you can skip it entirely - not sure whether MacSpeech will implement that feature. But can you type at 160 words a minute - I doubt you would achieve a third of that! Thats as fast as you can talk and the Dragon Software does a good job of keeping up!

noocular (2, Funny)

Hognoxious (631665) | more than 6 years ago | (#22065068)

I don't think dictation's the solution. If you're discelyc what you really need is a spielchucker.

And what about about people who speak dyslexically? Yes, Dubya, as it happens I am looking at you.

Re:noocular (1)

CSMatt (1175471) | more than 6 years ago | (#22065328)

That depends on how good your spelling skills are. You still need to spell the word well enough that the spell checker can guess what word you want to use. I'm a horrible speller, and I know that I've encountered a number of times where I had to keep guessing at a word's spelling just to get the spell checker to recognize what it should be.

Re:Talking to oneself (2, Funny)

Sox2 (785958) | more than 6 years ago | (#22065258)

hey, i'm using it now. It wonks fine.

Re:Talking to oneself (4, Interesting)

duvel (173522) | more than 6 years ago | (#22065350)

I am entering this comment while using Dragon NaturallySpeaking version 8.

I am not a native English speaker, but I am usually able to say just about anything I want. In this comments, I have not altered any of the mistakes (if any) that Dragon NaturallySpeaking made while I was dictating. As you can see, the error rate is probably a bit higher than 99 per cent correctness. Nevertheless, I used this extensively, because it increases the speed at which I can work.I often have to type reports, and it goes a lot faster while using this tool. The only problem is that these reports contain lots of enterprise specific (and IT specific) terms. Naturally, it takes a while before Dragon NaturallySpeaking knows all of these terms.

Other than that, I am very happy with it.

Re:Talking to oneself (0)

Anonymous Coward | more than 6 years ago | (#22065902)

About 9 months ago, I broke my elbow. fucking ouch..

anyway, as an experiment work gave me Dragon naturally speaking so I could complete a bunch of reports and documents on time. I could type and use the mouse, but not for the shear volume of work I had to complete, plus I was also enjoying some awesome pain killers for about a month there as well which, whilst enjoyable, didn't help in the speed and accuracy department.

After about an hour of learning it and it learning me, I could use it as fast as I could type.

You do get punished for making mistakes. Making corrections or minor edits becomes really tedious, and after a while you simply cannot be bothered 'training' it to avoid the same mistake the next time.

There is absolutely no way it can replace a mouse and keyboard for regular interaction with the computer, but for keying in loads of text it is actually really great and I can recommend it.

Re:Talking to oneself (1)

autophile (640621) | more than 6 years ago | (#22065980)

Out of curiosity, how long did it take you to dictate the comment?

--Rob

Isn't that... (2, Informative)

Sylos (1073710) | more than 6 years ago | (#22063868)

the whole intention of Dragon? For those people who *are* impaired in some way or another? I mean...I could never "speak" out a paper or something. I'd end up tearing my vocal cords out.

Re:Isn't that... (1)

cheater512 (783349) | more than 6 years ago | (#22064132)

Yes its useful for those people.

Its also incredibly useful for people who cant shut up.
I know quite a few people like that. ;)

Re:Isn't that... (1)

coolGuyZak (844482) | more than 6 years ago | (#22065278)

I'm one of those people, but I wouldn't use it to enter text into a computer. It seems better suited as a transcription device.

Re:Isn't that... (3, Interesting)

Propaganda13 (312548) | more than 6 years ago | (#22064138)

David Weber http://www.baen.com/author_catalog.asp?author=DWeber [baen.com] uses voice recognition software for writing novels.

David talking about it back in 2002.
"On a more technical from I began using voice-activated software when I broke my wrist very badly about two years ago. I've found that it tends to increase the rate at which I can write while I'm actually working, but that it's more fatigue-sensitive than a keyboard. You can push your fingers further than you can push your voice when fatigue begins to blur your pronunciation and confuse the voice recognition feature of your software.

I don't think it's had a major impact on my writing style, but it does affect how I compose sentences. What I mean by that is that because the software prefers complete phrases, in order to let it extrapolate from context when it's trying to decide what word to use for an ambiguous pronunciation, I have to decide how I want a sentence to be shaped before I begin talking to a much greater extent than I had to do before I began typing."
http://sfcrowsnest.co.uk/features/arc/2002/nz5718.php [sfcrowsnest.co.uk]

I for one... (5, Funny)

tieTYT (989034) | more than 6 years ago | (#22063890)

Re:I for one... (0)

Anonymous Coward | more than 6 years ago | (#22064662)

I personally like the Perl scripting one: http://www.youtube.com/watch?v=KyLqUf4cdwc [youtube.com]

As the Apple ads have demonstrated... (2, Funny)

kcbanner (929309) | more than 6 years ago | (#22063898)

...Mac users will have no trouble chatting with their computer for 5 minutes. Think of how accurate the system will be if the users got into a heated debate!

Apple version (2, Funny)

Wiseman1024 (993899) | more than 6 years ago | (#22063956)

Will it recognize metrosexual accents?

Re:Apple version (5, Funny)

bobdotorg (598873) | more than 6 years ago | (#22064070)

Will it recognize metrosexual accents?

Yes, select the check box: preferences/language settings/accent/Fanboi/Apple

This is the Mac equivalent to your current setting:

options/language setings/accent/Troll/WindowsME

Re:Apple version (0)

Anonymous Coward | more than 6 years ago | (#22064282)

Score 3, Informative? Change metrosexual to homosexual and watch yourself get modded troll/flamebait. What's the difference?

Re:Apple version (2, Funny)

ozmanjusri (601766) | more than 6 years ago | (#22064322)

What's the difference?

Drop the soap and you'll find out.

Re:Apple version (5, Funny)

_merlin (160982) | more than 6 years ago | (#22064326)

Score 3, Informative? Change metrosexual to homosexual and watch yourself get modded troll/flamebait. What's the difference?

The difference is that while metrosexuals try hard to be gay, homosexuals succeed.

Re:Apple version (0)

Anonymous Coward | more than 6 years ago | (#22064942)

Precisely. Which is why it's PC to make fun of them. Me, I would've just said homosexual.

(I'm the AC you're replying to)

Re:Apple version (0)

Anonymous Coward | more than 6 years ago | (#22064972)

Score 3, Informative? Change metrosexual to homosexual and watch yourself get modded troll/flamebait. What's the difference?


The difference is that while metrosexuals try hard to be gay, homosexuals succeed.

Score 4, Insightful? The defining trait of homosexuality is same-sex mating.

Mocking incidental characteristics associated with certain Western homosexual lifestyles is hateful towards homosexuals whether the target of your mocking commits to intercourse with same-sex partners or not. You fear and loathe the traits of homosexuals, but political correctness limits the expression of your loathing to non-protected groups exhibiting them.

Re:Apple version (0)

Anonymous Coward | more than 6 years ago | (#22065048)

butt fucking another dude is not an incidental characteristic of the homo lifestyle.

Re:Apple version (2, Funny)

Wiseman1024 (993899) | more than 6 years ago | (#22064820)

Lol, Apple iFanboys are wasting their mod points on this. Better keep them busy here rather than have them influence meaningful discussion.

Re:Apple version (1)

Malevolent Tester (1201209) | more than 6 years ago | (#22065550)

Yeth.

Whatever became of this technology? (5, Insightful)

lhaeh (463179) | more than 6 years ago | (#22063972)

The last time I tried using voice dictation was When I was running OS/2 Warp 4. Training took forever, and the experience of using it was nothing but an exercise in frustration, ending with me screaming at the bloody thing then seeing neat, yet random expletives on my screen. I later came across some budget software that required no training, yet worked surprisingly well compared to the $400 packages made by the big boys. That software really showed what voice diction should be like, if only it was developed further.

The training an accuracy seem like things that can be overcome, but I would really like to see a solution for things like punctuation and function keys, things that don't naturally come with speaking. Instead of having to say "delete that" or " delete" it would be nice to just have a button that I can hold down when saying things I want interpreted as commands.

Re:Whatever became of this technology? (4, Insightful)

jimicus (737525) | more than 6 years ago | (#22064160)

A few things became of the technology:

1. 99% accuracy rate is actually pretty bad in the real world. In a typical document, you might expect 12-15 words per line - so you have one error every 7 lines or so.
2. 99% accuracy rate is only achievable under ideal circumstances - ie. using a top quality microphone hooked up to a good soundcard in an environment with very little background noise and no echo. Basically, circumstances you only get in a half-decent recording studio. In the real world, you seldom get this.
3. Unless you happen to be blessed with amazing self-discipline (and/or can guarantee that nobody is going to approach you while you're working). Otherwise you get back to work after a distraction and find yourself having to delete a conversation you just had with a co-worker.
4. If you're in an open-plan office (that's probably about 99% of UK offices these days) your colleagues will not thank you for spending all day talking.

I've used the software (1)

TheVelvetFlamebait (986083) | more than 6 years ago | (#22064784)

I had about a 98% accuracy rating with the included microphone and no sound card.

Re:I've used the software (1)

jimicus (737525) | more than 6 years ago | (#22065220)

That extra 1% is the part that's difficult to get.

If you'd said "I got 98.995% accuracy with the included microphone", I'd be more interested.

When the software's history involves jail terms... (5, Interesting)

Futurepower(R) (558542) | more than 6 years ago | (#22065016)

This software's history includes jail terms. Speech recognition has gotten an extremely bad reputation for being worthless garbage, maybe because it is worthless garbage.

Even a 0.5 percent recognition failure rate is enough to make speech recognition software worse than worthless. The reason is that speech recognition software never makes a spelling mistake. Instead, the mistakes are often extremely difficult to recognize, and sometimes change the meaning in subtle ways. That's partly because when the software is confused it tries to select something that is grammatically plausible.

The result is that it has become difficult to sell speech recognition software. A high enough percentage of people in the U.S. culture know that it isn't actually useful. The orginal owners of Dragon NaturallySpeaking sold the product to a company that sold it to the company that became Nuance, maybe because they felt the product was damaging the credibility of their trademarks.

Here is a quote from the ComputerWorld story [computerworld.com] linked in the earlier Slashdot story, Is Speech Recognition Finally 'Good Enough'? [slashdot.org] :

"In 1993 two executives from Kurzweill Applied Intelligence (which pioneered SR for the medical market) went to prison for faking sales. That firm was sold in 1997 to a Belgium SR firm, Lernout and Hauspie (L&H), which was reporting phenomenal sales growth at the time. Dragon Systems, which originated DNS that year, was reporting only anemic growth, and L&H had no trouble acquiring Dragon Systems in early 2000 in a stock deal. Within a year a series of accounting frauds came to light and L&H collapsed into bankruptcy. Its SR technology was sold in late 2001 to ScanSoft Inc., which kept the DNS line going. (It was then at Version 6.0.) ScanSoft later acquired Nuance and adopted its name.

"Thereafter, "It was with the launch of Version 8.0 (in November 2004) that the market became reinvigorated and took off," said Chris Strammiello, director of product management at Nuance. "We crossed an invisible line with Version 8.0, where the software actually delivered on its promises and offered real utility for the users. Sales have been growing at a rate of 30% yearly since then, except that we expect it to do better than 30% this year."

Read that again: "... the software actually delivered on its promises and offered real utility..." I called Nuance and was told that version 8 did not have a new recognition engine, but only had improvements in the user interface. A friend who owns and tested version 8 told me he could see no difference in accuracy between that and version 7.

So, in my opinion, Nuance has done common deceitful things that are called "Marketing":

1) Bring out new versions. Previously, when there has been a "new version" of Dragon NaturallySpeaking, I call Nuance technical support and ask if there is a new recognition engine. I didn't call for version 9, but for the last two versions they have said no. So, nothing is changed; the software is still worse than useless to me, in spite of the fact that they advertise that the software is now more accurate.

How is it possible that the software is more accurate, if the recognition engine did not change? Maybe it isn't true. Or maybe the company improved the guesses the software makes when the software really has no clue what the user said. As I mentioned, those guesses have become so sophisticated that you can become confused about what you actually said, and you have to spend time re-creating your ideas. If you are saying simple things about a simple subject, this is not as much of problem as when you are writing about contract negotiations, for example.

In the words of a Slashdot reader: "The opinions expressed here may be those of my speech recognition software."

2) Take advantage of user ignorance. There is no mention whatsoever of the problems.

3) Get a marketing meany to write stories that make the product sound interesting, while carefully avoiding the truth, in this case, that apparently nothing important has changed.

When the background of software involves jail terms, be very careful.

Here's another quote from the article linked in the earlier Slashdot story:

"Today, "A person can get 95% accuracy right out of the box, and enrollment is optional and only takes five minutes," said Howard Parks, president of Microref Systems Inc., a firm in Highland Park, Ill., that sells SR systems and trains users."

Notice that in that paragraph there is no mention of the fact that 5% errors make the speech recognition software worthless. Later the article talks about a 2% error rate, still avoiding the issues that even 0.5% errors make the software slower than typing, while possibly introducing embarrassing errors.

Did he admit fraud? Read the statement quoted above from Chris Strammiello, director of product management at Nuance, again:

"We crossed an invisible line with Version 8.0, where the software actually delivered on its promises and offered real utility for the users."

Nuance owned the software when it was sold as version 7, and before. So, Mr. Strammiello is apparently saying that his company didn't deliver on its promises before, and was knowingly involved in fraud, but, don't worry, now the company is honest.

Wow. Sometimes people lie so much that they have no idea how what they say sounds to others.

At least, that's how it seems to me.

Marketing is meant to be methods whereby a company makes healthy connections. However, most marketing people seem to think that marketing means lying. And, when they sink the company, they just get a job somewhere else.

Talk like a robot: Perhaps the biggest problem with speech recognition software is that it slowly changes the way you speak. I once lived in England for 5 months and when I came back to the U.S. people said I an English accent. I was completely unaware of any change. But sounding like an Englishman is not bad compared to the changes that occur when you use speech recognition software regularly. You begin speaking in a slightly mechanical and less human-sounding manner to try to get the software to make fewer errors. Once you have that habit, it doesn't just go away when you are talking to other people rather than to software. If you eventually notice that people have stopped talking to you, go back to typing.

Re:Whatever became of this technology? (3, Interesting)

forkazoo (138186) | more than 6 years ago | (#22064512)

The training an accuracy seem like things that can be overcome, but I would really like to see a solution for things like punctuation and function keys, things that don't naturally come with speaking. Instead of having to say "delete that" or " delete" it would be nice to just have a button that I can hold down when saying things I want interpreted as commands.


Yes, and to follow along the same line of thought, nobody has ever come out with anything like a speech recogniser designed for programming. Personally, I always figured that a good speech recognition system for both text and commands would need to make use of sounds that don't occur as text. So, you could do something like a special double-whistle to enter command mode, or honk like a goose for undo. Likewise, you could use gibberish words as commands instead of "delete that."

Obviously, it violates the principle that all computers you can talk to should work like Star Trek. But, it seems that just like a command line interface, a spoken interface could be fantastically useful if only somebody would decide that the operator will need some instruction in a few special arcane incantations.

Then, all we'll need is an extension to C so that function prototypes include a way to express the pronunciation of a function name, so a spoken interface IDE could use something like intellisense to parse the API I am using and away we go.

Re:Whatever became of this technology? (0)

Anonymous Coward | more than 6 years ago | (#22064696)

So, you could do something like a special double-whistle to enter command mode, or honk like a goose for undo.

It's bad enough in my open plan office with people's cheesy ringtones and 'loud Howard' style phone conversations, without adding a menagerie of poor animal impersonations. And as for the whistling - imagine the scope for sexual harassment lawsuits!

Re:Whatever became of this technology? (1)

Ed Avis (5917) | more than 6 years ago | (#22064960)

Yes, and to follow along the same line of thought, nobody has ever come out with anything like a speech recogniser designed for programming.
Stay tuned for Perl 7.

Already been done.. (3, Funny)

Anonymous Coward | more than 6 years ago | (#22063974)

"Computer... computer... hello computer?"

Grate product (4, Funny)

Library Spoff (582122) | more than 6 years ago | (#22063982)

Am oosing it two type this comment. Didn't knead the fave mins train ming though...

Re:Grate product (0)

Anonymous Coward | more than 6 years ago | (#22064088)

Eye donut lycen ewe. Huma stubby real eSofa King Wee Todd Ted.

But does it run on linux? (1)

js_sebastian (946118) | more than 6 years ago | (#22064002)

i know the answer. No it doesn't.

I own a copy of dragon 9 but having to reboot into windows to use it makes it too much of a hassle. Wine doesn't seem to handle it either.

It actually works quite well, although mileage may vary depending on the sound quality you get from your microphone, soundcard setup.

Re:But does it run on linux? = WMware (1)

jackjeff (955699) | more than 6 years ago | (#22064492)

Have you heard of VMWare ?

Re:But does it run on linux? = WMware (1)

markdavis (642305) | more than 6 years ago | (#22065132)

In his case, that might be OK.

But for the rest of us- we choose to use Linux because we want to use Linux. For most Linux users, it doesn't make much sense to buy and install MS-Windows and Dragon to use in the free/open Virtualbox or the proprietary/closed VMware. With such a model, you cannot use the speech recognition in the Linux applications.

Re:But does it run on linux? (1)

markdavis (642305) | more than 6 years ago | (#22065100)

Probably not.

But I, personally, know several people that would buy a Linux version of Natural Speaking... including myself.

Perhaps the Mac version would be easier to port? Don't know. Best thing to do is send them Email saying you would pay for a Linux version. I did: questions@macspeech.com

Minion, do my bidding! (5, Interesting)

Anonymous Coward | more than 6 years ago | (#22064016)

I'll have to play with Dragon at some point; I just haven't gotten around to it yet. Aside from accuracy errors, the primary issue that bothers me about speech recognition solutions I've tried is the general lack of being able to recognize speech that seems natural to humans but isn't what the system is expecting as input.

This is especially true with over-the-telephone solutions. For example, I am with Rogers Wireless carrier here in Canada, and their automated customer service system prompts you for your phone number. My last 4 digits are 2125, and it is very natural to say "twenty-one, twenty-five" when giving the number to a human being. The speech system, unfortunately, is only sophisticated enough to understand one-digit-at-a-time mode, so you have to suffer through saying "two one two five". Which isn't truly a big deal, but it's frustrating having to learn each system's unique quirks and limits. I suppose the same can be said of any technology.

Oral dictation (as opposed to fixation) is frustrating at best. Punctuation is a critical item that I can't stand dealing with. Trying to get the goddamn software to insert commas and semi-colons can be difficult enough, let alone wanting to actually insert the word "comma" into a paragraph. Then there's trying to spell out acronyms (aka "aka"), or inserting the contents between and including those parentheses. Until dictation of a document can be done with truly minimal correction and post-editing, and can be spoken at a very comfortable pace, I will stick to a keyboard.

Of course, the most entertaining aspect of watching someone else play with speech recognition is the inevitable habit of sounding completely unnatural while speaking. The monotone voice and sounding like a robot are bad enough, let alone those who think that shouting or talking ree... aaa... llll... lllyy... sloowwwww.... llly is going to help. The funniest I've seen was a woman who seemed to think that talking in cutesy baby-talk would win the system over to her side. :)

I just want a system that responds to commands via a programmable keyword. Only when speech recognition is Star Treky enough to respond to its name will I be happy. My computer will be named Minion.

  • Minion, inform the family I love them.
  • Minion, crawl the web for the highest quality, free pr0n you can find
  • Minion, order me my favourite pizza. Oh, and hack a credit card number from the net to pay for it.
  • Minion, tell some slashdoters off for me. Make sure it's worthy of +5 funny.

Re:Minion, do my bidding! (2, Insightful)

LordLucless (582312) | more than 6 years ago | (#22064102)

I used Dragon Naturally Speaking for a while ages ago, and you could program it to respond to its name. Or rather, you setup a "start" sound that would indicate activate the listening algorithm. I had mine set to respond to "computer", but "minion" would work just as well.

I stopped using it after I accidentally left it on in training mode one day, when I was teaching it the word "bonza". The pet lorikeet outside my room made such a wide variety of noises, that from that time forth, it thought every word I said was bonza, and I couldn't be bothered retraining it - training time was more than 5 minutes back then.

I was using it more for commands than for dictation, and it was good at that, but there was one major drawback, and that was background noise - especially loud background noise emitted by the computer itself. One of the things I wanted to do was to get the computer to start and stop playing music on command. Unfortunately, once the music was playing, you had to really yell for the computer to differentiate the command from the music.

Re:Minion, do my bidding! (2, Informative)

Narcogen (666692) | more than 6 years ago | (#22064850)

MacOS has had a built-in feature called Speakable Items that does exactly this, and as an option you can have it respond only to things said after a specific key word-- in essence, the machine's name. "Minion" would work fine.

It is not true dictation. Essentially you create a script and give it a name. When your speech is recognized as the name of a corresponding script, the script is executed.

You can even make scripts that required multiple inputs. Some of the built-in ones in the Mac OS 9 days were knock knock jokes.

Re:Minion, do my bidding! (1)

LordLucless (582312) | more than 6 years ago | (#22065176)

That's essentially all software dictation is - it recognizes the pattern of your speech, and executes the corresponding instruction (prints the correct word). The thing that really defines quality software is the accuracy of its comparison algorithm, and the speed of its learning algorithm. But essentially they do the same as you describe, just with a much larger search space.

Re:Minion, do my bidding! (2, Funny)

andrewjhall (773595) | more than 6 years ago | (#22064676)

I think I'd name mine Igor. Then, assuming I can find the right USB widgets, I can shout "Igor! Raise the lightning rod and find me a fresh brain" - at which point my life's final ambition will have been achieved.

That said, the USB iBrainExtractor is probably as much of a technical challenge as producing speech recognition that isn't a pain in the ass.

Re:Minion, do my bidding! (1)

mwvdlee (775178) | more than 6 years ago | (#22065026)

"twenty-one, twenty-five" = 201205.
Why do you expect a computer to get this right when humans don't?

Re:Minion, do my bidding! (0)

Anonymous Coward | more than 6 years ago | (#22066120)

http://docs.info.apple.com/article.html?path=Mac/10.4/en/mh696.html [apple.com]

"Use the Speech Recognition pane of Speech preferences to turn speech recognition on, _set up how to signal your computer_ that you're speaking a command, create commands for applications, and open the Speakable Items folder."

OSX supports named Star Trek commands, built-in.

Posting from a Dragon Naturally Speaking Mac (5, Funny)

Anonymous Coward | more than 6 years ago | (#22064032)

iIt iworks iso iwonderfully iand iintegrates iwell iinto ithe iother iiproducts.

Fanboys are getting awfully silent as of late.. (-1, Troll)

Anonymous Coward | more than 6 years ago | (#22064100)

It's funny how the Apple fanboys aren't saying a word. Must be pretty boring to see the Mac getting a black eye from Microsoft/Intel time after time.. OWNED

Re:Fanboys are getting awfully silent as of late.. (0)

Anonymous Coward | more than 6 years ago | (#22064162)

Yawn. We had it about a decade ago.

Not to knock it.. (1)

rastoboy29 (807168) | more than 6 years ago | (#22064146)

But 99 out of 100 words correct still makes for a pretty lousy experience if you're trying to do anything serious.

Personally, I think so much when I'm writing that typing is quite fast enough.  Of course, I know not everyone is so fortunate.

according to who? (1)

dwater (72834) | more than 6 years ago | (#22064176)

> The new product is said to reach 99% accuracy after 5 minutes of training.

According to MacSpeech, I suppose?

I'll bet what was said was something 99% different to what MacSpeech thought.

I just saw these guys at macworld (2, Interesting)

Capt'n Hector (650760) | more than 6 years ago | (#22064204)

I was a bit put off by their pricing scheme. It's $50 off the normal price (something like $200) if you buy it at macworld. The only problem is that it's a pre-order, so you can't try before you buy. Also, nobody has reviewed the software, since it doesn't exist yet, so if it turns out to be a stinker you're out $150. And if you don't like the product, their tech support will try and "walk you through" your problem to make it go away. They explicitly said "no refunds". No, thanks.

Practical speech recognition, "House, lights on" (1)

the grace of R'hllor (530051) | more than 6 years ago | (#22064224)

So I've always wanted to rig my house up with voice commands. My guess is I need the following:
  • *Simple* speech recognition. I want it to react to a keyword ("Computer", or "House", or similar sci-fi-ey) and then a few simple commands. Sphinx-2 [sourceforge.net] seems ideal, but I'd need good dictionary files.
  • Ubiquitous microphones (preferably exclusively usable by the speech recognition engine. Setting proper /dev permissions will help). Probably the most difficult/expensive to get right; it needs to work in noisy environments.
  • Machine controllable electronics, sufficiently protected so that . Where those 433MHz remote switches come in I guess. Needs to be code protected, for obvious reasons.
  • Scripts to tie all this together.
Has anyone done this properly/successfully/usefully?

Re:Practical speech recognition, "House, lights on (0)

Anonymous Coward | more than 6 years ago | (#22064310)

Surely you mean...

'Illuminate'
'Deluminate'

Re:Practical speech recognition, "House, lights on (2, Interesting)

amRadioHed (463061) | more than 6 years ago | (#22064348)

Back in the late 90's using only Applescript and the Apple built in speech recognition I was able to voice automate my music library. I don't remember all the details, but I could start and stop the music and select what artist I wanted to hear. It was pretty neat being able to say "Computer, play Nirvana" and getting my music all from the comfort of my bed.

Re:Practical speech recognition, "House, lights on (1)

forkazoo (138186) | more than 6 years ago | (#22064598)

*Simple* speech recognition. I want it to react to a keyword ("Computer", or "House", or similar sci-fi-ey) and then a few simple commands. Sphinx-2 seems ideal, but I'd need good dictionary files.


Be careful what you use as the trigger, or else you won't be able to use the words "House" or "Computer" in any conversation while at home without the house thinking you are trying to command it, and starting the dishwasher or something. I suppose you could always name your house something sci-fi-ish, or fantasy-ish that would never come up in conversations, like "Malthikar." For extra points, establish some sort of visual avatar piped to your TV or something so you can see him while you talk to him.

As for implementation, Mac OS X comes with some sample code for Dictionary based untrained speech recognition. Should do exactly what you want. Since you can give a list of all possible words (the various valid commands) it works better then free-form recognition for general text input. And, you don't have to train it, so anybody who knows the right things to say could work your house. That just leaves having your app do the commands once they are recognized. I'm completely unfamiliar with that end of things, but I know there are home automation doodads which presumably shouldn't be that hard to access from a program.

Tea. Earl Grey. Hot. (0)

Anonymous Coward | more than 6 years ago | (#22064868)

should cover most needs ;-)

Ugh... why is MacSpeech doing this? (0)

Anonymous Coward | more than 6 years ago | (#22064338)

MacSpeech is scum that has been selling absolute shit for years.

Their iListen product was absolute unusable garbage, but that didn't stop them from marketing it as the Mac's equivalent of Dragon. Complete with "30 day money back guarantee" that meant you could get your money back only if you tried it full time for 30 days, and you convinced them that you had tried it full time for 30 days, and you had bought the special (shitty) microphone the software "requires" (I think they actually just had a marketting agreement with the manufacturer, because this TELEX headset was very low quality AND expensive), and they decided to give you a return authorization number, and the moon was in the house of Jupiter, etc.

Basically, nobody got their money back. Because they're liars and thieves who are used to selling garbage.

Truly unfortunate that this slime got the contract.

Are cops using this now in Jail? (0)

Anonymous Coward | more than 6 years ago | (#22064382)

Because I helped a illiterate person write a letter in jail and spelled everything out for him verbally... then I thought it was like he led me right through a training program for a speech recognition program.

?

Naturally Speaking (1)

iMac Were (911261) | more than 6 years ago | (#22064412)

Shouldn't it be 'Naturally Lisping"?

MST3K (1)

Ethanol-fueled (1125189) | more than 6 years ago | (#22064440)

The writers must have been using that software when they wrote this [ytmnd.com] song!

you fail 1t (-1, Redundant)

Anonymous Coward | more than 6 years ago | (#22064460)

Dragon is a NIGHTMARE. (3, Informative)

Caspian (99221) | more than 6 years ago | (#22064614)

I've worked with Nuance's server product in the Dragon NaturallySpeaking line as a developer. Their API is confusing, their speech recognition SUCKS, and their software bugs out in bizarre ways. It's also slow as a dog, and advanced functionality (like recognizing from wav files, as opposed to from a live audio stream) is so poorly implemented as to seem bolted on.

And the worst part? Nuance has a virtual monopoly in realistically priced (read: "in a budget that a normal small-to-medium-sized business can afford") general-purpose speech recognition systems. If I recall correctly, they bought out Lernout and Hauspie's speech recognition products and IBM's old consumer-level speech-recognition stuff. So you can't take your business elsewhere; there is no "elsewhere".

I loathe those guys.

Accessibility (3, Insightful)

Selanit (192811) | more than 6 years ago | (#22064900)

Five minutes training for most people, but not everyone. My boss uses Dragon NaturallySpeaking, and it took him nearly two weeks to complete the five-minute training due to some complications.

Namely, he's blind. He cannot read the training phrases off the screen, because he can't see them. Instead he had to have a screen reader (JAWS in this case) read the phrases aloud to him so that he can repeat them back. But of course, Dragon was not expecting to hear audio input from anything other than the user, so that confused things. There were problems even using a headset. And since he can't actually use the program at all without having the screen reader running, it was pretty awful trying to get the training done. I'm not even sure how he finally managed to do it - I suspect he probably got a sighted friend to help. Thankfully the training files can be copied from one computer to another so you don't need to retrain it on each different installation.

Once the training was finally finished, it worked well. He has poor fine motor control as a result of leukemia treatments - he can type, but only slowly and with a high error rate. His speech is slightly slurred as well, which reduces the accuracy of the transcription. Even so, the Dragon transcriptions are definitely better than manual typing. It's helped him a lot.

I just wish that the Dragon programmers would come up with a more easily accessible training routine. There aren't a whole lot of users with the same disabilities as my boss, but for the few like him having good, well-trained dictation software is vital. With it, he can control his computer reasonably well, if rather more slowly than a sighted person with normal motor control. Without it, using the computer is basically impractical. When he can't use Dragon, sending a single rather short email can take upwards of an hour.

Re:Accessibility (1)

jfim (1167051) | more than 6 years ago | (#22065670)

Have you tried the speech recognition in Windows Vista? I haven't tried it with the screenreader at the same time, but it seemed to work semi-decently and I'm curious as to how people with actual disabilities think of it. I was quite surprised by how well it seemed to be integrated with the OS-bundled applications.

Compare like with like! (0)

Anonymous Coward | more than 6 years ago | (#22064948)

A lot of the comments here say something along the lines of "I tried it years ago and it was rubbish" - yeah well things have moved on!

It's like saying all cars still have wood framing and carburettors like they used to be - a contemporary vehicle is steel and has fuel injection - oh and an iPod dock...

Dragon 9 whilst still not perfect is really very good - and MacSpeech will build their product on that technology.

Oh and by the way the Google speech recognition is by Nuance too and ViaVoice whilst distributed by Nuance is an old seperate IBM product....

But will it run on Linux.? (1)

tiluki (74844) | more than 6 years ago | (#22065096)

Seriously though, is it just me or is speech recognition support still sadly lacking under all current distros?

Based on the fact there are no leading edge projects out there. I mean, apart from IBM's ViaVoice a few years back (and now no more), and the CMU Sphinx project http://cmusphinx.sourceforge.net/html/cmusphinx.php [sourceforge.net] is there any other Linux/FOSS solution?

Can it write software? (1)

tgd (2822) | more than 6 years ago | (#22065260)

Understanding 99% of what I say correctly after 5 minutes is a lot better than the developers do...

First announced eleven years ago (1, Funny)

Anonymous Coward | more than 6 years ago | (#22065262)

This was first announced eleven years ago. It's about time. Maybe Pogue will stop using Windows now?

I acutally use it... (1)

lanzek (948620) | more than 6 years ago | (#22065468)

And it's amazing. I find that it's much more natural and fluid for language to go from thought to speech than from thought to typing. Also, the accuracy is better than typing, (including spelling) and it comes with a headset that is more than adequate. Give it a try folks, and forget about carpal tunnel forever...

Re:I acutally use it... (1)

jfim (1167051) | more than 6 years ago | (#22065620)

Give it a try folks, and forget about carpal tunnel forever...
Beware of straining your voice though. Also, while speech recognition is actually pretty good for natural text, it is pretty awful for programming due to the tediousness of entering punctuation and variable names, which aren't dictionary words most of the time.

It's about accessibility... (2, Insightful)

Tibor the Hun (143056) | more than 6 years ago | (#22065548)

This is fantastic news for those who need extra accessibility features.
It may be fine for you or me to hit any key, but there are many other folks with various disabilities for whom such a task is not an easy one. So it may make more sense for them to use their voice and move on.

If any of us were to lose fingers or hands in an accident, I bet we'd all be using something like Dragon to continue our work, rather than try to become a tap dancer.

And let's not forget about accessibility in the workplace. This is great news for Mac shops, as now there is one less reason for having to support a rogue Windows machine...

Re:It's about accessibility... (1)

timftbf (48204) | more than 6 years ago | (#22065802)

Thank you. It winds me up seeing the product getting a slamming because it's "only" 99% accurate, or because "it sucks - so much better to type". While they might be marketing it at people who are too lazy to type, or who think it's cool to talk to their computer, it's an absolute boon for people who really *can't* type.

My wife has been through bouts of severe RSI, and while a lot of the time she can now manage with a specialist keyboard, Dragon kept her able to work and to communicate through a long bad stretch after the initial onset, and is an ongoing help. 5% time typing to go back and make manual corrections is still 95% less trying to painfully use the keyboard.

Now I have a real chance of getting her off the PC and onto a Mac - no more Windows support for me! :)

Urgh!! Wrong PLATFORM!!!! (3, Interesting)

wonkavader (605434) | more than 6 years ago | (#22065602)

It's fine to port this to the Mac. Fine. Good. Whoopie.

But they are so DROPPING THE BALL. They have the best voice-rec platform. (You can think it's not good enough, but it's still the best.) What they need is to port it to Linux. Duh! Wake UP!

No, I'm not just saying the usual "Does it run on Linux?" bit. Linux is the now (and coming even more) obvious OS for small devices. When you want to talk to ANY device in your home or car, or your cell phone or PDA, you'll be talking to LINUX. THAT'S where we need a great voice-rec system. We need it ported to Linux and opened for an API. This will catapult this annoying desktop app into a present on almost everything type software device in a matter of a couple of years -- as low power devices provide enough umph to do what the heavy machines of a few years ago do.

It's a good thing, too. (3, Informative)

benmhall (9092) | more than 6 years ago | (#22065634)

My wife needed voice dictation software a year or two ago. She had been a Linux user. I gave her my PowerBook and bought iListen for her. It was terrible. And it was a resource hog. It used the Philips engine and, even with extensive training, was the pits. We even tried several high-quality mics to no avail.

She went from my G4/1.5GHz/1.25GB RAM PowerBook running iListen to Dragon NaturallySpeaking 8 on an IBM ThinkPad T23. (P3 1GHz, 768MB RAM, WinXP.) The difference was night and day. Not only did Dragon run much faster on the lowly P3, but the quality of speech recognition was _much_ better. As a result of this, she's now back to being a Windows user with Dragon.

At least it looks like our iListen purchase won't be a complete waste, as we can use it to upgrade to NaturallySpeaking for Mac. I'm glad that MacSpeech has killed iListen. It needed it. It was an embarrassment compared to Dragon.

Speech recognition has been a big hole in the Mac's software line-up. It looks like that is finally coming to an end. Now if only someone would release something that works for Linux.* I know that we'd have paid $200 for something approaching Dragon 8's capabilities.

----
*Yes, I know about IBM ViaVoice. Good luck getting that to work on any recent distribution. I also know about Sphinx. Unfortunately, it seems to be a perpetual research tool rather than an end-user program.

Correction: Dragon develops for Mac Again (1)

JoeCommodore (567479) | more than 6 years ago | (#22065900)

Dragon had a Mac product once before [thefreelibrary.com] - Dragon Power Secretary. It was tied to specific apps. Didn't get much updating or new versions after the initial release and died an agonizing death.

How about a Linux version? (1)

Michael Ross (599789) | more than 6 years ago | (#22065936)

Here's hoping they support Linux next.

Haven't tried the recognition, but... (1)

Pedrito (94783) | more than 6 years ago | (#22066124)

I have used their speech synthesis products and they're quite impressive. I used one of the voices to dictate a textbook into an MP3 file so that I could then do a book-on-tape type thing to play my textbook in my car. The pronunciation was generally pretty good. I had to define the pronunciation of a few words here and there (it had problems with some of the less common geek words, like "macromolecular"). But after giving it the proper pronunciations, it was quite excellent. The voice sounded natural a good portion of the time.
Load More Comments
Slashdot Login

Need an Account?

Forgot your password?