PDA

View Full Version : The Future: Voice Command Computing



barryfx
March 1st, 2011, 03:47 PM
The time is right for a new paradigm in computing. Speech recognition technology has advanced to the point where we should be talking to our computers and they should be talking back. The technology is available for at least a fixed command structure computing environment. I would like to welcome a discussion on how it should work and how it would best be implemented technically.

My concept is to have the computer in a room with a large screen display hanging on the wall. An omni-directional microphone will receive spoken commands. The computer will respond or ask questions with speech. The computer will display requested information if appropriate. The computer has a fixed command structure. The option exists to use a keyboard for direct input like we do today. The system will start in voice command mode but for keyboard input, a desktop will be displayed. Initially single room operation is envisioned.

Some possible applications:


Computer Options

Put computer to sleep
Shutdown computer
Volume Control

Time

Local or city of choice

Weather

Current Temperature
Local weather or another city
Display radar, etc

Traffic

Show traffic map

News Headlines

Reads headlines in selected category
Option to read story behind headline

Email

Read emails
Send emails

Instant Messenger (Voice Based)

Send/Receive

Web

Go to web site by spoken name
Navigate the web via voice
Fill in forms via voice
Select options via voice

RSS Feeds

Reads items, then optionally content

Read Documents

TBD

Take dictation for a document
How do you think it should work and how should it be implemented (hardware, OS, desktop, application level) ? NOTE: I'm not suggesting artificial intelligence, which, other than Watson, isn't quite there. I'm talking about a fixed command structure user interface that I think could be built today.

aeiah
March 1st, 2011, 04:12 PM
this will be useless until a domestic computer can understand natural language and context. watson is a step in that direction.

even with voice commands, its still easier to do some commands with keys and whatnot, especially if we're talking about doing things outside of web browsing and playing media.

having a computer recognise key words will only get you as far as superficial control - the kind of things you can do with a remote control, or a mouse.

Bazon
March 1st, 2011, 04:43 PM
oh, Bill Gates was talking about this all the time in the nineties, according to him, it should be there for a long time.

i see no use in it. clicking is faster and I don't like to talk to my computer, who won't understand me unless I apply my language exactly to it's needs.

Swagman
March 1st, 2011, 05:27 PM
So it's 02:00 and everyone except you is asleep. Are you going to whisper to the computer or will you just arrogantly wake everyone up by talking to the computer ?

MaxIBoy
March 1st, 2011, 05:53 PM
You are at your desk, working at your computer. A friend walks in who you haven't seen for two years. You shake his hand, tell him to sit down, and then say, "So? Tell me everything that's happened in two years!" Your computer starts babbling constantly in its robotic voice, reading off all the system logs it has from within the past two years, then locks up altogether and you have to switch to a virtual terminal and kill the process you've spawned.

I do agree that speech-to-text would be great for taking dictation (because speech is about 300 words per minute,) as long as you had a way of handling formatting. And in the future, we may use Skype instead of IM and Mumble instead of IRC and computerized voicemail instead of email. But as a primary way to use all software? Voice is a really, really bad user interface. The only reason it got popular on cell phones is because the old keypad-based menus were even worse.

RiceMonster
March 1st, 2011, 07:19 PM
I'm picturing sitting in an office where everyone is talking to their computers. Now, imagine not only the interference, but the distraction.

KiwiNZ
March 1st, 2011, 07:31 PM
set up hidden speakers all around the office, sit back with a switch box and microphone and let the chaos begin :P

MaxIBoy
March 1st, 2011, 07:41 PM
Actually, User Friendly did a series on this. March 16 through March 24, 2001.
First one: http://ars.userfriendly.org/cartoons/?id=20010316
http://www.userfriendly.org/cartoons/archives/01mar/uf002860.gif
http://www.userfriendly.org/cartoons/archives/01mar/uf002862.gif

handy
March 1st, 2011, 10:15 PM
I had a look at voice command computing years ago. It is a pain in the neck. Far slower than using your mouse & keyboard.

I can see benefits for people that are using X10 or similar to control their house so they could turn lights on/off, open/close blinds & other simple things via voice command.

Telling your entertainment system what to do would only be useful for people who keep losing their remote control devices.

Irihapeti
March 1st, 2011, 10:29 PM
Now how long is it going to take a toddler to figure out how to have the whole place utterly under his/her control?

And as far as a toddler is concerned, chaos is a perfectly good substitute.

(Oh right, they've got that figured already...)

oldos2er
March 2nd, 2011, 02:11 AM
Had that in 1996 with OS/2 Warp 4 VoiceType. Worked pretty well too.

Austin25
March 2nd, 2011, 04:18 AM
Yeah, I'm going to stick with my interface of conglomerated desktop environment and command line.:P

aschwerin.moses
March 2nd, 2011, 06:54 AM
Well, whats wrong with Voice Computing? I dont understand why so many are against it.

Okay, may be not for every use. Keyboard and Mouse are here to stay. Loooong really long. They should. However I dont see any harm to have Voice commands as well.

Imagine Star Trek:
- You come back from work, while one gets fresh, you just say "Computer. play some Jazz music"
- You settle on your bed/sofa to watch a movie, you say "Computer, play the movie Inception"
- Your friends come over, you say "Computer, slideshow the pictures from the album Singapore Trip"
- You would like to know something about Alfred Hitchcock, you say "Computer, display the details of Alfred Hitchcock" ... it displays the document from Wikipedia
- Blah blah blah...

what wrong with this? may be integrate this feature into a media center applications which can take voice, remote and keyboar/mouse commands. whats wrong with this approach?

piquat
March 2nd, 2011, 09:36 AM
Voice commands? No.

I do see a day where we might all put on a few sensors and just control the machine through bio-feedback.

Nobody wants you hear me talking to my computer.

Jimmey
March 2nd, 2011, 12:46 PM
Well, whats wrong with Voice Computing? I dont understand why so many are against it.

I agree. Many of the objections here are based on scenarios where it really shouldn't be applied anyway or limitations in current technology that we may be close to overcoming.

I think that the main hurdle is the point aeiah raised - A level of artificial intelligence would be cruicial to making this technology successful.

To think about the computer as a human, and then to address every area where the computer functions less adequately, will soon enough mean that Voice Computing is an attractively viable option.



Now how long is it going to take a toddler to figure out how to have the whole place utterly under his/her control?

Think this way - You are placing some of the functions of your house in the hands of SOMETHING else. If it were a butler, how would the butler know to ignore everything to toddler expresses toward these functions? Now, how can we make the computer "think" that way too?


You are at your desk, working at your computer. A friend walks in who you haven't seen for two years. You shake his hand, tell him to sit down, and then say, "So? Tell me everything that's happened in two years!" Your computer starts babbling constantly in its robotic voice, reading off all the system logs it has from within the past two years, then locks up altogether and you have to switch to a virtual terminal and kill the process you've spawned.

You are placing some of the functions of your computer in the hands of SOMETHING else. If it were a secretary, how would the secretary know when you've stopped addressing them, and when you're then addressing your friend? Now, how can we make the computer "think" that way too?

Everybody wants a computer like Jarvis in Iron Man!

slackthumbz
March 2nd, 2011, 01:16 PM
I can open a terminal and type a command faster than I can speak it.

I have keyboard shortcuts for all of my favourite programs, all of which are faster than saying their names aloud.

I also tend to mutter quietly to myself when coding, could be problematic if the computer is trying to interpret my self-absorbed ramblings as commands. On top of that I listen to very loud music a lot of the time. Good luck interpreting voice commands over the sound of some heavy digital hardcore.
/thread.

Jimmey
March 2nd, 2011, 01:38 PM
I can open a terminal and type a command faster than I can speak it.

Can you do that while you're washing up?



I have keyboard shortcuts for all of my favourite programs, all of which are faster than saying their names aloud.

I think you're thinking more of replacing the interface to the computer with the computer operating as-is. Voice commands are going to be useful when we have AI in place to interpret commands like this:

Computer, what bus will take me into town in the shortest time, and what time is the next bus due?


I also tend to mutter quietly to myself when coding, could be problematic if the computer is trying to interpret my self-absorbed ramblings as commands.

Again, think about AI in this situation. If you had somebody sat there typing everything you said, and you started mumbling, they might automatically know through the tone of your voice, through the direction of your head, through your facial expressions etc that what you're saying doesn't need recording. If the computer doesn't interpret these signals like a human would, voice commands will be a nightmare. But if it does, they're a benefit.

slackthumbz
March 2nd, 2011, 01:45 PM
Can you do that while you're washing up?



I think you're thinking more of replacing the interface to the computer with the computer operating as-is. Voice commands are going to be useful when we have AI in place to interpret commands like this:

Computer, what bus will take me into town in the shortest time, and what time is the next bus due?



Again, think about AI in this situation. If you had somebody sat there typing everything you said, and you started mumbling, they might automatically know through the tone of your voice, through the direction of your head, through your facial expressions etc that what you're saying doesn't need recording. If the computer doesn't interpret these signals like a human would, voice commands will be a nightmare. But if it does, they're a benefit.

When someone comes up with an AI that can understand tone and context we can talk more but until then this is pie in the sky.

Blutkoete
March 2nd, 2011, 01:55 PM
Voice Command Computing will come for things that only need short commands. In all other cases - talking is slower than typing or clicking.

"Coffee please" (or "Earl Grey, hot") might turn on the coffee machine, but for a direct computer experience just browse the web for ten minutes and say everything you do in the browser before doing it and measure the time.

What most people underestimate: Speaking is comfortable, but speaking is slow. And it's still impossible for a machine to distinguish between your husband saying "And then he said 'Coffee please' as if I'm his coffee maker!" and your husband ordering the machine to cook coffee via a simple "Coffee please.".

Even in Star Trek they typed things they wanted to be done fast. There is a big difference between speech recognition and what-I-want-you-to-do recognition.

honeybear
June 13th, 2011, 03:15 AM
Now new consoles have that even. Have you seen at the last representations on TV?

That's really amazing what do new console nowadays.

Also, btw, Windows 7 has that too. Voice reco does not work well, but with a good mic and a simple set of voice commands, at least that work


Why with Linux one has always to wait years for advanced developments? I wish we had less beautiful desktops but more applications for office and productivity. Voice command is indeed the real future.

forrestcupp
June 13th, 2011, 03:26 AM
What about the people with throat cancer? Nobody ever thinks about them. :)

Why are you guys talking about this like it's the future? There are already a lot of things that do this. My 360 and Kinect are made to do this. Kinect is pretty good at taking voice commands and gestures.

Also, my Android phone does it, too. I can give voice commands and have it transcribe my dictations to text pretty well.

The future is here, folks! ;)

honeybear
June 13th, 2011, 03:34 AM
What about the people with throat cancer? Nobody ever thinks about them. :)

Why are you guys talking about this like it's the future? There are already a lot of things that do this. My 360 and Kinect are made to do this. Kinect is pretty good at taking voice commands and gestures.

Also, my Android phone does it, too. I can give voice commands and have it transcribe my dictations to text pretty well.

The future is here, folks! ;)

Check this video:

that's future + made reality working : https://www.youtube.com/watch?v=M38mqXbAuLE&feature=player_embedded

manzdagratiano
June 13th, 2011, 03:52 AM
Maybe I am really naive, but isn't voice computing already a possibility today? On my Android, I can hit `voice commands' and it asks me to `Say a command'. I can tell it to `Call <X>' or `Play <X>' and it does that. Extending this to other commands does not seem that hard... at least in principle. And then you can polish up the welcome to make the machine seem more `human'.

desktorp
June 13th, 2011, 03:56 AM
Having a computer that recognizes simple commands is great, but in Star Trek (heeere we go, right?) you could basically have an intellectual conversation with a computer. I think the title of this should be more like "The Future: Conversational Computers" ..

honeybear
June 13th, 2011, 04:10 AM
There is a good working Howto using sphinx2 for basic commands:

http://knoppmyth.net/phpBB2/viewtopic.php?p=110384&sid=0d0495792032e37715560a9269f12c06

It works with Openbox or fluxbox too (not singly Gnome)

--
Link for simple linux voice control : http://ubuntuforums.org/showthread.php?t=1781084

Bandit
June 13th, 2011, 06:07 AM
Voice commands are already reasonable and have been around. Even my older Mac I have already pushed in to the closet had voice command software that did many features like opening websites, printing, playing music and so one. You have to spend a good afternoon setting it up but it worked great when you did. I could say "Mac.. Open VLC, Play Rammstein. It would open VLC and play all my music by the Artist Rammstein. Which may seem silly, but really nice when your hands are full or to darn lazy to get out off the sofa or out of bed.

honeybear
June 13th, 2011, 08:27 AM
Voice commands are already reasonable and have been around. Even my older Mac I have already pushed in to the closet had voice command software that did many features like opening websites, printing, playing music and so one. You have to spend a good afternoon setting it up but it worked great when you did. I could say "Mac.. Open VLC, Play Rammstein. It would open VLC and play all my music by the Artist Rammstein. Which may seem silly, but really nice when your hands are full or to darn lazy to get out off the sofa or out of bed.

perlbox could do it

so we can too with a simple installation using sphinx2

I would like to active/desactivate the server speech listening with my irda, using the usb microphone high quality, and tell him few simple commands to proceed in my openbox ... I work on it and would need help to analyses the wav files:
http://ubuntuforums.org/showthread.php?p=10933809#post10933809

alternatively how to install poxcketsphinx
http://pkgs.org/ubuntu-10.04/ubuntu-universe-i386/pocketsphinx-utils_0.5.1+dfsg1-0ubuntu1_i386.deb.html