Sunday, April 07, 2013

The path from recognizing spoken digits to virtual assistants killing astronauts

Do you recognize this?

It is HAL 9000, a fictional character in Arthur C. Clarke’s Space Odyssey series. It is a computer which is capable of speech, speech recognition, facial recognition, natural language processing.

In the first film of the series, HAL is controlling a space ship which the crew decide to shut HAL down during the flight as they conclude it is malfunctioning. Noticing the plan of the crew, HAL kills most of them having a single astronaut, Dave, left alive out of the ship.

Here you see Dave is asking HAL to “open the bod bay doors” to let him get in the ship. All this introduction was to come up to this dialogue actually; as today Siri, our most popular virtual assistant to date, might answer alike if you insist enough.

Brief history of speech recognition

Clarke thought that we will have such a technology in 1990s, though all we had was lousy dictation software. I was impressed with Clark's vision  as the film was dated 1968, but became less impressed after reading a bit more about the history of speech recognition. It was early 1950s when people started to work on the topic, although it was only about recognizing digits spoken by a single voice. In 1962 IBM reached the capability of recognizing 16 words, followed by 1011 words in 1976. There was progress in 1980s too, but it was requiring the user to talk slowly, separating the words. There was no ground breaking changes in 2000s till Google came up with its Google Voice application for iOS, followed by Siri. Google and Apple made the technology popular, showing that it is finally mature enough to use in our daily life.

Siri

Siri can post to your Facebook / Twitter acount, send email / SMS, set reminders and timers, start applications, check weather / stock prices / game scores, set up meetings, or even provide directions using Apple Maps. You can even ask about movies and restaurants and buy ticket or make a reservation for the restaurant. While it can search the web for you for the information she does not have, with her Wolfram integration she can answer many different questions in English. While she's mostly passive, answers when you ask something; her location based reminders features brings some form of pro-activity. You can ask her to remind you to something when you leave office for example.  She's not just a cold virtual assistant, but also has her own way of humor. She knows how to answer when you say you love her, or when you ask where you can hide a body. Apple made her available to car manufacturers to serve safely while driving with the EyesFree feature. While it is not officially possible to integrate Siri to some other platforms, people find ways to use it even to open their garage doors.

Google Voice Search / Google Now

Google Voice Search can be taken as Siri's competitor as it can answer questions as Siri does. As it allows you to reach the functions you can access by typing to Google, it is basically Google's voice based interface. It is hard to say which one is answering questions better, but maybe an assistant should be something more than a system answering questions. You may take Google Now as a better assistant as it gets you just the right information at just the right time.
It tells you today’s weather before you start your day, how much traffic to expect before you leave for work, when the next train will arrive as you’re standing on the platform, or your favorite team's score while they’re playing. And the best part? All of this happens automatically. Cards appear throughout the day at the moment you need them
It is running at the background, checking your location, your emails, calendar and even your search history on the mobile and desktop to serve you. This sounds like the big brother is watching you, but you can get addicted to the easiness of finding the information you need with its widget pretty fast.

Getting more into our lives

I'm sure you remember the series, Back to the Future; which predicts that we would be having flying cars and hoverboards. We're still far from those as well as having food hydrators, but we're pretty close to having voice controlled home appliances, thanks to those making this technology mature enough.
Today the voice recognition technology is expanding to browsers, TV remotes, alarm clocks, watches, games consoles, and soon to glasses. It looks like we'll be having always on voice activated devices at home and real time translators [from MicrosoftDocomoClarity, Vocre] and maybe this technology will always be behind the scenes to listen us and provide some help. So what do you think, will Siri open the pod bay door for us or kill us all?