The voice activated computer has been a Sci-Fi staple for years, a ubiquitous sign of “the future,” much like the flying car. We may not be flying to work yet, but thanks to voice assistants such as Siri, Cortana, or Alexa, we all have access to a little bit of the future on our smartphones, our computers, and in our homes.
Voice assistants are pretty easy to use, you push a button or say an activation phrase then ask your question or give your command, “Hey Siri, what’s the weather like today?” or “Alexa, turn the lights on.” and your assistant responds accordingly. You ask it to remind you of something, and an entry appears on whichever task and to-do list application you use. But how does that happen? How does your spoken request get translated to your device? How does it know what you mean? There are a few key programing concepts behind all digital assistants that allow them to translate our commands and act on them: natural language processing and machine learning.
Natural Language Processing is the result of decades of speech-recognition and synthetization software advancements, a complex and varied set of algorithms that help computers understand how people actually speak and how to mimic those patterns. Human speech does not come naturally to computers, it’s complicated, subtle, and often the actual words spoken mater less that the order they’re spoken in. Early speech recognition programs were like word processors, they could take the words spoken and apply them in order without gleaning any of the nuances or meanings. Now they can identify keywords based on their position in a sentence and take actions based on those keywords, they can even give a verbal response, prompting for more information or confirming your request.
Machine learning is much younger aspect of artificial intelligence that focuses on creating machines that can learn from their experiences. This allows your digital assistant to pick up on patterns in your use and eventually re-create your actions without being prompted. For instance, if you use your assistant to order a car service to your favorite restaurant on Friday evenings a few times your assistant may suggest that restaurant the next time you order a car service on a Friday night. This is why your digital assistant knows to give you location or context based information, providing an experience that is that much closer to interacting with a person.
There are of course other algorithms and processes that go into making a digital assistant that actually works, but this has been a glimpse at two of the biggest players. Understanding these concepts, even in just a general sense, can help you to better understand how your assistant works and improve your experience.
The Rise — and Dangers — of Personal Digital Assistants