For marketers and researchers, voice assistants are the biggest opportunity for personalized engagement, brand building, and consumer empowerment that has, paradoxically, gone untouched except by the few with true vision. Understanding voice assistants, what they offer, and which best align with your marketing and/or research goals is key to making that first investment into voice assistants a success.
A Brief History of Voice Assistants
To say that voice assistants have the potential to dramatically shift how you live, work, and play is no understatement. The first time you ask your TV to play your favorite show using just your voice, you realize you’ll never go back to that brick of a remote again.
Today, voice assistants help us to do everything from ordering our favorite products to controlling our homes and appliances. There are few things that cannot be done faster and easier today using just your voice and a voice assistant.
While the technology of voice assistants has never existed as it does today, the technologies behind voice assistants have. The first voice recognition software, for example, was created in 1952 by Bell Laboratories with the introduction of “Audrey,” arguable the grandmother to Siri, Alexa, and Google Assistant.
Though limited, Audrey was able to recognize numbers via speech. It wouldn’t be another 10 years until IBM gave the world “Shoebox, ” capable of understanding and responding to 16 words in English. Jump ahead another 10 years and, in the 70s, DARPA and the US Department of Defense, developed S.U.R, or the Speech Understanding Research, program eventually making it possible for Carnegie Mellon’s “Harpy” speech system to understand over 1,000 words - about the same as your three year old child.
It wouldn’t be until the 1990’s that we finally experienced the first consumer accessible speech technology designed and developed by Dragon with “Dragon Dictate” and shortly later “Dragon Naturally Speaking.” Together these offerings advanced the ability to understand human speech significantly and captured the imagination of many consumers. But this was all still more speech recognition than “voice assistance.”
In 1996, however, BellSouth starts the voice response revolution with the introduction of VAL. No, not HAL (that doesn’t come until much later) but VAL or Voice Activated Portal. This was the birth of IVR or interactive voice response which started to create the response side of the voice assistant equation.
Many advances continue through the years but it wasn’t until Apple’s native integration and launch of Siri in 2011 that voice assistants gain mainstream attention and the first hint of its practical, real world utility. Now millions of people have access to voice-based interactivity in the palm of their hands via iPhone. Apple Siri, at launch, was exciting but we soon learned it wasn't yet practical for daily use by consumers. While the technology was exciting, it was still limited in what you could do. Understanding your speech was spotty and how you could leverage voice was limited mostly to asking for the weather and setting a timer.
Jump forward a few years and something very special starts to happen. Siri learned new tricks. Microsoft introduces Cortana, and few months later Amazon introduces Alexa into our homes. Two years later, Google Assistant comes home too. This becomes a pivotal time for voice assistants as they begin their meteoric journey to become fixtures in our lives - our homes, appliances, automobiles, and yes, still on our phones.
The utility of voice assistants since the introduction of Siri has seen exponential reach - today you can use voice to perform a dizzying array of tasks, play games, listen to books, get guidance, and more. It truly is amazing.
The Growth of Voice Assistants
We can’t talk about voice assistants in any way without covering the explosive growth in adoption of voice assistants. Today, voice assistants, including Alexa, Google Assistant, Samsung Bixby, and Siri, are used by over 3.2 billion1 people across a multitude of devices including smart speakers, appliances, automobiles, mobile phones, tablets, and more - even toilets!
In the U.S., smart speakers alone are found in roughly 90 million homes with millions of consumers engaging voice assistants daily. If you include and account for voice assistants beyond smart speakers - like the automobile, phones, televisions, tablets, showers, toilets - a staggering +150 million consumers, more than half the US population, use and engage with voice assistants EVERY-SINGLE-DAY!
Statistica's research estimates that consumer voice assistants (or virtual digital assistants) will reach an astonishing 8 billion devices and a considerable number of the worlds population by 2023. These numbers translate to a projected forecasted consumer market value of about $150 billion by 2023. These are pre-covid estimates and, since April of 2020, we’ve only seen an amplification and acceleration of user engagement that will continue to drive voice-first engagement to the forefront via voice assistants.
Voice Assistants Today
When we talk about voice assistants today, a few but recognizable names should pop into your head. Specifically,
- Amazon Alexa
- Google Assistant
- Apple Siri
- Samsung Bixby
There are quite a few specialty assistants as well
- Hey Mercedes, part of the Mercedes-Benz user experience (MBUX) technology.
- IBM Watson
- Microsoft Cortana, now found as part of the Windows operating system and Xbox
- Hound from Houndify
- Dragon from Nuance Communications
Some of these voice assistants have very specific applications and capabilities. For marketers and researchers, navigating the voice assistant ecosystem and knowing which, how, when, and why to leverage them in your projects can easily become a confusing experience.
While there are a multitude of voice assistants out in the world, there are only a handful that are capable of adding true value to marketers and researchers today. These are voice assistants that have the necessary consumer reach and consumer usage to invest in. Let’s have a look at the top assistants and what they offer.
Let's start with Apple Siri. While arguably the first mainstream voice assistant to reach consumers at scale, Apple Siri, like the rest of the Apple ecosystem, has a very deliberate and controlled set of use cases. As of this writing, unlike other voice assistants, Apple Siri does not deploy a conversational experience.
Except in a very limited scope, it’s more of a request-action-via-voice experience than anything else. Get the weather, turn off the lights, play some music, launch an app. That kind of stuff. For marketers & researchers, it's somewhat limited in how you can leverage Apple Siri for your campaigns. As of this writing, there are no provisions for creating, what we refer to at True Reply, as Conversational Voice Applications. You can’t create a pure-voice experience that is not part of or anchored to a native iOS app.
You can, however, develop for Apple Siri as part of a mobile iOS app experience. This manifests in the form of building deep-linking like interactions via voice. For example, if you have a ride sharing app, you can integrate voice and allow Siri to accept voice commands like booking a ride.
Unlike Apple Siri, Amazon Alexa, Google Assistant and Samsung Bixby all support voice-first, Conversational Voice Apps.
With Amazon Alexa, you can create Alexa Skills that allow consumers to engage via any Alexa-powered device. This includes the entire line of Amazon Echo devices and millions of additional third-party devices, appliances, automobiles, and more that support Amazon Alexa built in. Today, this includes offerings from Facebook, Kholer, BMW, Ford, Audi and many more brands you'd quickly recognize.
Alexa Skills can create gaming experiences, tell stories, check-in on loved ones, collect consumer opinions, control home products, and so much more.
Potentially a hybrid of both Amazon Alexa and Siri, Google Assistant supports both voice-first apps as well as voice apps that amplify mobile app functionalities but you need to be very intentional around your experience design to leverage both at once. Voice apps for Google are called Google Actions. While the foot-print of OEM products are not as large with Google Assistant as they are for Amazon Alexa, the potential reach of Google Actions is considerably larger given the more than 500 million smart phones that currently have Google Assistant native to their operating system.
Google Actions supports comparable features and functionalities as Alexa Skills so the types of experiences you can create via Google Actions are very similar.
Arguably the new kid on the block, Samsung Bixby is Samsungs answer to Amazon Alexa and Google Assistant.
Designed with a voice-first vision from the start, Samsung Bixby is the future of voice for Samsungs wide array of smart devices, phones, and appliances. Similar to Google Assistant, Samsung Bixby can be strategically deployed as both a voice-first app experience for all supporting Samsung devices or as a voice app that is tethered to a mobile app on Samsung phones. Conversational by design, voice apps for Samsung Bixby, or Capsules, are capable of supporting Conversational Experiences on par with Amazon Alexa and Google Assistant.
While the foot-print of Samsung Bixby is considerably smaller than Amazon Alexa and Google Assistant, given Samsungs drive for innovation in consumer electronics, the future is bright for Samsung Bixby. Something to keep an eye on in the years to come.
The Future Of Voice Assistants
Voice Assistants, in terms of a ubiquitous technology, have a little bit to go. I’d argue that voice assistants today misunderstand us about as much, if not less, than our significant others. That being said, thanks to advancements in machine learning, voice assistants ability to understand your every word will get even better very fast.
Within True Reply, we saw an improvement in Googles speech recognition between 2018 and 2019 of nearly +10% - that is huge considering how well it was when we started. Improvements in speech recognition by all providers will continue at an exponential rate.
When looking internationally, the technology needs to advance in how it understands different accents, dialects, nuance speech and so on. To me, this is less about speech recognition but more about speech understanding or, more accurately, Natural Language Understanding (NLU).
I like to argue that NLU is not a computational science but a linguistic science in which we strive to understand meaning from as much about context as intent and situation. That understanding is then bestowed on voice assistants through engineering marvels.
While the technology is not perfect and still has a way to go - I mean, what technology doesn’t? - the future, very near future, of voice assistants feels, to me, to be pretty obvious. It's right there from the moment you ask Alexa to continue your binge of The Queens Gambit on Netflix.
The future of computing and interacting with our digital systems and environment will be built on a firm pillar of voice assistants and voice-first interfaces. Mark Cuban is continuously quoted by the voice community as saying “There is not future that doesn’t have some form of ambient voice interface in it.”
There is no future that doesn’t have you talking to your environment to get what you need and do what you need done faster and easier than any other way. For marketers and researchers, this means there is no future that doesn't involve you leveraging voice assistants for your marketing and research efforts.