2017 is already going down in history as the year that voice computing went mainstream.
Amazon leads the pack, selling well over eight million Echos and Dots in just two years, and leveraging their Amazon Voice Services (AVS) platform to get Alexa into everything from refrigerators to dancing robots to Ford F-150’s (the #1 selling vehicle in the US for 40 years).
Combined with other voice computing products like Google Home and the potential launch of an Apple Siri speaker this summer, it’s not out of the question that over 25 million more voice devices will ship this year. Despite this growth, voice computing is already showing some core problems in user retention and discovery. According to a new study by Voice Labs, new skills/actions will lose 97% of their users in just two weeks, while less than a third of the 10,000 Alexa skills have more than one review. But this isn’t because voice computing is failing. It’s because voice is only a part of the coming ambient computing revolution.
“Ambient computing” refers to making the capabilities of a place, such as a home, directly accessible to anyone present, without the need for an intermediate device like a mobile phone or computer. If you have ever stood in your kitchen and asked Alexa to play music or turn on the lights, you’ve used ambient computing. (Incidentally, these are the two most common uses of Alexa, each comprising 30% or more of Alexa requests). If you’ve ever had lights with motion sensors turn off when you aren’t in the room, or armed your security system using a wall keypad, you have also used ambient computing. Voice computing is just one of many ways that you can interact directly with your environment.
Voice computing works well for direct interactions when you know exactly what you want, such as asking for a weather forecast, but is critically lacking in other interactions, such as choosing from a list of options, reviewing information, or discovering what capabilities are available. General purpose ambient computing devices will have a range of interfaces adapted to relevant…