In an April 25th keynote address at the World Wide Web Conference in Lyon, France, Ruhi Sarikaya, who heads the Alexa Brain Team, spoke about Amazon’s objective to enhance the way we engage with the ever-growing number of connected devices, with specific reference to Alexa.
“Our goal is to enable more natural interaction with all of these IOT devices, and for these devices to more proactively engage with us,” Sarikaya wrote in a blog post.
He said that his team’s primary focus was to develop Alexa into a smarter and a more naturally interactive voice assistant than what it currently is; to make it more uncomplicated for users to find and engage with the more than 40,000 skills that developers have created for the voice assistant; and, to augment Alexa’s context and memory tracking ability within and across dialog sessions.
Toward that end, the Alexa Brain team has been working on three new capabilities, including skills arbitration, context carryover and memory, to be made available to Alexa users in the not too distant future.
Skills arbitration, in this case, is all about using machine learning to make Alexa capable of dynamically allowing users to “automatically discover, enable and launch skills using natural phrases and requests” – a feature that will be made available for U.S. customers in the coming weeks.
Citing an instance of this “friction-free” interaction capability, Sarikaya said that when he recently asked the voice assistant, “Alexa, how do I remove an oil stain from my shirt?”, she replied by saying, “Here is Tide Stain Remover,” and guided him through the entire oil stain removing process – something that would have previously involved finding the skill on his own to be able to use it.
It may have been one example, but it does give you an idea of how the new ‘skills arbitration’ capability will allow customers direct and unimpeded access to, and communication with, third-party skills.
“We’re excited about what we’ve learned from our early beta users and will gradually make this capability available to more skills and customers in the U.S.,” Sarikaya wrote in the blog post.
The “context carryover” capability will support what Sarikaya refers to as “multi-turn utterances,” replacing the existing “two-turn interactions with explicit pronoun references,” explains Sarikaya.
What this basically means is that customers will now be able to interact with Alexa in a way that’s more realistic; like you’re having a conversation with another human rather than a machine, plus you don’t have to keep addressing Alexa by name for every follow-up question or command, with support for context across domains.
Sakriya explains the concept with some appropriate examples that will make it easier to understand how this new Alexa skill will make a difference.
Here’s an example of the existing “two-turn” way of communicating with Alexa.
“Alexa, what was Adele’s first album?” “Alexa, play it.”
Now, here’s how the “multi-turn utterances” capability will work.
“Alexa, how is the weather in Seattle?” → “What about this weekend?”
The difference is obvious.
And, here’s an example of context across domains or ‘context carryover.’
“Alexa, how’s the weather in Portland?” → “How long does it take to get there?”
Notice the shift from the weather in Portland to the time it would take to reach the city, which also indirectly involves traffic conditions.
“We are providing this more natural way of engaging with Alexa by adding deep learning models to our spoken language understanding (SLU) pipeline that allows us to carry customers’ intent and entities within and across domains (i.e., between weather and traffic),” Sakriya writes.
This capability will initially be made available to all Alexa users in the U.S., U.K., and Germany.
Another handy characteristic that is expected to enhance the Alexa experience is the “memory” feature, soon to be made available in the U.S.
As the name suggests, this capability will allow Alexa to remember information that you are likely to forget and retrieve it later for you when you actually need it.
So, if you are someone who has a tendency to forget important events, like, for instance, your wife’s birthday, or your wedding anniversary, then worry not, because Alexa will dig deep into her memory bank to save that marriage of yours.
All you will need to do is tell Alexa to remember the information you think you are likely to forget.
“Alexa, remember that Jane’s birthday is June 20th.” Alexa will reply: “Okay, I’ll remember that Jane’s birthday is June 20th.”
“This memory feature is the first of many launches this year that will make Alexa more personalized,” said Sakriya.
“It’s early days, but with this initial release we will make it easier for customers to save information, as well as provide a natural way to recall that information later.” He added.
All of the above inclusions are, indeed, great enhancements to Alexa’s capabilities and skill levels, but there are still far too many areas of improvement and although most of them have been identified, they still remain to be addressed, which the Alexa Brain Team is dedicatedly working towards.
“We have many challenges still to address, such as how to scale these new experiences across languages and different devices, how to scale skill arbitration across the tens of thousands of Alexa skills, and how to measure experience quality,” says Sakriya.
“ Additionally, there are component-level technology challenges that span automatic speech recognition, spoken language understanding, dialog management, natural language generation, text-to-speech synthesis, and personalization,” he adds.