Jeff Bezos-led Amazon is making Alexa learn to talk like humans. Alexa can now whisper, take a breath to pause for emphasis, adjust the rate, pitch and volume of her speech, and more.
These new tools were provided to Alexa app developers in the form of a standardized markup language called Speech Synthesis Markup Language, or SSML, which will let them code Alexa’s speech patterns into their applications. This will allow for the creation of voice apps – “Skills” on the Alexa platform – where developers can control the pronunciation, intonation, timing and emotion of their Skill’s text responses.
According to Amazon, “Speech Synthesis Markup Language, or SSML, is a standardized markup language that allows developers to control pronunciation, intonation, timing, and emotion. SSML support on Alexa allows you to control how Alexa generates speech from your skill’s text responses. You can add pauses, change pronunciation, spell out a word, add short audio snippets, and insert speechcons (special words and phrases) into your skill. These SSML features provide a more natural voice experience.”
Here’s what you’ll start to hear:
Substitutions (Alexa will swap a word for the one that’s written)
Emphasis (this affects Alexa’s speech rate and volume)
“Prosody” (controls volume, pitch, and speech rate)
Amazon recently introduced the Echo Look, a $200 device that includes a camera and computer vision tech to recommend outfits. Alexa can help you look your best. Using just your voice, easily take full-length photos and short videos with a hands-free camera. The new home assistant answers to commands like “Alexa, take a picture” and “Alexa, take a video” – for the latter, users spin around accordingly to get shot from all side, taking selfies while keeping their hands free. Videos shot with the hands-free selfie stick can be recorded or viewed in real time. (Image- Amazon)