The problem is that Alexa is just a consumer voice command line. UI discoverability is impossible, and everything that you can do is just a utility that does something else. There are no native apps because everything that it can do is just an IO to something else. If they actually get conversational, this will change, but until then, it’s just a command line with no man page.
> The problem is that Alexa is just a consumer voice command line. UI discoverability is impossible, and everything that you can do is just a utility that does something else.
In a way, I think it's even worse than that. I've used Alexa since the Echo was a relatively new product. Back then, I experimented with new phrases and commands often, but was frequently greeted with wrong answers or "Sorry, I don't know how to help with that." Over time, I stopped trying those commands. Skip forward to today--the backend has been improving for years, and many of those commands now work, but it's too late. Their users have already been taught that they don't, so folks stop trying to use those features. Not only can you not discover new commands easily, you might mentally blacklist useful commands permanently.
Perhaps more frustratingly, the "What's New" emails they send out don't help with this. They never say "Hey, we know you tried to ask your Echo to report its volume level before and it didn't work, but it does now." They always say "Ask Alexa to tell you an Arbor Day joke!" -_-
These devices have a huge marketing problem. Like the Alexa commercial that shows somebody pausing a Prime TV show to order something from the Amazon store. Cool, I guess--except I can already order stuff while watching TV by using my smartphone, so the value proposition is completely absent.
Discovberability is solvable. Amazon has chosen not to focus on solving it, and instead they are focused on sheer volume if skills and ease of use to make them. A lot of Alexa skills can be made in a day.
I'd argue this is the wrong thing to solve for. The best skills take a long time to make and require privledged access. Sonos, Spotify and others have this, and they work amazing. It's the large mass of Alexa skills they were quickly made and that don't have privledged access that are dragging the whole experience down.
So, just like every app I use that doesn't put effort into making me aware of their new features (but probably worse, as it likely takes more effort to find those features).
It's humbling (and annoying) when you think of yourself as a power user of some application to find out from some semi-new user of it that there's a new feature that would have made your life much easier if you had only known it was added a year or two ago.
There's only so much time to learn about the tools you use, much less the changes in them over time. How do I find out if GNU grep has some interesting new feature in the version that ships with the next version of the distro I use? Or rsync? Or tmux/screen?
This problem definitely seems worse with smaller tools, since there's a lot of them and they are often packages together by some other party. Not every project is as large and used as much but such a large population as Firefox, which arguably does a very good job of advertising new features. But you can definitely tell it takes them a lot of time and effort to do so, and a smaller project may not find much traction if they tried the same strategy.
>> "Hey, we know you tried to ask your Echo to report its volume level before and it didn't work
This makes me shudder. Such a functionality would mean that they stored your failed request in some sort of database, a database that they later used to send you a personalized marketing email. A machine cataloging our voice for later inspection is a very dark future. Alexa should delete and scrub any iota of voice that it doesn't instantly understand.
"Oh, remember last week when I though you were asking me to buy you a pot plant? I now realize you were asking me to buy you some pot from Canada. It will be arriving in two days."
"I didn't understand it at the time, but I now realize that you were yelling at your husband Alex, not Alexa. Your social credit score has been adjusted to reflect this negative interaction."
A great step that could be taken before full on conversation ability would be the simple ability for the assistant to ask questions and seek basic clarification when encountering ambiguity and then using the user’s answer to learn so clarification won’t be needed next time. This way, over time the voice assistant would refine itself and acclimate itself to its user’s way of communicating.
This alone would make for a much more usable experience, and yet no voice assistant has implemented it yet, which flabbergasts me. I don’t have much knowledge in this particular realm so maybe I’m missing some big blocker that prevents it from being possible, but to me it seems like such an obvious thing to do.
> The problem is that Alexa is just a consumer voice command line. UI discoverability is impossible, and everything that you can do is just a utility that does something else
This is a really important point, and both a strength and a weakness.
The iPhone began as essentially a front end to existing services (with visual discovery, which a voice interface inherently lacks). I used to name mine "FEP" (as in Front End Processor) -- a front end to a subset of "real" computing or as I think of it these days: multiple windows into a shared computing space.
A watch (like the apple one) is really a crappy general UI device; discoverability is pretty bad because of the limited area and speed. But it's great in the role the phone had: subsetted interface to a limited number of "real" computing tasks (yes, it has a few of its own tricks too but mainly as data collection for apps on your phone).
Thompson captured this issue by talking about devices and software in terms of "the task it was hired to to". The problem is the voice assistants haven't figured that out yet.
The thing about this approach is it creates pressure to make more functionality available at the edge.
Alexa and google home tried to jump right to the edge in one go, which skips too much phylogeny.
Apple seems to understand this, but gets it wrong in the opposite direction: the iPad hasn't moved far beyond being "most of an iPhone but with a larger screen". And if you have an apple speaker, a phone, and iPad and call out "hey Siri, set an alarm for 10 minutes" you may get three devices chiming in 10 minutes. They don't act like a single device.
Sheer volume of commands is a pretty good solution to the lack of discoverability - if it does enough, you're probably going to try things without knowing if it works. I got a Google Home recently and have been impressed at the complexity and range of commands it supports. Obviously there's a limit, but it's a lot "fuzzier" than a command line.
My Echo was an interesting novelty at first. But a couple years in now, it is essentially (in order of usefulness):
1. An alarm clock / kitchen timer
2. A thing that tells me the weather report during morning coffee, so I know what to wear
3. A DJ that my kids yell at to play pop music
If it died tomorrow, I'd probably just go back to using my phone for these 3 things rather than buy a new one. I can't imagine getting into it enough to explore third-party "skills".
Pretty much the same for me, although I'll add (I have Google Homes but pretty much the same thing):
1. Metric / imperial conversions, especially in the kitchen - this is one feature that is miles better than using a phone or computer if my hands are dirty cooking something and I want to know how many grams 12 ounces is or something like that.
2. Intercom between my 2 google homes, one in my kitchen and one in our converted attic playroom - it's so much nicer to use Google home to call my kids down for dinner vs. screaming up two flights of stairs.
3. Making quick phone calls
4. Finding my phone - I lose my phone constantly, and it's super handy that I can get my google home to make it ring even if it's on silent.
And that's a killer feature for me; it's so damn convenient to use a voice interface to add items to the shopping list as you run out of item that I'd probably replace it just for that.
Same, although I'd say a timer and weather machine that I talk to everyday is a runaway hit. There are few other things I own besides my bed, phone, clothes and a couple major appliances that I use every single day.
I came to say the same. For my 15mo old son, I prefer asking Alexa to play music on the Echo because I don't have to ignore him (in a way) to go tapping on my phone -- which makes him very curious about what I'm doing.
As a plus, he babbles at the Echo when he wants to listen to music.
For an elderly relative of mine who no longer has the physical dexterity they used to, this has been a killer feature of voice assistants for them. No longer do they need to fiddle with awkward buttons to change the radio station or find out what the weather is, they just... ask and it happens.
I can turn lights off without getting off my bed, and damn, has it made my life crazy simple. I was in India for few months last year and I realized a large part of me getting up early was to just switch on the geyser, open the door for the cook and turn the lights on. I would get back to bed after and check up fb till the water got hot. Just the fact that Echo can do these things without making me get off the bed will be a crazy value addition to my current decently privileged life.
Find the phone, unlock the phone, find and open the app, find and click the control. That requires some effort.
Saying commands, having physical buttons at the expected location in your home, NFC tags.. there's a bunch of more convenient ways to do extremely repetitive tasks that would otherwise take you more time to do.
I bought an Echo speaker when they first came out. The sound quality is impressive (to my untrained ear), but the development experience is not.
The first thing I wanted to do was to add a feature so I could add a task to my to-do list software which is not supported by Alexa. It turns out that you cannot construct a sentence along the lines of "Tell Asana to add a task: <task>". You can't actually have a 'slot' which contains a freeform piece of text, even if it is the last piece of text in the sentence.
The Alexa API differs between regions, so the North America version of the API supports this but the EU version doesn't. It was removed from the NA version for a while but placed back after a bit of an uproar.
I think you could develop some more useful stuff using Alexa if only this feature was consistently available. I cannot think of a good reason why it isn't.
I now just mainly have it as a Spotify speaker, and occasionally I use it as an expensive egg timer. I normally have it on mute because I find it activates and starts recording private conversations. I don't see it getting better.
This issue has killed my interest in basically all smart-home products I can find on the market, even ignoring their massive privacy issues.
The most useful 'basic' behavior I could think of for a smart home device was to check the weather and trigger my alarm earlier in bad weather or heavy traffic. I knew IFTTT was capable of running scripts then a trigger happened, checking the weather and traffic, and setting off an alarm, so it seemed obvious. I literally wanted "if this (or this), then that"!
No such luck. The basic IFTTT setup couldn't do it at all, no existing app could do it, and the developer program was invite-only. I got in after quite a long time, and even then it wasn't obvious. IFTTT wanted to treat 'check weather' as a script output exclusively, which I couldn't feed into any other system. The best I could do was be told the weather when I woke up. So, I stuck with the phone that could already do that.
That specific situation might have improved, but the general ecosystem doesn't seem much better. The only smart-home features I see available that I would use are list-making, music/media playing, and quick reference. But the features I would actually value are dense integration between apps, floating scripts like the one you describe, and non-user-triggered events to turn active tasks into passive ones. Those seem to be the features which are least available, even when they would be easy to implement.
This is the correct solution. The Alexa Skills Kit contains the Amazon.SearchQuery slot which meets your needs, but Lex doesn't. This is one of the two missing slots from Alexa in Lex. If you read articles regarding Lex slots you might get inaccurate info as they are slightly differing products.
I have several, along with Google Home too. They all suffer from the same issue: it's awkward to use with third party skills. Unless they are the highly integrated ones (Spotify, Logitech Harmony) in which case it's great.
It's the same for Siri, the API is so restrictive.
The whole "tell XX to YY" isn't convenient.
One workaround on Google Home is to use IFTTT, and in that case you can customize the whole phrase and response, and then it's pretty nice, even though the latency is high.
"-Okay Google, open the blinds -Sure thing Commander" never gets old.
You don't have to do that - the system can usually infer what skill you want from your utterance. [0]. And the "open the blinds" thing can be done on Alexa with a custom routine (Alexa specific IFTT).
Absolutely. Until the voice assistants become a bit more intelligent with the language used to activate them, it feels so rigid and un-natural communicating with them.
I don't see a great future for voice assistants in the near future, until we truly solve the problem of intent (in a natural way) and accordingly can respond.
Despite often seeming like a thin veneer of natural language processing on top of a search engine, I'm sure most users think of their voice assistants, or are willing them to be, a general AI agent. They don't want to have to care what app/skill/web-scrape is involved in enabling a response.
The whole skills ecosystem feels like an awkward stopgap on the path to AI. The language required to invoke them feels particularly clunky - "Alexa, ask ThingFinder about a thing" - and then, is the user supposed to be talking to ThingFinder now, or Alexa? She still sounds like Alexa, but she doesn't seem quite herself.
As developer choosing an invocation name is fraught with difficulty. There are only so many natural sounding names for something which does a particular thing, without incongruously inserting some invented branding word onto it - "Alexa open Tidy Tide Tables". If someone's already using the most natural name you are free to use exactly the same, but then who knows whose skill will be launched? And you'd better make sure your skill's name doesn't clash with anything else in the world at large, like the entire history of music for example, "Alexa play Wicked Game". It's all a bit of a mess.
Yeah, that's a good point. People are good at mentally compartmentalizing behaviors with personalities. If Alexa doesn't allow each agent to have a personality, people are going to have a hard time keeping the behaviors straight.
Kinda reminds me of Neuromancer, where a superintelligent AI was broken into two pieces to avoid detection. One part was good at personality, and the other part had to mimic people in order to communicate. It was very unnerving for people to talk to an AI that was copying the personality of someone they knew.
Alexa is just an input/output device. It is limited, which is OK but that means there will not be a 'runaway hit'. There's just not enough of a surface area for something so comprehensive.
It's like saying 'There are 80k mouse enabled apps and no runaway hit'.
What it can do is have an app for nearly anything, though, that makes sense for the form factor. It's on its way. The next step of its evolution is to look at what apps work and what conventions can be pulled from those as a general standard. Users will be much happier when they can nearly instantly download an Alexa enabled interface for an app and have it work intuitively. And this is doubly important for Alexa because there isn't deep feedback like you have with a mouse where you can see the things you aren't doing to gather hints at what's possible -- you just have to know or, at least know how to find the answer, like a command line.
Which gives me an idea 'man for Alexa'. At least we can standardize a help menu.
But the mouse does have a runaway hit: Microsoft Windows.
That being said, it took 22 years to get from Engelbart's original mouse to the first version of Windows that really took off (3.0), so perhaps we're just too early on in this product cycle for the hit to have emerged yet.
I'd say the Alexa is more like birth of the personal computer. Computers weren't very good in the mid 1970's.
Now that companies realize there's a product here, there will be an arm's race. Look for big improvements in next decade. Lots of people and billions of dollars are about to go into making these products better.
> Alexa is just an input/output device. It is limited, which is OK but that means there will not be a 'runaway hit'.
I like the way you say that. My Google Home is a remote control for my mouth. It's like licking a keyboard one key at a time in the dark. Simple queries are easy, but any non-trivial query just aint't gonna happen at the moment.
I don't believe your comparison really works. A mouse is useless on its own. An Alexa device can operate on its own.
I do agree with your latter sentiment. It does feel like a command line at times. More so like trying to figure out a text adventure game. Zork would be very difficult if you didn't know the basic functions/words. That's what Alexa feels like most of the time for me. I'd love a help menu.
That's the thing about USPs...most are idealized possibilities not yet realized.
I mean, look at VR and AR and, hell, even AI.
But AI is not useless even though it hasn't reached it generalized intelligence promise. It is adding tremendous value even though it has landed in the limited middle.
I don't mind learning Alexa's syntax and what it expects of me. I get value from what it can do well. As long as that's true, I think it can miss its more grandiose promises and still be a huge success.
Anecdotally a developer friend of mine strongly suspects, most people, are like him and not actually using these home speakers very much.
I’ve been working on a an Audio App that reads any articles to you for iOS/Android and planned to bring it to the Echo and he seems to suggest my efforts are totally wasted. Although their market reach seems to be massive, likely related to their cheap price, not surprised by this article that their actual usage is incredibly low. Most people seem to buy them and then forget about them.
As a shameless self plug, if you would like to check out my app that reads articles to you using beautiful sounding AI/ML, find it here:
Pretty much same, the Wink integration lets us turn on/off lights and do other things like “Turn on Movie Theater” which dims certain lights and turns off all others.
We use Alexa to play music when we just want something going in the background as well. Additionally we use timers and the grocery list.
Wife uses the daily news rundown.
Kids ask Alexa questions (what’s the fastest bird), have it make fart sounds, and that’s about it.
I got a pair of buzzers to play the quiz game, it was so horribly janky they’ve been used twice. I scan the the apps list and nothing strikes and as worth even trying.
I use the flash briefing thing every day. I suspect most people don’t know about it. I wake up, stumble into the shower and say “Alexa play the news” on the way. By the time I finish brushing my teeth I got my curated news read to me. Pretty nice.
As a counterpoint, thanks to an Echo in most rooms of my house and Hue lights everywhere I almost never touch a light switch. Likewise I generally use the Echo timers instead of oven timers for cooking, and I like using them as Spotify speakers. And I can't remember the last time I left my house before I asked the Echo about the weather. Sometimes my wife even uses it as an intercom if I'm in my office, but I concede that feature is a bit janky.
Whenever my wife and I go away we usually remark to each other that it suddenly feels very backwards to have to do things without the Echo. That might not be the best word, but it definitely feels like we're missing something integral to our home life when we don't have them around.
That being said, it's expensive to set up Hue lights everywhere and an Echo in every room. My setup might be one of the reasons I use it so much.
My favorite behavior was setting up a lamp near the downstairs landing with a smart bulb, and a motion sensor in the upstairs hallway/landing (through SmartThings). When someone approaches the stairs from either direction, the lamp flips on with a low brightness (in red), allowing easy navigation of the stairway.
I think the problem with your app on Alexa is the interaction model - if your app could push to my Alexa so I can open an article on my phone and then say "Alexa read this article", or use the iOS share menu and share to your app on my Alexa that'd be pretty nice. But I don't see myself saying "Alexa, tell Articulu to read ...." if that's the interaction model it'd force.
Playing songs/podcasts from Spotify on Echo while controlling it from my phone/computer is my favourite use of the speaker now. Ironically this action involves no voice commands whatsoever except occasionally telling the speaker to pause/resume/adjust volume when I'm away from my phone/computer.
I use the same features in google home but long for proper integration with Google Keep.
I had used home to populate my shopping list but now that has disappeared from the home app and you have to go to shoppinglist.google.com
I don't understand why i cant tell Home to add an item to a specific Keep list.
I agree with you -- it's a fad product with limited utility.
The inability of any of these solutions to identify the user in a meaningful way or interact usefully without involving the whole room kneecaps the utility of the products beyond actions that you take in a public space.
Turning on/off lights, wireless speaker and replacing the landline are long term probably the killer apps. Not trivial, but no smartphone either.
I am surprised there isn't a demo available on the site, at least there wasn't an obvious one I saw. I imagine your conversion rate would improve from that landing page if customers could hear a sample!
Yeah I really wonder if these speakers are a sort of fad that folks feel is super cool and gravitate too ... and pretty quickly find there isn't a lot of use.
Alexa needs a PageRank and an 'I'm feeling lucky' that is successful the majority of the time.
I would like it to work as a room of human experts works with a moderator.
Alexa, how long will it take me to cycle to Netto?
Alexa sends this parsed query out to all apps it thinks can answer.
They respond with a confidence level and answer / follow up question.
Alexa decides which app to choose (the 'PageRank') based on a number of factors like has the query been seen before and how did each app perform for it, the app's answer success rate, has the user been given a response from this app before etc. etc.
The user should not have to install apps. Alexa should know all of the apps in the room and what they're good/bad at and select one to respond.
Surely discovery is the wrong way to think about this problem and Alexa needs to be elevated from a good speech parser.
The “canfulfill” API for skills is available. Alexa scientists are constantly working on improving the arbitration experience but it’s a very difficult problem, especially due to security concerns.
In a way, I think it's even worse than that. I've used Alexa since the Echo was a relatively new product. Back then, I experimented with new phrases and commands often, but was frequently greeted with wrong answers or "Sorry, I don't know how to help with that." Over time, I stopped trying those commands. Skip forward to today--the backend has been improving for years, and many of those commands now work, but it's too late. Their users have already been taught that they don't, so folks stop trying to use those features. Not only can you not discover new commands easily, you might mentally blacklist useful commands permanently.
Perhaps more frustratingly, the "What's New" emails they send out don't help with this. They never say "Hey, we know you tried to ask your Echo to report its volume level before and it didn't work, but it does now." They always say "Ask Alexa to tell you an Arbor Day joke!" -_-
"Hey, 3 months ago you asked for the forecast according to the ECMWF weather model, and I couldn't answer you. But now I can, so go ahead and try!".
I'd argue this is the wrong thing to solve for. The best skills take a long time to make and require privledged access. Sonos, Spotify and others have this, and they work amazing. It's the large mass of Alexa skills they were quickly made and that don't have privledged access that are dragging the whole experience down.
And this is entirely Amazon's doing.
It's humbling (and annoying) when you think of yourself as a power user of some application to find out from some semi-new user of it that there's a new feature that would have made your life much easier if you had only known it was added a year or two ago.
There's only so much time to learn about the tools you use, much less the changes in them over time. How do I find out if GNU grep has some interesting new feature in the version that ships with the next version of the distro I use? Or rsync? Or tmux/screen?
This problem definitely seems worse with smaller tools, since there's a lot of them and they are often packages together by some other party. Not every project is as large and used as much but such a large population as Firefox, which arguably does a very good job of advertising new features. But you can definitely tell it takes them a lot of time and effort to do so, and a smaller project may not find much traction if they tried the same strategy.
This makes me shudder. Such a functionality would mean that they stored your failed request in some sort of database, a database that they later used to send you a personalized marketing email. A machine cataloging our voice for later inspection is a very dark future. Alexa should delete and scrub any iota of voice that it doesn't instantly understand.
"Oh, remember last week when I though you were asking me to buy you a pot plant? I now realize you were asking me to buy you some pot from Canada. It will be arriving in two days."
"I didn't understand it at the time, but I now realize that you were yelling at your husband Alex, not Alexa. Your social credit score has been adjusted to reflect this negative interaction."
This alone would make for a much more usable experience, and yet no voice assistant has implemented it yet, which flabbergasts me. I don’t have much knowledge in this particular realm so maybe I’m missing some big blocker that prevents it from being possible, but to me it seems like such an obvious thing to do.
This is a really important point, and both a strength and a weakness.
The iPhone began as essentially a front end to existing services (with visual discovery, which a voice interface inherently lacks). I used to name mine "FEP" (as in Front End Processor) -- a front end to a subset of "real" computing or as I think of it these days: multiple windows into a shared computing space.
A watch (like the apple one) is really a crappy general UI device; discoverability is pretty bad because of the limited area and speed. But it's great in the role the phone had: subsetted interface to a limited number of "real" computing tasks (yes, it has a few of its own tricks too but mainly as data collection for apps on your phone).
Thompson captured this issue by talking about devices and software in terms of "the task it was hired to to". The problem is the voice assistants haven't figured that out yet.
The thing about this approach is it creates pressure to make more functionality available at the edge.
Alexa and google home tried to jump right to the edge in one go, which skips too much phylogeny.
Apple seems to understand this, but gets it wrong in the opposite direction: the iPad hasn't moved far beyond being "most of an iPhone but with a larger screen". And if you have an apple speaker, a phone, and iPad and call out "hey Siri, set an alarm for 10 minutes" you may get three devices chiming in 10 minutes. They don't act like a single device.
Edit: there's also a third-party skill that will let you play old-school adventure/interactive fiction games. https://www.amazon.com/Vitaly-Lishchenko-Interactive-Fiction...
1. An alarm clock / kitchen timer
2. A thing that tells me the weather report during morning coffee, so I know what to wear
3. A DJ that my kids yell at to play pop music
If it died tomorrow, I'd probably just go back to using my phone for these 3 things rather than buy a new one. I can't imagine getting into it enough to explore third-party "skills".
1. Metric / imperial conversions, especially in the kitchen - this is one feature that is miles better than using a phone or computer if my hands are dirty cooking something and I want to know how many grams 12 ounces is or something like that.
2. Intercom between my 2 google homes, one in my kitchen and one in our converted attic playroom - it's so much nicer to use Google home to call my kids down for dinner vs. screaming up two flights of stairs.
3. Making quick phone calls
4. Finding my phone - I lose my phone constantly, and it's super handy that I can get my google home to make it ring even if it's on silent.
* Setting multiple alarms (why can't apple do this??)
* Conversions
* Food questions
I think it would be rad if I could feed it a recipe and then have it read me the ingredients and instructions for each step. Maybe that exists?
4. Shopping List.
And that's a killer feature for me; it's so damn convenient to use a voice interface to add items to the shopping list as you run out of item that I'd probably replace it just for that.
As a plus, he babbles at the Echo when he wants to listen to music.
I use it for: - I have to go in 1h but i only need 30 minutes to prepare the leave so timer for 30minutes - Pizza / food - short nap - for learning
And quite often.
Find the phone, unlock the phone, find and open the app, find and click the control. That requires some effort.
Saying commands, having physical buttons at the expected location in your home, NFC tags.. there's a bunch of more convenient ways to do extremely repetitive tasks that would otherwise take you more time to do.
We also user audible heavily. The kids live the boxcar children readings.
The first thing I wanted to do was to add a feature so I could add a task to my to-do list software which is not supported by Alexa. It turns out that you cannot construct a sentence along the lines of "Tell Asana to add a task: <task>". You can't actually have a 'slot' which contains a freeform piece of text, even if it is the last piece of text in the sentence.
The Alexa API differs between regions, so the North America version of the API supports this but the EU version doesn't. It was removed from the NA version for a while but placed back after a bit of an uproar.
I think you could develop some more useful stuff using Alexa if only this feature was consistently available. I cannot think of a good reason why it isn't.
I now just mainly have it as a Spotify speaker, and occasionally I use it as an expensive egg timer. I normally have it on mute because I find it activates and starts recording private conversations. I don't see it getting better.
The most useful 'basic' behavior I could think of for a smart home device was to check the weather and trigger my alarm earlier in bad weather or heavy traffic. I knew IFTTT was capable of running scripts then a trigger happened, checking the weather and traffic, and setting off an alarm, so it seemed obvious. I literally wanted "if this (or this), then that"!
No such luck. The basic IFTTT setup couldn't do it at all, no existing app could do it, and the developer program was invite-only. I got in after quite a long time, and even then it wasn't obvious. IFTTT wanted to treat 'check weather' as a script output exclusively, which I couldn't feed into any other system. The best I could do was be told the weather when I woke up. So, I stuck with the phone that could already do that.
That specific situation might have improved, but the general ecosystem doesn't seem much better. The only smart-home features I see available that I would use are list-making, music/media playing, and quick reference. But the features I would actually value are dense integration between apps, floating scripts like the one you describe, and non-user-triggered events to turn active tasks into passive ones. Those seem to be the features which are least available, even when they would be easy to implement.
What about AMAZON.SearchQuery?
What?!
It's the same for Siri, the API is so restrictive.
The whole "tell XX to YY" isn't convenient.
One workaround on Google Home is to use IFTTT, and in that case you can customize the whole phrase and response, and then it's pretty nice, even though the latency is high.
"-Okay Google, open the blinds -Sure thing Commander" never gets old.
You don't have to do that - the system can usually infer what skill you want from your utterance. [0]. And the "open the blinds" thing can be done on Alexa with a custom routine (Alexa specific IFTT).
[0] https://developer.amazon.com/blogs/alexa/post/c870fd31-4f91-...
Disclaimer - interned on the team that built this
I don't see a great future for voice assistants in the near future, until we truly solve the problem of intent (in a natural way) and accordingly can respond.
I can't imagine Amazon would hand the "Alexa, order more bread" keyword over to Instacart/Wal-Mart/whoever without a fight.
The whole skills ecosystem feels like an awkward stopgap on the path to AI. The language required to invoke them feels particularly clunky - "Alexa, ask ThingFinder about a thing" - and then, is the user supposed to be talking to ThingFinder now, or Alexa? She still sounds like Alexa, but she doesn't seem quite herself.
As developer choosing an invocation name is fraught with difficulty. There are only so many natural sounding names for something which does a particular thing, without incongruously inserting some invented branding word onto it - "Alexa open Tidy Tide Tables". If someone's already using the most natural name you are free to use exactly the same, but then who knows whose skill will be launched? And you'd better make sure your skill's name doesn't clash with anything else in the world at large, like the entire history of music for example, "Alexa play Wicked Game". It's all a bit of a mess.
Kinda reminds me of Neuromancer, where a superintelligent AI was broken into two pieces to avoid detection. One part was good at personality, and the other part had to mimic people in order to communicate. It was very unnerving for people to talk to an AI that was copying the personality of someone they knew.
It's like saying 'There are 80k mouse enabled apps and no runaway hit'.
What it can do is have an app for nearly anything, though, that makes sense for the form factor. It's on its way. The next step of its evolution is to look at what apps work and what conventions can be pulled from those as a general standard. Users will be much happier when they can nearly instantly download an Alexa enabled interface for an app and have it work intuitively. And this is doubly important for Alexa because there isn't deep feedback like you have with a mouse where you can see the things you aren't doing to gather hints at what's possible -- you just have to know or, at least know how to find the answer, like a command line.
Which gives me an idea 'man for Alexa'. At least we can standardize a help menu.
That being said, it took 22 years to get from Engelbart's original mouse to the first version of Windows that really took off (3.0), so perhaps we're just too early on in this product cycle for the hit to have emerged yet.
Now that companies realize there's a product here, there will be an arm's race. Look for big improvements in next decade. Lots of people and billions of dollars are about to go into making these products better.
I like the way you say that. My Google Home is a remote control for my mouth. It's like licking a keyboard one key at a time in the dark. Simple queries are easy, but any non-trivial query just aint't gonna happen at the moment.
Ewww. Good analogy, but eww.
I do agree with your latter sentiment. It does feel like a command line at times. More so like trying to figure out a text adventure game. Zork would be very difficult if you didn't know the basic functions/words. That's what Alexa feels like most of the time for me. I'd love a help menu.
Only if you conveniently ignore the AWS services it uses.
Can it?
I don't know how much I'd use Alexa if it weren't for Spotify. There are some native apps but its real value is in the 3rd party apps it connects to.
I mean, look at VR and AR and, hell, even AI.
But AI is not useless even though it hasn't reached it generalized intelligence promise. It is adding tremendous value even though it has landed in the limited middle.
I don't mind learning Alexa's syntax and what it expects of me. I get value from what it can do well. As long as that's true, I think it can miss its more grandiose promises and still be a huge success.
I’ve been working on a an Audio App that reads any articles to you for iOS/Android and planned to bring it to the Echo and he seems to suggest my efforts are totally wasted. Although their market reach seems to be massive, likely related to their cheap price, not surprised by this article that their actual usage is incredibly low. Most people seem to buy them and then forget about them.
As a shameless self plug, if you would like to check out my app that reads articles to you using beautiful sounding AI/ML, find it here:
https://articulu.com
Wife uses the daily news rundown.
Kids ask Alexa questions (what’s the fastest bird), have it make fart sounds, and that’s about it.
I got a pair of buzzers to play the quiz game, it was so horribly janky they’ve been used twice. I scan the the apps list and nothing strikes and as worth even trying.
Whenever my wife and I go away we usually remark to each other that it suddenly feels very backwards to have to do things without the Echo. That might not be the best word, but it definitely feels like we're missing something integral to our home life when we don't have them around.
That being said, it's expensive to set up Hue lights everywhere and an Echo in every room. My setup might be one of the reasons I use it so much.
Those are by far my killer features why i use it and why i like it.
I haven't looked at any amazon skill store as i don't see any reason for it.
The inability of any of these solutions to identify the user in a meaningful way or interact usefully without involving the whole room kneecaps the utility of the products beyond actions that you take in a public space.
Turning on/off lights, wireless speaker and replacing the landline are long term probably the killer apps. Not trivial, but no smartphone either.
EDIT: Is there a way to buy the app, or do I really have to rent it for $60 a year?
I would like it to work as a room of human experts works with a moderator.
Alexa, how long will it take me to cycle to Netto?
Alexa sends this parsed query out to all apps it thinks can answer.
They respond with a confidence level and answer / follow up question.
Alexa decides which app to choose (the 'PageRank') based on a number of factors like has the query been seen before and how did each app perform for it, the app's answer success rate, has the user been given a response from this app before etc. etc.
The user should not have to install apps. Alexa should know all of the apps in the room and what they're good/bad at and select one to respond.
Surely discovery is the wrong way to think about this problem and Alexa needs to be elevated from a good speech parser.
https://developer.amazon.com/docs/custom-skills/implement-ca...