I think the thing they want to communicate here is the training method. Rather than a pick and place tool path, this is just trained on videos of someone using a coffee machine, and imitating that.
Which… now that I think about it… is probably drawing a skeleton over top of the video Kinect-style, and… drawing a tool path…
But at least it seems to understand when it’s made a mistake, which is interesting.
Right I mean, for B2B industrial robotic arm applications this might actually be a big improvement / breakthrough.
I just find it annoying how many of these AI companies are pumping goofy B2C applications.
Like they make it look as though they are building a breakthrough humanoid robotic assistant or something, but really that is still very very far away and not at all related to the "video trained robotic arm" breakthrough they are trying to communicate.
I don't find this particularly impressive as well. Anything except a pod machine would likely present the robot with tasks it can't handle like e.g. handling actual beans or ground coffee and filling water. And most household tasks would often involve this kind of task that is not entirely straightforward as your house and its contents aren't designed with limited robots in mind.
This Keurig queued up perfectly use case is fundamentally unimpressive because I could expect my 5 year old nephew to pass it with 1 minute of verbal direction or watching me do it once, not 10 hours of deep learning.
Now expand the Keurig use case to the real world.
Present it with a machine:
* filled / not filled with water
* used pod bin empty / used pod bin filled
* pods available / pods in a box on counter / pods in a box in the cabinet
* cup on counter / cup in cabinet / cup in dishwasher / cup in sink
We now have ~50 permutations of even the simple Keurig use case and let me know how many years of training to accomplish even this. All of this to make fairly bad coffee.
We haven't even gotten into your example of actually making real coffee.
From 10 hours of video, it learned to pick up a pod carefully placed on a counter in front of it, insert said pod into a Keurig machine also carefully placed on the counter in front of it, and push the button.
From the hype, I was expecting it to pass the full Woz test:
1. Walk into a strange house and find the kitchen.
2. Find whatever coffee is available, including knowing likely places to look. Choose the best among multiple options based on the best available information about the drinker's likely preferences.
3. Find whatever coffee maker is available, get it out without screwing anything else up, assess its condition, clean it if-and-only-if necessary, plug it in, figure out its controls.
4. Similarly find filters or other supplies if necessary.
5. Grind the coffee if necessary to match the coffee maker and the desired results.
6. "Make the coffee" in the sense these guys are trying to claim is a breakthrough.
7. Serve it, or at least put it it in a cup and tell the user.
8. Put everything away and clean up
9. Take any available feedback on making it better next time.
Did a quick search for anything at all with more detail than the twitter post, and couldn’t find anything. They sure do have a ton of hip marketing image. But they also seem to have some clued up engineers. So who really knows
To me, there's only one metric that matters for humanoid robots: Can it earn its keep?
A $30K robot with 10 years useful lifetime, and a 10 year 5% mortgage is $10.40 per day. Minimum wages range from $7 to $15, so the US national average would be around that. If your robot can do 1 hour of minimum wage labour per day, people will buy it.
This is a pretty low bar. Light cleaning (dusting, wiping down surfaces, vacuum), loading the dishwasher, doing laundry, tidying up after kids, etc. These tasks don't demand AGI, agility, adeptness, speed, accuracy, etc.
Do keep in mind that a comparison like that doesn’t suppose the null case, where someone dusts their own home for free, loads the dishwasher, and trains the kid AI to clean up their own messes with thoughtfully deployed apple-sauce-snack-based positive reenforcement training.
I do agree these tasks don’t demand AGI. I don’t really want my “robot that walks around and does 11$ worth of tasks I don’t want to deal with” to feel emotions.
Disappointed. The robot t is not "making coffee". It is feeding a capsule to a machine and pressing a button.
I expected the robot would be grinding beans, filling the coffee holder, tamping, fitting the holder to a steam machine and pulling an espresso for a certain time.
I could even understand fitting a coffee filter to a holder, dropping ground coffee and dripping hot water until a certain level is reached.
Those things require a lot more decisions than "open machine, drop capsule, close machine, press button".
The robot is not "making coffee".
It is placed in front of a Keurig, coffee cup already in place, water already filled, powered on.. with a pod in front.
It is doing essentially 3 things - pick&place pod, secure latch, press go.
Which… now that I think about it… is probably drawing a skeleton over top of the video Kinect-style, and… drawing a tool path…
But at least it seems to understand when it’s made a mistake, which is interesting.
I just find it annoying how many of these AI companies are pumping goofy B2C applications.
Like they make it look as though they are building a breakthrough humanoid robotic assistant or something, but really that is still very very far away and not at all related to the "video trained robotic arm" breakthrough they are trying to communicate.
Now expand the Keurig use case to the real world.
Present it with a machine: * filled / not filled with water * used pod bin empty / used pod bin filled * pods available / pods in a box on counter / pods in a box in the cabinet * cup on counter / cup in cabinet / cup in dishwasher / cup in sink
We now have ~50 permutations of even the simple Keurig use case and let me know how many years of training to accomplish even this. All of this to make fairly bad coffee.
We haven't even gotten into your example of actually making real coffee.
From the hype, I was expecting it to pass the full Woz test:
1. Walk into a strange house and find the kitchen.
2. Find whatever coffee is available, including knowing likely places to look. Choose the best among multiple options based on the best available information about the drinker's likely preferences.
3. Find whatever coffee maker is available, get it out without screwing anything else up, assess its condition, clean it if-and-only-if necessary, plug it in, figure out its controls.
4. Similarly find filters or other supplies if necessary.
5. Grind the coffee if necessary to match the coffee maker and the desired results.
6. "Make the coffee" in the sense these guys are trying to claim is a breakthrough.
7. Serve it, or at least put it it in a cup and tell the user.
8. Put everything away and clean up
9. Take any available feedback on making it better next time.
You could queue up a dozen of kitchen use cases that individually look impressive if you don't think at all.
But they are just rhyming variations of "pick&place, then hit the button".
Can it truly learn via video visualization of humans performing an action and replicate?
During “training”, what feedback does it use, and what is the goal function to signify success?
A $30K robot with 10 years useful lifetime, and a 10 year 5% mortgage is $10.40 per day. Minimum wages range from $7 to $15, so the US national average would be around that. If your robot can do 1 hour of minimum wage labour per day, people will buy it.
This is a pretty low bar. Light cleaning (dusting, wiping down surfaces, vacuum), loading the dishwasher, doing laundry, tidying up after kids, etc. These tasks don't demand AGI, agility, adeptness, speed, accuracy, etc.
I do agree these tasks don’t demand AGI. I don’t really want my “robot that walks around and does 11$ worth of tasks I don’t want to deal with” to feel emotions.
I expected the robot would be grinding beans, filling the coffee holder, tamping, fitting the holder to a steam machine and pulling an espresso for a certain time.
I could even understand fitting a coffee filter to a holder, dropping ground coffee and dripping hot water until a certain level is reached.
Those things require a lot more decisions than "open machine, drop capsule, close machine, press button".