I don't think he touched the scroll wheel in the keynote. Why is it there? What does it do? I found it awkward he scrolled by touching the screen across the damn scroll wheel. Just use it. At least in the keynote.
And then there are so many questions about the hardware capabilities of this device:
- where is the inference running? I don't believe it's on the device. And if it's in the cloud then why make the claim it is under 500ms? Is that just a "don't you guys have low-latency 4g at home?" moment?
- how is that tiny camera capable of parsing a small-text table with 100% accuracy? the optics just don't allow it to.
- what's the battery life with that kind of usage? If inference is running on device it must be very low. Just thinking that my GPU pulls 50-100W on average (with spikes to 200W) just to suggest code I still don't think rabbit is doing anything on device. If it's cloud based then 4g is also a battery destroyer. Maybe that's why the device is so big: huge battery inside.
The "teach" session was definitely cool. But at this point it must be magic because there's no way that thing browsed to a discord server, authenticated with hallucinated credentials and it just worked.
From
> rabbit OS operates apps on our secured cloud, so you don’t have to. Log into the apps you’d like rabbit to use on your system through the rabbit hole to relay control. You only need to do this once per app.
it seems that this device doesn't run anything locally, everything is in the cloud.
> it seems that this device doesn't run anything locally, everything is in the cloud.
For apps w/o a web interface, presumably they have an Android emulator running somewhere and they controlling it with their LAM and feeding the results back to end users. They can even feed sensor data from the Rabbit phone (e.g. GPS) to the emulator.
It is a brilliant solution to the problems they are facing, but their cost model must be extreme. "No subscription required" means I am wondering what their "pay as you go" pricing will be!
>I don't think he touched the scroll wheel in the keynote. Why is it there?
Same vibe as Humane not using the projector for almost anything and then explicitly saying "You dont have to use it" in an interview. Like why add it then
It's using a Mediatek Helio P35, a SoC from 2018. It's slower than an iPhone 6S.
There's no 5G support either, unless they've purchased an external modem. So even if you do live in an area with low-latency 5G, you aren't gonna get it.
I wouldn't be surprised if this turns out to be running just a cheap Android fork that's locked in kiosk mode with a single app (or webapp).
Thank you, this is an important detail. Teenage Engineering isn't mentioned anywhere on the linked page, so I was confused about the HN title. I figured Rabbit was a new venture by TE.
I think the hardware is just a gimmick. It gives the proposition more 'body' and is great for marketing.
But really, this is screaming to be an app. There is literally nothing this thing can do that the phone in all of our pockets can't. Especially integrated with our smartwatches for ease of use. So why have a separate device for it? It makes no sense.
I think the hardware is just there to make it not 'just another app' in the press, really. And that the app version will come soon after the physical product releases and will take over 99% of the installed base.
And of course if this takes off they'll be quickly acquired by Apple, Microsoft, Google or Samsung. I bet that's what they're aiming for.
I don't think the hardware is the main product. I think the AI is, but they didn't want to be "just an app" they want to be the first OS for the new way of computing. So they designed a new device. I wouldn't be surprised if they open up to OEM's to start making all kinds of devices.
I’m extremely bearish on this app. But Pininfarina basically saved Ferrari, revived Ferrari’s little brother Maserati and gave life back to Volvo. If done right (clearly defined roles, collaboration, design maturity etc) it works well.
I'm not sure people want an LLM assistant that can actually do things like spend their money.
A minor convenience when it works well. A major inconvenience when it doesn't. (And it doesn't take a lot of imagination to come up with nightmare scenarios.)
Love the gadget, though. I feel like I want to eat it or build a Lego castle around it, or just look at it more.
How do you know the LLM is going to do what you confirmed?
There's a fundamental tension here: either you limit the LLM to a set of fixed actions that a user can individually understand and confirm. Or you let it figure out what to do and how to do it given a higher-level goal.
In the first case, it's limited and not really better than a well-designed site or app. In the second it's powerful but can run amok.
E.g., did it really fill out that four page order form for that e-bike you asked it to order? Maybe it uses the debit card instead of your credit card and your checking account is overdrawn. Maybe it sets the delivery address to your brother's address. Maybe it orders two bikes or 10 or the wrong model or wrong size.
OR
It asks/confirms each step of the way so there's little chance for mistakes.
This is an inherent problem with any kind of delegation. You either micro manage the details or trust your agent to get them right.
$199 with no subscription? You must have to bring your own data sim card. Or can it get its connection from your phone? And is the AI stuff running locally on the device? Impressive if so. Suspicious if not; there's no way the service remains free forever.
Pretty compelling price, and I'm certain the vision of AI agents that can use any existing app or website to take actions on your behalf is the future of computing. But there's no room in my pocket for a second device. I don't see how a device is going to succeed when an equivalent app for existing phones seems around the corner.
It's indeed suspicious. You're sending your voice samples, your various services accounts, your location and more private data to some proprietary black box in some public cloud. Sorry, but this is a privacy nightmare. It should be open source and self-hosted like Mycroft (https://mycroft.ai) or Leon (https://getleon.ai) to be trustworthy.
It's not a matter of "catching on". There are smaller models and they have been looking into putting useful models on device from day one.
It's just the fact that the models that can currently run on a mobile device are not effective enough. It is about the most memory and compute intensive type of application ever. In particular, models that can reason or follow instructions reliably and for general purpose are too big to run quickly on mobile devices.
They are putting models on phones, but they do not have general purpose assistant capabilities.
> the industry is only starting to catch onto on-device models language models.
i mean there are technical limitations and tradeoffs to running LLM-size models locally. doesnt help to ascribe it to lack of foresight when it is a known Hard Problem.
In the linked keynote, Jesse Lyu mentions that LLM won't help us actually do tasks - there are currently no so-called "agents" that do something simple like book a flight - the best way to do it is still to click the buttons yourself.
Rabbit means to solve that by creating a "LAM", a "Large Action model", which is a service by Rabbit that will click interfaces for you. I'm not sure this is the right approach - if it is successful, it will lead to more centralisation around Rabbit.
I agree this is a problem, but I feel a better approach would be to have a market of agents that for a small fee actually handle the whole transaction for you. So there might be multiple parties that say they can buy Delta Flight DL101 tomorrow 21:10 for various prices - some might be a service like the Rabbit LAM, others might be booking platforms, and there might even be airlines themselves. And now an agent-concierge that you choose once at the start will look at all the parties, and then pick and buy the right flight for you. This will make the problem a problem of an open market, where good speedy service is promoted, and prices get ever lower. And if the Rabbit LAM gets outcompeted by an ever better speedier solution, that would be a good thing. (This will also allow us to move away from our current dreaded attention-based economy where e.g. a booking websites tries to exploit your required presence during waiting times, which the LAMs would also solve, but, like I said, let's not move towards more centralisation.)
> Rabbit means to solve that by creating a "LAM", a "Large Action model", which is a service by Rabbit that will click interfaces for you. I'm not sure this is the right approach - if it is successful, it will lead to more centralisation around Rabbit.
The LAM is a genius hack to get around the thousands of closed gardens that apps have created.
It also may have been easier than teaching an LLM how to make tons of API calls, and if done right I presume their LAM adapts to UI changes, vs writing integrations against breaking / deprecating APIs.
90% of use cases will be covered by an official API.
They’ll cover the other 10% with “teaching”. Essentially you telling the AI what the lazily written markup actually means. Then they save it into an automation template. QA teams have only been doing that for the better part of 3 decades.
I know a company that employs a building of a 1,000 people doing nothing but performing 1 click. So they put a human in the scraping
/automation loop so they don’t violate the site/services TOS.
But if we're not careful this will circle back to apps/silos.
What I'd like to see is the Smalltalk approach: data providers that are able to send/receive messages, and can be connected together to achieve a goal. Even better if the connecting is done by the "machine" after I issue a command.
its been such a long year, I still remember the month of gpt...what was it, not gpt4all...gpt...ah whatever. The "running an LLM in a loop will solve it" approach. I'm not a big fan, I'd need to see something truly transformative.
This seems to be a Langchain wrapper, where the Langchain is a prompt + retrieval based on a few documents.
> Rabbit means to solve that by creating a "LAM", a "Large Action model", which is a service by Rabbit that will click interfaces for you.
https://openadapt.ai is an open source app that runs on your local machine that clicks interfaces for you -— but only for repetitive tasks that you show it how to do.
QA teams have been doing this sort of stuff for decades. With a little know how and an hour you could record a user doing something in the DOM and play it back. There’s no magic here.
This is exactly what Siri wanted to be. Compare with the original Siri keynote https://vimeo.com/5424527
I can't find the exact ~2010 article from before being bought by Apple. I remember in an interview they were talking about making a web agent that could operate and perform tasks on any website, to avoid being locked out by APIs.
I'm very interested in what Apple does with LLMs on iDevices.
They have the right hardware for it and they have all the motivation, with their focus on on-device processing. OTOH they also have a pretty bad history with their AI assistant.
I've been surprised that Apple hasn't done more to keep pushing beyond the app boundary. The primary pitch for Rabbit from the keynote is basically "it's a layer that sits atop the broken model of modern phones." I think we all agree that the 'evolved state' of phones and apps is disappointing compared to where it could be / where we expected it would go.
They (Apple) are now in the position of being seen as laggards, caught with their pants down by companies releasing products whose core conceits are built atop inefficiency of their "core" models.
For both Humane and Rabbit, it's hard for me to imagine that there's enough "there" there for these to be beyond niche products that don't get starved out by the border-expansion of Apple / Android over the next few years … but I would have also guessed that A+A would have been further out front of this.
What's the advantage of this vs a smartphone? Realistically, are you going to carry two gadgets with you? I feel like a lot of people don't like the smartphone, because, being a general purpose computer, it takes away the excuse to buy all sorts of different gadgets.
having a purpose built device reduces number of clicks and other friction. currently phones are hostile to AI interaction (not that I'm entertaining this specific device)
And then there are so many questions about the hardware capabilities of this device:
- where is the inference running? I don't believe it's on the device. And if it's in the cloud then why make the claim it is under 500ms? Is that just a "don't you guys have low-latency 4g at home?" moment?
- how is that tiny camera capable of parsing a small-text table with 100% accuracy? the optics just don't allow it to.
- what's the battery life with that kind of usage? If inference is running on device it must be very low. Just thinking that my GPU pulls 50-100W on average (with spikes to 200W) just to suggest code I still don't think rabbit is doing anything on device. If it's cloud based then 4g is also a battery destroyer. Maybe that's why the device is so big: huge battery inside.
The "teach" session was definitely cool. But at this point it must be magic because there's no way that thing browsed to a discord server, authenticated with hallucinated credentials and it just worked.
it seems that this device doesn't run anything locally, everything is in the cloud.
For apps w/o a web interface, presumably they have an Android emulator running somewhere and they controlling it with their LAM and feeding the results back to end users. They can even feed sensor data from the Rabbit phone (e.g. GPS) to the emulator.
It is a brilliant solution to the problems they are facing, but their cost model must be extreme. "No subscription required" means I am wondering what their "pay as you go" pricing will be!
Same vibe as Humane not using the projector for almost anything and then explicitly saying "You dont have to use it" in an interview. Like why add it then
There's no 5G support either, unless they've purchased an external modem. So even if you do live in an area with low-latency 5G, you aren't gonna get it.
I wouldn't be surprised if this turns out to be running just a cheap Android fork that's locked in kiosk mode with a single app (or webapp).
This is a product sold by Rabbit - and they hired TE to help as a design agency.
But really, this is screaming to be an app. There is literally nothing this thing can do that the phone in all of our pockets can't. Especially integrated with our smartwatches for ease of use. So why have a separate device for it? It makes no sense.
I think the hardware is just there to make it not 'just another app' in the press, really. And that the app version will come soon after the physical product releases and will take over 99% of the installed base.
And of course if this takes off they'll be quickly acquired by Apple, Microsoft, Google or Samsung. I bet that's what they're aiming for.
Deleted Comment
https://www.theverge.com/2024/1/9/24030667/rabbit-r1-ai-acti... (via https://news.ycombinator.com/item?id=38933819, but no comments there)
https://techcrunch.com/2024/01/09/can-a-striking-design-set-... (via https://news.ycombinator.com/item?id=38932575, but no comments there)
https://www.tomsguide.com/reviews/rabbit-r1
A minor convenience when it works well. A major inconvenience when it doesn't. (And it doesn't take a lot of imagination to come up with nightmare scenarios.)
Love the gadget, though. I feel like I want to eat it or build a Lego castle around it, or just look at it more.
There's a fundamental tension here: either you limit the LLM to a set of fixed actions that a user can individually understand and confirm. Or you let it figure out what to do and how to do it given a higher-level goal.
In the first case, it's limited and not really better than a well-designed site or app. In the second it's powerful but can run amok.
E.g., did it really fill out that four page order form for that e-bike you asked it to order? Maybe it uses the debit card instead of your credit card and your checking account is overdrawn. Maybe it sets the delivery address to your brother's address. Maybe it orders two bikes or 10 or the wrong model or wrong size.
OR
It asks/confirms each step of the way so there's little chance for mistakes.
This is an inherent problem with any kind of delegation. You either micro manage the details or trust your agent to get them right.
Pretty compelling price, and I'm certain the vision of AI agents that can use any existing app or website to take actions on your behalf is the future of computing. But there's no room in my pocket for a second device. I don't see how a device is going to succeed when an equivalent app for existing phones seems around the corner.
It's just the fact that the models that can currently run on a mobile device are not effective enough. It is about the most memory and compute intensive type of application ever. In particular, models that can reason or follow instructions reliably and for general purpose are too big to run quickly on mobile devices.
They are putting models on phones, but they do not have general purpose assistant capabilities.
This may change in the next few years though.
Actually the only device with such features is a recent smartphone. Wouldn’t this be better as an app?
i mean there are technical limitations and tradeoffs to running LLM-size models locally. doesnt help to ascribe it to lack of foresight when it is a known Hard Problem.
Ah already have ChatGPT installed, that'll do :-).
Rabbit means to solve that by creating a "LAM", a "Large Action model", which is a service by Rabbit that will click interfaces for you. I'm not sure this is the right approach - if it is successful, it will lead to more centralisation around Rabbit.
I agree this is a problem, but I feel a better approach would be to have a market of agents that for a small fee actually handle the whole transaction for you. So there might be multiple parties that say they can buy Delta Flight DL101 tomorrow 21:10 for various prices - some might be a service like the Rabbit LAM, others might be booking platforms, and there might even be airlines themselves. And now an agent-concierge that you choose once at the start will look at all the parties, and then pick and buy the right flight for you. This will make the problem a problem of an open market, where good speedy service is promoted, and prices get ever lower. And if the Rabbit LAM gets outcompeted by an ever better speedier solution, that would be a good thing. (This will also allow us to move away from our current dreaded attention-based economy where e.g. a booking websites tries to exploit your required presence during waiting times, which the LAMs would also solve, but, like I said, let's not move towards more centralisation.)
The LAM is a genius hack to get around the thousands of closed gardens that apps have created.
It also may have been easier than teaching an LLM how to make tons of API calls, and if done right I presume their LAM adapts to UI changes, vs writing integrations against breaking / deprecating APIs.
90% of use cases will be covered by an official API.
They’ll cover the other 10% with “teaching”. Essentially you telling the AI what the lazily written markup actually means. Then they save it into an automation template. QA teams have only been doing that for the better part of 3 decades.
I know a company that employs a building of a 1,000 people doing nothing but performing 1 click. So they put a human in the scraping /automation loop so they don’t violate the site/services TOS.
Good luck with that.
What I'd like to see is the Smalltalk approach: data providers that are able to send/receive messages, and can be connected together to achieve a goal. Even better if the connecting is done by the "machine" after I issue a command.
https://github.com/joaomdmoura/crewAI
This seems to be a Langchain wrapper, where the Langchain is a prompt + retrieval based on a few documents.
ex. `https://github.com/joaomdmoura/crewAI-examples/tree/main/sto...` ``` BrowserTools.scrape_and_summarize_website, SearchTools.search_internet, CalculatorTools.calculate, SECTools.search_10q, SECTools.search_10k ```
https://openadapt.ai is an open source app that runs on your local machine that clicks interfaces for you -— but only for repetitive tasks that you show it how to do.
I can't find the exact ~2010 article from before being bought by Apple. I remember in an interview they were talking about making a web agent that could operate and perform tasks on any website, to avoid being locked out by APIs.
They have the right hardware for it and they have all the motivation, with their focus on on-device processing. OTOH they also have a pretty bad history with their AI assistant.
They (Apple) are now in the position of being seen as laggards, caught with their pants down by companies releasing products whose core conceits are built atop inefficiency of their "core" models.
For both Humane and Rabbit, it's hard for me to imagine that there's enough "there" there for these to be beyond niche products that don't get starved out by the border-expansion of Apple / Android over the next few years … but I would have also guessed that A+A would have been further out front of this.
It’s the dedicated appliance that’s the genius part and what makes it feel like magic.