For all the reasons that this might not take off, what a thrill that people are trying something new--and it looks really nicely designed too.
I think this is easy to dismiss at first glance, but I genuinely believe they're trying to think about a new mode of interaction. The idea that "the computer will disappear" is probably accurate in the long term. Except for content delivery (reading, photos, movies), most tasks we achieve via computers and phones do not strictly require a screen. It's probably a good thing if computers did a better job of getting out of the way, and stop so loudly disrupting human interactions.
Whether this will be the solution is unclear; the privacy/creepiness angle is still real with an outwards-facing camera. Latency and battery life limitations might be too significant. The cost will be a non-starter for many (it is for me).
But I'm still impressed because there was a vision here. The conversational interface has never worked before for many reasons, but that does not mean it cannot work in principle, or that the ideal implementation would not be spellbinding. I'm glad they're trying. Also, the laser display is neat!
First, I’m really excited people are trying new things, but I won’t be buying this just based on the demo.
> The conversational interface has never worked before for many reasons, but that does not mean it cannot work in principle, …. I'm glad they're trying. Also, the laser display is neat!
So I did a lot of work over the years to research voice UI/UX and I’m very skeptical about this, even with the LLM stuff. I think an LLM was missing from the Siri/alexa era to transform it from “audio cli” to “chat interface” but there’s a few reasons besides that it didn’t catch on.
The information density and linearity of chat, voice especially, is a big problem.
When you look at a screen, your eyes can move in 2 dimensions. You can have sidebars, you can have text fields organized in paragraphs and buttons and bars etc. Not so with chatting - when you add linearity (you can only listen to or read one thing at a time, conversation can only present one list at a time) it becomes really slow to navigate any sort of decision or menu trees. Mobile-first have simplified this of course, but it’s not enough. Reading TTS becomes even slower to find the info you care about. It’s found a place for simple controls (smarthome, media, timers, etc) and simple information retrieval (weather, announce doorbell, read last text). Then there’s the obvious problem of talking out loud in public, false response recognition etc which are necessary evils of a voice UI.
I think the best hope for a voice device like this is to (as they’ve done) focus on simple experiences like “what’s I miss recently” and hope an AI can do a good enough job.
The laser display might help with presenting a full menu at once (media controls being an easy example), but it probably will end up being a pain to use (eg like a worse smartwatch).
Honestly though, my biggest hesitation (which could end up great) is the “pin” design. It’s novel, especially with the projector, but how heavy is it and how will that impact the comfort of my clothes? What about when wearing a jacket or scarf? Will this flop around while walking? Etc.
There is also a lack of serendipity or explorability with voice: How do you know whats possible? There is a reason a GUI menu is called a menu. It not only gives you access to multiple options but also at a glance an overview what options are there, like a restaurant menu.
It'll flop everywhere, not just while walking. Boom boom.
But yeah I've been thinking that too. "Oh, put my coat on - better spend 30 seconds messing around with my pin" [...] "Ahhh back in the office. There goes another thirty seconds moving the pin so it can film me looking at a screen for four hours"
And yeah, I feel like the weight would definitely pull my jumper or t-shirt out of shape, and make things like my collar/neckline look out of whack. Maybe they'll bring out a range of clothes suitable for it, or suggest you wear a coat indoors like the woman in the video is doing.
Linear conversation is a big problem for anything beyond simple, casual usage. It is the reason that YouTube is a terrible research platform. Is the information you want inside that 3-hour video? Possibly, but with text I can search an article for content or skim sections to determine if it's worth a deeper read.
Let's not forget the value of non-linear input. Good search terms are often constructed rather than spilled forth. Sometimes I enter search terms, read it and realize that it's like to return unrelated results and need to modify it. By the time I realize this while speaking to an AI it's already spitting out the wrong information.
This leads to a need for altered interfaces that allow these scenarios to be accomodated. This is v1.0. Let's see where it goes.
If a science fiction author was writing it, the need for stiffer fabrics to support chest cameras would synergize with a neo-Victorianism in generation alpha. (Formal button-up shirts and higher necklines for enforced modesty)
IMO, with LLMs we won't really need information density except for certain classes of people.
Even now - clicking through some insurance company's website hierarchy to find something out is insanely painful.
But even for researching things that we should probably care about enough to do it ourselves, correlating different sources of information or working through abstract/ambiguous problems... the vast majority of ordinary people will 100% take the easy way out and let LLMs do most of the thinking for them. Even with free GPT-3, people are unflinchingly having LLMs solve problems they don't want to think about too deeply. What they pay for, with occasional inaccuracy, is more than offset by convenience.
> most tasks we achieve via computers and phones do not strictly require a screen.
X (doubt). There are unfortunately only 5 senses that our brains can interact with the outside world, and visual ways are the most information dense and the easiest to utilize. The screen isn't going away anytime soon.
Projector to me are same as screen - they've been around for as long too.
Though I do look forward to direct computer-brain interface, like introducing a 6th sense.
It's funny, I see people cover up the webcam on their laptops all the time, but not their phones. They forget that there's a camera on both sides of the phone.
And especially since we can now make cameras small enough that you'd never know they were there. Even OVM6948 is commercially available, the size of a "grain of sand".
I've always said that privacy is an illusion, the usual example I give is: "You're lying in bed with the curtains drawn, you see a shadow fall across the curtains that looks like a person standing outside. Do you, or do you not have privacy?"
If the shadow turns out to be a person peeking through the curtains, then you don't. If the shadow turns out to be primal brain + tree shadow then you do. Schrodinger style.
Privacy is probably best described (as it sometimes is) as a "sense" of privacy I guess.
> It's probably a good thing if computers did a better job of getting out of the way, and stop so loudly disrupting human interactions.
And that is not this. Talking out loud every few moments with verbal commands do a device is way more annoying that someone looking at and typing on a phone
That said, I agree with you at a glance it's neat. I think in reality though it's a poor idea given how often people need to give a verbal command.
The talking out loud I agree is problematic. The bluetooth functionality and increasing quality audio pass through give me hope for a simple earphone in one ear, and eventually... this: https://x.com/ruohanzhang76/status/1720525179028406492
Also bullish on hand gesture control. Maybe most stuff will eventually become jutsu level fancy hand movements lol.
What a time to be alive. It is easy to remain grateful in this age of rapid progress.
The problem is the voice-based approach: it won't work reliably in loud environments, it won't be usable in a doctor's waiting room, libraries and other quiet environments, and some people simply don't like voice UIs.
If you want the computer to disappear, why not a better smartwatch? Or glasses, this time without the sci-fi gadget look? Both could support the exact same featureset but with a screen.
> The idea that "the computer will disappear" is probably accurate in the long term.
Why though? Computer requires attention, which pretty much rules out doing something else while using it, except perhaps when passively listening to a podcast (which doesnt really qualify as computer use). Even though we may see new mediums, the mode of interaction will remain similar to that of a book
Haha in the demo he asks "when is the next solar eclipse and where is the best place to see it?" - The AI responds correctly that it's on April the 8th 2024, but then clearly hallucinates like crazy and says "the best places to see it are in Exmouth, Australia and East Timor" which is totally incorrect - this eclipse will be visible only in North America, and invisible in Australia and East Timor. Good job he didn't ask it to book flights to Australia on the 7th of April.
You'd think your tech demo would check to see if your AI was hallucinating!
I think people are very eager to believe AI is useful in ways it often isn’t. Kind of like the crypto hype. People were willfully ignorant to how little sense it made in so many contexts. The common thread here is “yeah, but money!”
I’m not an AI detractor. I use it and really like it. I just don’t like it for information like this. Anything where the response needs to be verified yet is very brief makes no sense to bounce off of an AI, in my opinion.
Biggest issue is that people hate talking to computers in public.
Alexa was the closest to achieve significant usage since you can use it within the privacy of your home.
For voice UIs the non clear boundaries on what you think it can or cannot do is also a huge hurdle. After you get a couple “sorry I cannot do that” you stop using it
Yeah, unless the utility of this devices is large enough to override existing cultural norms, there's actually very few venues where it feels "comfortable" to voice interact with a device.
I went through this exercise with GPT voice. It's an awesome capability, but other than perhaps walking outside, or sitting in my office, there's no other space where it feels "ok" to just spontaneously talk to something.
A grey area is when you perhaps have headphones in / on and it looks like you're in a phone conversation with somebody, then it kinda feels ok, but generally you're not going to take a phone conversation in a public area without distancing yourself from others.
There's a reason most casual communication these days is text rather than voice or video calls.
> it looks like you're in a phone conversation with somebody
Even though everyone's seen AirPods by now, in those rare occasions when I'm on the phone in public, I feel compelled to have my phone out and vaguely talking at it, so it's clear I'm on a phone call and not a crazy person.
I'm curious if we would see similar usage with the pin, where voice commands in public are always performed with the hand up for the projection screen (it will still prompt looks, but hopefully be clear in context, "oh they're doing some tech thing").
Of course at this price point, it's highly dubious that we'll see anywhere near the ubiquitous market penetration of AirPods (which garner understandable complaints about the price point sub-$200, and that's with a clear value prop).
The weirdness is caused by the incantation all these things have. Once you can just talk to the AI without doing anything, just talk to it, it'll catch on very easily.
I agree, BUT, i think it's going to get a lot better soon. Ie i loathe Siri because it felt like there was always some incantation i had to remember. Like a very terrible CLI. LLMs though, even if we never get intelligence right, i think can help this area significantly.
Combine that with areas like GPT Vision, (GPT?) Whisper, etc .. it'll start feeling a lot more natural here very soon i suspect.
TBH i'm surprised Apple isn't pushing this much harder. They tout Siri so hard but it's just worthless to me. It feels like apple could make a AI Pin like this, but visibly from the public side i have zero idea that the're even working in this space. It feels like they purposefully watched the boat sail away.
edit: Sidenote, Pin + Airpods would be a nice way to interface more quietly too.
The Google assistant has been years ahead of Siri and Alexa for a good while now. I've been able to give it really loose sloppy commands, even stuttering or backtracking on my sentences, and does a competent job of figuring out what I want. In my experience Siri is much more dependent on keywords and certain phrasing, and doesn't quite integrate as deeply into one's life because Apple isn't doesn't play Google's game of slurping up all your personal data and all the public data on the internet.
These next gen AI voice assistants are still a solid improvement over Google's current offerings, but they'll feel like a massive jump into the future for folks that have been stuck in Apple's ecosystem, and that's probably where the biggest opportunity lies.
Agreed. I have the new Meta Ray-Ban glasses, and have been pleasantly surprised with how soft I can speak since the mics are so close to my mouth, but still don't enjoy doing it in public.
Well, I hated talking to Siri in public because about 70% of the time it did what I want and 30% of the time it made me feel like a fool for even trying. That 30% was what killed it for me after giving it a serious go around the time Apple was rolling out shortcuts.
After watching the presentation, I am now curious about Humane’s thing though, but I’m still going to hold off for a bit because I want to see the failure modes first and I also don’t want to rush out and be one of the first to buy the brand new 3Com Audrey.
The only reason people don't like talking to computers in public is that it's distinguishable in an awkward way from talking to humans in public. That's not going to be an issue for much longer. ChatGPT voice mode is about 99% of the way there. The only remaining issue is the cadence of the conversation -- you can't interrupt ChatGPT naturally, you have to press a button.
The issue is that your private communications are now audible by the people around you. It’s one thing when it’s to another person and you can whisper and share social context, it’s another when it’s at a good volume and contextless.
> The only reason people don't like talking to computers in public is that ...
It does not seem right to speak of a single reason. There are probably multiple. So, IMHO it would be more productive to come up with a list and put some weights on the options if you want to dissect this matter.
IMHO one very strong factor / important reason (one that you ignore) is the social context. Ie the reaction of others in the same physical space, as you start talking out loud, seemingly unmotivated.
Humans are social animals, and so the reaction of others to the actions you do tend to be very important to a large fraction of the population. What is acceptable in one context simply isn't in another. Also, the exact tolerances tend to differ with the local culture (here "local" is used in the sense "geographically/physically local")
It's not just about not annoying others here. In this case it's also about a thing as imprecise as "perceived self image". Some people (I'd argue, most people) dislike having the perception that others perceive them to be mentally unstable or rude. Most people need some kind of social acceptance for the actions they do.
One significant trait of some mental instabilities (as well as some drug induced behavioral changes) is that those affected will spontanously start talking in public. You will probably know the Tourettes Syndrome, and the alchoholic rambling about because these cases often imply quite rude and offensive verbiage and/or loud volume, but these are not the only cases.
People in general are well adept at detecting such anomalous behaviour as it is part of our insticts trained through Evolution. Also the uncomfortable feelings that observing this type of behaviour leads to will lead many to react with a "confront or escape" (aka. "fight or flee") response (a stress signal), which is not beneficial to social interaction in general.
TL;DR: If you speak out in public without a very clear and socially valid reason (speaking to an object is not that) you are not only rude to others, but you also cause them stress... and you will have to face the social stigma of being perceived as insane.
I just find voice control too outward to use in public. I don't want people to know what I'm doing, even if it's something totally innocent, plus it would also be super annoying to be on a train full of people going "blah blah blah" to their devices.
If we could subvocalise with throat or other microphones/bone speaker then maaaybe, but I feel like it's better left to a brain interface and we should really just stick to touchscreen/typing interaction for now.
same with swiftkey - can handle whispered speech to some extend.
Still I would guess Meta Glasses or AirPods should be better to handle such whispered mode since microphones are so much closer. Would be interesting if Airpods had some contact mic that could pickup whispered sound inside your mouth.
Maybe the holly grail is to have something inside your mouth so you don't have to even make voice - device will figure out what you want to speak from how mouth and tongue movement - smart tooth braces anyone? :)
If people can speak more naturally, maybe they'll be okay with it. I am constantly encountering people who are laughing or talking to themselves out in public nowadays. Of course, they're probably on phone calls with Airpods in, but it doesn't seem to be awkward in a way it used to in the 'Bluetooth headset' days.
This is easy to fix IMHO. Pair a small screen in the future for typing or have a cuff link mic for whispering. You will see accessories like these pop up in the near future.
The can and cannot do problem reminds me of writing Applescript. I just want to call a function not figure out where to sprinkle in random a/the/of modifiers!
How well does whispering do with these things? I've found that I can reliably write sentences and set alerts when holding the mic fairly close on my Pixel 6.
believe it or not, here in Ottawa Canada I was just reading a post on Reddit where people complain about those who were talking or doing video calls on the streets,
I think this will be a matter of culture and the barrier will be smaller as soon as the devices are "smarter" and not making you repeat yourself many times or, not understanding what you are asking.
This strikes me as a less-functional Apple Watch that you wear on your shirt instead of your wrist.
(Yes, Siri is not great today, but that will change very quickly with Apple working hard on their own LLMs.)
Cool project, but not something I imagine most people will want. Like Google Glass.
They even did the cringey stunt Google Glass tried and featured it on the runway during Fashion Week, as if that instantly makes something fashionable:
Yeah, just realized this is an Apple Watch competitor — but one that requires and odd new paradigm of interactivity that seems much worse than that of the Watch. Lifting your wrist up and having a small screen you can look at and talk to seems so intuitive in a way that the Humane widget doesn't.
Think of the simple interaction of wanting to issue a voice command in public. Watch: Bring it close to your mouth, maybe cover both with the other hand to be even less audible to others. Humane: Smoosh your shirt up to your face?
(Also: I live in one of the sunniest places on earth — I simply don't trust that I'll be able to see light projections onto my hand when I'm outside.)
Anyway. All in favor of exploration and new ideas. Very willing to be proven wrong on the form factor. But I also feel like we've kind of solved the wearable computing interface problem — a couple hundred years ago, turns out — and so it's going to take a lot of convincing.
Indeed. It just screams "comm badge", which makes the product idea obvious, and makes me surprised they somehow managed to make zero references to Star Trek in the entire godawfully long landing page.
Watches & phones don't have the optical & audio "visibility" of the Humane AI Pin -- which, incidentally, looks an awful lot like the Axon body-worn cameras for police.
If you really want to Always Be Surveilling, wouldn't a better solution be a tiny cam/mic accessory that pairs with your phone/watch? You could use the same magnetic battery idea, but in a much smaller form factor.
This thing (the Humane AI Pin) is aiming to be a phone replacement, which seems like a really steep challenge given its limitations--how could it replace any of the things I use my phone for on the subway to work?
Also more expensive, i pay 10/month for a dedicated watch, and i can still make 3rd part apps for it, i can't do that with humane as far as i can tell and don't really want to put it on my shirt like this.
Only real differentiator is maybe the real time translation, but that's not a frequent use case and i think i can take my phone out for that with google translate as needed.
It's too bad, love new hardware, this isn't it for me at least with that price and functionality.
The problem with voice interfaces in public is you look like a tosser while using the - and that's if they actually work. Also you may need it to communicate privately with you too...
"Hey humane, add a meeting next Tuesday at 2pm'.
"I’m sorry Dave, I'm afraid I can't do that. You have a doctors appointment about your haemeroids"
> I don't understand the insistence on using voice as the main interaction and ditching the screen.
There was a google i/o talk a few years back were they talked about users wanting multi-modal, an example being they ask for restaurant recommendations by voice, then get the list they can view on their device. Both query and results are presented in their easiest modal, and humans will naturally switch between them.
This thing seem dead on arrival. Who wants to hold their hand up like that? Who wants to look at an uneven "screen"? Can you use it while walking or experience the movement in a vehicle? (car, bus, subway)
I agree with you on all points except one. Arguably the uneven "screen" problem can be solved with a depth camera and warping the projection to match the contours of your hand. Since they already support hand gestures on the target hand it's possible they already have the equipment built-in to do this.
I repeat: it's a combadge. It solves the self-evident problem of there not being combadges available and in use.
Or, at least, it's almost a combadge. A good qualitative jump forward, but with plenty of unwanted features like subscription (I guess this could work for a Ferengi combadge), screen, wake words, etc. A combadge doesn't need to be an image projector, nor does it need rich tactile controls. But I guess you can improve the product-problem fit by ignoring those features.
People are thinking about the form factor after the cell phone. Apple is busy training everyone to use hand gestures with the new Apple Watch and upcoming Apple Vision. Humane is going down the path of projecting on the hand and touch.
"What comes next" is interesting as a problem formulation insofar as it encourages solution based thinking ("Here's the solution I think is next, for an problem still to be identified- other than it is what comes next.)
> People are thinking about the form factor after the cell phone.
That presumes there is one. There's not yet a "form factor after the car" for example. Just refinement of the same basic 4-wheeled template, with a few oddball vehicles for niche uses.
A possible indicator here is the apparent lack of demand for small screen phones. To me it suggests that screen real estate is more valuable than portability for most people.
This looks like a cool toy that high-level members of an organization will buy, and nobody else.
It can’t compete in the consumer space, because it doesn’t let you waste time on social media. It can’t compete in the corporate world because it doesn’t have a screen — no email, no spreadsheets, no collaborative chat application we’ve all grown used to. And it can’t even be great for photography, since you need another device to view the photos and videos this thing takes.
If this thing takes off for its impressive AI capabilities, smartphone makers can pump R&D into their AI, and give us this for free as a software update. But right now, the only people who will use this are folks whose job involves scheduling meetings and firing off quick text messages to colleagues and clients.
This thing is great for old people who can't see the screen. It is like a life alert on steroids that can order pizza. It is also great for kids for obvious reasons.
The guy had a Ted talk a while back going into his motivations. I believe the main one was he didn't like how phones get between you and the world, and take you out of the moment. This was an attempt to make tech that isn't a distraction in your life but that fades into the background. That was his driving principle, I believe.
I can see something like this filling a niche with the elderly population as like an external memory. (Or even just for forgetful adhd folks like myself, having something I can ask "wait, what did my wife just say to me 5 minutes ago?" ;) )
The elderly example is actually an extremely good/thought provoking idea. I can imagine my Grandparents getting huge use out of this, including with smart home functionality, if it got to where it needs to get to.
Not just Apple. Any smart watch or ear buds with Google or Amazon AI. I think ear buds paired to a phone are already the perfect form factor for this kind of thing. My Pixel Buds are already pretty good at this and I absolutely never use it.
Part of me just wants to get rid of my phone if I get a device that does the actually useful things. Get info about something, checks and sends messages in a smart way, checks the bus.
Most of the other stuff is just idling. I don't expect I would idle in the same way with an actually good assistant that respects me.
But then I'd prefer an open source Wikipedia/Wikimedia like organisation behind it.
The route is usually shown in great detail on your car's display. You also get a voice prompt just before you need to start thinking about turning. Is this a genuine problem?
A bit tongue in cheek - I said "instantly destroy" because if the main selling point is an ai voice assistant, then people would just use what's already built into their phone/watch/airpods instead of paying 600$ if apple was to implement a better LLM for siri.
I'm skeptical of the usefulness of the hand projection vs a watch. And I think anyone who wants to bring a camera would be far better served by an iphone (or any phone).
The "laser ink display" looks a bit like the totally bunk display tech of the Cicret Bracelet "product" that VFX videomaker Captain Disillusion did a comprehensive takedown of a couple of years ago https://www.youtube.com/watch?v=KbgvSi35n6o.
While it looks like there are a few videos of apparent actual demos, I haven't seen one yet where the device (and more importantly, the recording camera's settings) are controlled by an impartial reviewer, and I'm extremely sceptical that this is usable in the real world. There's a demo by the founder where one of the inputs is to tilt your palm up, and even in the demo the projection struggles to compete with the indoor lights, nevermind the sun https://youtu.be/CwSeUV3RaIA?t=205.
The pitch of this seems to be "no more distracting screens, and no need to download and manage lots of apps and services". Except there is a (very poor) screen, it's your hand. And you're limited to just one service and set of apps, the one that comes with the device.
It's all well and good saying that the AI can do everything you want, but the real world (sadly) has copyright restrictions and content licensing agreements which an out-of-the-box service by a legit company will have to abide by. If the song I want to listen to isn't available on whatever music service this product is partnered with, could I transfer music files from my computer to this device? There's a lot of use cases like this where you very quickly start to want an actual screen, and actual methods of input more precise and domain-specific than conversational voice commands.
> If the song I want to listen to isn't available on whatever music service this product is partnered with, could I transfer music files from my computer to this device?
What a weird example. They say they've partnered with Tidal, which would have 999 out of 1000 songs people look for, maybe more.
> If the song I want to listen to isn't available on whatever music service this product is partnered with, could I transfer music files from my computer to this device?
Unfortunately, "nobody" has music files any more. Spotify forever.
The Humane AI Pin Launches Its Campaign to Replace Phones - https://news.ycombinator.com/item?id=38207656 - Nov 2023 (130 comments)
I think this is easy to dismiss at first glance, but I genuinely believe they're trying to think about a new mode of interaction. The idea that "the computer will disappear" is probably accurate in the long term. Except for content delivery (reading, photos, movies), most tasks we achieve via computers and phones do not strictly require a screen. It's probably a good thing if computers did a better job of getting out of the way, and stop so loudly disrupting human interactions.
Whether this will be the solution is unclear; the privacy/creepiness angle is still real with an outwards-facing camera. Latency and battery life limitations might be too significant. The cost will be a non-starter for many (it is for me).
But I'm still impressed because there was a vision here. The conversational interface has never worked before for many reasons, but that does not mean it cannot work in principle, or that the ideal implementation would not be spellbinding. I'm glad they're trying. Also, the laser display is neat!
> The conversational interface has never worked before for many reasons, but that does not mean it cannot work in principle, …. I'm glad they're trying. Also, the laser display is neat!
So I did a lot of work over the years to research voice UI/UX and I’m very skeptical about this, even with the LLM stuff. I think an LLM was missing from the Siri/alexa era to transform it from “audio cli” to “chat interface” but there’s a few reasons besides that it didn’t catch on.
The information density and linearity of chat, voice especially, is a big problem.
When you look at a screen, your eyes can move in 2 dimensions. You can have sidebars, you can have text fields organized in paragraphs and buttons and bars etc. Not so with chatting - when you add linearity (you can only listen to or read one thing at a time, conversation can only present one list at a time) it becomes really slow to navigate any sort of decision or menu trees. Mobile-first have simplified this of course, but it’s not enough. Reading TTS becomes even slower to find the info you care about. It’s found a place for simple controls (smarthome, media, timers, etc) and simple information retrieval (weather, announce doorbell, read last text). Then there’s the obvious problem of talking out loud in public, false response recognition etc which are necessary evils of a voice UI.
I think the best hope for a voice device like this is to (as they’ve done) focus on simple experiences like “what’s I miss recently” and hope an AI can do a good enough job.
The laser display might help with presenting a full menu at once (media controls being an easy example), but it probably will end up being a pain to use (eg like a worse smartwatch).
Honestly though, my biggest hesitation (which could end up great) is the “pin” design. It’s novel, especially with the projector, but how heavy is it and how will that impact the comfort of my clothes? What about when wearing a jacket or scarf? Will this flop around while walking? Etc.
But yeah I've been thinking that too. "Oh, put my coat on - better spend 30 seconds messing around with my pin" [...] "Ahhh back in the office. There goes another thirty seconds moving the pin so it can film me looking at a screen for four hours"
And yeah, I feel like the weight would definitely pull my jumper or t-shirt out of shape, and make things like my collar/neckline look out of whack. Maybe they'll bring out a range of clothes suitable for it, or suggest you wear a coat indoors like the woman in the video is doing.
Let's not forget the value of non-linear input. Good search terms are often constructed rather than spilled forth. Sometimes I enter search terms, read it and realize that it's like to return unrelated results and need to modify it. By the time I realize this while speaking to an AI it's already spitting out the wrong information.
This leads to a need for altered interfaces that allow these scenarios to be accomodated. This is v1.0. Let's see where it goes.
If a science fiction author was writing it, the need for stiffer fabrics to support chest cameras would synergize with a neo-Victorianism in generation alpha. (Formal button-up shirts and higher necklines for enforced modesty)
Even now - clicking through some insurance company's website hierarchy to find something out is insanely painful.
But even for researching things that we should probably care about enough to do it ourselves, correlating different sources of information or working through abstract/ambiguous problems... the vast majority of ordinary people will 100% take the easy way out and let LLMs do most of the thinking for them. Even with free GPT-3, people are unflinchingly having LLMs solve problems they don't want to think about too deeply. What they pay for, with occasional inaccuracy, is more than offset by convenience.
Dead Comment
X (doubt). There are unfortunately only 5 senses that our brains can interact with the outside world, and visual ways are the most information dense and the easiest to utilize. The screen isn't going away anytime soon.
Projector to me are same as screen - they've been around for as long too.
Though I do look forward to direct computer-brain interface, like introducing a 6th sense.
I don't think you're wrong, but it's funny that we aren't as concerned about everyone walking around with outwards-facing phone cameras.
I myself never felt like taping my camera, I feel like if someone pwned my system I would be much more worried about the leaked audio.
I've always said that privacy is an illusion, the usual example I give is: "You're lying in bed with the curtains drawn, you see a shadow fall across the curtains that looks like a person standing outside. Do you, or do you not have privacy?"
If the shadow turns out to be a person peeking through the curtains, then you don't. If the shadow turns out to be primal brain + tree shadow then you do. Schrodinger style.
Privacy is probably best described (as it sometimes is) as a "sense" of privacy I guess.
The next generation of devices that incorporate some of these features might be more successful.
And that is not this. Talking out loud every few moments with verbal commands do a device is way more annoying that someone looking at and typing on a phone
That said, I agree with you at a glance it's neat. I think in reality though it's a poor idea given how often people need to give a verbal command.
Also bullish on hand gesture control. Maybe most stuff will eventually become jutsu level fancy hand movements lol. What a time to be alive. It is easy to remain grateful in this age of rapid progress.
If you want the computer to disappear, why not a better smartwatch? Or glasses, this time without the sci-fi gadget look? Both could support the exact same featureset but with a screen.
Why though? Computer requires attention, which pretty much rules out doing something else while using it, except perhaps when passively listening to a podcast (which doesnt really qualify as computer use). Even though we may see new mediums, the mode of interaction will remain similar to that of a book
You'd think your tech demo would check to see if your AI was hallucinating!
Cant believe they left this stuff in.
Gotta at least make it seem good in the commercial, this ended up being the opposite of a sizzle reel
I’m not an AI detractor. I use it and really like it. I just don’t like it for information like this. Anything where the response needs to be verified yet is very brief makes no sense to bounce off of an AI, in my opinion.
You’d think they’d have learned their lesson after Google Bard’s hallucinated demo!
(1) https://www.space.com/33784-solar-eclipse-guide.html
Alexa was the closest to achieve significant usage since you can use it within the privacy of your home.
For voice UIs the non clear boundaries on what you think it can or cannot do is also a huge hurdle. After you get a couple “sorry I cannot do that” you stop using it
I went through this exercise with GPT voice. It's an awesome capability, but other than perhaps walking outside, or sitting in my office, there's no other space where it feels "ok" to just spontaneously talk to something.
A grey area is when you perhaps have headphones in / on and it looks like you're in a phone conversation with somebody, then it kinda feels ok, but generally you're not going to take a phone conversation in a public area without distancing yourself from others.
There's a reason most casual communication these days is text rather than voice or video calls.
Even though everyone's seen AirPods by now, in those rare occasions when I'm on the phone in public, I feel compelled to have my phone out and vaguely talking at it, so it's clear I'm on a phone call and not a crazy person.
I'm curious if we would see similar usage with the pin, where voice commands in public are always performed with the hand up for the projection screen (it will still prompt looks, but hopefully be clear in context, "oh they're doing some tech thing").
Of course at this price point, it's highly dubious that we'll see anywhere near the ubiquitous market penetration of AirPods (which garner understandable complaints about the price point sub-$200, and that's with a clear value prop).
Combine that with areas like GPT Vision, (GPT?) Whisper, etc .. it'll start feeling a lot more natural here very soon i suspect.
TBH i'm surprised Apple isn't pushing this much harder. They tout Siri so hard but it's just worthless to me. It feels like apple could make a AI Pin like this, but visibly from the public side i have zero idea that the're even working in this space. It feels like they purposefully watched the boat sail away.
edit: Sidenote, Pin + Airpods would be a nice way to interface more quietly too.
These next gen AI voice assistants are still a solid improvement over Google's current offerings, but they'll feel like a massive jump into the future for folks that have been stuck in Apple's ecosystem, and that's probably where the biggest opportunity lies.
After watching the presentation, I am now curious about Humane’s thing though, but I’m still going to hold off for a bit because I want to see the failure modes first and I also don’t want to rush out and be one of the first to buy the brand new 3Com Audrey.
It does not seem right to speak of a single reason. There are probably multiple. So, IMHO it would be more productive to come up with a list and put some weights on the options if you want to dissect this matter.
IMHO one very strong factor / important reason (one that you ignore) is the social context. Ie the reaction of others in the same physical space, as you start talking out loud, seemingly unmotivated.
Humans are social animals, and so the reaction of others to the actions you do tend to be very important to a large fraction of the population. What is acceptable in one context simply isn't in another. Also, the exact tolerances tend to differ with the local culture (here "local" is used in the sense "geographically/physically local")
It's not just about not annoying others here. In this case it's also about a thing as imprecise as "perceived self image". Some people (I'd argue, most people) dislike having the perception that others perceive them to be mentally unstable or rude. Most people need some kind of social acceptance for the actions they do.
One significant trait of some mental instabilities (as well as some drug induced behavioral changes) is that those affected will spontanously start talking in public. You will probably know the Tourettes Syndrome, and the alchoholic rambling about because these cases often imply quite rude and offensive verbiage and/or loud volume, but these are not the only cases.
People in general are well adept at detecting such anomalous behaviour as it is part of our insticts trained through Evolution. Also the uncomfortable feelings that observing this type of behaviour leads to will lead many to react with a "confront or escape" (aka. "fight or flee") response (a stress signal), which is not beneficial to social interaction in general.
TL;DR: If you speak out in public without a very clear and socially valid reason (speaking to an object is not that) you are not only rude to others, but you also cause them stress... and you will have to face the social stigma of being perceived as insane.
(edit: grammar/typos)
If we could subvocalise with throat or other microphones/bone speaker then maaaybe, but I feel like it's better left to a brain interface and we should really just stick to touchscreen/typing interaction for now.
Still I would guess Meta Glasses or AirPods should be better to handle such whispered mode since microphones are so much closer. Would be interesting if Airpods had some contact mic that could pickup whispered sound inside your mouth.
Maybe the holly grail is to have something inside your mouth so you don't have to even make voice - device will figure out what you want to speak from how mouth and tongue movement - smart tooth braces anyone? :)
Talking to my cuff isn't going to make this better
Dead Comment
(Yes, Siri is not great today, but that will change very quickly with Apple working hard on their own LLMs.)
Cool project, but not something I imagine most people will want. Like Google Glass.
They even did the cringey stunt Google Glass tried and featured it on the runway during Fashion Week, as if that instantly makes something fashionable:
https://images.fastcompany.net/image/upload/w_1200,c_limit,q...
Think of the simple interaction of wanting to issue a voice command in public. Watch: Bring it close to your mouth, maybe cover both with the other hand to be even less audible to others. Humane: Smoosh your shirt up to your face?
(Also: I live in one of the sunniest places on earth — I simply don't trust that I'll be able to see light projections onto my hand when I'm outside.)
Anyway. All in favor of exploration and new ideas. Very willing to be proven wrong on the form factor. But I also feel like we've kind of solved the wearable computing interface problem — a couple hundred years ago, turns out — and so it's going to take a lot of convincing.
Though from the reviews I've seen (and as with so many Bluetooth devices), it's unusably terrible, and the battery only lasts a few hours.
A touch more seriously, the Narrative Clip:
https://en.wikipedia.org/wiki/Narrative_Clip
https://thenextweb.com/news/narratives-clip-2-wearable-camer...
This thing (the Humane AI Pin) is aiming to be a phone replacement, which seems like a really steep challenge given its limitations--how could it replace any of the things I use my phone for on the subway to work?
Only real differentiator is maybe the real time translation, but that's not a frequent use case and i think i can take my phone out for that with google translate as needed.
It's too bad, love new hardware, this isn't it for me at least with that price and functionality.
"Hey humane, add a meeting next Tuesday at 2pm'.
"I’m sorry Dave, I'm afraid I can't do that. You have a doctors appointment about your haemeroids"
Why wouldn't I use my existing watch/phone/earbuds/pods instead of paying 600$+subscription for this?
I don't understand the insistence on using voice as the main interaction and ditching the screen.
At least google glass/AR let's me read
There was a google i/o talk a few years back were they talked about users wanting multi-modal, an example being they ask for restaurant recommendations by voice, then get the list they can view on their device. Both query and results are presented in their easiest modal, and humans will naturally switch between them.
This thing seem dead on arrival. Who wants to hold their hand up like that? Who wants to look at an uneven "screen"? Can you use it while walking or experience the movement in a vehicle? (car, bus, subway)
Is this just a big sunk cost fallacy launch?
It's a combadge.
I repeat: it's a combadge. It solves the self-evident problem of there not being combadges available and in use.
Or, at least, it's almost a combadge. A good qualitative jump forward, but with plenty of unwanted features like subscription (I guess this could work for a Ferengi combadge), screen, wake words, etc. A combadge doesn't need to be an image projector, nor does it need rich tactile controls. But I guess you can improve the product-problem fit by ignoring those features.
People are thinking about the form factor after the cell phone. Apple is busy training everyone to use hand gestures with the new Apple Watch and upcoming Apple Vision. Humane is going down the path of projecting on the hand and touch.
Apple's implementations are for 1 hand operation. You can operate the watch's touch screen while holding a steering wheel for example.
What's the difference between the objectively not great screen that is my hand, and the oled watch that doesn't require both my hands for operation?
EDIT Heck this requires one hand just to see anything. I can look at my watch without any hands!
That presumes there is one. There's not yet a "form factor after the car" for example. Just refinement of the same basic 4-wheeled template, with a few oddball vehicles for niche uses.
A possible indicator here is the apparent lack of demand for small screen phones. To me it suggests that screen real estate is more valuable than portability for most people.
It can’t compete in the consumer space, because it doesn’t let you waste time on social media. It can’t compete in the corporate world because it doesn’t have a screen — no email, no spreadsheets, no collaborative chat application we’ve all grown used to. And it can’t even be great for photography, since you need another device to view the photos and videos this thing takes.
If this thing takes off for its impressive AI capabilities, smartphone makers can pump R&D into their AI, and give us this for free as a software update. But right now, the only people who will use this are folks whose job involves scheduling meetings and firing off quick text messages to colleagues and clients.
Most of the other stuff is just idling. I don't expect I would idle in the same way with an actually good assistant that respects me.
But then I'd prefer an open source Wikipedia/Wikimedia like organisation behind it.
Simple example: which way do I go at the next intersection?
Or if I'm driving, GPS is displayed on a giant screen.
I'm skeptical of the usefulness of the hand projection vs a watch. And I think anyone who wants to bring a camera would be far better served by an iphone (or any phone).
While it looks like there are a few videos of apparent actual demos, I haven't seen one yet where the device (and more importantly, the recording camera's settings) are controlled by an impartial reviewer, and I'm extremely sceptical that this is usable in the real world. There's a demo by the founder where one of the inputs is to tilt your palm up, and even in the demo the projection struggles to compete with the indoor lights, nevermind the sun https://youtu.be/CwSeUV3RaIA?t=205.
The pitch of this seems to be "no more distracting screens, and no need to download and manage lots of apps and services". Except there is a (very poor) screen, it's your hand. And you're limited to just one service and set of apps, the one that comes with the device.
It's all well and good saying that the AI can do everything you want, but the real world (sadly) has copyright restrictions and content licensing agreements which an out-of-the-box service by a legit company will have to abide by. If the song I want to listen to isn't available on whatever music service this product is partnered with, could I transfer music files from my computer to this device? There's a lot of use cases like this where you very quickly start to want an actual screen, and actual methods of input more precise and domain-specific than conversational voice commands.
What a weird example. They say they've partnered with Tidal, which would have 999 out of 1000 songs people look for, maybe more.
Unfortunately, "nobody" has music files any more. Spotify forever.
(Of course readers here are the exception.)