Have to say, I was thoroughly impressed by what Apple showed today with all this Personal AI stuff. And it proves that the real power of consumer AI will be in the hands of the platform owners where you have most of your digital life in already (Apple or Google for messaging, mail, photos, apps; Microsoft for work and/or life).
The way Siri can now perform actions based on context from emails and messages like setting calendar and reservations or asking about someone’s flight is so useful (can’t tell you how many times my brother didn’t bother to check the flight code I sent him via message when he asks me when I’m landing for pickup!).
I always saw this level of personal intelligence to come about at some point, but I didn’t expect Apple to hit it out of the park so strongly. Benefit of drawing people into their ecosystem.
Nevermind all the thought put into private cloud, integration with ChatGPT, the image generation playground, and Genmoji. I can genuinely see all this being useful for “the rest of us,” to quote Craig. As someone who’s taken a pessimistic view of Apple software innovation the last several years, I’m amazed.
One caveat: the image generation of real people was super uncanny and made me uncomfortable. I would not be happy to receive one of those cold and impersonal, low-effort images as a birthday wish.
> I always saw this level of personal intelligence to come about at some point, but I didn’t expect Apple to hit it out of the park so strongly. Benefit of drawing people into their ecosystem.
It's the benefit of how Apple does product ownership. In contrast to Google and Microsoft.
I hadn't considered it, but AI convergence is going to lay bare organizational deficiencies in a way previous revolutions didn't.
Nobody wants a GenAI feature that works in Gmail, a different one that works in Messages, etc. -- they want a platform capability that works anywhere they use text.
I'm not sure either Google or Microsoft are organizationally-capable of delivering that, at this point.
"AI convergence is going to lay bare organizational deficiencies in a way previous revolutions didn't`'
Your quote really hit me. I trust Apple to respect my privacy when doing AI, but the thought of Microsoft or Google slurping up all my data to do remote-server AI is abhorrent. I can't see how Microsoft or Google can undo the last 10 years to fix this.
Ironically, I feel like Apple might have lost me as a customer today. It won't matter to Apple, obviously, but so much of what they showed today I just felt was actively pushing me out of the ecosystem.
I first bought some devices for myself, then those devices got handed off to family when I upgraded, and now we're at a point where we still use all of the devices we bought to date - but the arbitrary obsolescence hammer came down fairly hard today with the intel cut-off and the iPhone 15+ requirement for the AI features. This isn't new for Apple, they've been aging perfectly usable devices out of support for years. We'll be fine for now, but patch support is only partial for devices on less-than-latest major releases so I likely need to replace a lot of stuff in the next couple of years and it would be way too expensive to do this whole thing again. I'll also really begrudge doing it, as the devices we have suit us just fine.
Some of it I can live without (most of the AI features they showed today), but for the parts that are sending off to the cloud anyway it just feels really hard to pretend it's anything other than trying to force upgrades people would be happy without. OCLP has done a good job for a couple of homework Macs, I might see about Windows licenses for those when they finally stop getting patches.
I'd feel worse for anyone that bought the Intel Mac Pro last year before it got taken off sale (although I'm not sure how many did). That's got to really feel like a kick in the teeth given the price of those things.
It’s ironic how the one company that is WAY over the top wrt secrecy — not only to the public, but also and especially internally (they’ve even walled the hardware team from the software team while developing the iPhone!) — is at the same time the one company that really nails integration.
Well tbf I’m not sure Google does project ownership… I was shocked how many seemingly important conversations ended with “well, I guess these days that functionality is owned by the community of relevant stakeholders…” (aka: owned by nobody at all). I think they’re only able to do what they’ve done through the sheer concentration of brilliant overpaid engineers, in spite of such “innovation”.
Totally agree on the AI points. Google may have incredible research, but Apple clearly is playing to their strengths here.
Anyway, while I see all of your points, none of the things I've read in the news make me excited. Recapping meetings or long emails or suggesting how to write are just...not major concerns to me at least.
Microsoft is trying and I feel they are in a much stronger position than Google. The same advantage that Apple has for personal docs and images Microsoft has across business content. Seamless AI integration across teams and outlook and sharepoint and other office products offer huge platform benefits.
For Google in particular, this was honestly something they could've done far earlier. They had the Pixel phones, they had the Tensor stuff, and then Gemini came along.
But for some reason, they decided to just stick to feature tidbits here and there and chose not to roll out quality-of-life UI features to make Gemini use easier on normal apps and not just select Google apps. And then it's also limited by various factors. They were obviously testing the waters and were just as cautious, but imho it was a damn shame. Even summarization and key points would've been nice if I could invoke it on any text field.
But yeah, this is truly the ecosystem benefit in full force here for Apple, and they're making good use of it.
Neither is Apple unless one buys wholly into the Apple ecosystem. I want open ai tools that I can truly use with all my text. But I'm not holding my breath.
It's the potential for the model. Everyone else is hoovering the internet to model everything and Apple is sticking with their privacy message and saying 'how can I model your stuff to help you.'
Apple Intelligence stuff is going to be very big. iOS is clearly the right platform to marry great UX AI with. Latching LLMs onto Siri have allowed the Siri team to quickly atone for its sins.
I think the private compute stuff to be really big. Beyond the obvious use the cloud servers for heavy computing type tasks, I suspect it means we're going to get our own private code interpreter (proper scripting on iOS) and this is probably Apple's path to eventually allowing development on iPad OS.
Not only that, Apple is using its own chips for their servers. I don't think the follow on question is whether it's enough or not. The right question to ask is what are they going to do bring things up to snuff with NVDIA on both the developer end and hardware end?
There's such a huge play here and I don't think people get it yet, all because they think that Apple should be in the frontier model game. I think I now understand the headlines of Nadella being worried about Apple's partnership with OpenAI.
I do believe much of what they showed was impressive. It actually seems to realize the "personal digital secretary" promise that personal computing devices throughout the decades were sold on.
The most important question to me is how reliable it is. Does it work every time or is there some chance that it horribly misinterprets the content and even embarrasses the user who trusted it.
Yeah, reliability is the crucial bit. Like that example he showed where it checked whether he can make an appointment (by checking driving times), a lot can go wrong there and if the assistant tells you "Yes, you can" but you cannot then I can see lots of people getting angry and not trusting it for anything.
For this reason, I really hope we can self-host our "private cloud" for use with apple devices. That would truly, properly allow end to end privacy. I don't trust Apple given the legislation you've just linked to, both claims obviously can't be correct.
legitimately good voice recognition would probably be the "killer feature" to get me to switch from android to iOS after all this time. I'm so frustrated with the current state of voice recognition in android keyboards, but ChatGPT's recent update is amazing at voice recognition. I type primarily by voice transcribing and I would be so happy if I could go from 70% voice 30% I need to type to 95% voice 5% I need to type.
One really powerful use case they demoed was that of meeting conflicts.
"Can you meet tonight at 7?"
Me "oh yes"
Siri "No you can't, your daughter's recital is at 7"
It's these integrations which will make life easier for those who deal with multiple personas all through their day.
But why partner with an outside company ? Even though it's optional on the device etc, people are miffed about the partnership than being excited by all that Apple has to offer.
The image generation is dalle 2.5 level and feels really greasy to me, beyond that I think the overall launch is pretty good! I also congratulate rabbit r1 for their timely release months before WWDC
https://heymusic.ai/music/apple-intel-fEoSb
> The way Siri can now perform actions based on context
Given that this will apparently drop... next year at the earliest?... I think it's simply quite a tease, for now.
I literally had to install a keyboard extension to my iPhone just to get Whisper speech to text, which is thousands of times better at dictation than Siri at this point, which seems about 10 years behind the curve
> the platform owners where you have most of your digital life
Yup! The hardest part of operationalizing GenAI has been, for me, dragging the "ring" of my context under the light cast by "streetlamp" of the model. Just writing this analogy out makes me think I might be putting the cart before the horse.
> but I didn’t expect Apple to hit it out of the park so strongly.
No-one is hitting anything out of the park, this is just Apple the company realising that they're falling behind and trying to desperately attach themselves to the AI train. Doesn't matter if in so doing they're validating a company run by basically a swindler (I'm talking about the current OpenAI and Sam Altman), the Apple shareholders must be kept happy.
I kind of feel like their walled garden and ecosystem might just have created the perfect environment for an AI integrated directly to the platform to be really useful.
I’m encouraged, but I am already a fan of the ecosystem…
I have no confidence this will work as intended. The last MacOS upgrade had the horrible UX of guessing which emoji you want and being wrong 95% of the time. I don't expect this to be any better. Demos are scripted.
I also expect it to fail miserably on names (places, restaurants, train stations, people), people that are bilingual, non-English, people with strong accents from English not being their first language, etc.
It just writes the content, it doesn't actually send anything.
We'll find out later if there's an API to do something like that at all or are external communications always behind some hard limit that requires explicit user interaction.
> The way Siri can now perform actions based on context from emails and messages like setting calendar and reservations
I can't think of something less exciting than a feature that Gmail has supported for a decade.
Overall there's not a single feature in the article that I find exciting (I don't use Siri at all, so maybe it's just me), but I actually see that as a good thing. The least they add GenAI the better.
The difference is that this is on-device and private. Gmail just feeds your emails to Google's servers and they do the crunching. And in the meanwhile train their systems to be better using your content.
It is really “the app” that has to die in order for AI to show its potential in user interfaces. If you want to, say, order from a restaurant, your personal agent should order it for you, any attempt for the restaurant to “own” the consumer by putting an app in his face has to end.
I don’t think I am understanding what you mean, but isn’t one of the potential use cases of AI to say: “Siri order me the thing I always get from _restaurant_ and it navigates the app for you in the background? Potentially this can be done without API integration; the AI synthetically operates the app. Maybe it “watches” how you used the app before (which options you choose, which you dismiss, etc) to learn your preferences and execute the order with minimal interaction with the user. This way annoying, bad, UI can be avoided. AI “solves” UI in this way?
Are you saying this type of scenario kills the app, or are you saying the app needs to die, replaced by an API that AIs can interact with, thus homogenizing the user experience, and avoiding the bad parts of Apps?
An interesting consequence: I started to think about how I'll be incentivized to take more pictures of useful information, and might even try setting up a Proton Mail proxy so I can use the iOS Email app and give Siri more context
Google is doing this as well but they are doing it on single app like gmail assuming all info is there and also across websites with agents but not cross apps like apple is doing across mails, messages, maps etc.
Not to mention tied into their underlying SDK API that basically the whole system is based on, and seems they are using those same API's for the internal integrations so they can feel whats missing themselves as well.
"brother didn’t bother to check the flight code I sent him via message when he asks me when I’m landing for pickup"
Yeah but what about people going to the wrong airport, or getting scammed by taking fake information uncritically? "Well it worked for me and anyway AI will get better.". Amen.
I will believe it when siri isn’t the stupidest decade old idea ever. I’m sorry if I sound anything but snarky, but they have had Star Trek abilities this whole time, nerfed for “safety” and platform product integrity —from my iPhone
The AI/Cartoony person being sent as a birthday wish was super cringey, like something my boomer father would send me. I'm a fan of genmoji. That looks fun. Less a fan of generated clip art and "images for the sake of having an image here", and way, way less into this "here, I made a cornball image of you from other images of you that I have" feature. It's as lame as Animoji but as creepy as deepfakes.
What do you mean into hands of platform owners? The point of having an Apple device is that you can run stuff on your device. The user is in control, not any platforms.
I think what they're getting at is that the platform owners have power because they can actually leverage the data that users give them to be useful tools to those users.
I would contrast this with the trend over the last year of just adding a chatbot to every app, or Recall being just a spicy History function. It's AI without doing anything useful.
I take it as 3rd party alternatives will have a much harder time because they have to ask the user to share their data with them. Apple / Google already have that established relationship and 3rd parties will unlikely have the level of integration and simplicity that the platformers can deliver.
Aside from the search and Siri improvements, I'm really not sure about the usefulness of all the generative stuff Apple is suggesting we might use here.
If you spend an hour drawing a picture for someone for their birthday and send it to them, a great deal of the value to them is not in the quality of the picture but in the fact that you went to the effort, and that it's something unique only you could produce for them by giving your time. The work is more satisfying to the creator as well - if you've ever used something you built yourself that you're proud of vs. something you bought you must have felt this. The AI image that Tania generated in a few seconds might be fun the first time, but quickly becomes just spam filling most of a page of conversation, adding nothing.
If you make up a bedtime story for your child, starring them, with the things they're interested in, a great deal of the value to them is not in the quality of the story but... same thing as above. I don't think Apple's idea of reading an AI story off your phone instead is going to have the same impact.
In a world where you can have anything the value of everything is nothing.
I've got a fairly sophisticated and detailed story world I've been building up with my kid, it always starts the same way and there are known characters.
We've been building this up for some time, this tiny universe is the most common thing for me to respond to "will you tell me a story?" (something that is requested sometimes several times a day) since it is so deeply ingrained in both our heads.
Yesterday, while driving to pick up burritos, I dictated a broad set of detailed points, including the complete introductory sequence to the story to gpt-4o and asked it to tell a new adventure based on all of the context.
It did an amazing job at it. I was able to see my kid's reaction in the reflection of the mirrors and it did not take away from what we already had. It actually gave me some new ideas on where I can take it when I'm doing it myself.
If people lean on gen ai with none of their own personal, creative contributions they're not going to get interesting results.
But I know you can go to the effort to create and create and create and then on top of that layer on gen AI--it can knock it out of the park.
In this way, I see gen AI capabilities as simply another tool that can be used best with practice, like a synthesizer after previously only having a piano or organ.
That's a very valid rebuttal to my comment. I think this kind of "force multiplier" use for AI is the most effective one we have right now; I've noticed the same thing with GPT-4 for programming. I know the code well enough to double check the output, but AI can still save time in writing it, or sometimes come up with a strategy that I may not have.
Maybe the fact that you did the dictation together with your child present is also notable. Even though you used the AI, you were still doing an activity together and they see you doing it for them.
You could say the same thing for sending a Happy Birthday text, versus a hand written letter or card. Nothing is stopping a person from sending the latter today, and yes they are more appreciated, but people also appreciate the text. For example, if you're remote and perhaps don't have that deep of a relationship with them
If a friend of mine sent me some AI generated slop for my birthday I'd be more offended than if they just sent me a text that only contains the letters "hb"
The value of a gift isn't solely on how much you worked on it or what you spent on it. It can also be in picking out the right one, if you picked something good.
Context will be more important when the gift itself is easy.
I sometimes think the physical world has been going through a similar time, where most of what we own and receive is ephemeral, mass-produced, lacking in real significance. We have a lot more now but it often means a lot less.
having been bombarded with forwards of "good morning" image greetings from loved ones on a daily basis, i can definitely attest to this sentiment.
ai spam, especially the custom emoji/stickers will be interesting in terms of whether they will have any reusability or will be littered like single-use plastic.
LOL that image you painstakingly created is also forgotten not long after being given to most people, just because you know the effort that went in doesn't mean the receiving person does 99.9% of the time.
Same thing for your kid, the kid likes both stories, gives 0 shit that you used GenAI or sat up for 8 hours trying to figure out the rhyme, those things are making YOU feel better not the person receiving it.
I think it would be clear that the picture was drawn for the person - I imagine most people would explicitly say something like "I drew this for you" in the accompanying message. And I don't know what kind of kids you've been hanging around, but my daughter would definitely appreciate a story that I spent some time thinking up rather than "here's something ChatGPT came up with". I guess that assumes you're not going to lie to kids about the AI-generated being yours, but that's another issue entirely.
> those things are making YOU feel better not the person receiving it
I don't think this is true at all. Love is proportional to cost; if it costs me nothing, then the love it represents is nothing.
When we receive something from someone, we estimate what it cost them based on what we know of them. Until recently, if someone wrote a poem just for us, our estimation of that would often be pretty high because we know approximately what it costs to write a poem.
In modern times, that cost calculation is thrown off, because we don't know whether they wrote it themselves (high cost) or generated it (low/no cost).
I don't truly agree with your take here, but let's assume you are correct and creating real things in your life only benefits you and no-one else. If you create a painting or story or piece of furniture, others prefer the more professional AI or mass-produced version.
In that scenario certainly there'll be times when using the AI option will make more sense, since you usually don't have hours to spare, and you also want to make the stories that your kid likes the most, which in this scenario are the AI ones.
But even then there's still that benefit to yourself from spending time on creating things, and I'd encourage anyone to have a hobby where they get to make something just because they feel like it. Even if it's just for you. It's nice to have an outlet to express yourself.
Their demos looked like how I imagined AI before ChatGPT ever existed. It was a personalized, context aware, deeply integrated way of interacting with your whole system.
I really enjoyed the explanation for how they planned on tackling server-enabled AI tasks while making the best possible effort to keep your requests private. Auditable server software that runs on Apple hardware is probably as good as you can get for tasks like that. Even better would be making it OSS.
There was one demo where you could talk to Siri about your mom and it would understand the context because of stuff that she (your mom) had written in one of her emails to you... that's the kind of stuff that I think we all imagined an AI world would look like. I'm really impressed with the vision they described and I think they honestly jumped to the lead of the pack in an important way that hasn't been well considered up until this point.
It's not just the raw AI capabilities from the models themselves, which I think many of us already get the feeling are going to be commoditized at some point in the future, but rather the hardware and system-wide integrations that make use of those models that matters starting today. Obviously how the experience will be when it's available to the public is a different story, but the vision alone was impressive to me. Basically, Apple again understands the UX.
I wish Apple the best of luck and I'm excited to see how their competitors plan on responding. The announcement today I think was actually subtle compared to what the implications are going to be. It's exciting to think that it may make computing easier for older people.
Until this gets into reviewers' hands, I think it's fair to say that we really have no idea how good any of this is. When it comes to AI being able to do "all kinds of things," it's easy to demo some really cool stuff, but if it falls on its face all the time in the real world, you end up with the current Siri.
I think too many people assumed that because ChatGPT is a conversation interface that that's how AI should be designed, which is like assuming computers would always be command lines instead of GUIs. Apple has done a good job of providing purpose-built GUIs for AI stuff here, and I think it will be interesting to watch that stuff get deeper.
> There was one demo where you could talk to Siri about your mom and it would understand the context because of stuff that she (your mom) had written in one of her emails to you... that's the kind of stuff that I think we all imagined an AI world would look like.
We're really just describing an on-device search tool with a much better interface. It's only creepy if you treat it like a person, which Apple is pretty careful not to do too much.
This something else is it pushes people to even more heavily dive into the ecosystem, if it works how they show you really want it to understand your life, so you'll want all your devices able to help build that net of data to provide your context to all your devices for answering about events and stuff, meaning hey maybe i should get an appletv instead of a chromecast so that siri knows about my shows and stuff too.
I'm just unhappy that this will mostly end up to make the moat larger and the platform lock-in more painful either way. iPhones have been going up in price, serious compute once you're deep in this will be simply extortion, as leaving the apple universe is going to be nigh impossible.
Also no competitor is going to be as good at integrating everything, as none of those have as integrated systems.
i'd be skeptical of the marketing used for the security/privacy angle. won't be surprised if there is subopena-able data out of this in some court case.
i might have missed it but there has not been much talk about guardrails or ethical use with their tools, and what they are doing about it in terms of potential abuse.
It sounds like the app creators need to built in the support using SiriKit and app intentions. If they're using either already a fair bit of integration will be automatic.
Gotta say, from a branding point of view, it's completely perfect. Sometimes things as "small" as the letters in a companies name can have a huge impact decades down the road. AI == AI, and that's how Apple is going to play it. That bit at the end where it said "AI for the rest of us" is a great way to capture the moment, and probably suggests where Apple is going to go.
imo, apple will gain expertise to serve a monster level of scale for more casual users that want to generate creative or funny pictures, emojis, do some text work, and enhance quality of life. I don't think Apple will be at the forefront of new AI technology to integrate those into user facing features, but if they are to catch up, they will have to get into the forefront of the same technologies to support their unique scale.
Was a notable WWDC, was curious to see what they would do with the Mac Studio and Mac Pro, and nothing about the M3 Ultra or M4 Ultra, or the M3/M4 Extreme.
I also predicted that they would use their own M2 Ultras and whatnot to support their own compute capacity in the cloud, and interestingly enough it was mentioned. I wonder if we'll get more details on this front.
Apple have a long antagonist relationship with NVIDIA. If anything it is holding Apple back because they don’t want to go cap in hand to NVIDIA and say “please sir, can I have some more”.
We see this play out with the ChatGPT integration. Rather than hosting GPT-4o themselves, OpenAI are. Apple is providing NVIDIA powered AI models through a third party, somewhat undermining the privacy first argument.
I see what they did here and it is smart, but can bring chaos. On one side it is like saying "we own it", but on the other hand it is putting a brand outside of their control. Now I only hope people will not abbreviate it with ApI, because it will pollute search results for API :P
Yeah I feel like we are getting the crumbs for a future hardware announcement, like M4 ultra. They’ll announce it like “we are so happy to share our latest and greatest processor, a processor so powerful, we’ve been using it in our private AI cloud. We are pleased to announce the M4 Ultra”
It was speculated when the M4 was released only for the iPad Pro that it might be out of an internal need on Apple's part for the bulk of the chips being manufactured. This latest set of announcements gives substantial weight to that theory.
I remain skeptical until I see it in action. On the one hand, Apple has a good track record with privacy and keeping things on device. On the other, there was too much ambiguity around this announcement. What is the threshold for running something in the cloud? How is your personal model used across devices - does that mean it briefly moves to the cloud? How does its usage change across guest modes? Even the phrase "OpenAI won’t store requests" feels intentionally opaque.
I was personally holding out for a federated learning approach where multiple Apple devices could be used to process a request but I guess the Occam's razor prevails. I'll wait and see.
> Apple has a good track record with privacy and keeping things on device.
Apple also has a long track record of "you're holding it wrong". I don't expect an amazing AI assistant out of them, I expect something that sometimes does what the user meant.
> Apple also has a long track record of "you're holding it wrong".
And yet this was never said.
Closest was this:
> Just don't hold it that way.
Or maybe this:
> If you ever experience this on your iPhone 4, avoid gripping it in the lower left corner in a way that covers both sides of the black strip in the metal band, or simply use one of many available cases.
I get the sense there's still a lot of work to be done over the next few months, and we may see some feature slippage. The betas will be where we see their words in action, and I'll be staying far away from the betas, which will be a little painful. I think ambiguity works in their favor right now. It's better to underpromise and overdeliver, instead of vice versa.
I see no real difference between 2 and 3. Once the data has left your device, it has left your device. There is no getting it back and you no longer have any control over it.
This #2, so-called "Private Cloud Compute", is not the same as iCloud. And certainly not the same as sending queries to OpenAI.
Quoting:
“With Private Cloud Compute, Apple Intelligence can flex and scale its computational capacity and draw on larger, server-based models for more complex requests. These models run on servers powered by Apple silicon, providing a foundation that allows Apple to ensure that data is never retained or exposed.“
“Independent experts can inspect the code that runs on Apple silicon servers to verify privacy, and Private Cloud Compute cryptographically ensures that iPhone, iPad, and Mac do not talk to a server unless its software has been publicly logged for inspection.”
“Apple Intelligence with Private Cloud Compute sets a new standard for privacy in AI, unlocking intelligence users can trust.”
You do realise that already happens though? If you read apple's privacy policy they send a lot of what you do to their servers.
Furthermore how private do you think Siri is? Their privacy policy explicitly states they send transcripts of what you say to them. That cannot be disabled.
Certainly there's a difference. You are right that the jump is big between 1 and 2, but it is negligent to say that Apple, a company which strives for improved privacy and security, and ChatGPT have the same privacy practices.
Apple has demonstrated to be relatively trustworthy about privacy while most AI companies have demonstrated the opposite, so I do see a significant difference.
#2 is publicaly auditable, 100% apple controlled and apple hardware servers, tied to your personal session (probably via the ondevice encryption), i'd imagine ephemeral docker containers or something similar for requests that just run for each request or some form of Encrypted AI Lambdas.
Lvl 3 is supposed to support other models and providers in the future too. I hope it will support every server with simple, standard API so I can run self-hosted LLama 3 (or whatever will be released in next 6-12 months).
It sounded like 3 is meant for non-personal stuff. Basically like a search engine style feature. When you want to look up things like say sports records and info, or a movie and info about it, etc.
The problem is they don't explicitly define when 1 can pass to 2 and whether we can fully and categorically disable it. As far as I know, 1 can pass to 2 when governments ask for some personal data or when Apple's ad model needs some intimate details for personalization.
That was my sense as well. I would have appreciated some clarification on where the line between 1 and 2 was, although I am sure a YouTuber will deep dive on it as soon as they have it in their hands
I'm skeptical of the on-device AI. They crave edge compute but I'm doubtful their chips can handle a 7B param model. Maybe ironically with Microsoft's phi 3 mini 4k you can run this stuff on a cpu but today it's no where near good enough.
I don't know how they are going to square the privacy circle when at worst its a RAG based firehose to OpenAI, and at best you can just ask the model to leak your personal info.
Said this in the other thread, but I am really bothered that image generation is a thing but also that it got as much attention as it did.
I am worried about the reliability, if you are relying on it giving important information without checking the source (like a flight) than that could lead to some bad situations.
That being said, the polish and actual usefulness of these features is really interesting. It may not have some of the flashiest things being thrown around but the things shown are actually useful things.
Glad that ChatGPT is optional each time Siri thinks it would be useful.
My only big question is, can I disable any online component and what does that mean if something can't be processed locally?
I also have to wonder, given their talk about the servers running the same chips. Is it just that the models can't run locally or is it possibly context related? I am not seeing anything if it is entire features or just some requests.
I wonder if that implies that over time different hardware will run different levels of requests locally vs the cloud.
Regarding image generation, it seems the Image Playground supports three styles: Animation, Illustration, or Sketch.
Notice what's missing? A photorealistic style.
It seems like a good move on their part. I'm not that wild about the cartoon-ification of everything with more memes and more emojis, but at least it's obviously made-up; this is oriented toward "fun" stuff. A lot of kids will like it. Adults, too.
There's still going to be controversy because people will still generate things in really poor taste, but it lowers the stakes.
I noticed that too, but my conclusion is that they probably hand-picked every image and description in their training data so that the final model doesn’t even know what the poor taste stuff is.
> I am worried about the reliability, if you are relying on it giving important information without checking the source (like a flight) than that could lead to some bad situations.
I think it shows the context for the information it presents. Like the messages, events and other stuff. So you can quickly check if the answer is correct. So it's more about semantic search, but with a more flexible text describing the result.
> I wonder if that implies that over time different hardware will run different levels of requests locally vs the cloud.
I bet that’s going to be the case. I think they added the servers as a stop-gap out of necessity, but what they see as the ideal situation is the time when they can turn those off because all devices they sell have been able to run everything locally for X amount of time.
> I am worried about the reliability, if you are relying on it giving important information without checking the source (like a flight) than that could lead to some bad situations.
I am worried at the infinite ability of teenagers to hack around the guardrails and generate some probably not safe for school images for the next 2 years while apple figures out how to get them under control.
They said the models can scale to "private cloud compute" based on Apple Silicon which will be ensured by your device to run "publicly verifiable software" in order to guarantee no misuse of your data.
I wonder if their server-side code will be open-source? That'd be positively surprising. Curious to see how this evolves.
Anyway, overall looks really really cool. If it works as marketed, then it will be an easy "shut up and take my money". Siri seems to finally be becoming what it was meant to be (I wonder if they're piggy-backing on top of the Shortcuts Actions catalogue to have a wide array of possible actions right away), and the image and emoji generation features that integrate with Apple Photos and other parts of the system look _really_ cool.
It seems like it will require M1+ on Macs/iPads, or an iPhone 15 Pro.
You don't even have to buy a new device since it's backwards compatible with A17 Pro and M1, M2, M3 and M4. It feels like the integration of the services are using existing models and integrating the API used traditionally originally from AppleScript but, extending it to LLM or stable diffusion systems. It seems that they want the M4 as soon as possible though for the gaming and cloud pushes.
For those curious, there is in fact a ChatGPT integration.
The way it works is that when the on-device model decides "this could better be answered by chatgpt" then it will ask you if it should use that. They described it in a way which seems to indicate that it will be pluggable for other models too over time. Notably, ChatGPT 4o will be available for free without creating an OpenAI account.
I don't think that 4o will actually be available for free. It seemed like they were quite careful in choosing their words. My guess is 3.5 is free without an account, and accessing 4o requires linking your OpenAI account.
I'm really curious about this. Framing it as "running a large language model in the cloud" is almost burying the lede for me. Is this saying that in general the client will be able to cryptographically ascertain somehow the code that the server is running? That sounds incredibly interesting and useful outside of this.
It seems like this is an orchestration layer that runs on Apple Silicon, given that ChatGPT integration looks like an API call from that. It's not clear to me what is being computed on the "private cloud compute"?
If I understand correctly there's three things here:
- on-device models, which will power any tasks it's able to, including summarisation and conversation with Siri
- private compute models (still controlled by apple), for when it wants to do something bigger, that requires more compute
- external LLM APIs (only chatgpt for now), for when the above decide that it would be better for the given prompt, but always asks the user for confirmation
The way Siri can now perform actions based on context from emails and messages like setting calendar and reservations or asking about someone’s flight is so useful (can’t tell you how many times my brother didn’t bother to check the flight code I sent him via message when he asks me when I’m landing for pickup!).
I always saw this level of personal intelligence to come about at some point, but I didn’t expect Apple to hit it out of the park so strongly. Benefit of drawing people into their ecosystem.
Nevermind all the thought put into private cloud, integration with ChatGPT, the image generation playground, and Genmoji. I can genuinely see all this being useful for “the rest of us,” to quote Craig. As someone who’s taken a pessimistic view of Apple software innovation the last several years, I’m amazed.
One caveat: the image generation of real people was super uncanny and made me uncomfortable. I would not be happy to receive one of those cold and impersonal, low-effort images as a birthday wish.
It's the benefit of how Apple does product ownership. In contrast to Google and Microsoft.
I hadn't considered it, but AI convergence is going to lay bare organizational deficiencies in a way previous revolutions didn't.
Nobody wants a GenAI feature that works in Gmail, a different one that works in Messages, etc. -- they want a platform capability that works anywhere they use text.
I'm not sure either Google or Microsoft are organizationally-capable of delivering that, at this point.
Your quote really hit me. I trust Apple to respect my privacy when doing AI, but the thought of Microsoft or Google slurping up all my data to do remote-server AI is abhorrent. I can't see how Microsoft or Google can undo the last 10 years to fix this.
I first bought some devices for myself, then those devices got handed off to family when I upgraded, and now we're at a point where we still use all of the devices we bought to date - but the arbitrary obsolescence hammer came down fairly hard today with the intel cut-off and the iPhone 15+ requirement for the AI features. This isn't new for Apple, they've been aging perfectly usable devices out of support for years. We'll be fine for now, but patch support is only partial for devices on less-than-latest major releases so I likely need to replace a lot of stuff in the next couple of years and it would be way too expensive to do this whole thing again. I'll also really begrudge doing it, as the devices we have suit us just fine.
Some of it I can live without (most of the AI features they showed today), but for the parts that are sending off to the cloud anyway it just feels really hard to pretend it's anything other than trying to force upgrades people would be happy without. OCLP has done a good job for a couple of homework Macs, I might see about Windows licenses for those when they finally stop getting patches.
I'd feel worse for anyone that bought the Intel Mac Pro last year before it got taken off sale (although I'm not sure how many did). That's got to really feel like a kick in the teeth given the price of those things.
Totally agree on the AI points. Google may have incredible research, but Apple clearly is playing to their strengths here.
Anyway, while I see all of your points, none of the things I've read in the news make me excited. Recapping meetings or long emails or suggesting how to write are just...not major concerns to me at least.
But for some reason, they decided to just stick to feature tidbits here and there and chose not to roll out quality-of-life UI features to make Gemini use easier on normal apps and not just select Google apps. And then it's also limited by various factors. They were obviously testing the waters and were just as cautious, but imho it was a damn shame. Even summarization and key points would've been nice if I could invoke it on any text field.
But yeah, this is truly the ecosystem benefit in full force here for Apple, and they're making good use of it.
That's a little premature, let's try not to be so suckered by marketing.
They really hammered in the fact that every bit is going to be either fully local or publicly auditable to be private.
There's no way Google can follow, they need the data for their ad modeling. Even if they anonymise it, they still want it.
That's tangibly different.
I think the private compute stuff to be really big. Beyond the obvious use the cloud servers for heavy computing type tasks, I suspect it means we're going to get our own private code interpreter (proper scripting on iOS) and this is probably Apple's path to eventually allowing development on iPad OS.
Not only that, Apple is using its own chips for their servers. I don't think the follow on question is whether it's enough or not. The right question to ask is what are they going to do bring things up to snuff with NVDIA on both the developer end and hardware end?
There's such a huge play here and I don't think people get it yet, all because they think that Apple should be in the frontier model game. I think I now understand the headlines of Nadella being worried about Apple's partnership with OpenAI.
Are we sure there is a Siri team in Apple? What have they been doing since 2012?
The most important question to me is how reliable it is. Does it work every time or is there some chance that it horribly misinterprets the content and even embarrasses the user who trusted it.
https://www.theguardian.com/us-news/2024/apr/16/house-fisa-g...
Two features I really want:
“Position the cursor at the beginning of the word ‘usability’”
“Stop auto suggesting that word. I never use it, ever”
"Can you meet tonight at 7?" Me "oh yes" Siri "No you can't, your daughter's recital is at 7"
It's these integrations which will make life easier for those who deal with multiple personas all through their day.
But why partner with an outside company ? Even though it's optional on the device etc, people are miffed about the partnership than being excited by all that Apple has to offer.
Just randomly sprinkled eyes on the sides. I wonder why they chose to showcase that.
Given that this will apparently drop... next year at the earliest?... I think it's simply quite a tease, for now.
I literally had to install a keyboard extension to my iPhone just to get Whisper speech to text, which is thousands of times better at dictation than Siri at this point, which seems about 10 years behind the curve
Yup! The hardest part of operationalizing GenAI has been, for me, dragging the "ring" of my context under the light cast by "streetlamp" of the model. Just writing this analogy out makes me think I might be putting the cart before the horse.
Apple products tend to feel thoughtful. It might not be a thought you agree with, but it's there.
With other companies I feel like im starving, and all they are serving is their version of grule... Here is your helping be sure to eat all of it.
https://assets.horsenation.com/wp-content/uploads/2014/07/dw...
No-one is hitting anything out of the park, this is just Apple the company realising that they're falling behind and trying to desperately attach themselves to the AI train. Doesn't matter if in so doing they're validating a company run by basically a swindler (I'm talking about the current OpenAI and Sam Altman), the Apple shareholders must be kept happy.
I kind of feel like their walled garden and ecosystem might just have created the perfect environment for an AI integrated directly to the platform to be really useful.
I’m encouraged, but I am already a fan of the ecosystem…
I also expect it to fail miserably on names (places, restaurants, train stations, people), people that are bilingual, non-English, people with strong accents from English not being their first language, etc.
I did not see the announcement. Can Siri also send emails? If so then won't this (like Gemini) be vulnerable to prompt injection attacks?
Edit: Supposedly Gemini does not actually send the emails; maybe Apple is doing the same thing?
We'll find out later if there's an API to do something like that at all or are external communications always behind some hard limit that requires explicit user interaction.
- Proofread button in mail.
- ChatGPT will be available in Apple’s systemwide Writing Tools in macOS
I expect once you'll get used to it, it'll be hard to go without it.
I can't think of something less exciting than a feature that Gmail has supported for a decade.
Overall there's not a single feature in the article that I find exciting (I don't use Siri at all, so maybe it's just me), but I actually see that as a good thing. The least they add GenAI the better.
Are you saying this type of scenario kills the app, or are you saying the app needs to die, replaced by an API that AIs can interact with, thus homogenizing the user experience, and avoiding the bad parts of Apps?
Which at the backend means unifying necessary data from different product silos, into organized and usable sources.
Yeah but what about people going to the wrong airport, or getting scammed by taking fake information uncritically? "Well it worked for me and anyway AI will get better.". Amen.
Dead Comment
I would contrast this with the trend over the last year of just adding a chatbot to every app, or Recall being just a spicy History function. It's AI without doing anything useful.
But it runs in their cloud.
If you spend an hour drawing a picture for someone for their birthday and send it to them, a great deal of the value to them is not in the quality of the picture but in the fact that you went to the effort, and that it's something unique only you could produce for them by giving your time. The work is more satisfying to the creator as well - if you've ever used something you built yourself that you're proud of vs. something you bought you must have felt this. The AI image that Tania generated in a few seconds might be fun the first time, but quickly becomes just spam filling most of a page of conversation, adding nothing.
If you make up a bedtime story for your child, starring them, with the things they're interested in, a great deal of the value to them is not in the quality of the story but... same thing as above. I don't think Apple's idea of reading an AI story off your phone instead is going to have the same impact.
In a world where you can have anything the value of everything is nothing.
We've been building this up for some time, this tiny universe is the most common thing for me to respond to "will you tell me a story?" (something that is requested sometimes several times a day) since it is so deeply ingrained in both our heads.
Yesterday, while driving to pick up burritos, I dictated a broad set of detailed points, including the complete introductory sequence to the story to gpt-4o and asked it to tell a new adventure based on all of the context.
It did an amazing job at it. I was able to see my kid's reaction in the reflection of the mirrors and it did not take away from what we already had. It actually gave me some new ideas on where I can take it when I'm doing it myself.
If people lean on gen ai with none of their own personal, creative contributions they're not going to get interesting results.
But I know you can go to the effort to create and create and create and then on top of that layer on gen AI--it can knock it out of the park.
In this way, I see gen AI capabilities as simply another tool that can be used best with practice, like a synthesizer after previously only having a piano or organ.
Maybe the fact that you did the dictation together with your child present is also notable. Even though you used the AI, you were still doing an activity together and they see you doing it for them.
Context will be more important when the gift itself is easy.
ai spam, especially the custom emoji/stickers will be interesting in terms of whether they will have any reusability or will be littered like single-use plastic.
Deleted Comment
Same thing for your kid, the kid likes both stories, gives 0 shit that you used GenAI or sat up for 8 hours trying to figure out the rhyme, those things are making YOU feel better not the person receiving it.
I don't think this is true at all. Love is proportional to cost; if it costs me nothing, then the love it represents is nothing.
When we receive something from someone, we estimate what it cost them based on what we know of them. Until recently, if someone wrote a poem just for us, our estimation of that would often be pretty high because we know approximately what it costs to write a poem.
In modern times, that cost calculation is thrown off, because we don't know whether they wrote it themselves (high cost) or generated it (low/no cost).
In that scenario certainly there'll be times when using the AI option will make more sense, since you usually don't have hours to spare, and you also want to make the stories that your kid likes the most, which in this scenario are the AI ones.
But even then there's still that benefit to yourself from spending time on creating things, and I'd encourage anyone to have a hobby where they get to make something just because they feel like it. Even if it's just for you. It's nice to have an outlet to express yourself.
I really enjoyed the explanation for how they planned on tackling server-enabled AI tasks while making the best possible effort to keep your requests private. Auditable server software that runs on Apple hardware is probably as good as you can get for tasks like that. Even better would be making it OSS.
There was one demo where you could talk to Siri about your mom and it would understand the context because of stuff that she (your mom) had written in one of her emails to you... that's the kind of stuff that I think we all imagined an AI world would look like. I'm really impressed with the vision they described and I think they honestly jumped to the lead of the pack in an important way that hasn't been well considered up until this point.
It's not just the raw AI capabilities from the models themselves, which I think many of us already get the feeling are going to be commoditized at some point in the future, but rather the hardware and system-wide integrations that make use of those models that matters starting today. Obviously how the experience will be when it's available to the public is a different story, but the vision alone was impressive to me. Basically, Apple again understands the UX.
I wish Apple the best of luck and I'm excited to see how their competitors plan on responding. The announcement today I think was actually subtle compared to what the implications are going to be. It's exciting to think that it may make computing easier for older people.
Remember this ad? https://www.youtube.com/watch?v=sw1iwC7Zh24 12 years ago, they promised a bunch of things that I still wouldn't trust Siri to pull off.
I can't but feel all of this super creepy.
I remember vividly the comment on Windows Recall that said if the same was done by Apple it would be applauded. Here we are.
Also no competitor is going to be as good at integrating everything, as none of those have as integrated systems.
Deleted Comment
i might have missed it but there has not been much talk about guardrails or ethical use with their tools, and what they are doing about it in terms of potential abuse.
Dead Comment
imo, apple will gain expertise to serve a monster level of scale for more casual users that want to generate creative or funny pictures, emojis, do some text work, and enhance quality of life. I don't think Apple will be at the forefront of new AI technology to integrate those into user facing features, but if they are to catch up, they will have to get into the forefront of the same technologies to support their unique scale.
Was a notable WWDC, was curious to see what they would do with the Mac Studio and Mac Pro, and nothing about the M3 Ultra or M4 Ultra, or the M3/M4 Extreme.
I also predicted that they would use their own M2 Ultras and whatnot to support their own compute capacity in the cloud, and interestingly enough it was mentioned. I wonder if we'll get more details on this front.
We see this play out with the ChatGPT integration. Rather than hosting GPT-4o themselves, OpenAI are. Apple is providing NVIDIA powered AI models through a third party, somewhat undermining the privacy first argument.
I was personally holding out for a federated learning approach where multiple Apple devices could be used to process a request but I guess the Occam's razor prevails. I'll wait and see.
Apple also has a long track record of "you're holding it wrong". I don't expect an amazing AI assistant out of them, I expect something that sometimes does what the user meant.
And yet this was never said.
Closest was this:
> Just don't hold it that way.
Or maybe this:
> If you ever experience this on your iPhone 4, avoid gripping it in the lower left corner in a way that covers both sides of the black strip in the metal band, or simply use one of many available cases.
To be fair, this was just the keynote -- details will be revealed in the sessions.
They repeated this so many times they've made it true.
I mean they have great PR, but in terms of privacy, they extract more information from you than google does.
Google is an ad company, they have a full model of what you like and dont like at different states of your life built.
What does Apple have that's even close?
Not saying you're wrong, I'm just curious what sources or info you're using to make that claim.
1. On-device AI
2. AI using Apple's servers
3. AI using ChatGPT/OpenAI's services (and others in the future)
Number 1 will pass to number 2 if it thinks it requires the extra processing power, but number 3 will only be invoked with explicit user permission.
[Edit: As pointed out below, other providers will be coming eventually.]
This #2, so-called "Private Cloud Compute", is not the same as iCloud. And certainly not the same as sending queries to OpenAI.
Quoting:
“With Private Cloud Compute, Apple Intelligence can flex and scale its computational capacity and draw on larger, server-based models for more complex requests. These models run on servers powered by Apple silicon, providing a foundation that allows Apple to ensure that data is never retained or exposed.“
“Independent experts can inspect the code that runs on Apple silicon servers to verify privacy, and Private Cloud Compute cryptographically ensures that iPhone, iPad, and Mac do not talk to a server unless its software has been publicly logged for inspection.”
“Apple Intelligence with Private Cloud Compute sets a new standard for privacy in AI, unlocking intelligence users can trust.”
Furthermore how private do you think Siri is? Their privacy policy explicitly states they send transcripts of what you say to them. That cannot be disabled.
Those that won’t use those won’t use this either.
I think also a bunch of people will trust Apple’s server more (but not completely) than other third parties.
I am worried about the reliability, if you are relying on it giving important information without checking the source (like a flight) than that could lead to some bad situations.
That being said, the polish and actual usefulness of these features is really interesting. It may not have some of the flashiest things being thrown around but the things shown are actually useful things.
Glad that ChatGPT is optional each time Siri thinks it would be useful.
My only big question is, can I disable any online component and what does that mean if something can't be processed locally?
I also have to wonder, given their talk about the servers running the same chips. Is it just that the models can't run locally or is it possibly context related? I am not seeing anything if it is entire features or just some requests.
I wonder if that implies that over time different hardware will run different levels of requests locally vs the cloud.
Notice what's missing? A photorealistic style.
It seems like a good move on their part. I'm not that wild about the cartoon-ification of everything with more memes and more emojis, but at least it's obviously made-up; this is oriented toward "fun" stuff. A lot of kids will like it. Adults, too.
There's still going to be controversy because people will still generate things in really poor taste, but it lowers the stakes.
I think it shows the context for the information it presents. Like the messages, events and other stuff. So you can quickly check if the answer is correct. So it's more about semantic search, but with a more flexible text describing the result.
I bet that’s going to be the case. I think they added the servers as a stop-gap out of necessity, but what they see as the ideal situation is the time when they can turn those off because all devices they sell have been able to run everything locally for X amount of time.
If you have a M6 MacBook/ipad pro it’ll run your AI queries there if you’re on the same network in two-four years.
I am worried at the infinite ability of teenagers to hack around the guardrails and generate some probably not safe for school images for the next 2 years while apple figures out how to get them under control.
This can be never. LLMs fail fast as you move away from high resourced languages.
They said the models can scale to "private cloud compute" based on Apple Silicon which will be ensured by your device to run "publicly verifiable software" in order to guarantee no misuse of your data.
I wonder if their server-side code will be open-source? That'd be positively surprising. Curious to see how this evolves.
Anyway, overall looks really really cool. If it works as marketed, then it will be an easy "shut up and take my money". Siri seems to finally be becoming what it was meant to be (I wonder if they're piggy-backing on top of the Shortcuts Actions catalogue to have a wide array of possible actions right away), and the image and emoji generation features that integrate with Apple Photos and other parts of the system look _really_ cool.
It seems like it will require M1+ on Macs/iPads, or an iPhone 15 Pro.
The way it works is that when the on-device model decides "this could better be answered by chatgpt" then it will ask you if it should use that. They described it in a way which seems to indicate that it will be pluggable for other models too over time. Notably, ChatGPT 4o will be available for free without creating an OpenAI account.
Deleted Comment
Deleted Comment
- on-device models, which will power any tasks it's able to, including summarisation and conversation with Siri
- private compute models (still controlled by apple), for when it wants to do something bigger, that requires more compute
- external LLM APIs (only chatgpt for now), for when the above decide that it would be better for the given prompt, but always asks the user for confirmation
Deleted Comment
Deleted Comment