Other commenters here are correct that the LIDAR is too low-resolution to be used as the primary source for the depth maps. In fact, iPhones use four-ish methods, that I know of, to capture depth data, depending on the model and camera used. Traditionally these depth maps were only captured for Portrait photos, but apparently recent iPhones capture them for standard photos as well.
1. The original method uses two cameras on the back, taking a picture from both simultaneously and using parallax to construct a depth map, similar to human vision. This was introduced on the iPhone 7 Plus, the first iPhone with two rear cameras (a 1x main camera and 2x telephoto camera.) Since the depth map depends on comparing the two images, it will naturally be limited to the field of view of the narrower lens.
2. A second method was later used on iPhone XR, which has only a single rear camera, using focus pixels on the sensor to roughly gauge depth. The raw result is low-res and imprecise, so it's refined using machine learning. See: https://www.lux.camera/iphone-xr-a-deep-dive-into-depth/
3. An extension of this method was used on an iPhone SE that didn't even have focus pixels, producing depth maps purely based on machine learning. As you would expect, such depth maps have the least correlation to reality, and the system could be fooled by taking a picture of a picture. See: https://www.lux.camera/iphone-se-the-one-eyed-king/
4. The fourth method is used for selfies on iPhones with FaceID; it uses the TrueDepth camera's 3D scanning to produce a depth map. You can see this with the selfie in the article; it has a noticeably fuzzier and low-res look.
You can also see some other auxiliary images in the article, which use white to indicate the human subject, glasses, hair, and skin. Apple calls these portrait effects mattes and they are produced using machine learning.
I made an app that used the depth maps and portrait effects mattes from Portraits for some creative filters. It was pretty fun, but it's no longer available. There are a lot of novel artistic possibilities for depth maps.
> but apparently recent iPhones capture them for standard photos as well.
Yes, they will capture them from the main photo mode if there’s a subject (human or pet) in the scene.
> I made an app that used the depth maps and portrait effects mattes from Portraits for some creative filters. It was pretty fun, but it's no longer available
What was your app called? Is there any video of it available anywhere? Would be curious to see it!
I also made a little tool, Matte Viewer, as part of my photo tool series - but it’s just for viewing/exporting them, no effects bundled:
I'm sorry for neglecting to respond until now. The app was called Portrait Effects Studio and later Portrait Effects Playground; I took it down because it didn't meet my quality standards. I don't have any public videos anymore, but it supported background replacement and filters like duotone, outline, difference-of-Gaussians, etc., all applied based on depth or the portrait effects matte. I can send you a TestFlight link if you're curious.
I looked at your apps, and it turns out I'm already familiar with some, like 65x24. I had to laugh -- internally, anyway -- at the unfortunate one-star review you received on Matte Viewer from a user that didn't appear to understand the purpose of the app.
One that really surprised me was Trichromy, because I independently came up with and prototyped the same concept! And, even more surprisingly, there's at least one other such app on the App Store. And I thought I was so creative coming up with the idea. I tried Trichromy; it's quite elegant, and fast.
Actually, I feel we have a similar spirit in terms of our approach to creative photography, though your development skills apparently surpass mine. I'm impressed by the polish on your websites, too. Cheers.
> Yes, they will capture them from the main photo mode if there’s a subject (human or pet) in the scene.
One of the example pictures on TFA is a plant. Given that, are you sure iOS is still only taking depth maps for photos that get the "portrait" icon in the gallery? (Or have they maybe expanded the types of possible portrait subjects?)
Cool article. I assume these depth maps are used for the depth of field background blurring / faux bokeh in "Portrait" mode photos. I always thought it was interesting you can change the focal point and control the depth of field via the "aperture" after a photo is taken, though I really don't like the look of the fake bokeh. It always looks like a bad photoshop.
I think there might be a few typos of the file format?
I think the reason it looks fake is because they actually have the math wrong about how optics and apertures work, and they make some (really bad) approximations but from a product standpoint can please 80% of people.
I could probably make a better camera app with the correct aperture math, I wonder if people would pay for it or if mobile phone users just wouldn't be able to tell the difference and don't care.
There are a few projects now that simulate defocus properly to match what bigger (non-phone camera) lenses do - I hope to get back to working on it this summer but you can see some examples here: https://x.com/dearlensform
Those methods come from the world of non-realtime CG rendering though - running truly accurate simulations with the aberrations changing across the field on phone hardware at any decent speed is pretty challenging...
most people just want to see blurry shit in the background and think it makes it professional. if you really want to see it fall down, put things in the foreground and set the focal point somewhere in the middle. it'll still get the background blurry, but it gets the foreground all wrong. i'm guessing the market willing to pay for "better" faked shallow depth of field would be pretty small.
Would it be possible to point out more details about where Apple got the math wrong and which inaccurate approximations they use? I'm genuinely curious and want to learn more about it.
I'm pretty happy with the results my Pixel produces (apart from occasional depth map errors). Is Google doing a better job than Apple with the blurring, or am I just blissfully ignorant? :-)
If it's all done in post anyway, then it might be a lot simpler to skip building a whole camera app and just give people a way to apply more accurate bokeh to existing photos. I would pay for that.
On the point of fake bokeh, as a photographer I can't stand it. It looks horribly unnatural and nothing like bokeh from a good lens. Honestly astounding people think it looks good.
If you want a pretty portrait, just buy/borrow a cheap DSLR and the resulting image will be 100x better.
I have a camera with some primes and love those portraits. That said, the fake bokeh keeps getting better. With the iPhone 16 Pro it‘s now good enough that i no longer find it to be an issue.
There’s Reality Composer for iOS which has a LIDAR-enabled specific feature allowing you to capture objects. I was bummed to find out that on non-LIDAR equipped Apple devices it does not in fact fall back to photogrammetry.
Just in case you were doing 3d modeling work or photogrammetry and wanted to know, like I was.
I've had the most success doing 3d scanning with Heges. The LiDAR works pretty well for large objects (like cars), but you can also use the Face ID depth camera to capture smaller objects.
I did end up getting the Creality Ferret SE (via TikTok for like $100) for scanning small objects, and it's amazing.
I've had pretty good success with https://3dscannerapp.com - it's mostly intended for people with access to iDevices with LiDAR and an Apple Silicon Mac and in this combination can work completely offline by capturing via the iDevice and doing the processing on the Mac (using the system API for photogrammetry). AFAIK, there are also options for using just photos without LiDAR data and for cloud processing but I've never tried those.
Yes, those depth maps + semantic maps are pretty fun to look at - and if you load them into a program like TouchDesigner (or Blender or Cinema 4D whatever else you want) you can make some cool little depth effects with your photos. Or you can use them for photographic processing (which is what Apple uses them for, ultimately)
As another commenter pointed out, they used to be captured only in Portrait mode, but on recent iPhones they get captured automatically pretty much whenever a subject (human or pet) is detected in the scene.
I might be missing something here but the article spends quite a bit discussing the HDR gain map. Why is this relevant to the depth maps? Can you skip the HDR gain map related processing but retain the depth maps?
FWIW I personally hate the display of HDR on iPhones (they make the screen brightness higher than the maximum user-specified brightness) and in my own pictures I try to strip HDR gain maps. I still remember the time when HDR meant taking three photos and then stitching them together while removing all underexposed and overexposed parts; the resulting image doesn't carry any information about its HDR-ness.
I thought the same about the article and assumed I had just missed something - it seemed to have a nice overview of the depth maps but then covered mostly the gain maps and some different file formats. Good article, just a bit of a meandering thread
That only does it in the Photos app. What about online in WebViews? What about third party apps like Instagram? The only surefire way of turning it off everywhere is low power mode.
Just wonder if depth maps can be used to generate stereograms or SIRDS. I remember having playing with stereogram generation starting from very similar grey-scaled images.
They do. The UI to do this is apparently only included in the VisionOS version of the Photos app. But you can convert any photo in your album to "Spatial Format" as long as it has a Depth Map, or is high enough resolution for the ML approximation to be good enough.
It also reads EXIF to "scale" the image's physical dimensions to match the field of view of the original capture, so wide-angle photos are physically much larger in VR-Space than telephoto.
In my opinion, this button and feature alone justifies the $4000 I spent on the device. Seeing photos I took with my Nikon D7 in 2007, in full 3D and correct scale, triggers nostalgia and memories I've forgotten I had for many years. It was quite emotional.
Apple is dropping the ball on not making this the primary selling-point of Vision Pro. It's incredible.
Aha! I wonder if Apple uses this for their “create sticker” feature, where you press a subject on an image and can extract it to a sticker, or copy it to another image.
Definitely not, that works on any image no matter the source. Also depth probably wouldn't help you that much, practically everything would include the floor or table it was sitting on in that case.
That would be a machine-learning only approach, semantic segmentation.
1. The original method uses two cameras on the back, taking a picture from both simultaneously and using parallax to construct a depth map, similar to human vision. This was introduced on the iPhone 7 Plus, the first iPhone with two rear cameras (a 1x main camera and 2x telephoto camera.) Since the depth map depends on comparing the two images, it will naturally be limited to the field of view of the narrower lens.
2. A second method was later used on iPhone XR, which has only a single rear camera, using focus pixels on the sensor to roughly gauge depth. The raw result is low-res and imprecise, so it's refined using machine learning. See: https://www.lux.camera/iphone-xr-a-deep-dive-into-depth/
3. An extension of this method was used on an iPhone SE that didn't even have focus pixels, producing depth maps purely based on machine learning. As you would expect, such depth maps have the least correlation to reality, and the system could be fooled by taking a picture of a picture. See: https://www.lux.camera/iphone-se-the-one-eyed-king/
4. The fourth method is used for selfies on iPhones with FaceID; it uses the TrueDepth camera's 3D scanning to produce a depth map. You can see this with the selfie in the article; it has a noticeably fuzzier and low-res look.
You can also see some other auxiliary images in the article, which use white to indicate the human subject, glasses, hair, and skin. Apple calls these portrait effects mattes and they are produced using machine learning.
I made an app that used the depth maps and portrait effects mattes from Portraits for some creative filters. It was pretty fun, but it's no longer available. There are a lot of novel artistic possibilities for depth maps.
Yes, they will capture them from the main photo mode if there’s a subject (human or pet) in the scene.
> I made an app that used the depth maps and portrait effects mattes from Portraits for some creative filters. It was pretty fun, but it's no longer available
What was your app called? Is there any video of it available anywhere? Would be curious to see it!
I also made a little tool, Matte Viewer, as part of my photo tool series - but it’s just for viewing/exporting them, no effects bundled:
https://apps.apple.com/us/app/matte-viewer/id6476831058
I looked at your apps, and it turns out I'm already familiar with some, like 65x24. I had to laugh -- internally, anyway -- at the unfortunate one-star review you received on Matte Viewer from a user that didn't appear to understand the purpose of the app.
One that really surprised me was Trichromy, because I independently came up with and prototyped the same concept! And, even more surprisingly, there's at least one other such app on the App Store. And I thought I was so creative coming up with the idea. I tried Trichromy; it's quite elegant, and fast.
Actually, I feel we have a similar spirit in terms of our approach to creative photography, though your development skills apparently surpass mine. I'm impressed by the polish on your websites, too. Cheers.
One of the example pictures on TFA is a plant. Given that, are you sure iOS is still only taking depth maps for photos that get the "portrait" icon in the gallery? (Or have they maybe expanded the types of possible portrait subjects?)
I think there might be a few typos of the file format?
- 14 instances of "HEIC"
- 3 instances of "HIEC"
I could probably make a better camera app with the correct aperture math, I wonder if people would pay for it or if mobile phone users just wouldn't be able to tell the difference and don't care.
Those methods come from the world of non-realtime CG rendering though - running truly accurate simulations with the aberrations changing across the field on phone hardware at any decent speed is pretty challenging...
Deleted Comment
Deleted Comment
Just in case you were doing 3d modeling work or photogrammetry and wanted to know, like I was.
I did end up getting the Creality Ferret SE (via TikTok for like $100) for scanning small objects, and it's amazing.
I’ve also heard good things about Canvas (requires LiDAR) and Scaniverse (LiDAR optional.)
I’d be fine with paying for it, but it’s clear that they want to employ basic dark patterns and false advertising.
As another commenter pointed out, they used to be captured only in Portrait mode, but on recent iPhones they get captured automatically pretty much whenever a subject (human or pet) is detected in the scene.
I make photography apps & tools (https://heliographe.net), and one of the tools I built, Matte Viewer, is specifically for viewing & exporting them: https://apps.apple.com/us/app/matte-viewer/id6476831058
FWIW I personally hate the display of HDR on iPhones (they make the screen brightness higher than the maximum user-specified brightness) and in my own pictures I try to strip HDR gain maps. I still remember the time when HDR meant taking three photos and then stitching them together while removing all underexposed and overexposed parts; the resulting image doesn't carry any information about its HDR-ness.
It also reads EXIF to "scale" the image's physical dimensions to match the field of view of the original capture, so wide-angle photos are physically much larger in VR-Space than telephoto.
In my opinion, this button and feature alone justifies the $4000 I spent on the device. Seeing photos I took with my Nikon D7 in 2007, in full 3D and correct scale, triggers nostalgia and memories I've forgotten I had for many years. It was quite emotional.
Apple is dropping the ball on not making this the primary selling-point of Vision Pro. It's incredible.
That would be a machine-learning only approach, semantic segmentation.