Google AI Edge – On-device cross-platform AI deployment

Anybody have any experience with this? I just spend a while contorting a custom pytorch model to get it to export to coreml and it was full of this that and the other not being supported, or segfaulting, and all sorts of silly errors. I'd love if someone could say this isn't full of sharp edges too.

smeej · 3 months ago

I got it all set up and tested out Gemma3 1B on a Pixel 8a. That only took a few minutes, which was nice.

But it was garbage. It barely parsed the question, didn't even attempt to answer it, and replied in what was barely serviceable English. All I asked was how it was small enough to run locally on my phone. It was bad enough for me to abandon the model entirely, which is saying a lot, because I feel like I have pretty low expectations for AI work in the first place.

throwaway314155 · 3 months ago

> All I asked was how it was small enough to run locally on my phone

Bit off-topic, but did you expect to see a real or honest answer about itself? I see many people under the impression that models know information about themselves that isn't in the system prompt. Couldn't be further from the truth. In face, those questions specifically lead to hallucinations more often resulting in an overconfident assertion with a "reasonable" answer.

The information the model knows (offline - no tools allowed) stops weeks if not months if not years prior to when the model is done training. There is _zero_ information about its inception, how it works, or anything similar in its weights.

Sorry, this is mostly directed at the masses - not you.

DrSiemer · 3 months ago

Why would you ask a 1B model anything? Those are only useful for rephrasing output at best.

More information here: https://ai.google.dev/edge/mediapipe/solutions/guide

(It seems to be open source: https://github.com/google-ai-edge/mediapipe)

I think this is a unified way of deploying AI models that actually run on-device ("edge"). I guess a sort of "JavaScript of AI stacks"? I wonder who the target audience is for this technology?

wongarsu · 3 months ago

Some of the mediapipe models are nice, but mediapipe has been around forever (or 2019). It has always been about running AI on the edge, back when the exciting frontier of AI were visual tasks.

For stuff like face tracking it's still useful, but for some other tasks like image recognition the world has changed drastically

babl-yc · 3 months ago

I would say the target audience is anyone deploying ML models cross-platform, specifically ones that would require supporting code beyond the TFLite runtime to make it work.

LLMs and computer vision tasks are good examples of this.

For example, a hand-gesture recognizer might require: - Pre-processing of input image to certain color space + image size - Copy of image to GPU memory - Run of object detection TFLite model to detect hand - Resize of output image - Run of gesture recognition TFLite model to detect gesture - Post processing of gesture output to something useful

Shipping this to iOS+Android requires a lot of code beyond executing TFLite models.

The Google Mediapipe approach is to package this graph pipeline, and shared processing "nodes" into a single C++ library where you can pick and choose what you need and re-use operations across tasks. The library also compiles cross-platform and the supporting tasks can offer GPU acceleration options.

One internal debate Google likely had was whether it was best to extend TFLite runtime with these features, or to build a separate library (Mediapipe). TFLite already supports custom compile options with additional operations.

My guess is they thought it was best to keep TFLite focused on "tensor based computation" tasks and offload broader operations like LLM and image processing into a separate library.

salamo · 3 months ago

Really happy to see additional solutions for on-device ML.

That said, I probably wouldn't use this unless mine was one of the specific use cases supported[0]. I have no idea how hard it would be to add a new model supporting arbitrary inputs and outputs.

For running inference cross-device I have used Onnx, which is low-level enough to support whatever weights I need. For a good number of tasks you can also use transformers.js which wraps onnx and handles things like decoding (unless you really enjoy implementing beam search on your own). I believe an equivalent link to the above would be [1] which is just much more comprehensive.

[0] https://ai.google.dev/edge/mediapipe/solutions/guide

[1] https://github.com/huggingface/transformers.js-examples

arbayi · 3 months ago

https://github.com/google-ai-edge/gallery

A gallery that showcases on-device ML/GenAI use cases and allows people to try and use models locally.

ricardobeat · 3 months ago

This is a repackaging of TensorFlow Lite + MediaPipe under a new “brand”.

echelon · 3 months ago

The same stuff that powers this?

https://3d.kalidoface.com/

It's pretty impressive that this runs on-device. It's better than a lot of commercial mocap offerings.

AND this was marked deprecated/unsupported over 3 years ago despite the fact it's a pretty mature solution.

Google has been sleeping on their tech or not evangelizing it enough.

pzo · 3 months ago

My take: tensorflow lite + mediapipe was great but google really neglected it in the last 3 years or so. Mediapipe didn't have many meaningful update in last 3 years. A lot of models today are outdated or slow. TF Lite supported NPU (like apple ANU) but mediapipe never did. They had also too much mess with different branding: MLKit, Firebase ML, TF lite, LiteRT.

This days probably better to stick with onnxruntime via hugging face transformers or transformers.js library or wait until executorch mature. I haven't seen any SOTA model officially released having official port to tensorflow lite / liteRT for a long time: SAM2, EfficientSAM, EdgeSAM, DFINE, DEIM, Whisper, Lite-Whisper, Kokoro, DepthAnythingV2 - everything is pytorch by default but with still big communities for ONNX and MLX

yeldarb · 3 months ago

Is this a new product or a marketing page tying together a bunch of the existing MediaPipe stuff into a narrative?

Got really excited then realized I couldn’t figure out what “Google AI Edge” actually _is_.

Edit: I think it’s largely a rebrand of this from a couple years ago: https://developers.googleblog.com/en/introducing-mediapipe-s...

6gvONxR4sf7o · 3 months ago

davedx · 3 months ago

hatmanstack · 3 months ago

Played with this a bit and from what I gathered it's purely a re-arch of pytorch models to work as .tflite models, at least that's what I was using it for. It worked well with a custom finbert model with negligible size reduction. It converted a quantized version but outputs were not close. From what I remember of the docs it was created for standard pytorch models, like "torchvision.models", so maybe with those you'd have better luck. Granted, this was all ~12 months ago, sounds like I might have dodged a pack of Raptors?