> CoreNet evolved from CVNets, to encompass a broader range of applications beyond computer vision. Its expansion facilitated the training of foundational models, including LLMs.
It looks like a mid-level implementations of training and inference. You can see in their "default_trainer.py"[1] that the engine uses Tensors from torch but implements its own training method. They implement their own LR scheduler and optimizer; the caller can optionally use Adam from torch.
It's an interesting (maybe very Apple) choice to build from the ground up instead of partnering with existing frameworks to provide first class support in them.
The interface looks very Apple as well. Looks like you create a config file, and you already have a model in mind with the hyperparameters and it provides a simple interface. How useful is this to researchers trying to hack the model architecture?
Not much. But if you just want to adapt/optimize hyperparams, this is a useful approach. So I can certainly see a possible, less technical audience. If you actually want to hack and adapt architectures it's probably not worth it.
> It looks like a mid-level implementations of training and inference
I’m not familiar with how any of this works but what does state of the art training look like? Almost no models release their training source code or data sets or pre processing or evaluation code. So is it known what the high level implementation even is?
> It's an interesting (maybe very Apple) choice to build from the ground up instead of partnering with existing frameworks to provide first class support in them.
It smells of a somewhat panicked attempt to prepare for WWDC to me. Apple has really dropped the ball on AI and now they're trying to catch up.
I don’t get the idea that Apple dropped the ball on AI. They were fairly early with adding neural engine hardware to their chips and have been using ML extensively on-device for a long time now
They haven’t put an LLM assistant out there. But they don’t make their own search engine either so I don’t think “online LLM assistant” is something they’ll ever put much effort into unless it’s part of a bigger effort to launch their own AI-based search engine as well.
As for generative AI I don’t think the quality is up to a level that would be reasonable for Apple.
The only area where i would expect Apple to keep up is the kind of Copilot integration Microsoft is working on. And we know Apple is working on on-device AI assistant, and probably have for a long time. It’ll be launched when they can get good quality results on-device. Something nobody else has achieved anyway, so we can’t say that they’re behind anyone yet.
Wouldn’t WWDC-related endeavors be more product-facing? I’m not so sure this has to do with their efforts to incorporate ai into products, and tbh I would say their ai research has been pretty strong generally speaking.
It's interesting that Apple also actively develops https://github.com/apple/axlearn, which is a library on top of Jax. Seems like half the ML teams at Apple use PyTorch, and the other half uses Jax. Maybe its split between Google Cloud and AWS?
In my experience, this is pretty normal in large companies like Apple. Coordination costs are real. Unless there's a good reason to standardize on a single tool, its usually easier for teams to just pick whichever tool makes the most sense based on the problem they're solving and what the team has experience with.
I don’t know as haven’t worked there, but have always heard Apple described more as a series of companies/startups than one coherent entity like Meta or whatever. Each is allowed a large degree of autonomy from what I’ve heard.
It looks like MLX is a part of this initiative. https://github.com/apple/corenet lists "MLX examples" as one of the components being released in April.
As mentioned in the "mlx_examples/open_elm": "MLX is an Apple deep learning framework similar in spirit to PyTorch, which is optimized for Apple Silicon based hardware."
"MLX examples demonstrate how to run CoreNet models efficiently on Apple Silicon. Please find further information in the README.md file within the corresponding example directory."
> mlx_example/clip: ... an example to convert CoreNet's CLIP model implementation to MLX's CLIP example with some customized modification.
- FP16 Base variant: 60% speedup over PyTorch
- FP16 Huge variant: 12% speedup
> mlx_example/open_elm: ... an MLX port of OpenELM model trained with CoreNet. MLX is an Apple deep learning framework similar in spirit to PyTorch, which is optimized for Apple Silicon based hardware.
Seems like an advantage is extra speedups thanks to specialization for Apple Silicon. This might be the most power-efficient DNN training framework (for small models) out there. But we won't really know until someone benchmarks it.
I would love an LLM agent that could generate small api examples (reliably) from a repo like this for the various different models and ways to use them.
Would such a capability (training) be useful for anything other than small scale experimentation? Apple doesn’t make server products anymore and even when they did, they were overpriced. Unless they have private Apple silicon based servers for their own training needs?
> Unless they have private Apple silicon based servers for their own training needs?
Id be SHOCKED if so. Its been 15 years, but I was there when xserve died. Priorities were iphone > other mobile devices >>> laptops > displays & desktops >>> literally anything else. When xserve died we still needed osx for OD & similar. Teams moved on to 3P rack mount trays of mac minis as a stop gap. Any internal support/preference for server style hardware was a lolwut response. Externally I see no reason to suspect thats changed.
> CoreNet evolved from CVNets, to encompass a broader range of applications beyond computer vision. Its expansion facilitated the training of foundational models, including LLMs.
We can expect it to have grown from here: https://apple.github.io/ml-cvnets/index.html
It looks like a mid-level implementations of training and inference. You can see in their "default_trainer.py"[1] that the engine uses Tensors from torch but implements its own training method. They implement their own LR scheduler and optimizer; the caller can optionally use Adam from torch.
It's an interesting (maybe very Apple) choice to build from the ground up instead of partnering with existing frameworks to provide first class support in them.
The MLX examples seem to be inference only at this point. It does look like this might be a landing ground for more MLX specific implementations: e.g. https://github.com/apple/corenet/blob/5b50eca42bc97f6146b812...
It will be interesting to see how it tracks over the next year; especially with their recent acquisitions:
Datakalab https://news.ycombinator.com/item?id=40114350
DarwinAI https://news.ycombinator.com/item?id=39709835
1: https://github.com/apple/corenet/blob/main/corenet/engine/de...
One example: https://github.com/apple/corenet/tree/main/projects/clip#tra...
I’m not familiar with how any of this works but what does state of the art training look like? Almost no models release their training source code or data sets or pre processing or evaluation code. So is it known what the high level implementation even is?
This is probably a good baseline to start thinking about LLM training at scale.
Deleted Comment
Deleted Comment
It smells of a somewhat panicked attempt to prepare for WWDC to me. Apple has really dropped the ball on AI and now they're trying to catch up.
They haven’t put an LLM assistant out there. But they don’t make their own search engine either so I don’t think “online LLM assistant” is something they’ll ever put much effort into unless it’s part of a bigger effort to launch their own AI-based search engine as well.
As for generative AI I don’t think the quality is up to a level that would be reasonable for Apple.
The only area where i would expect Apple to keep up is the kind of Copilot integration Microsoft is working on. And we know Apple is working on on-device AI assistant, and probably have for a long time. It’ll be launched when they can get good quality results on-device. Something nobody else has achieved anyway, so we can’t say that they’re behind anyone yet.
Apple put a neural engine on-die in the A11 back in 2017:
* https://en.wikipedia.org/wiki/Apple_A11#Neural_Engine
The A-derived M-series chips had them from the beginning in 2020:
* https://en.wikipedia.org/wiki/Apple_M1#Other_features
Seems like they've been doing machine learning for a while now.
> CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
This is the first I’m hearing of that, and the link seems broken.
curious to know how fast catlip is. the above using openai clip is already fast.
Dead Comment
Is this meant for training MLX models in a distributed manner? Or what is its purpose?
> mlx_example/clip: ... an example to convert CoreNet's CLIP model implementation to MLX's CLIP example with some customized modification.
> mlx_example/open_elm: ... an MLX port of OpenELM model trained with CoreNet. MLX is an Apple deep learning framework similar in spirit to PyTorch, which is optimized for Apple Silicon based hardware.Seems like an advantage is extra speedups thanks to specialization for Apple Silicon. This might be the most power-efficient DNN training framework (for small models) out there. But we won't really know until someone benchmarks it.
https://github.com/CarperAI/OpenELM (ELM = Evolution through Large Models)
This repo has a lot of handy utilities but also a bunch of clean implementations of common models, metrics, etc.
In other words, this is more for writing new models rather than inference.
https://www.apple.com/mac-pro/
Id be SHOCKED if so. Its been 15 years, but I was there when xserve died. Priorities were iphone > other mobile devices >>> laptops > displays & desktops >>> literally anything else. When xserve died we still needed osx for OD & similar. Teams moved on to 3P rack mount trays of mac minis as a stop gap. Any internal support/preference for server style hardware was a lolwut response. Externally I see no reason to suspect thats changed.
If your product runs on an iPhone or iPad, I’m sure this is great.
If you only ever want to run on 4090s or other server stuff, yeah this probably isn’t that interesting.
Maybe it’s a good design for the tools or something, I have no experience to know. Maybe someone else can build off it.
But it makes sense Apple is releasing tools to make stuff that works better on Apple platforms.