CoreNet: A library for training deep neural networks

> Relationship with CVNets

> CoreNet evolved from CVNets, to encompass a broader range of applications beyond computer vision. Its expansion facilitated the training of foundational models, including LLMs.

We can expect it to have grown from here: https://apple.github.io/ml-cvnets/index.html

It looks like a mid-level implementations of training and inference. You can see in their "default_trainer.py"[1] that the engine uses Tensors from torch but implements its own training method. They implement their own LR scheduler and optimizer; the caller can optionally use Adam from torch.

It's an interesting (maybe very Apple) choice to build from the ground up instead of partnering with existing frameworks to provide first class support in them.

The MLX examples seem to be inference only at this point. It does look like this might be a landing ground for more MLX specific implementations: e.g. https://github.com/apple/corenet/blob/5b50eca42bc97f6146b812...

It will be interesting to see how it tracks over the next year; especially with their recent acquisitions:

Datakalab https://news.ycombinator.com/item?id=40114350

DarwinAI https://news.ycombinator.com/item?id=39709835

1: https://github.com/apple/corenet/blob/main/corenet/engine/de...

error9348 · a year ago

The interface looks very Apple as well. Looks like you create a config file, and you already have a model in mind with the hyperparameters and it provides a simple interface. How useful is this to researchers trying to hack the model architecture?

One example: https://github.com/apple/corenet/tree/main/projects/clip#tra...

sigmoid10 · a year ago

Not much. But if you just want to adapt/optimize hyperparams, this is a useful approach. So I can certainly see a possible, less technical audience. If you actually want to hack and adapt architectures it's probably not worth it.

zitterbewegung · a year ago

What you say is true about the project but both PyTorch works on Mace and Tensorflow was ported to Macs by Apple

_aavaa_ · a year ago

They were originally available only as binaries, have they released the code changes required or upstreamed them yet?

blackeyeblitzar · a year ago

> It looks like a mid-level implementations of training and inference

I’m not familiar with how any of this works but what does state of the art training look like? Almost no models release their training source code or data sets or pre processing or evaluation code. So is it known what the high level implementation even is?

spott · a year ago

https://github.com/NVIDIA/Megatron-LM

This is probably a good baseline to start thinking about LLM training at scale.

Deleted Comment

big-chungus4 · a year ago

They don't implement their own stuff, their optimizers just inherits pytorch optimizers

Deleted Comment

davedx · a year ago

> It's an interesting (maybe very Apple) choice to build from the ground up instead of partnering with existing frameworks to provide first class support in them.

It smells of a somewhat panicked attempt to prepare for WWDC to me. Apple has really dropped the ball on AI and now they're trying to catch up.

audunw · a year ago

I don’t get the idea that Apple dropped the ball on AI. They were fairly early with adding neural engine hardware to their chips and have been using ML extensively on-device for a long time now

They haven’t put an LLM assistant out there. But they don’t make their own search engine either so I don’t think “online LLM assistant” is something they’ll ever put much effort into unless it’s part of a bigger effort to launch their own AI-based search engine as well.

As for generative AI I don’t think the quality is up to a level that would be reasonable for Apple.

The only area where i would expect Apple to keep up is the kind of Copilot integration Microsoft is working on. And we know Apple is working on on-device AI assistant, and probably have for a long time. It’ll be launched when they can get good quality results on-device. Something nobody else has achieved anyway, so we can’t say that they’re behind anyone yet.

throw0101c · a year ago

> Apple has really dropped the ball on AI and now they're trying to catch up.

Apple put a neural engine on-die in the A11 back in 2017:

* https://en.wikipedia.org/wiki/Apple_A11#Neural_Engine

The A-derived M-series chips had them from the beginning in 2020:

* https://en.wikipedia.org/wiki/Apple_M1#Other_features

Seems like they've been doing machine learning for a while now.

pizza · a year ago

Wouldn’t WWDC-related endeavors be more product-facing? I’m not so sure this has to do with their efforts to incorporate ai into products, and tbh I would say their ai research has been pretty strong generally speaking.

What's the advantage of using this over something like Huggingface Transformers, possibly with the MPS backend?

pshc · a year ago

"MLX examples demonstrate how to run CoreNet models efficiently on Apple Silicon. Please find further information in the README.md file within the corresponding example directory."

> mlx_example/clip: ... an example to convert CoreNet's CLIP model implementation to MLX's CLIP example with some customized modification.

  - FP16 Base variant: 60% speedup over PyTorch
  - FP16 Huge variant: 12% speedup

> mlx_example/open_elm: ... an MLX port of OpenELM model trained with CoreNet. MLX is an Apple deep learning framework similar in spirit to PyTorch, which is optimized for Apple Silicon based hardware.

Seems like an advantage is extra speedups thanks to specialization for Apple Silicon. This might be the most power-efficient DNN training framework (for small models) out there. But we won't really know until someone benchmarks it.

HarHarVeryFunny · a year ago

OpenELM (ELM = Efficient Language Models) has an unfortunate name clash with another LLM-related open source project.

https://github.com/CarperAI/OpenELM (ELM = Evolution through Large Models)

upbeat_general · a year ago

The implementation seems to be pretty clean and modular here where transformers (and diffusers) isn’t, unless you take their modules standalone.

This repo has a lot of handy utilities but also a bunch of clean implementations of common models, metrics, etc.

In other words, this is more for writing new models rather than inference.

jaimex2 · a year ago

Nothing, its basically pytorch with an Apple logo.

gbickford · a year ago

ipsum2 · a year ago

It's interesting that Apple also actively develops https://github.com/apple/axlearn, which is a library on top of Jax. Seems like half the ML teams at Apple use PyTorch, and the other half uses Jax. Maybe its split between Google Cloud and AWS?

josephg · a year ago

In my experience, this is pretty normal in large companies like Apple. Coordination costs are real. Unless there's a good reason to standardize on a single tool, its usually easier for teams to just pick whichever tool makes the most sense based on the problem they're solving and what the team has experience with.

tomComb · a year ago

Big companies like Apple yes, but not Apple

te_chris · a year ago

I don’t know as haven’t worked there, but have always heard Apple described more as a series of companies/startups than one coherent entity like Meta or whatever. Each is allowed a large degree of autonomy from what I’ve heard.

flawn · a year ago

aka Google some years ago (don't know about now...)

coder543 · a year ago

They also mention in the README:

> CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data

This is the first I’m hearing of that, and the link seems broken.

simonw · a year ago

The link should go here I think: https://github.com/apple/corenet/tree/main/projects/catlip

seanvelasco · a year ago

somewhat related, i came across this, mlx examples for openai clip: https://github.com/ml-explore/mlx-examples/tree/main/clip

curious to know how fast catlip is. the above using openai clip is already fast.

huac · a year ago

cat's out of the bag, too early?

mxwsn · a year ago

Built on top of pytorch.

Dead Comment

leodriesch · a year ago

How does this compare to MLX? As far as I understand MLX is equivalent to PyTorch but optimized for Apple Silicon.

Is this meant for training MLX models in a distributed manner? Or what is its purpose?

It looks like MLX is a part of this initiative. https://github.com/apple/corenet lists "MLX examples" as one of the components being released in April.

reader9274 · a year ago

As mentioned in the "mlx_examples/open_elm": "MLX is an Apple deep learning framework similar in spirit to PyTorch, which is optimized for Apple Silicon based hardware."

dagmx · a year ago

Just skimming the README it looks like it’s a layer above MLX. So looks like a framework around it to ease ML

It's a layer on top of PyTorch, and it has code to translate PyTorch models into MLX.

miki123211 · a year ago

jn2clark · a year ago

I would love an LLM agent that could generate small api examples (reliably) from a repo like this for the various different models and ways to use them.

buildbot · a year ago

Does this support training on Apple silicon? It’s not very clear unless I missed something in the README.

Would such a capability (training) be useful for anything other than small scale experimentation? Apple doesn’t make server products anymore and even when they did, they were overpriced. Unless they have private Apple silicon based servers for their own training needs?

jjtheblunt · a year ago

Isn’t the current Mac Pro available in rack mount form?

https://www.apple.com/mac-pro/

donavanm · a year ago

> Unless they have private Apple silicon based servers for their own training needs?

Id be SHOCKED if so. Its been 15 years, but I was there when xserve died. Priorities were iphone > other mobile devices >>> laptops > displays & desktops >>> literally anything else. When xserve died we still needed osx for OD & similar. Teams moved on to 3P rack mount trays of mac minis as a stop gap. Any internal support/preference for server style hardware was a lolwut response. Externally I see no reason to suspect thats changed.

MBCook · a year ago

There are an insane number of Apple Silicon devices out there.

If your product runs on an iPhone or iPad, I’m sure this is great.

If you only ever want to run on 4090s or other server stuff, yeah this probably isn’t that interesting.

Maybe it’s a good design for the tools or something, I have no experience to know. Maybe someone else can build off it.

But it makes sense Apple is releasing tools to make stuff that works better on Apple platforms.

zmk5 · a year ago

I believe the MLX examples allow for it. Seems like a general purpose framework rather than a Mac specific one.

I couldn't find any training code in the MXL examples.