Omni uses a policy-based approach to model selection thanks to a model router called Arch-Router from Katanemo.
Under the hood, it now powers 5M Xet-enabled AI models & datasets on HF which see hundreds of terabytes of uploads and downloads every single day.
What makes it super powerful is that it massively speeds up & reduces costs of data transfer thanks to methods like content-defined chunking (CDC). Instead of treating a file as an indivisible unit, CDC breaks files down into variable-sized chunks, using the data to define boundaries.
That's what allows Hugging Face to offer a platform for 10 million AI builders in open-source at a fraction of the cost.
Open weights means models you can truly own, so they’ll never get nerfed or taken away from you!
So we’re excited to announce today a new partnership to: - reduce Hugging Face model & dataset upload and download times through Vertex AI and Google Kubernetes Engine thanks to a new gateway for Hugging Face repositories that will cache directly on Google Cloud - offer native support for TPUs on all open models sourced through Hugging Face - provide a safer experience through Google Cloud’s built-in security capabilities. Ultimately, our intuition is that the majority of cloud spend will be AI related and based on open-source (rather than proprietary APIs) as all technology builders will become AI builders and we're trying to make this easier.
Questions, comments, feedback welcome!