Google just released Gemma 2 2B and you can run it on Mac (and other devices), 100% local, powered by llama.cpp!
1. brew install llama.cpp 2. ./llama-cli --hf-repo google/gemma-2-2b-it-GGUF \ --hf-file 2b_it_v2.gguf \ -p "Write a poem about cats as a labrador" -cnv
With this, I created a Local RAG Knowledge and Answering System with Google Gemma 2 2B and Marqo. Check it out: https://github.com/ellie-sleightholm/marqo-google-gemma2
Link also in the comments!
1. You will need to request access to the model in HuggingFace and accept the license. Head to https://huggingface.co/google/gemma-2-2b-it-GGUF/tree/main and there should be an option to request access. This will be approved almost immediately and you will receive an email saying you've been granted access.
2. Create a User Access Token in HuggingFace to download the model. Visit https://huggingface.co/settings/tokens and create a new token. Then, set the token in your environment by running: `export HF_TOKEN=<your_huggingface_token>`
Hope that helps! Any further issues, feel free to reply to this comment and I'd be happy to help.
Dead Comment