Run Google Gemma 2 2B 100% Locally

Google just released Gemma 2 2B and you can run it on Mac (and other devices), 100% local, powered by llama.cpp!

1. brew install llama.cpp 2. ./llama-cli --hf-repo google/gemma-2-2b-it-GGUF \ --hf-file 2b_it_v2.gguf \ -p "Write a poem about cats as a labrador" -cnv

With this, I created a Local RAG Knowledge and Answering System with Google Gemma 2 2B and Marqo. Check it out: https://github.com/ellie-sleightholm/marqo-google-gemma2

Link also in the comments!

stavros · a year ago

What's new about this? LM studio runs much more powerful models with a few clicks, in a nice interface. Ollama does the same thing well on the cli.

vineyardmike · a year ago

The new thing added here is a disguised advertisement by the head of developer relations (of the database) :)

schappim · a year ago

Good pickup: [1] https://www.linkedin.com/in/elliesleightholm/?original_refer...

schappim · a year ago

What is the advantage of this over running: ollama run gemma2:2b ?

mappu · a year ago

ollama is a thin wrapper over llama.cpp, so i'd pose the opposite question - what does ollama give you over using llama.cpp directly?

schappim · a year ago

Model management, customisable HTTP APIs, monitoring, security features, "parallel requests" (batch processing), no requirement for HF auth etc...

yjftsjthsd-h · a year ago

Ease of use. Rather like arduino or like docker vs chroot/jails/zones, there's nothing wrong with just using the underlying tech, but lowering friction has value.

umtksa · a year ago

and I tried both of them and ollama some how handled everything better for gemma2

kozak · a year ago

I tried this code, and it cannot download the model from https://huggingface.co/google/gemma-2-2b-it-GGUF/resolve/mai... without authentication.

esleightholm · a year ago

There are 2 things you will need to do.

1. You will need to request access to the model in HuggingFace and accept the license. Head to https://huggingface.co/google/gemma-2-2b-it-GGUF/tree/main and there should be an option to request access. This will be approved almost immediately and you will receive an email saying you've been granted access.

2. Create a User Access Token in HuggingFace to download the model. Visit https://huggingface.co/settings/tokens and create a new token. Then, set the token in your environment by running: `export HF_TOKEN=<your_huggingface_token>`

Hope that helps! Any further issues, feel free to reply to this comment and I'd be happy to help.

esleightholm · a year ago

https://github.com/ellie-sleightholm/marqo-google-gemma2

Dead Comment