frontsideair (u/frontsideair)

frontsideair commented on My Most Popular Application blog.6nok.org/my-most-pop... · Posted by u/frontsideair

gus_massa · 4 months ago

Feature request: Start with a few predefined colors like Red, Green and Blue. It's easier to try, and latter people can personalize it. Perhaps a discrete palette with ¿16? options is better, with the advanced option to pick the color like now or select a color like #ff8040.

frontsideair · 4 months ago

Yeah, the initial experience with no colors doesn’t look great. I can implement this when I have some free time, if you feel like doing it please feel free to open a PR. Thanks!

Posted by u/frontsideair 4 months ago

frontsideair commented on Experimenting with Local LLMs on macOS blog.6nok.org/experimenti... · Posted by u/frontsideair

KolmogorovComp · 5 months ago

The really though spot is finding a good model for your use case. I’ve a 16Gb MB and have been paralyzed by the many options. I’ve settle for a quantisied 14B Qwen for now, but no idea if this is a good idea.

frontsideair · 5 months ago

14B Qwen was a good choice, but it became outdated a bit and seems like the new version of 4B surpassed it in benchmarks somehow.

It's a balancing game, how slow a token generation speed can you tolerate? Would you rather get an answer quick, or wait for a few seconds (or sometimes minutes) for reasoning?

For quick answers, Gemma 3 12B is still good. GPT-OSS 20B is pretty quick when reasoning is set to low, which usually doesn't think longer than one sentence. I haven't gotten much use out of Qwen3 4B Thinking (2507) but at least it's fast while reasoning.

frontsideair commented on Experimenting with Local LLMs on macOS blog.6nok.org/experimenti... · Posted by u/frontsideair

a-dub · 5 months ago

ollama is another good choice for this purpose. it's essentially a wrapper around llamacpp that adds easy downloading and management of running instances. it's great! also works on linux!

frontsideair · 5 months ago

Ollama adding a paid cloud version made me postpone this post for a few weeks at least. I don't object them to make money, but it was hard to recommend a tool for local usage and make the first instruction to go to settings and enable airplane mode.

Luckily llama.cpp has come a long way and was at a point that I could easily recommend as the open source option instead.

frontsideair commented on Experimenting with Local LLMs on macOS blog.6nok.org/experimenti... · Posted by u/frontsideair

jftuga · 5 months ago

I have a macbook air M4 with 32 GB. What LM Studio models would you recommend for:

* General Q&A

* Specific to programming - mostly Python and Go.

I forgot the command now, but I did run a command that allowed MacOS to allocate and use maybe 28 GB of RAM to the GPU for use with LLMs.

frontsideair · 5 months ago

This is the command probably:

  sudo sysctl iogpu.wired_limit_mb=184320

Source: https://github.com/ggml-org/llama.cpp/discussions/15396

frontsideair commented on Experimenting with Local LLMs on macOS blog.6nok.org/experimenti... · Posted by u/frontsideair

Damogran6 · 5 months ago

Oddly, my 2013 MacPro (Trashcan) runs LLMs pretty well, mostly because 64Gb of old school RAM is, like, $25.

frontsideair · 5 months ago

I'm interested in this, my impression was that the newer chips have unified memory and high memory bandwidth. Do you do inference on the CPU or the external GPU?

frontsideair commented on Experimenting with Local LLMs on macOS blog.6nok.org/experimenti... · Posted by u/frontsideair

atentaten · 5 months ago

Every blog post or article about running local LLMs should include something about which hardware was used.

frontsideair · 5 months ago

Good point, let me add a quick note.

frontsideair commented on Experimenting with Local LLMs on macOS blog.6nok.org/experimenti... · Posted by u/frontsideair

linux2647 · 5 months ago

Unrelated but I really enjoyed the wavy text effect on “opinions” in the first paragraph

frontsideair · 5 months ago

Thank you, it was the integral part of the whole post!

Posted by u/frontsideair 5 months ago

Experimenting with Local LLMs on macOS blog.6nok.org/experimenti...

frontsideair commented on Qwen3-4B-Thinking-2507 huggingface.co/Qwen/Qwen3... · Posted by u/IdealeZahlen

frontsideair · 6 months ago

According to the benchmarks, this one is improved in every one of them compared to the previous version, some better than 30B-A3B. Definitely worth a try, it’ll easily fit into memory and token generation speed will be pleasantly fast.

u/frontsideair

KarmaCake day256June 15, 2012

About

[ my public key: https://keybase.io/frontsideair; my proof: https://keybase.io/frontsideair/sigs/OkvxxcDuiNS18hJBaXqcsqVwPSSRVzb2q7QR8goRDzs ]

View Original