Which I've been using with Qwen3 Coder. As long as infill is supported, that should work. I'll try later today.
Which I've been using with Qwen3 Coder. As long as infill is supported, that should work. I'll try later today.
FWIW my feeling is positive in regard to the core meaning being conveyed- I just feel like I'm missing out on something in not understanding the format.
For example, simple tasks CAN be handled by Devstral 24B or Qwen3 30B A3B, but often they fail at tool use (especially quantized versions) and you often find yourself wanting something bigger, where the speed falls a bunch. Even something like zAI GLM 4.6 (through Cerebras, as an example of a bigger cloud model) is not good enough for doing certain kinds of refactoring or writing certain kinds of scripts.
So either you use local smaller models that are hit or miss, or you need a LOT of expensive hardware locally, or you just pay for Claude Code, or OpenAI Codex, or Google Gemini, or something like that. Even Cerebras Code that gives me a lot of tokens per day isn't enough for all tasks, so you most likely will need a mix - but running stuff locally can sometimes decrease the costs.
For autocomplete, the one thing where local models would be a nearly perfect fit, there just isn't good software: Continue.dev autocomplete sucks and is buggy (Ollama), there don't seem to be good enough VSC plugins to replace Copilot (e.g. with those smart edits, when you change one thing in a file but have similar changes needed like 10, 25 and 50 lines down) and many aren't even trying - KiloCode had some vendor locked garbage with no Ollama support, Cline and RooCode aren't even trying to support autocomplete.
And not every model out there (like Qwen3) supports FIM properly, so for a bit I had to use Qwen2.5 Coder, meh. Then when you have some plugins coming out, they're all pretty new and you also don't know what supply chain risks you're dealing with. It's the one use case where they could be good, but... they just aren't.
For all of the billions going into AI, someone should have paid a team of devs to create something that is both open (any provider) and doesn't fucking suck. Ollama is cool for the ease of use. Cline/RooCode/KiloCode are cool for chat and agentic development. OpenCode is a bit hit or miss in my experience (copied lines getting pasted individually), but I appreciate the thought. The rest is lacking.
It's what I did when I got a new opener. Works fine in HomeAssistant.
We also were looking for DDR4 memory for some older machines and that has shot up 2x as well.
Hate this AI timeline.
I picked up 32GB (2x16GB) DDR4 (CMK32GX4M2E3200C16) last September for $55. Now it's $155.
I've got two R640's so I can live migrate, and an R720XD with TrueNAS (democratic-csi for K8s persistence). QSFP (40Gb) for TrueNAS / R720XD, and SFP+ (10Gb) for R640's linked to a Brocade ICX 6610.
So I can update the hosts, and K8 nodes with 0 downtime. Do I need it? No, but I learned a lot and had / have fun deploying and maintaining it.
I don't notice any difference other than now I have a pile of useless lightning cables (good riddance). Honestly kind of a relief as I liked the 12 just fine. Phones kind of seem like a Solved tech these days. About as exciting to upgrade them as upgrading my Brother Laser Printer.
It had the most bizarre solution; airplane mode, set time to one year in the future, reboot, wait a few minutes, set time to 6mo in the future, reboot, wait a few minutes, set time to now, reboot. Went from 200GB to like 15GB. Was ridiculous.
(For anyone looking at this and considering doing it, you also need to ensure iMessage retention is forever, otherwise the iPhone will think it's a year old and delete the messages)