codeisawesome (u/codeisawesome)

codeisawesome commented on My LLM codegen workflow harper.blog/2025/02/16/my... · Posted by u/lolptdr

Would be great if there were more details on the costs of doing this work - especially when loading lots of tokens of context via repo mix and then generating code with context (context-loaded inference API calls are more expensive, correct?). A dedicated post discussing this and related considerations would be even better. Are there cost estimations in the tools like aider (vs just refreshing the LLM platform’s billing dashboard?)

codeisawesome commented on Show HN: I built a CLI to automate depth tracking on underwater videos github.com/noppanut15/dep... · Posted by u/noppanut15

codeisawesome · a year ago

The README is indeed clear, engaging and well-structured! The Screencap gifs are pretty helpful to immediately understand the scope of the project.

codeisawesome commented on My Favorite Book on AI gatesnotes.com/The-Coming... · Posted by u/f1shy

fsckboy · a year ago

absolutely everybody in software was talking about the internet in 1995, and for the most part already on the internet, and companies were already pivoting to the internet left and right. It made me realize how discretionary is all the work we do in these large industries, because whatever we were working on in 1993 and 1994 no longer mattered, we were now working on enabling whatever assets we had to be compatible with the internet.

codeisawesome · a year ago

I’m not sure I’d agree with pivoting being a signal that prior work was discretionary. I think, in a lot of cases the strategic pressure from competition becoming more efficient or productive by capturing Internet traffic could have been a real reason as well.

codeisawesome commented on Ask HN: Those making $500/month on side projects in 2024 – Show and tell · Posted by u/cvbox

codeisawesome · a year ago

It would be great if these questions also included a sub-question on distribution strategy, that's one of the hardest things to visualize as a developer from $0 to $500.

codeisawesome commented on Ollama now supports tool calling with popular models in local LLM ollama.com/blog/tool-supp... · Posted by u/thor-rodrigues

codeisawesome · a year ago

How does this compare to Agent Zero (frdel/agent-zero on GitHub)? Seems that provides similar functionality and uses docker for running the scripts / code generated.

codeisawesome commented on The Architecture Behind a One-Person Tech Startup (2021) anthonynsimon.com/blog/on... · Posted by u/thunderbong

codeisawesome · 2 years ago

I’d love to read a blog post from successful one person startups that are in the highly competitive spaces like DevOps tooling (uptime, analytics…) on how they go-to-market and differentiate.

codeisawesome commented on Goody-2, the world's most responsible AI model goody2.ai/chat... · Posted by u/belladoreai

Me1000 · 2 years ago

If you're new to this then just download an app like LMStudio (which unfortunately is closed source, but it is free) which basically just uses llama.cpp under the hood. It's simple enough to get started with local LLMs. If you want something open source ollama is probably a good place to look too, it's just a CLI tool but several GUIs integrate with ollama specifically.

As for your bonus question, that is the model you want. In general I'd choose the largest quantized version that you can fit based on your system. I'm personally running the 8bit version on my M3 Max MacBook Pro and it runs great! Performance is unfortunately a loaded word when it comes to LLMs because it can mean tokens per second or it can mean perplexity (i.e. how well the LLM responds). In terms of tokens per second, quantized models usually run a little faster because memory bandwidth is a constraint, so you're moving less memory around. In terms of perplexity there are different quantization strategies that work better and worse. I really don't think there's much of a reason for anyone to use a full 16fp model for inference, you're not really gaining much there. I think most people use the 4bit quants because it's a nice balance. But really it's just a matter of playing with the models and seeing how well it works. For example, some models perform okay when quantized down to 2 bits (I'm shocked that's the case, but I've heard people say that's the case in their testing), but Mixtral is not one of those models.

codeisawesome · 2 years ago

Thank you so much for the detailed answer! I didn’t realize Ollama was OSS, I confused it with LMStudio’s licensing. I’ll try it out.

I would say I care a lot more about the perplexity performance than pure T(okens)PS… it’s good to be able to verbalize that.

codeisawesome commented on Goody-2, the world's most responsible AI model goody2.ai/chat... · Posted by u/belladoreai

geuis · 2 years ago

I've been running the unrestricted mixtral 8x7B model locally via llama.cpp. It's insanely refreshing compared to any ChatGPT models, Gemini, Llama, etc.

For one thing, and granted this is my own experience, that model is much better at coding than any of the others I've tried.

But going beyond that, if I need to do anything complicated that might hit the baked in filters on these other models I don't have to worry about it with mixtral. I'm not doing anything illegal btw. It's just that I'm an adult and don't need to use the bumper lane when I go bowling. I also approach any interaction with the thing knowing not to 100% trust it and to verify anything it says independently.

codeisawesome · 2 years ago

Is there a tutorial on how to get that setup running step-by-step? I only found a GitHub issue (https://github.com/ggerganov/llama.cpp/issues/4439) that mentions that mainline llama.cpp isn't working for the model.

Bonus question if you have the time: there's a release by TheBloke for this on HuggingFace (TheBloke/Mixtral-8x7B-Instruct-v0.1-GGUF); but I thought his models were "quantised" usually - does that kneecap any of the performance?

codeisawesome commented on Canada Must End Reliance on Cheap Foreign Labor, Minister Says bnnbloomberg.ca/canada-mu... · Posted by u/djkivi

codeisawesome · 2 years ago

I’m an immigrant. I’ll be the first in line to agree that all the demands (and the specifics of those demands: personal space, hygiene, clothing, noise etc) for assimilation by western societies that I’ve ever heard re: immigrants, are reasonable - assuming such luxuries can be afforded (nobody crams themselves 8 to a room by choice).

But, people aren’t robots whose movements are controlled by an on-off switch. The government introduced the means for people to arrive and work, and so the people arrived. They are continuing to arrive because the policies have not been updated yet. How is it the immigrants’ fault? Why the hate and the attacks on their dignity & humanity?

The nonstop online vitriol hurts me deeply to read - nowhere is “safe” - Reddit, HN, Instagram… the hate spewers seemingly spend all their time spewing on these platforms to manipulate opinions and tap into the fundamental atavistic psychological flaws of the human mind.