fine_tune (u/fine_tune)

fine_tune commented on Top Secret: Automatically filter sensitive information thoughtbot.com/blog/top-s... · Posted by u/thunderbong

fine_tune · 9 days ago

I'm no ruby expert, so forgive my ignorance, but it looks like a small "NER model" packaged as a string convince wrapper named `filter` that tries to filter out "sensitive info" on input strings.

I assume the NER model is small enough to run on CPU at less than 1s~ per pass at the trade off of storage per instance (1s is fast enough in dev, in prod with long convos - that's a lot of inference time), generally a neat idea though.

Couple questions;

- NER doesn't often perform well in different domains, how accurate is the model?

- How do you actually allocate compute/storage for inferring on the NER model?

- Are you batching these `filter` calls or is it just sequential 1 by 1 calls

fine_tune commented on Gemini Embedding: Powering RAG and context engineering developers.googleblog.com... · Posted by u/simonpure

criddell · a month ago

Is RAG how I would process my 20+ year old bug list for a piece of software I work on?

I've been thinking about this because it would be nice to have a fuzzier search.

fine_tune · a month ago

Yes and no, for human search - its kinda neat, you might find some duplicates, or some nearby neighbour bugs that help you solve a whole class of issues.

But the cool kids? They'd do something worse;

They'd define some complicated agentic setup that cloned your code base into containers firewalled off from the world, give prompts like;

Your expert software dev in MY_FAVE_LANG, here's a bug description 'LONG BUG DESCRIPTION' explore the code and write a solution. Here's some tools (read_file, write_file, ETC)

You'd then spawn as many of these as you can, per task, and have them all generate pull requests for the tasks. Review them with an LLM, then manually and accept PR's you wanted. Now your in the ultra money.

You'd use RAG to guide an untuned LLM on your code base for styles and how to write code. You'd write docs like "how to write an API, how to write a DB migration, ETC" and give that as tool to the agents writing the code.

With time and effort, you can write agents to be specific to your code base through fine tuning, but who's got that kind of money?

fine_tune commented on Gemini Embedding: Powering RAG and context engineering developers.googleblog.com... · Posted by u/simonpure

stillpointlab · a month ago

> Embeddings are crucial here, as they efficiently identify and integrate vital information—like documents, conversation history, and tool definitions—directly into a model's working memory.

I feel like I'm falling behind here, but can someone explain this to me?

My high-level view of embedding is that I send some text to the provider, they tokenize the text and then run it through some NN that spits out a vector of numbers of a particular size (looks to be variable in this case including 768, 1536 and 3072). I can then use those embeddings in places like a vector DB where I might want to do some kind of similarity search (e.g. cosine difference). I can also use them to do clustering on that similarity which can give me some classification capabilities.

But how does this translate to these things being "directly into a model's working memory'? My understanding is that with RAG I just throw a bunch of the embeddings into a vector DB as keys but the ultimate text I send in the context to the LLM is the source text that the keys represent. I don't actually send the embeddings themselves to the LLM.

So what is is marketing stuff about "directly into a model's working memory."? Is my mental view wrong?

fine_tune · a month ago

RAG is taking a bunch of docs, chunking them it to text blocks of a certain length (how best todo this up for debate), creating a search API that takes query (like a google search) and compares it to the document chunks (very much how your describing). Take the returned chunks, ignore the score from vector search, feed those chunks into a re-ranker with the original query (this step is important vector search mostly sucks), filter those re-ranked for the top 1/2 results and then format a prompt like;

The user ask 'long query', we fetched some docs (see below), answer the query based on the docs (reference the docs if u feel like it)

Doc1.pdf - Chunk N Eat cheese

Doc2.pdf- Chunk Y Dont eat cheese

You then expose the search API as a "tool" for the LLM to call, slightly reformatting the prompt above into a multi turn convo, and suddenly you're in ze money.

But once your users are happy with those results they'll want something dumb like the latest football scores, then you need a web tool - and then it never ends.

To be fair though, its pretty powerful once you've got in place.

fine_tune commented on Resurrecting a dead torrent tracker and finding 3M peers kianbradley.com/2025/06/1... · Posted by u/k-ian

fine_tune · 2 months ago

You bought a house that had a murder X years ago and are wondering if your guilty for the murder, probably not - aslong as you don't do more murder in it.

I suppose real life is more interesting though, the guy who picked up the domain to stop the global ransomware crisis was picked up after Defcon if memory serves.

Ironically your probably at more risk from the GDPR for leaking those IP addresses that connected to the box via your blog post.

I'm not a lawyer/solicitor though, don't take my advise.

fine_tune commented on DeepMind program finds diamonds in Minecraft without being taught nature.com/articles/d4158... · Posted by u/Bender

fine_tune · 5 months ago

Attempting to train this on a real workload I converted over the weekend after, "step" 8M~ so far and rarely scores above 5% and most are 0% but has scored 60% once 7M~ steps ago.

Adding more than 1 GPU didn't improve speed but that's pretty standard as we don't have fancy interconnect. Bit annoying they didn't use tensorboard for logging, but overall seems like a pretty cool lib - will leave it a few days and see if it can learn (no other algo has so I dont have much hope).

fine_tune commented on Fine-tune Google's Gemma 3 unsloth.ai/blog/gemma3... · Posted by u/tomdekan

danielhanchen · 5 months ago

Oh fill in the middle is definitely smart especially for codebases!!

fine_tune · 5 months ago

Love unsloth btw, use it for some other stuff at work, GRPO stuff was fun :)

I know its coming but "mUlTi GpU PlZ" :pleading: <3

fine_tune commented on Fine-tune Google's Gemma 3 unsloth.ai/blog/gemma3... · Posted by u/tomdekan

smokel · 5 months ago

I'm interested to know if anyone is using fine-tuning to train a model on proprietary or in-house codebases and documentation.

RAG solutions seem to have their limitations, and fine-tuning might be a more effective approach.

How much effort is required to turn code into something one can use for fine-tuning?

fine_tune · 5 months ago

> I'm interested to know if anyone is using fine-tuning to train a model on proprietary or in-house codebases and documentation.

I've done it, 1/2 the team though it was great 20% of the time, 1/2 the team hated it from day 0. I used roughly 500K lines of code.

> How much effort is required to turn code into something one can use for fine-tuning?

Very little to moderate, less than 200 lines of python, QWEM FIM, HF, LLAMA.CPP, LLAMA.CPP code extension.

> RAG solutions seem to have their limitations, and fine-tuning might be a more effective approach.

The only problem either way is keeping the information up to date, RAG just adds more cost to the inference process (which at my dev speed is pretty important).

> How much effort is required to turn code into something one can use for fine-tuning?

Fine tuning "fill in the middle" process is the process of taking a file, cutting out a some text in the middle and asking AI to guess what was there - there is a hugging face example that will have you doing it in an hour or less - your OPs team saying "No you cant litreally copy all code to a single folder" is probably the biggest hurdle (advise them you'll do it in CI and then they can stand up a FIM training endpoint that accepts a csv, pretty easy)