Readit News logoReadit News

Deleted Comment

holomorphiclabs commented on Ask HN: Most efficient way to fine-tune an LLM in 2024?    · Posted by u/holomorphiclabs
dhouston · a year ago
Qlora + axolotl + good foundation model (llama/mistral/etc, usually instruction fine tuned) + runpod works great.

A single A100 or H100 with 80GB VRAM can fine tune 70B open models (and obviously scaling out to many nodes/GPUs is faster, or can use much cheaper GPUs for fine tuning smaller models.)

The localllama Reddit sub at https://www.reddit.com/r/LocalLLaMA/ is also an awesome community for the GPU poor :)

holomorphiclabs · a year ago
Thank you! and yes huge fan of r/localllama :)
holomorphiclabs commented on Ask HN: Most efficient way to fine-tune an LLM in 2024?    · Posted by u/holomorphiclabs
blissfulresup · a year ago
holomorphiclabs · a year ago
Thank you we have been exploring this.
holomorphiclabs commented on Ask HN: Most efficient way to fine-tune an LLM in 2024?    · Posted by u/holomorphiclabs
dvt · a year ago
I think you may be misunderstanding what fine tuning does. It does not teach the model new knowledge. In fact, Meta has a paper out that argues you only need a data set of 1000[1] to achieve pretty good alignment (fine-tuning) results. (100M is way overkill.) For knowledge retrieval, you need RAG (usually using the context window).

[1] https://arxiv.org/pdf/2305.11206.pdf

holomorphiclabs · a year ago
Our findings are that RAG does not generalize well when critical understanding is shared over a large corpus of information. We do not think it is a question of either context length or retrieval. In our case it is very clearly capturing understanding within the model architecture itself.
holomorphiclabs commented on Ask HN: Most efficient way to fine-tune an LLM in 2024?    · Posted by u/holomorphiclabs
Redster · a year ago
What LLM are you hoping to use. Have you considered using HelixML? If I am reading you right, the primary concern is compute costs, not human-time costs?
holomorphiclabs · a year ago
We are finding there is a trade-off between model performance and hosting costs post-training. The optimal outcome is where we have a model that performs well on next-token prediction (and some other in-house tasks we've defined) that ultimately results in a model that we can host on the lowest-cost hosting provider rather than be locked in. I think we'd only go the proprietary model route if the model really was that much better. We're just trying to save our selves weeks/months of benchmarking time/costs if there was already an established option in this space.
holomorphiclabs commented on Ask HN: Most efficient way to fine-tune an LLM in 2024?    · Posted by u/holomorphiclabs
tdba · a year ago
What's your measure of performance?

Theres no one size fits all answer yet, but if you just want to test it out there are many commercial offerings on which you should be able to get some results for under $10k.

holomorphiclabs · a year ago
Are there any that are recommended? Honestly we would rather not share data with any 3P vendors. It's been a painstaking progress to curate it.

u/holomorphiclabs

KarmaCake day29April 4, 2024
About
Currently building out Holomorphic Labs.

We are a research lab interested in next-generation model and agent architectures.

Hiring for Engineers and Applied Scientists: careers [at] holomorphic [dot] ai

General Info: info [at] holomorphic [dot] ai

View Original