Qlora + axolotl + good foundation model (llama/mistral/etc, usually instruction fine tuned) + runpod works great.
A single A100 or H100 with 80GB VRAM can fine tune 70B open models (and obviously scaling out to many nodes/GPUs is faster, or can use much cheaper GPUs for fine tuning smaller models.)
I think you may be misunderstanding what fine tuning does. It does not teach the model new knowledge. In fact, Meta has a paper out that argues you only need a data set of 1000[1] to achieve pretty good alignment (fine-tuning) results. (100M is way overkill.) For knowledge retrieval, you need RAG (usually using the context window).
Our findings are that RAG does not generalize well when critical understanding is shared over a large corpus of information. We do not think it is a question of either context length or retrieval. In our case it is very clearly capturing understanding within the model architecture itself.
What LLM are you hoping to use. Have you considered using HelixML? If I am reading you right, the primary concern is compute costs, not human-time costs?
We are finding there is a trade-off between model performance and hosting costs post-training. The optimal outcome is where we have a model that performs well on next-token prediction (and some other in-house tasks we've defined) that ultimately results in a model that we can host on the lowest-cost hosting provider rather than be locked in. I think we'd only go the proprietary model route if the model really was that much better. We're just trying to save our selves weeks/months of benchmarking time/costs if there was already an established option in this space.
Theres no one size fits all answer yet, but if you just want to test it out there are many commercial offerings on which you should be able to get some results for under $10k.
A single A100 or H100 with 80GB VRAM can fine tune 70B open models (and obviously scaling out to many nodes/GPUs is faster, or can use much cheaper GPUs for fine tuning smaller models.)
The localllama Reddit sub at https://www.reddit.com/r/LocalLLaMA/ is also an awesome community for the GPU poor :)