You'll probably get a lot of replies around how this model is a just a fine-tune and a potential disregard for LoRAs, as if we didn't know about them. While the reality is that we have thousands of them running in our platform. Sadly there's simply so much a LoRA and a fine-tune can do before you run into issues that can't be solved until you apply more advanced techniques such as curated post-training runs (including reinforcement learning-based techniques such as Diffusion-PPO[1]), or even large-scale pre-training.
-
Sadly with SAI going effectively bankrupt things changed, their rushed 3.0 model was broken beyond repair and the later 3.5 just unfinished or something (the api version is remarkably better), gens full of errors and artifacts even though the good ones looked great. It turned out hard to finetune as well.
In the mean time flux got released, but that model can be fried (as in one concept trained in) but not finetuned (this krea flux is not based on the open weights flux). Add to that that as models got bigger training/finetuning now costs an arm and a leg, so here we are, a year after flux got released a good finetune is celebrated as the next new thing :)
As others have said, you can fine-tune any model with a pretty small data set of images and captions and make your generations not look like 'AI' or all look the same.
Here's one I made a while back trained on Sony HVS HD video demos from the 80s/90s -- https://civitai.com/models/896279/1990s-analog-hd-or-4k-sony...
(Disclaimer: I am the Krea cofounder and this is based on a small sample size of results I've seen).