AFAIU slim pajama is about 627B tokens, and Starcoder:
> approximately 250 Billion tokens.
Ed: I see TFA says:
> Combined Dataset Size - Around 950B tokens
> Total Tokens During Training - 3 trillion (slightly more than 3 epochs/1430k steps)
... but I'm not seeing how one becomes three? That's more like 1 trillion than 3 trillion tokens?
I worked trying to add torch.compile support to A1111 for a bit, fixing some graph breaks locally, but... It was too much. Some other things, like ML compilation backends, are also basically impossible.
While it has a fraction of the features found in stable-diffusion-webui, it has the best out of the box UI I've tried so far.The way it enqueues tasks and renders the generated images beats anything I've seen in the various UIs I've played with.
I also like that you can easily write plugins in Javascript, both for the UI and for server-side tweaks.
Thanks to both people that cleared up my mistake, it has always seemed they had much stronger coverage than they should for my mistaken view of how it worked.