Very cool model, but the post is a caricature of AI writing. "Okay, let's get into the nitty-gritty. What makes this little beast tick? These aren't just bullet points on a GitHub README; these are the specs that will fundamentally redefine what you thought was possible with local AI." Sure.
This is strictly true but not correct. LLMs were trained on human-written text, but they were post-trained to generate text in a particular style. And that style does have some common patterns.
Indeed the blurb is absurd and very off-putting. It's not a big deal that "It clocks in at under 25MB with just 15 million parameters", because text to speech is a long-solved problem, in fact the Texas Speak and Spell from 1978 (half a century ago FFS) solved it, probably with a good deal less than 25MB.
I've re-upped that thread to the same position the previous discussion (this one) was at.
Here is the link to our repo: https://github.com/KittenML/KittenTTS
This is a ouroboros that will continue.
(Not saying this is or isn't, simply that these claims are rampant on a huge number of posts and seem to be growing.)
No human comments on meta formatting like that outside the deepest trenches of Apple/FB corporate stuff.
Dead Comment
Deleted Comment