Readit News logoReadit News
pruthvishetty · 2 years ago
Which is the best one, so far? Eleven labs? OpenAI apparently has something in the works to (going by today's Spotify podcast updates).
smusamashah · 2 years ago
Google Soundstorm had the best demo so far. It takes few seconds of original audio and continues it with the same voices. Just hearing those examples you wont figure out where original finished and generated one started.
narrationbox · 2 years ago
Yeah, neural codecs are pretty amazing. The most incredible part is that they can do compression well across the temporal domain, something which has been non-trivial.
airstrike · 2 years ago
I'm also curious. A review of what's state-of-the-art today would be a great idea for a blog post. Just don't post it on medium.com please
robga · 2 years ago
Still Azure Speech in my experience.
radicalriddler · 2 years ago
ElevenLabs is better quality wise, but it's vastly more expensive. Azure Speech hits a really good price:quality ratio.
akshayys · 2 years ago
Hey everyone - I'm the founder of Fluxon. Just saw that we were on HN today - sorry for the late replies.

The app you're using right now is essentially an alpha-version - even the website was only made this week. Sorry about the somewhat-broken experience so far.

I'll try to get back to everything here in the next couple hours but if I miss something / you have other questions, please ping me on akshay@fluxon.ai

pjmq · 2 years ago
Hey there! Just FYI your APIs are reporting address not found: Error: getaddrinfo ENOTFOUND app.fluxon.dev
shrisukhani · 2 years ago
Sorry about that! Docs are fixed now.

(Other cofounder of Fluxon)

fuddle · 2 years ago
"Ultrarealistic AI Voice Generator" - I initially read the title as "Unrealistic AI Voice Generator". I'd suggest adding a space to "Ultrarealistic".
stavros · 2 years ago
A hyphen, please.
airstrike · 2 years ago
I also read the same thing
sterlind · 2 years ago
the prosody seems a little robotic, and kind of jarring. maybe I'm spoiled by Bark, even in its rough and slow state, but is this really that much of a step up from Tacotron2?
mynegation · 2 years ago
Every time I see one of those, as a big fan of TV crime dramas, I cannot help but think that voice recordings as proof are going to be a thing in the past very soon.
g_xing · 2 years ago
The price is lower than the competitors... I wonder how good it actually is though. Guessing they just sacrifice quality
BeetleB · 2 years ago
Frankly, these don't sound any better than Google Cloud's TTS, and is orders of magnitude more expensive.
ameliaquining · 2 years ago
I think the killer feature here is supposed to be voice cloning, which IIUC Google Cloud offers only as a custom enterprise thing that takes weeks (which suggests that it's not fully automated).
BeetleB · 2 years ago
Cloning is nice, but what's the point if it doesn't sound natural? Will people really pay 100-1000x (if not more) just to get their preferred voice, but won't sound anything like that person when speaking?
vigneshv59 · 2 years ago
All these TTS services look cool but I don’t know how any of them are different from each other…