Another Text to Speech API

Another Text to Speech API fluxon.ai/...

Which is the best one, so far? Eleven labs? OpenAI apparently has something in the works to (going by today's Spotify podcast updates).

smusamashah · 2 years ago

Google Soundstorm had the best demo so far. It takes few seconds of original audio and continues it with the same voices. Just hearing those examples you wont figure out where original finished and generated one started.

narrationbox · 2 years ago

Yeah, neural codecs are pretty amazing. The most incredible part is that they can do compression well across the temporal domain, something which has been non-trivial.

airstrike · 2 years ago

I'm also curious. A review of what's state-of-the-art today would be a great idea for a blog post. Just don't post it on medium.com please

robga · 2 years ago

Still Azure Speech in my experience.

radicalriddler · 2 years ago

ElevenLabs is better quality wise, but it's vastly more expensive. Azure Speech hits a really good price:quality ratio.

akshayys · 2 years ago

Hey everyone - I'm the founder of Fluxon. Just saw that we were on HN today - sorry for the late replies.

The app you're using right now is essentially an alpha-version - even the website was only made this week. Sorry about the somewhat-broken experience so far.

I'll try to get back to everything here in the next couple hours but if I miss something / you have other questions, please ping me on akshay@fluxon.ai

pjmq · 2 years ago

Hey there! Just FYI your APIs are reporting address not found: Error: getaddrinfo ENOTFOUND app.fluxon.dev

shrisukhani · 2 years ago

Sorry about that! Docs are fixed now.

(Other cofounder of Fluxon)

fuddle · 2 years ago

"Ultrarealistic AI Voice Generator" - I initially read the title as "Unrealistic AI Voice Generator". I'd suggest adding a space to "Ultrarealistic".

stavros · 2 years ago

A hyphen, please.

airstrike · 2 years ago

I also read the same thing

sterlind · 2 years ago

the prosody seems a little robotic, and kind of jarring. maybe I'm spoiled by Bark, even in its rough and slow state, but is this really that much of a step up from Tacotron2?

mynegation · 2 years ago

Every time I see one of those, as a big fan of TV crime dramas, I cannot help but think that voice recordings as proof are going to be a thing in the past very soon.

g_xing · 2 years ago

The price is lower than the competitors... I wonder how good it actually is though. Guessing they just sacrifice quality

BeetleB · 2 years ago

Frankly, these don't sound any better than Google Cloud's TTS, and is orders of magnitude more expensive.

ameliaquining · 2 years ago

I think the killer feature here is supposed to be voice cloning, which IIUC Google Cloud offers only as a custom enterprise thing that takes weeks (which suggests that it's not fully automated).

BeetleB · 2 years ago

Cloning is nice, but what's the point if it doesn't sound natural? Will people really pay 100-1000x (if not more) just to get their preferred voice, but won't sound anything like that person when speaking?

vigneshv59 · 2 years ago

All these TTS services look cool but I don’t know how any of them are different from each other…