Readit News logoReadit News
hammadh commented on Launch HN: Play.ht (YC W23) – Generate and clone voices from 20 seconds of audio    · Posted by u/hammadh
SergeAx · 3 years ago
Great product, giving it a try. Here you saying that 20 seconds is enough, and on a "clone" page there is an instruction about 30 minutes for better result. Is there any kind of instruction about how to create a good sample of the voice? For example, should I speak English, or any language will do? Do you have some stats on corellation between sample length and generation quality? Thank you!
hammadh · 3 years ago
Thanks. What we've shared here is a demo tool to show our new speech model that can clone a voice with few seconds of audio. You can try that with English or non-English recordings, but the generated voice can only speak English at the moment. If you are looking for high-fidelity cloning, you can sign up and try it in our app here - https://play.ht/voice-cloning/

High-fidelity cloning requires at least 20 mins of good quality audio. The more the better.

hammadh commented on Launch HN: Play.ht (YC W23) – Generate and clone voices from 20 seconds of audio    · Posted by u/hammadh
tanepiper · 3 years ago
"Trusted by 7000+ users and teams of all sizes" [posts a bunch of company logos]

You've just launched in beta, how can you claim this? I'm always very suspicious of this (I take this from the position of being a tech lead at a multi-billion euro retailer who's logo you'll never be able to use)

Is this one developer? A team? Or is this just marketing bullshit for VCs who somehow don't verify if this is true or not?

hammadh · 3 years ago
We launched the playground.play.ht in beta to share the new speech model we are working on. We've been operating play.ht for a while and have teams from these companies using the platform.
hammadh commented on Launch HN: Play.ht (YC W23) – Generate and clone voices from 20 seconds of audio    · Posted by u/hammadh
Natfan · 3 years ago
This is already being used for scams.

https://playground.play.ht/listen/1079 (https://archive.ph/HKjue)

How exactly do you expect to combat this type of content?

hammadh · 3 years ago
The intention for this playground was to let people try the model. We actually have auto moderation on the user facing platform (https://play.ht/) and malicious text gets blocked and the user get flagged.
hammadh commented on Launch HN: Play.ht (YC W23) – Generate and clone voices from 20 seconds of audio    · Posted by u/hammadh
h1fra · 3 years ago
Congrats on launching. People already made a lot of feedback on the product itself so I'll keep mine.

Just a few note on the UX:

- Recording your own voice should contain a script too, that could help increase the quality of the sampling because I struggled to say anything relevant.

- Recording again, there is no time so it's hard to say when it's okay to stop

- You enforce the checkbox "not [...] to generate any sexual content" yet you have a filter to display only nswf

- It doesn't work at all with non-english voices, maybe you can add a warning or a way to fine tune depending on the language?

- There is no way to delete a voice nor an account, that's a huge red flag especially when dealing with PII like this.

- An other person has said it already, but generated voices are identified by an Auto Increment, making it easy to access PII of an other person. I would recommend at the very least a random string or an UUID

- All generated voices are public and no way to delete them

hammadh · 3 years ago
Thanks, we intended the playground to be merely a testing tool for the new model we're building. We'll improve based on your feedback!
hammadh commented on Launch HN: Play.ht (YC W23) – Generate and clone voices from 20 seconds of audio    · Posted by u/hammadh
MattRix · 3 years ago
I think this tech is super cool, but why is the API priced with subscription tiers rather than just some per-word rate? It would make it easier to develop with and budget for if the cost was based on actual usage (like the OpenAI API is, for example).
hammadh · 3 years ago
Yes, we are working on making the API pay as you go soon. Thanks for the feedback!
hammadh commented on Launch HN: Play.ht (YC W23) – Generate and clone voices from 20 seconds of audio    · Posted by u/hammadh
jameshiew · 3 years ago
I would mention something to that effect in the modal because it wasn't clear to me why it was asking for card details at that point for "$0.00/mo" (though I guessed the reason). Maybe something like "To prevent abuse, we require card details, but you won't be charged", but worded better.
hammadh · 3 years ago
Thank you. Would fix this.
hammadh commented on Launch HN: Play.ht (YC W23) – Generate and clone voices from 20 seconds of audio    · Posted by u/hammadh
juliennakache · 3 years ago
Looks great. I've waiting for a service like that ever since Microsoft released their paper on speech synthesis using voice samples. Feature requests: - make the voice generation available via API so devs can embed that in their app - expose a streaming API like Polly so we can feed it text in real time and get the voice as an audio stream - make it Hipaa compliant and have a plans that offer signing a BAA

I'll be your first customer if you do this! You can get in touch with me at @juliennakache

hammadh · 3 years ago
We have an API - https://docs.play.ht/reference/api-getting-started

We have a beta streaming endpoint but the latency is not real time yet (something we're working on) and are adding an endpoint to create voices.

hammadh commented on Launch HN: Play.ht (YC W23) – Generate and clone voices from 20 seconds of audio    · Posted by u/hammadh
skybrian · 3 years ago
You may get your wish. The FTC posted an article about this a week ago. [1]

> The FTC Act’s prohibition on deceptive or unfair conduct can apply if you make, sell, or use a tool that is effectively designed to deceive – even if that’s not its intended or sole purpose.

It seems like an awfully broad rule? But they probably could go after this startup if they noticed it.

There are some kinds of businesses where making sure the regulators like what you’re doing is pretty much a prerequisite. On the other hand, plenty of companies got where they are today by pushing the limits.

[1] https://www.ftc.gov/business-guidance/blog/2023/03/chatbots-...

hammadh · 3 years ago
Thanks for sharing this ^

u/hammadh

KarmaCake day247October 1, 2016View Original