Readit News logoReadit News
8bitsrule · 2 years ago
They've done well with the reading -voices-, and enunciations. But the readings themselves are, well, hilariously revealing in many ways. Odd pauses like 'Winnie. The poo...', or word-bunchings, mispronouncings, sudden unexpected volume changes, etc. become a constant distraction. All parties in fictional conversation sound much alike ... who's talking now?

In short, voice-animation pros should not be too worried. Yet.

narrationbox · 2 years ago
Their sentence segmentation heuristics were not configured correctly. It's not an inherent limitation of the technology itself.

The newer transformer based generators are a bit better in this regard (since they can maintain a longer context window, not just in short tiny snippets).

GeekyBear · 2 years ago
I haven't looked at much of Project Gutenberg's source text recently, but in the past I found a lot of formatting issues. Strange line breaks and spacing artifacts from OCR were pretty common, and may be the source of some of the weird pauses that were in some works, but not others.

I'm a fan of Standard eBooks, where they take the extra effort to proofread and clean up the OCR results, They make works available as well formatted epub files with attractive typography, metadata, and cover art.

https://standardebooks.org/

RecycledEle · 2 years ago
I have not noticed any volume changes.
8bitsrule · 2 years ago
I meant, not in the audio level, but some words are said much louder than those around them. 'Misplaced emphasis' for long!
zui · 2 years ago
Brajeshwar · 2 years ago
The article talked and linked to so many URLs but the the topic in question -- where are the Audiobooks! What's the logic behind these types of writing which is prolific these days?

What about get to the point, then supplement that with the stories. Or, start with the story that lead to the point.

Edit: Someone updated in the comments; here it is https://marhamilresearch4.blob.core.windows.net/gutenberg-pu...

senkora · 2 years ago
There is also a separate project of humans reading public domain works, called LibreVox: https://librivox.org/
RecycledEle · 2 years ago
I've been l eeching these, and it looks like the total will be 1.25 TB. You can always tell a rush job when the audio files are much larger than they need to be.
orasis · 2 years ago
It’s not as good as elevenlabs and probably not worth most listeners time. I would wait a year or so for further improvements.
mvidal01 · 2 years ago
This is making me nostalgic for the text to speech that was part of my Amiga.
tnel · 2 years ago
How does this compare to Librivox.org ?