Readit News logoReadit News
xrd · 2 years ago
When I read Rikki Tikki Tavi to my 8 yr old daughter, we play a game. She asks me to change one of the words in the page and she tries to listen and see if she can figure it out. It is mentally taxing at the end of a long day to do that on the fly without pausing to figure out the word to slip in. And, my daughter is very sharp and catches them.

I listened to a few of these. The voice sounds muted at times, as if the reader has a stuffy nose. H.G. Wells was read with a pause in between each period because it "thinks" that each letter boundary is a sentence change, which drove me batty. And, there is zero life in the stories. It might be a good thing to put in front of a kid to put them to sleep, maybe? But, it would not put me to sleep because it is just aggravating to listen to these stories stripped of all life by AI.

Like Louis CK said: "Everything is amazing and no one is happy." I know this is incredible that AI can take in a transcript and produce something that most people would be able to distinguish between a real human. But, we should ask if you would want to hang out with the voice actor at a party.

ecshafer · 2 years ago
> Like Louis CK said: "Everything is amazing and no one is happy."

Everything is not amazing. Sure things are amazing from a technical perspective. But most tech advancements I think have been harmful to society in the last 30 years or so. Its awesome that computers are so powerful and we have awesome video and photos and can share things so easily. But technology should better lives, and not cheapen it, which it often does. Tech is being used to try and replace essential human lived experiences to try and inject advertising into it and extract money.

Technology can not replace the human, its impossible. No matter how good the AI is at reading the book, it will never replace sitting next to your parent and them reading it. No matter how easy it is to share a video or a photo, it will never replace sitting next to someone and them showing you photos, or better yet being there when the photo was taken.

jzb · 2 years ago
I forget the exact quote but the thing I've seen making the rounds sums it up pretty well: Computers were supposed to do the work so people could make art and write poetry. Now the computers are making art and writing poetry and I still have to have a job.

In another life I'd love to do voice over work. (I even have a face for radio!) But, instead, technology is being used to avoid even having humans do that type of work. Sure, today it's PG, but they're definitely doing this with an eye to replacing actual voiceover actors.

Every advance in AI is "how can we replace people and save money?" and not "how can people have better lives and work less?" And it's going to continue until it's "what the fuck do we do with all these jobless people who've been replaced?"

OfSanguineFire · 2 years ago
> it will never replace sitting next to someone and them showing you photos

It definitely does replace that. It sucks so much to be trapped next to someone showing you their photo album or vacation slides, when you don’t really care, that this became a stock scene in 20th-century comedy TV series and films. Nowadays when people are sharing their photos online, that gives their peers the choice of whether to look or whether to ignore, and that is immensely freeing.

karmelapple · 2 years ago
I'd suggest reading How to Do Nothing by Jenny Odell [1]. I think it addresses some of the concerns you have.

1. https://www.goodreads.com/en/book/show/42771901

tcmb · 2 years ago
Please don't dissect an aphorism by a standup comedian like it was a Ph.D. thesis.

Deleted Comment

janekm · 2 years ago
Elevenlabs is a lot closer to compelling audiobook narration (needs a better way to deal with multiple characters in a story without manual use of multiple voices): https://pub-a24da573c61f4b2d905bdebb2d0ecf88.r2.dev/ElevenLa... (an H.G.Wells example I just asked it to read).
tkgally · 2 years ago
I was going to mention ElevenLabs, too. Their samples are very impressive in how the intonation and word stress are varied based on the text’s meaning. Their pricing is a bit high for personal use, though.

(The link you posted seems to have been truncated. Can you try posting it again?)

satvikpendem · 2 years ago
Any open source alternatives?

Dead Comment

lynx23 · 2 years ago
Now you have a pretty good idea how blind people must feel. Yes, a good audiobook should be read by a human. But if you dont have that, speech synthesis is the best or even only thing you can get. And then, many years later, you read a post like yours. And you realize that man is spoiled.

Signed, a blind man

prepend · 2 years ago
> But, we should ask if you would want to hang out with the voice actor at a party.

I think the question is really “Will I be able to enjoy great books I otherwise would not have experienced?”

For me, it’s not that these are superior or equivalent books to parents reading to their kid or voice actors; it’s whether I’ll listen to a book for free that I wouldn’t be able to afford $10-30.

therealdrag0 · 2 years ago
Plus lots of books don’t have audiobooks. I’ve a few sitting on my to read list for years on end just cuz there’s no audio. Being able to make one myself with AI would be awesome.
low_tech_love · 2 years ago
Audible is $7.95 a month and you can listen to whatever book you want (like Spotify). If you’re not willing to go even with that in order to listen to an actual human, then maybe yeah, you can try AI.
barrenko · 2 years ago
We're living through the Great Enshitification.
fuzztester · 2 years ago
And it is living through us, or on us.
gnutrino · 2 years ago
Seth Godin did a whole Akimbo podcast that was written by ChatGPT, and the audio was AI generated. The voice was spot on, the content and delivery was dead. I almost fell asleep listening to it, which is NEVER the case for any other episode of Akimbo I've listened to.
Pxtl · 2 years ago
> H.G. Wells was read with a pause in between each period because it "thinks" that each letter boundary is a sentence change

This is why I'm a firm "two spaces after the period" guy. Makes it unambiguous the difference between the abbrevs. period and the sentence-end period. Otherwise you get sentences like "Let's not forget that Dr. Principal does not care about this." which can be read in two valid ways.

bloak · 2 years ago
Of course some style guides would tell you not to put a dot after "Dr" because "r" is the last letter of "Doctor". Similarly, the abbreviation of "Saint" would be "St", while the abbreviation of "Street" would be "St.", according to those style guides.

Meanwhile the GB military style guide says never to use a dot after any abbreviation, I think.

Also, the style guides I'm familiar with prescribe "H. G. Wells", rather than "H.G. Wells", but "H.G.W." if you're abbreviating all of the words.

None of this is of much interest to anyone who isn't an editor but I thought I'd mention it anyway.

cxr · 2 years ago
This trick doesn't work across linebreaks (unless you adopt a rule like "treat the spaces in the nouns as non-breaking and do not permit a linebreak for anything that isn't a sentence boundary").
ta988 · 2 years ago
Sidenote, I asked ChatGPT about where to put the comma and how it would change the meaning of the sentence. It got it right.
jtbayly · 2 years ago
How does it feel to have websites and books and newspapers and practically every other place silently ignore your double spaces and treat them as a single space?
didntcheck · 2 years ago
See also: Librivox [1], for public-domain audio books read by actual humans

[1] https://librivox.org/

bunderbunder · 2 years ago
Lit2Go is also good: https://etc.usf.edu/lit2go/

The sibling poster is right, the quality varies. But the upper end of the quality range is really quite good. One of the best-read audiobooks I've ever heard was a Lit2Go edition of Pygmalion. And, for that matter, one of the worst-read audiobooks I've ever heard was an edition of an extremely famous and commercially successful book that I bought on Audible.

low_tech_love · 2 years ago
Are you a specist? Why should we value more an audiobook that's read by a human?

Sorry, just joking. But here's a reason: these things were not quality-checked at all. Click on Moon Voyage by Jules Verne and be greeted with a very human-like voice reading an numbered list of "other works by the author" in an extremely awkward fashion that's probably caused by how the .TXT file is organized.

yorwba · 2 years ago
Most likely the AI was actually trained on LibriVox, potentially even on largely the same books...

EDIT: The first book on the list https://marhamilresearch4.blob.core.windows.net/gutenberg-pu... is "100%: the Story of a Patriot" and the LibriVox version is at https://librivox.org/100-the-story-of-a-patriot-by-upton-sin...

DanielleMolloy · 2 years ago
Not a good idea then. The librivox experience turned me away from professionally read audiobooks for far too long.

Amateur readers are hit & miss. A lot of professional readers are actors or have a lot of experience. There is a reason people do pay for professionally read books instead of electronic reading or librivox only.

JKCalhoun · 2 years ago
Thanks, I'm happy this exists. I think I'll start contributing this fall.
tmountain · 2 years ago
I was kind of hoping this was going to be human beings contributing read aloud versions of Gutenberg content. Since it’s not, I’ll propose a cool project. Raise money to enlist high quality voice actors to create audiobooks from Gutenberg. Release these audiobooks to the world for free. Which books come first could be voted upon. As someone who has used TTS a Lot in recent projects, I’m not excited about listening to AI read a book to me. It feels soulless.
dirtyid · 2 years ago
>As someone who has used TTS a Lot in recent projects, I’m not excited about listening to AI read a book to me. It feels soulless.

AI TTS still uncanny valley enough to distract. I prefer even more soulless traditional TTS which sounds "neutral" after habituation. To the point where my brain can start layering on characterization as if I was reading. AI TTS feels like listening to to medicore voice actor, where it's hard to overwrite their creative choices, so just left disapointmented and annoyed.

IggleSniggle · 2 years ago
I agree completely! I kinda like the neutral tone of a soulless robot when it knows how to stay out of the way. Far better than a bad AI _or_ a poor human reading.
crop_rotation · 2 years ago
I have used TTS in the past and in the last few years there has been a quantum leap in TTS quality. A similar such leap in the next few years and it will dominate the audiobook scene for good or bad.
hedora · 2 years ago
AI might dominate, but it would be a loss. Here’s a tutorial explaining modern audio fiction:

https://www.drabblecast.org/2018/07/30/inside-drabblecast-au...

(In audio format, of course; roughly 1.5 hours)

————

This episode takes you inside Drabblecast audio production. Ever wonder how we produce an episode of the Drabblecast? Wonder no more!

We dig into all the technical aspects like voice acting, sound editing and mixing, foley effects, music and more.

Preproduction? Reading? Acting? Yeah, it’s all here folks, all the blood sweat and tears that go into every production of the Drabblecast.

totetsu · 2 years ago
Does anyone know of and TTS available now that doesn't completely muck up foreign words? I know you can make custom pronouncing dictionaries to use with some of the open source ones, but I wonder if any of the more modern systems are good for this. I have been listening to the english news podcast from a japanese news paper that is made with TTS and it gets its one job, pronouncing Japanese names and places completely jarringly wrong.
j3d · 2 years ago
Shameless plug - if you download lots of audiobooks and need help organizing them and figuring out which to listen to next, check out Audiobook Locker: https://gitlab.com/fonner/audiobook-locker. It's a desktop app (built with Tauri) that scans your audiobook folder and lets you sort, search and tag your audiobooks.
Borrible · 2 years ago
Sounded better than the handful of random corresponding Librivox recordings I listened to in order to compare them. To be honest, a lot of people go to great lengths to make Librivox recordings without having the skills to read aloud.

Which is a pity, but nonetheless.

jawns · 2 years ago
I agree. I love the idea of Librivox, but the volunteers vary widely in quality.

Some are non-native English speakers, some have lisps or other articulation problems, some have other marks of fluency deficiencies, some have under- or over-dramatic intonation, etc.

And even if they're perfect voice actors, often their microphone setups are sub-par, and it comes through in the recording.

Frankly, these AI voices are now at a level where the few mistakes they make are easier to forgive than some of those issues from human readers.

That said, the final hurdle -- giving them the brains to know when to skip or resolve hiccups in the source material, such as typos, formatting issues, or text not intended to be read aloud -- is going to be very hard to overcome.

everybodyknows · 2 years ago
> hiccups in the source material

From Joyce's Ulysses (capitalization possibly wrong): "nes. yo."

Good luck with that!

Borrible · 2 years ago
Absolutely.

So, will your books be available with an audio section as a free encore in the future? :)

dirtyid · 2 years ago
It's getting more passable. As someone who listens to a lot of TTS at high speed for years, eventually I adapted my brain do it and now it feels similar to phsyical reading with subvocalization where I can adjust the voices of characters. It's occasionally even preferrable, i.e. too much over produced podcasts these days where I just TTS the transcript.
The_suffocated · 2 years ago
Great news. It seems there is still much room for improvement, though. E.g. in “A Short History of the World” by H. G. Wells, the AI reads Darius I and Charles V as “Darius Eye” and “Charles Vee”. Open and closed brackets in sentences are not read out. The intonation is also a bit unnatural. But it is intelligent enough to parse 1,782 as 1782 rather than two numbers.

Another problem is that the audio clips are not broken into sessions. There is no way to locate the beginning of a chapter, for instance.

low_tech_love · 2 years ago
I noticed the same, especially when some kind of structure text (like a bullet point or numbered list) comes in.
hedora · 2 years ago
Some podcasts embed chapter start timestamps into the rss (atom? m3u?) metadata.

That way, you get one file (and gapless playback), but most players have a chapter skip button that will do what you are asking for.

Deleted Comment

davidzweig · 2 years ago
Oh, snap, we've been working on importing Gutenburg to LR:

https://www.languagereactor.com/m/t_en_-

We're ranking them using the download count, and also this prompt to chatGPT (it's primarily for language learners):

"Is this text engaging and interesting for a modern reader, someone not into fine literature? Rate the text excellent, good, ok or poor. I don't want crusty, flowery, contorted language, talking about buttons and mannerisms and the hue of the sky etc."

Then, we're rewriting the ~1000 most popular books using chatGPT to modernise/simplify the text.

Using some markdown as an internal format, drawing from the gutenberg plain text and html formats, this will go to a github repo shortly.

There's translations, and then, need to look at current best TTS voices.

hedora · 2 years ago
Standard Ebooks has the ability to filter books by reading level.

That seems much better for people trying to learn English.

https://standardebooks.org/

They carefully curate and copy-edit their books, and go for quality over quantity. I think that is probably the right choice. We already have free access to an effectively infinite amount of mediocre content on the internet.

davidzweig · 2 years ago
I did check that out. The thing is, the criteria is a little different. This is for learners who need to practice reading English. The scripts surface a lot of material in Gutenberg that otherwise would be time-consuming to find. 1950's science fiction that still reads well, or story books for children. Someone could certainly call these books 'mediocre', but I think that's a bit adjacent to what we're trying to do here. If this idea is worth the effort relative to other sources of material.. still undecided.
letmevoteplease · 2 years ago
Just for fun, here's what happens to Pride and Prejudice:

User: Rewrite and simplify the following text for a modern audience: "It is a truth universally acknowledged, that a single man in possession of a good fortune must be in want of a wife. However little known the feelings or views of such a man may be on his first entering a neighbourhood, this truth is so well fixed in the minds of the surrounding families, that he is considered as the rightful property of some one or other of their daughters."

ChatGPT: "People generally believe that a rich single guy must be looking for a wife. Even if we don't really know what he's thinking when he moves to a new area, everyone assumes he's up for grabs by one of the local girls."

bstpierre · 2 years ago
This is a cool project. I'm learning French and am always on the lookout for sources of books -- and especially audio -- at the right level.

To others questioning the need for rewriting: language changes over the course of a century! It's no problem for me, as a native English speaker, to read English from around the turn of the 20th Century. However, it increases the degree of difficulty for me to try to read French or Spanish from the same time period. Also, to get through an old book I have to learn a bunch of useless words that have fallen into disuse; it's much more effective for me to learn the version of the language that is relevant in today's world.

It looks like the French TTS isn't working properly yet? At least on the couple of titles I tried. The selection of titles you have listed looks great.

e12e · 2 years ago
If you've already determined the text is interesting to the modern reader, why rewrite?

Wouldn't it make more sense to look for texts that are thematically relevant, but with inaccessible language - and rewrite those?

I still shudder to think how this system will handle something like Shakespeare's sonnets...

davidzweig · 2 years ago
It's for students learning English. The top 1000 will be rewritten and be available in addition to the original version, and about another 10,000 available in original only (selected using vocabulary frequency analysis and the chatGPT query).

Midsummer Night's Dream:

THESEUS Now, fair Hippolyta, our nuptial hour Draws on apace; four happy days bring in Another moon: but, O, methinks, how slow This old moon wanes! she lingers my desires, Like to a step-dame or a dowager Long withering out a young man revenue.

HIPPOLYTA Four days will quickly steep themselves in night; Four nights will quickly dream away the time; And then the moon, like to a silver bow New-bent in heaven, shall behold the night Of our solemnities.

Rewritten:

THESEUS Now, dear Hippolyta, our wedding day is approaching quickly. In just four more happy days, a new moon will arrive. But oh, it feels like this current moon is taking forever to disappear! It's holding back my excitement, like an old, slow stepmother or a widow who keeps a young man's money for far too long.

HIPPOLYTA Don't worry, Theseus. Four days will pass by in no time, and then four nights will also quickly slip away. After that, the new moon will shine brightly in the sky, like a freshly bent silver bow, and it will witness the night of our special celebration.

Hmm. Shakespeare is the extreme case. It does make it more accessible. The style is lost. shrug

floren · 2 years ago
It's amazing that a decade ago I'd have called you an exceptionally demented individual for doing this, but these days there's so much stupid unconscionable shit going on with AI that it hardly stands out.
davidzweig · 2 years ago
floren: In epochs bygone, a mere decade hence, I would have proffered thee the epithet of an egregiously unhinged denizen, were thou to embark upon such an endeavor. However, in these current times, wherein the domain of artificial intelligence doth teem with preposterous and morally indefensible machinations, thy conduct emerges as naught but a faint ember in the vast tapestry of perplexities that now envelop us.

davidzweig: Hark! Methinks 'tis the act of rewriting, not the sifting, that doth wound thy pride most deeply, and affront thee grievously. Set thine eyes upon the instance presented hence, to glean a clearer understanding of our noble mission, one that doth strive to render these splendid volumes more reachable:

Original:

The other person was a man named O’Brien, a member of the Inner Party and holder of some post so important and remote that Winston had only a dim idea of its nature. A momentary hush passed over the group of people round the chairs as they saw the black overalls of an Inner Party member approaching.

O’Brien was a large, burly man with a thick neck and a coarse, humorous, brutal face. In spite of his formidable appearance he had a certain charm of manner. He had a trick of resettling his spectacles on his nose which was curiously disarming—in some indefinable way, curiously civilized. It was a gesture which, if anyone had still thought in such terms, might have recalled an eighteenth-century nobleman offering his snuffbox.

Winston had seen O’Brien perhaps a dozen times in almost as many years. He felt deeply drawn to him, and not solely because he was intrigued by the contrast between O’Brien’s urbane manner and his prize-fighter’s physique. Much more it was because of a secretly held belief—or perhaps not even a belief, merely a hope—that O’Brien’s political orthodoxy was not perfect. Something in his face suggested it irresistibly.

Rewritten:

The other person was a man named O'Brien. He belonged to a group called the Inner Party and had a very important job that Winston didn't fully understand. When people saw O'Brien, they became quiet, because he was a high-ranking member of the Inner Party.

O'Brien was a big, strong man with a thick neck and a rough, funny, and tough-looking face. Despite his intimidating appearance, he had a certain charm in the way he acted. He had a habit of adjusting his glasses on his nose, which was oddly disarming, and in some way, it seemed polite and refined, like an old-fashioned nobleman offering his snuffbox.

Winston had seen O'Brien about a dozen times over the past few years. He felt a strong connection to him, not just because he was curious about the difference between O'Brien's polite behavior and his tough appearance. It was more because Winston secretly hoped that O'Brien wasn't completely loyal to the government's beliefs. There was something about his face that made Winston believe this, even though it was just a feeling.

---> Orwell already has a clear style that I very much admire. The somewhat minor changes here would make for easier parsing by students, without changing the message much. I can't imagine Orwell would have objected.

I think your message is misplaced and pretty rude.

hedora · 2 years ago
If you are looking for short stories, I strongly recommend audio fiction podcasts.

Escape Artists is one of my favorite production houses. The recordings are creative commons licensed, and the authors (and other artists) get paid professional rates:

https://escapeartists.net/

Other sites to check out (all are donation-supported and pay authors):

https://www.drabblecast.org/

https://www.lightspeedmagazine.com/podcasting/

https://www.asimovs.com/more-stuff/podcasts/

To give you an idea of what’s available, “Money in the Bank” by John Kessel and Bruce Sterling will likely sit well with the HN crowd:

https://www.lightspeedmagazine.com/fiction/money-in-the-bank...

I could have picked literally 100’s of other stories; this one wins due to recency bias and the authors cyberpunk roots.

jmspring · 2 years ago
For horror short stories - I'm a fan of the "Horrorbabble" podcast - https://www.google.com/search?client=firefox-b-1-d&q=horrorb...