Show HN: ML paper podcast generator using GPT and Tortoise-TTS

Interesting to read the prompts used to generate these conversations:

https://github.com/yacineMTB/scribepod/blob/master/lib/proce...

> Make the dialogue about this as long as possible.

yacine_ · 3 years ago

No more intuitive interpreter than the english language

I'm borrowing your podcast prompts and one issue I get is that Bob bounces between an asker of questions and a co-announcer. I'm currently adding a bullet point to your instructions which seems to be working so far.

  Only Alice has read the facts beforehand, Bob should never mention anything Alice hasn't said first.

I'm playing with this for popular science articles and am using a two-stage process, one to extract the textual claims from the article and another to rank them for the inclusion into the podcast. I found that just "summarize this" boiled down the wrong things - that there were discoveries not what they were for instance.

  Please make a bulleted list of all factual claims in this article, do not summarize or include any opinions or non-objective claims.

and then

  Combine similar or duplicate facts. Rate the facts by importance, objectivity, and checkability, and pick the top six for inclusion into a podcast for already semi-informed viewers.

When fed an article about the JWST, produced these and others:

  The two ancient galaxies were found billions of light-years behind a giant galaxy cluster called Abell 2744.
  -Importance: High, the location of the galaxies helps to understand the structure of the universe
  -Objectivity: High, this is a factual claim that can be confirmed by the telescope's observations
  -Checkability: High, the location of the galaxies can be verified through scientific data and observations.

  The two galaxies existed just 350 to 450 million years after the Big Bang.
  -Importance: High, this information tells us about the timeline of the universe and when certain galaxies formed
  -Objectivity: High, this is a factual claim that is supported by scientific data and research
  -Checkability: High, the age of the galaxies can be verified through scientific observations and data.

So far I'm just using the rankings myself to manually pick the facts to discuss but I'm going to prompt it to discuss them itself in context.

riskable · 3 years ago

This awesome, haha! It's already more accurate and informative (with better sound quality) than 90% of the podcasts that exist :D

ax8080 · 3 years ago

WOW! What do you use to generate voice? It's SO scary similar to real podcasts. I couldn't find it in a minute in the sources.

ax8080 · 3 years ago

and it's funny how sometimes it makes ahhrhrhrhrhhhhh sounds what is the reason behind that?

nielsinho · 3 years ago

It happens quite often with TorToiSe that it collapses in this way. Especially for unseen tokens that wouldn't have appeared in the training data, which likely consisted of a lot of transcribed stuff and read text like audio books. Trying to make it laugh by prompting it with "hahaha" (which you won't really see in mentioned data) often gets you demon and zombie noises.

carlbarrdahl · 3 years ago

It's making an api request to play.ht:

https://github.com/yacineMTB/scribepod/blob/master/playht.ts...

windsignaling · 3 years ago

I wonder why the title says that it uses Tortoise TTS?

Also interesting that play.ht allows you to clone others' voices.

tehsauce · 3 years ago

How did they get to use the joe rogan voice though? It seems that one isn’t public?

yacine_ · 3 years ago

That generation uses tortoise-tts. Play.ht has a model called peregrine, I've taken to using a script to call them out. Super cool company & API. I just haven't had time to get my next gen version out.

qup · 3 years ago

Play.ht

pikseladam · 3 years ago

I'm quite impressed and also shocked. Just wow! I believe we will find more useful cases like that in the near future.

renderingprompt · 3 years ago

Very impressive

georgeburdell · 3 years ago

Horrifying. “Believe nothing of what you hear and only half of what you see” has new meaning to me.

epistemer · 3 years ago

Imagine if we actually focused on papers being reproducible and deleted papers that were not reproducible or at least classified research that can't currently be reproduced.

LLMs trained on reproducible research across domains would be super human.

TheCaptain4815 · 3 years ago

This is quite an amazing use of this tech.

0x008 · 3 years ago

that is ridiculously funny haha