Readit News logoReadit News
ilamont · 7 years ago
But, he says, instead of being limited by how quickly we can process information by listening, we’re likely limited by how quickly we can gather our thoughts. That’s because, he says, the average person can listen to audio recordings sped up to about 120%—and still have no problems with comprehension.

Some years ago I worked on an accessibility project for an app and website designed for people with disabilities. One of the team members had low vision, and used a screen reader that must have been set to 3x or even higher. I usually listen to YouTube and podcasts at 1.5-2x and I could barely understand the audio. He seemed surprised, which indicated to me that 3x+ was the norm for people in his circle.

I wonder if his ability was trained through years of using fast screen readers, vs. a lower visual processing load leads to better audio processing, or some other explanation.

ahicks · 7 years ago
I'm the blind dev who refactored a huge chunk of the Rust compiler [0]. I'm at roughly 800 words a minute with a synth, with the proven ability to top out at 1219. 800 or so is the norm among programmers. In order to get it we normally end up using older synths which sound way less natural because modern synthesis techniques can't go that fast. There's a trade-off between natural sounding and 500+ words a minute, and the market now strongly prefers the former because hardware can now support i.e. concatenative synthesis.

1219 is a record as far as I know. We measured it explicitly by getting the screen reader to read a passage and dividing. I spent months working up from 800 to do it and lost the skill once I stopped (there was a marked level of decreased comprehension post 1000, but I was able to program there; still, in the end, not worth it). When I try to described the required mental state it comes out very much like I'm on drugs. Most of us who reach 800 or so stay there, though not always that fast for i.e. pleasure reading (I do novels at about 400). it's built up slowly over time, either more or less explicitly. I did it because I was in high school doing muds and got tired of not being able to keep up; it took about 6-8 months of committing to turn the synth faster once a week no matter what, keeping it there and dealing with a day or two of mild headaches. Note that for most blind people these days, total synthesis time per day is around 10+ hours; this stuff replaces the pencil, the novel, etc. Others just seem to naturally do it. You have little choice, it's effectively a 1 dimensional interface, so from time to time you find a reason to bump the knob. And that's enough.

Whether and how much the skill transfers to normal human speech, or even between synths, is person-specific. I can't do Youtube at much beyond 2x. Others can. It's definitely a learned skill.

0: https://ahicks.io/posts/April%202017/rust-struct-field-reord...

ahicks · 7 years ago
And as a followup to that--because really this is the weird part--some circles of blind people (including mine) talk faster between ourselves. That's not common, but it happens. I still sometimes have to remember that other people can't digest technical content at the rate I say it and remember to slow down. A good way to bring it out is to have me try to explain a technical concept that I understand really well. I have the common problem in that situation of not being able to talk as fast as I think, but I also seem to have the ability to assemble words faster in a sort of tokenize/send to vocal cords sense once I know what I want to say.

To me, the fact that this does in fact seem to be bidirectional at least some is more interesting than that I can listen fast.

tomp · 7 years ago
Has anyone tried overlapping words instead of speeding them up? Like so:

  How
   are
    you
     doing?
I often wondered if this, or at least sped up speech, should be the default robotic interface... it would make sense to optimize for efficiency/speed (while maintaining legibility) if we can do so.

taneq · 7 years ago
Wow, that's incredible. Do you find it frustrating talking to actual humans now? I'd imagine it feels like they're speaking in slow motion.

Edit: Hah, just saw your post on talking faster to other people who have the same audio skills.

josephpmay · 7 years ago
I can't find any recordings of 800 WPM synths. Would it be possible for you to make one? I'm curious of what it sounds like.
aepiepaey · 7 years ago
> Whether and how much the skill transfers to normal human speech, or even between synths, is person-specific. I can't do Youtube at much beyond 2x. Others can. It's definitely a learned skill.

I find that the maximum understandable rate varies a lot between speakers. For some speakers 2.5x is possible, but just 1.5x for others.

One advantage synths has, is that they can more easily control the speed at which words are spoken, and the pauses between words independently. When watching/listening pre-recorded content I often find that I'd want to speed up the pauses more than the words (because speeding up everything until the pauses are sufficiently short make the words intelligible).

If someone knows of a program or algorithm that can play back audio/video using different rates for speech and silence, please share.

tripzilch · 7 years ago
Are old speech synths not harsh on the ears to listen for longer periods? Or maybe I'm just familiar with the super robotic ones (I like them for music production).

If so, have you considered using an EQ plugin to maybe turn down the harsher high frequencies a few notches? Just a thought.

knzhou · 7 years ago
I've known a lot of people that push podcasts, videos, and audiobooks to extreme speed. I knew a guy who'd turn video speed up to 8x so he could binge watch a season of generic anime in an hour flat. I knew a girl who'd get through paperback romance novels by scanning each page diagonally, in 10 seconds each. And here in this thread we have a lot of people bragging along the same lines.

I just don't get the point. If you can process content much faster than it was meant to be played, it doesn't mean you're learning much faster than you could, it means the novel information density is low. Any content that can be sped up that much without loss is not worth listening to in the first place. You're just skipping the trite cliches, filler, and obvious facts.

I can read fast, and I typically go through fluffy NYT bestseller nonfiction at 600 WPM. But when I do this I constantly have a sneaking suspicion that I'm just wasting my time. When I read a good book full of new ideas, I barely go at 150 WPM, but the time always feels well-spent.

dredmorbius · 7 years ago
Exceedingly slow narration, particularly what's normal for audiobooks, is annoying to me because it's slower than I process words. It's like walking with someone whose pace is far slower than your natural gait -- it takes more energy and concentration to slow down. It's why slow-talkers are so annoying.

This isn't "how fast can I go through this" but "what is a comfortable pace"?

So I bump the speed up, though usually fairly modestly: 1.25x - 1.5x is generally enough.

I've noticed that preferred speeds vary tremendously with the quality of the work and speaker -- high-density information and an exceedingly good speaker, and I'll slow down. Slapdash redundant content and poor speaker, I'll speed up.

The degree of polish in the production matters tremendously. I've listened to CPG Grey's YouTube videos (highly polished) and podcasts (a lot of chit-chat with his co-host). The videos work well at normal speed, or perhaps slighly sped up. The podcasts I find nearly unlistenable, though they improve at much higher speeds (1.75x - 2x).

dfhdshfff · 7 years ago
I used to watch a lot of lectures at a high speed. But I've came to realize that the faster I watch something, the faster I forget it.

It is like the infromation doesn't have the time to settle in my memory, despite me understanding it.

It's maybe because when things are slow, I can use the dead time to think about the implication/corner cases of what's being said.

jacobolus · 7 years ago
People tend to enjoy genre fiction because it is full of predictable filler. Not all entertainment has to be surprising or life changing.

Just spending time in the moment with an enjoyable story is not wasting it.

the_duke · 7 years ago
After doing speed reading exercises a few years ago I initially also started to use the techniques for novels, but that was a bad mistake in my opinion.

While I could get my comprehension percentage quite high with a bit of training, I lost all connection to the characters and story, stopped imagining the scenes and felt like the reading the book was a waste of time.

Novels should be read at a natural pace to give room to your imagination and dive into the story. You can still quickly scan over boring/repetitive filler text, but I did that without caring about WPM already.

With other things like textbooks / articles / reports cranking up your WPM and applying your attention more selectively by focusing on or re-reading critical parts is a very helpful skill though.

zik · 7 years ago
Maybe "speed readers" still receive knowledge at around 39bps - they just filter out a lot more.
liability · 7 years ago
> . If you can process content much faster than it was meant to be played, it doesn't mean you're learning much faster than you could, it means the novel information density is low.

I 'read' your comment using TTS at 3x. What does that say about the information density of your comment?

(Little to nothing. TTS at that speed is still marginally slower that I normally read with my eyes. Human speech is generally much slower than is necessary to be understood.)

mLuby · 7 years ago
I imagine it's a compromise between cutting through the fluff and using the primary source material.

You could just read the plot synopsis or watch the highlights, but sometimes those don't convey build-up, suspense, or other data that are hard to losslessly compress.

Being comfortable with the "boilerplate" of a given medium or genre usually lets you skim or skip it to jump right into the good stuff.

markussss · 7 years ago
I listen to a lot of podcasts and audiobooks while doing other things; walking, cleaning, cooking, traveling, playing games, etc. Every time I try speeding up, even just to 1.25x, I don't enjoy it as much, as it feels rushed and stressful. I think it could be interesting to learn to listen and read at extremely high speeds, but nothing more than interesting, and I'm even doubting the usefulness of it.
js2 · 7 years ago
As I used to tell my boss: a man page read once thoroughly is worth more than ten-times skimming it quickly.
jlebar · 7 years ago
Any content that can be sped up [by 8x] without loss is not worth listening to in the first place.

"That's just, like, your opinion, man."

Obviously the people who are getting something out of it, otherwise they wouldn't do it?

"Don't yuck my yum."

saagarjha · 7 years ago
> I can read fast, and I typically go through fluffy NYT bestseller nonfiction at 600 WPM. But when I do this I constantly have a sneaking suspicion that I'm just wasting my time. When I read a good book full of new ideas, I barely go at 150 WPM, but the time always feels well-spent.

I do the same, but with Hacker News comments :)

40acres · 7 years ago
In my experience the best books routinely stop me dead in my tracks. I just started Invisible Man and every paragraph is littered with really deep themes, I can normally finish a 500 page book in a few days (a majority of my reading is done during my commute) but this will definitely be a slow burn.
alperakgun · 7 years ago
I think it depends on the speaker. I watch most videos at 1.5x speed, and some slow speakers with gaps at 2x.

Deleted Comment

floatingatoll · 7 years ago
Fluff has value. Fresh cotton candy should be eaten rapidly. It’s fluff. It’s allowed :)
yowlingcat · 7 years ago
For me, when I'm reading something dense, my WPM fluctuates. It could be 300WPM in the easy areas, and down to 30WPM in the conceptually challenging areas.
jimmaswell · 7 years ago
I can watch videos like Linus Tech Tips up to 2x speed and get just as much out of it as otherwise.
emilga · 7 years ago
Here’s a blind programmer using Visual Studio with a ridiculously fast TTS: https://youtu.be/94swlF55tVc
saagarjha · 7 years ago
I'm not blind, but I've tested some of my apps for VoiceOver and it's just utterly unusable with a "reasonable" speed. You have to pretty much set it to your reading speed for it to be useful, and that happens to be significantly faster than most people are comfortable speaking.
daenz · 7 years ago
This makes me emotional.
ilamont · 7 years ago
Yes, it was like this video (scroll ahead to 1 minute mark).
netcraft · 7 years ago
Thank you for sharing that, was super interesting.
vlovich123 · 7 years ago
Yes. Took me a while but I can comfortably understand 2x speed and now 1x podcasts seem weird like they're talking super slow. I would imagine it's something you just train even more out of necessity.
liability · 7 years ago
I typically use TTS near 2.5x (I turn it up when I'm alert and down when I'm tired.) It's definitely a learned skill; a few years ago I started at 1x and struggled even with that.

Every couple of months, take a moment to reflect on your comprehension. Is it currently easy for you to understand the audio? If yes, then crank it up a little bit until it's noticeably more difficult. Repeat this process periodically over a year or so and before you know it, it'll be set pretty damn quick.

dual_basis · 7 years ago
I can watch YouTube content at 3x without issue. I did this without much intentional effort - I simply downloaded an extension which allows me to speed up the video in increments of 0.1x using a keyboard shortcut. Whenever it felt slow I would speed it up, and whenever it felt too fast I would slow it down. Without paying much attention to the actual numbers I had reached over 3x within a month or so.
im3w1l · 7 years ago
I tried this right now with the trick below. 2x was no issue but 3x... that was a big step. It sounded like word salad. As if my brain was decoding the words out of order and was unable to assemble them into sentences.
gonehome · 7 years ago
Is there a trick to get it above 2x (which is the cap in the UI). It'd be nice when watching videos of congress.
colechristensen · 7 years ago
It depends on the speaker. Some speakers are particularly slow with lots of long pauses and others are faster.

I think anyone can get to 3x but it takes some time to adjust to faster and faster speeds. It also depends on what you are doing while listening. Distractions or listening while doing something else (driving for example) lowers my ability to comprehend. For example on the interstate without much traffic I'll listen to audiobooks at 3x, but in a city or a crowded highway I have to slow it down.

Swizec · 7 years ago
Overcast has a great feature where it cuts out pauses and otherwise leaves the speed intact. It easily speeds up podcasts by 30% and more
tfha · 7 years ago
If you close your eyes listening comprehension goes way up. I top out around 2x if I have to look but with my eyes closed I can get full comprehension at 3x+

If it's a technical talk or something I'll still pause often too reflect on what was said, but I can hear full sentences just fine at >3x with my eyes closed.

devnulloverflow · 7 years ago
Well I'm sure the ability requires training, but I wonder if it is specific to screen readers.

Consider what you quote: > we’re likely limited by how quickly we can gather our thoughts

Now the amount of relevant info on a screen is typically small enough that a sighted person can zero in on it at a glance and perhaps just click a button without thinking.

I.e. the amount of info that deserves "gathering our thoughts" is typically very small. So if that is the bottle-neck, your colleague can keep cranking up the audio speed until low-level processing audio becomes the bottle-neck, which is a regime that sighted people never deal with even, not even the nerds who speed up their Joe Rogan podcasts.

Deleted Comment

kccqzy · 7 years ago
It's easy to train yourself to do that though. Just find your favorite audiobook and listen to it daily. First listen to it at 1.5x, then adjust to 2x after a few days, then 2.5x after a few more days etc. You'd be surprised how fast your brain can actually process the information.

Personally when I did this I feel irritated when I speak because my sped-up audiobooks have conditioned me into thinking I should be speaking at that rate, but it's just not possible for my mouth and tongue to move that fast physically.

dexterdog · 7 years ago
I recommend. .2 step increases as .5 is too much. I also recommend silence skipping if your player supports it.
DoctorOetker · 7 years ago
>But, he says, instead of being limited by how quickly we can process information by listening, we’re likely limited by how quickly we can gather our thoughts. That’s because, he says, the average person can listen to audio recordings sped up to about 120%—and still have no problems with comprehension.

The deduction that is quoted does not follow: speeding up audio recordings with 120% results pressing both the auditory system as the language and thought systems (or any other potential bottleneck) to be sped up proportionally since it's a pipeline.

Similarily the posted article (I have yet to read the original one) states in the title that "human speech" has a universal transmission rate, but the research tested reading not speech, so this may or may not be true.

Perhaps the bottleneck is human speech, with the side effect that listening is never trained beyond the typical speech rate limit. (in this case the higher speed syllable languages would be easier to pronounce fast, and the lower speed ones harder to pronounce fast)

Perhaps the bottleneck was in the visual burden of reading, a language that encodes more bits per syllable implies more types of syllables, which irrespective of size or number of characters puts a classification demand on the visual system (classifying a symbol coming from a set of only 2 symbols will be easier, but will require more classification instances than classifying from a large set of characters but with fewer classification instances).

Perhaps the bottleneck was again in speech during reading by subconscious vocalizing of the text.

Perhaps the bottleneck was in the auditory "speech to syllable" classification.

Perhaps the bottleneck was in parsing text.

Perhaps the bottleneck was in "accessing thoughts" etc.

So it is rather hard to identify where the bottleneck is located without having a means of detecting where in the brain the "incoming queue is full" vs "incoming queue is waiting" during speaking, listening, reading. And which of these 3 causes this universal bottleneck (since I gave 2 examples of how an apparent bottleneck in reading could stem from not being trained beyond a possible universal bottleneck in speaking rate...)

Iv · 7 years ago
That quote seems to imply that they have not measured the maximum speed to receive information but the speed at which we are comfortable outputing it.

There is no shortage of people training to receive a lot of information at once, and 39 bits per second seems to me on the lower end of what some video games require but in terms of constructed, linguistic output? They may be on to something there.

Fast chatters are not faster thinkers. I have yet to see people exchanging thought at a higher rate then usual.

soulofmischief · 7 years ago
> I wonder if his ability was trained through years of using fast screen readers, vs. a lower visual processing load leads to better audio processing, or some other explanation.

While I'm sure his visual cortex picked up some slack, I'm willing to bet it's mostly just through training. We just aren't trained for faster communication. I've known blind people and they are the same way with their readers.

ssalka · 7 years ago
For me I imagine the bottleneck would rather be how quickly I can translate my thoughts into speech. Oftentimes I will start out talking to myself to explain a topic, only to eventually digress into my "mental monologue" because I start to process thoughts faster than I can say them.
thrwayxyz · 7 years ago
I'm fully sighted but I use espeak tts at around 1000wpd for fluff text like ecnomist articles and 300wpm for heavy going text like the text sections of math books.

I also watch most video at 2/3 times the speed since the skills seem transferable.

sabujp · 7 years ago
what flags do you use for espeak, just -s 1000 seems incomprehensible to me
samstave · 7 years ago
I listen to youtube videos at 200%

What do i win???

mlang23 · 7 years ago
I am blind, but I am not a primary speech synthesis user, I prefer tactile braille. However, I know a number of people which are using their speech synthesizers at rates similar to what you described above.

My theory/experience with this phenomenon is, that a speech synthesizer never makes any errors. When it pronounces a word, it will do so exactly the same way everytime the same word comes up. So the learning effect after a while is a bit higher then when you listen to a human. Humans will always have slight variation in how they pronounce the same word. So, as I understand it, you can "learn" to listen to your speech synthesizer on a fast rate more effectively then you would be able to listen to a fast human speaker.

And yes, I also listen to YouTube talks and audiboosk at about 1.5-2x rate. So I guess 80 bits per second is relatively easily doable for the receiver.

microcolonel · 7 years ago
Depends on the natural speed of the speaker, but I listen to most podcasts and YouTube commentary/narrative at 2x. Podcasts sometimes in the 3x range.

Sometimes it's worth slowing down to 1.5x to give myself a bit of time to process the ideas, though slowing below that sometimes hurts comprehension.

Side note: I find that YouTube in Chrome has the best pitch-preserving time stretching filter, and I've neglected all this time to figure out what exactly they use to accomplish that. I'd love to add that to mpv, if it's not already there.

not_a_cop75 · 7 years ago
Probably this is a healthy reminder of how the brain optimizes and uses sections of itself. Without the need for vision, those cranial areas can be better used for other things.
lenepp · 7 years ago
Apologies for being harsh, but this kind of thing is the phrenology of our time. I know it's utterly conventional to think this way about language in some circles that present themselves as doing legitimate science, but the view that you can calculate the amount of information in human speech, except in a super-technical sense that doesn't match any of the reporting on this study or the way people are interpreting it, has to be called out for the total nonsense that it is. It doesn't bear a moment's honest reflection.

And yes, I know information theory. It's language that these folks - many of them prominent and celebrated within their utterly normalized professions, just like in the days of phrenology - are fundamentally mistaken about. What quantity of information do you think there is in the word "trump," for instance? Is it the same over time, to bring up just one feature of how this funny thing called context informs human speech?

Wittgenstein's Philosophical Investigations is a good place to start if anyone's interested in understanding this issue.

bagacrap · 7 years ago
They aren't talking about the semantic information of the word "trump". They explain the methodology for calculating information, and it's per syllable (based on the number of distinct syllables that are part of the language's phonetics). So, for English speakers, 'trump' has exactly 7 bits in it. That exact syllable may or may not exist in another language, but if so the same singly syllabic word "trump" would have a different number of bits to a speaker of that language. Maybe next time RTA?
cortesoft · 7 years ago
In other words, they aren't factoring in compression.
EpicEng · 7 years ago
>Maybe next time RTA?

I think it's you that has missed the point. Syllables have a very loose correlation to information. So great; we can stream out 39bits worth of syllables / second. In what way does that describe how information dense those syllables are? Context matters here.

bagacrap · 7 years ago
s/exactly 7/a little more than 7/
subroutine · 7 years ago
You're saying phonology is the new phrenology?

Jokes aside, I agree that estimating the average absolute information content of a syllable seems pretty absurd.

However, if the primary goal here was to determine whether some languages convey more information per unit time than other languages, I think the authors did fine. To this end, they needn't define information per syllable in anything other than p.d.u. - procedurally defined units. If average Vietnamese speech has 2x the number of syllables/min as German, but it takes the same amount of time to recite War and Peace in both Vietnamese and German, it suggests that both languages convey the same high-level information 'per unit time', but not 'per syllable'.

And basically that's all they did... "We computed the ratio between the number of syllables [in the text passage] and the duration [it took to recite the passage]"

Deleted Comment

canjobear · 7 years ago
What do you see as the contradiction between Wittgenstein and information theory?
anoncake · 7 years ago
> And yes, I know information theory.

You clearly don't know linguistics though because the idea that a word conveys a constant quantity of information is hilarious.

Dead Comment

codeulike · 7 years ago
Early on when Information Theory was emerging, there were attempts to measure the bandwidth of consciousness. They reckoned about 18 bits per second or less, which sounds very low.

Tor Norretranders book, The User Illusion, mentions some of the research:

W R Garner and Harold W Lake "The Amount of Information in Absolute Judgements" - Psychological Review 58 (1951) - they attempted to measure people's ability to distinguish stimuli (such as light and sound) in bits. Result: 2.2 to 3.2 bits per second.

W E Hick "On the Rate of Gain of Information" - Quarterly Journal of Experimental Psychology 4 (1952) - this experiment measured how much information a person could pass on if they acted as a link in a communication channel. That is, faced with a series of flashing lights, subjects had to press the right keys. Result: 5.5 bits per second.

Henry Quastler "Studies of Human Channel Capacity" - Information Theory, Proceedings of the Third London Symposium (1956). Measured how many bits of information are expressed by a pianist while pressing keys on a piano. Result: 25 bits per second.

J R Pierce "Symbols, Signals and Noise" (Harper 1961) - used experiments involving letters and symbols. Result: 44 bits per second.

Discussion of the research, Tor Norretranders book, and what the research may have missed here:

http://memebake.blogspot.com/2008/08/straw-dogs-and-bandwidt...

Mathnerd314 · 7 years ago
> instead of being limited by how quickly we can process information by listening, we’re likely limited by how quickly we can gather our thoughts. That’s because, he says, the average person can listen to audio recordings sped up to about 120%—and still have no problems with comprehension. “It really seems that the bottleneck is in putting the ideas together.”

Glad this paragraph was in the article, clears up their methodology. I wonder if it applies to writing too, or if skilled writers work faster.

bonoboTP · 7 years ago
Okay, but the experimental subjects didn't put any thoughts together but read out some text aloud...
thunderrabbit · 7 years ago
The same text, at that. So the text has N bits of information and it was, according to the article, spoken at different speeds per language. So N bits at different speeds per language, exactly the opposite of their claim.
Aperocky · 7 years ago
Really depends on language, if you're writing java, you'll be putting out a lot more than that due to how stupidly verbose it is.
jefftk · 7 years ago
> if you're writing java, you'll be putting out a lot more than that due to how stupidly verbose it is

Being "verbose" means that each letter you type communicates fewer bits of information. If the bottleneck is putting ideas together then you would expect someone writing in a more verbose language to type more letters per minute but still take a similar amount of time to communicate the idea.

In practice most Java programmers are using IDEs with good auto-completion, though, so aren't actually needing to type as many letters as you'd think.

airstrike · 7 years ago
So you're saying you're not a fan of RequestProcessorFactoryFactory.StatelessProcessorFactoryFactory?

http://ws.apache.org/xmlrpc/apidocs/org/apache/xmlrpc/server...

jolmg · 7 years ago
I also think that in writing the bottleneck is in how fast and accurate your hand can move. I would agree that English is faster than Spanish because Spanish is more verbose.
n1231231231234 · 7 years ago
This is really cool. I am working in a related area and I think most of us have assumed that on average, the information rate is 'about the same' for the languages across the world. So it's exciting to see that their results confirm this assumption.

Two qualifying remarks.

1) The 'about the same' is important. Even in their data, there is still quite some variance. They found an average of 39bits, with a stdev of 5. That means that about 1/3 of the data falls outside of the range of 34-44bits.

2) Which brings me to the the uniform information density (UID) hypothesis. According to the UID, the language signal should be pretty smooth wrt how information is spread across it. For many years, the UID was thought to be pretty absolute: Even across a unit like a sentence, it was thought that information will spread pretty evenly. Now, there is an increasing amount of research that shows that esp. in spontaneous spoken language, there is a lot more variance within in the signal, with considerable peaks and troughs spread across longer sequences.

godelski · 7 years ago
Why did everyone assume it would be the same on average? This seems weird to me.

Also, can you explain more about how the information density was calculated? Anything at the bit level seems crazy small to me. Words convey a lot of information. They cause your brain to create images, sounds, emotions, smells, etc. I guess we're calling language a compression of that? But even still, bits seems small.

n1231231231234 · 7 years ago
> Why did everyone assume it would be the same on average? This seems weird to me.

(see edit below; but i leave this up; it might be interesting, also) you mean that even for smaller sequences, the UID holds, right? the assumption was that even for a single sentence, there are a lot of ways to reduce or increase information density so that you get a smoother signal. e.g.: "It is clear that we have to help them to move on.", you could contract it to "it's clear we gotta help them move on" and contract it even further in the actual speech signal ('help'em'). or you could stretch it: "it is clear to us that we definitely have to help them in some way to move on", or alike. the assumption was that such increases / decreases would even be done to 'iron out' the very local peaks and troughs, particularly in speech.

bits: yeah, that took me a while to get used to, as well. the authors used (conditional) entropy as a way to measure information density (which is a good measure in this instance imv). and bits is just per definition the unit that comes out of information theoretical entropy: https://en.wikipedia.org/wiki/Entropy_(information_theory) . btw: while technically possible, i don't think that the comparison in the summary article between 39 bits in language and a xy bit modem is a helpful comparison. bits in the context of entropy are all about occurence and expectation in a given context. bits of a modem/in CS, they represent a low level information content for which we do not check context and expectation.

edit: ah, i realise you are asking why most in our community assumed that this universal rate applied across languages, right?

i guess the intuition was that all of us humans, no matter what language we speak, use the speech signal to transmit and receive information and that all of us have the same cognitive abilities. so the rate at which we convey information should be about the same. sure, there are probably differences according to some factors (spoken vs written language, differences in knowledge between speakers, etc.). but when the only factor that differs is English vs Hausa, esp. in spontaneous spoken language, then the information rate should be about the same.

anonytrary · 7 years ago
I think there is a distinction between "flux of incoming information" and "net knowledge gained by human as a result of incoming information".
cwackerfuss · 7 years ago
After a few cocktails, once or twice, I've wondered with friends whether some "fuzzy" information rate constant might be a reference by which our brain understands the passage of time. In other words: if there is a fundamental processing rate of x/time, then theoretically, wouldn't our brains subconsciously use that for all kinds of neat reasons?

And the rate wouldn't have to be the exact same value for each individual, so long as the brain can attune its specific value to other reference points to time in nature.

SXX · 7 years ago
So here is my own experience. I was avid audio book fan for last 3 years and while ago some guy on reddit told me about how he listen books on Audible using high-speed option like 2.x. I never tried that before last summer since at higher speed speech become incomprehensible for me.

What this guy told me is that it's just take time to adjust to it. So I basically started to listen for books at slightly higher speed. Then I gradually increased it and in a few days I could handle 2.0x speed no problem while listening for really complex fantasy (Malazan Book of the Fallen [1]). After two weeks I could handle 2.5x without a problem.

In the beginning it was harder to comprehend at high speed while walking or crossing the street since I lost attention, but in a few months I could do anything while listening without missing any information or emotions of narrator.

To give an example of how far this can go. This spring I was listening for The Expanse audiobook [2] at 4.0x speed. With some effort I could go even faster for like 5.x in case of these particular books, but obviously can not keep up for long.

I still usually listen books at 2.0-3.0x depend on narrator and quality of audio and this skill dont go away even if I have extended time between books like a month or so.

[1] https://www.audible.com/pd/Reapers-Gale-Audiobook/B00M4LRBY6

[2] https://www.audible.co.uk/pd/Abaddons-Gate-Audiobook/B00T6NZ...

UPD: Edit. s/can keep up/can not keep up/

colechristensen · 7 years ago
One thing I'd also like to develop / wish was integrated into audible and the like is silence trimming. Some speakers leave outsized pauses in their narration which can be significantly shortened effectively increasing speed with less distortion.

I have the opposite problem where I have trouble paying attention to an audiobook at 1x. I get bored in between words and my mind wanders making it very difficult to keep track of what is being said (as in I hear individual words but have trouble keeping sentences in memory when everything comes too slow)

I wish I had realized this in university and had been able to somehow record and playback lectures at 2x. I always got so little out of lectures because the information wasn't coming in fast enough for me to process correctly.

thegranderson · 7 years ago
Overcast (a podcasting app) has great features to optimize the high speed listening experience. They have variable speed, a great silence trimmer, and a voice boost that makes speech clearer.
SXX · 7 years ago
> One thing I'd also like to develop / wish was integrated into audible and the like is silence trimming.

I don't really use audible, but if you looking for good audio player on Android here is one that can do this:

https://play.google.com/store/apps/details?id=de.ph1b.audiob...

https://github.com/PaulWoitaschek/Voice

bitwize · 7 years ago
Blind people use screenreader software sped up so fast as to be indecipherable to untrained ears. The screenreader can give them near instantaneous feedback about where on the screen they are and what's there when it's so sped up, and with a bit of practice perceiving the sped-up speech imposes no burden at all.
SXX · 7 years ago
I can tell from experience that when I lay down in my bed with my eyes closed I could comprehend speech at much higher speed than I would while walking on the street. No surprise blind people can handle it better even though I have no clue how exactly it work in relations to the brain.

I was always curios to make actual research / paper on this kind of thing, but as non-scientist I simply have no time to do so. So I happy someone actually doing it.

bitexploder · 7 years ago
Side question: I wonder if anyone has actually finished the entire Malazan series. It takes some serious dedication. I would be curious of the story still makes sense to you by the end when listening at that speed.
SXX · 7 years ago
> Side question: I wonder if anyone has actually finished the entire Malazan series. It takes some serious dedication.

I only finished Malazan Book of the Fallen, first two Tales books and all The Path to Ascendancy books. Also started Forge of Darkness, but was too preoccupied with my life to finish it.

Honestly Esselmont books are just weaker overall. The Path to Ascendancy was much better, but 3rd book is just too rushed.

> I would be curious of the story still makes sense to you by the end when listening at that speed.

Speed have no effect on story at all. Basically after you practice it for a bit you even get every emotion narrator trying to put into his speech.

As for the story in general it's make more and more sense closer you get the the end. It's masterfully crafted world with great theme of compassion and even though I finished it more than a year ago I still have flashback or two from time to time since I loved some of characters. Malazan is certainly one of my favorite book series.

Yet keep in mind there is abundance of information and events as well as unreliable narrators which can confuse your view of story lines.

kovrik · 7 years ago
Not OP, but I am currently on Reaper's Gale.

Malazan quickly became my favourite book series (and I am not even a fan of fantasy). It was hard initially. But it gets better.

However, I think that re-read is a must if you want to fully grasp the whole thing.

Causality1 · 7 years ago
Why would you want to do that though? Isn't the experience of listening to it the point? If not, why listen to it at all instead of reading a detailed summary?
SXX · 7 years ago

  > Why would you want to do that though?
Because I don't just listen for books for enjoyment of process itself. I love complex stories with hundred of characters and plot lines across many books. Reading something like Malazan or Wheel of Time is like a journey into another world for me and I deeply immersed into these world while exploring them. Yet amount of free time I have is limited so getting more information in short period of time is very convenient

I totally get it when some people just love to read books slowly while enjoying their coffee or looking at nature, but I'm into books for the stories and format of fast-paced audio is fine for me.

  > Isn't the experience of listening to it the point? If not, why listen to it at all instead of reading a detailed summary?
I feel like you imply that by listening on high speed I miss some part of experience. Yet other than voices being just slightly distorted (after some practice it's the same voices, but faster) I get exactly the same experience as any person who listen or read unabridged book.

On other side detailed summaries are not the same thing that author designed, but someone else rehearsal which is usually far from perfect.

tkfu · 7 years ago
I'm a bit confused, here. (I went and looked at the original paper.) They estimated information density for each of the subject languages as a whole, on average:

> In parallel, from independently available written corpora in these languages, we estimated each language’s information density (ID) as the syllable conditional entropy to take word-internal syllable-bigram dependencies into account.

But the experiment uses the same text translated into each language! Why introduce this extra variable (and source of error) of estimated language-wide information density, if you are controlling your experiment such that you have the exact same information encoded in each language? That is to say, why use an _estimated_ information density when you could measure it exactly for the texts that are being spoken? Or, conversely, why go to all the trouble of having the speakers read the same text translated into each language, if you aren't going to make use of that symmetry?

canjobear · 7 years ago
Information depends on probability. If something is very probable then it doesn’t have much information (because you already saw it coming). If something is improbable then it has a lot of information.

In the paper they want to know how much information is in a syllable in context. To do that they need to know the probability of each syllable given the previous syllable. To estimate that probability distribution, you need to look at a lot of text, much more than just the passages that the authors used to measure speech rate.

rocqua · 7 years ago
Good question!

I suppose that the experiment wants to capture the actual 'information density' of the language, and hence looks at the full language. Then, they want to avoid any modification in speech rate due to the semantics of the spoken text.

This does not make sense for a hypothesis where the actual bit-rate of speech tends towards 39 b/s. That is, when your text happens to convey more bits, you slow down.

However, for an alternative hypothesis, this design does make sense. The idea here is that a language naturally converges to a speech-rate that gives 39 b/s. The idea here is that the actual speech-rate is much more constant, and just drops until it becomes too fast. For that, I'd argue you don't want the mean bit-rate but something like the 90th percentile bit-rate. Because it seems to me that speech-rate that is 'too fast' more than 10% of the time would not really be natural.

ShinyObject · 7 years ago
The researchers obviously have to keep the scope narrow in order to get numbers at all.

That said, we should be aware that a tech nerd audience will find simple answers to complex non-tech questions appealing, and we should not over-estimate our understanding here just because we have a number.

There is a large amount of data transmitted through sub-communication and context, particularly during an in-person interaction, which is what people are wired for. Overall tone, body language, eye contact, and various social cues make up the bulk of data being transferred in many interactions. There's a reason why talking to some people feels exhausting and others invigorating, and it's not just the transcript.

knzhou · 7 years ago
We can avoid reading too much into the study by just remembering the error bars. It's not like 39 is a universal constant. It's more like 39 with a standard deviation of 6. That's a wide spread, but it's less wide than the spread you get from syllable rate alone, and that's all the study quantitatively tells us.
d-sc · 7 years ago
What are the things that make the difference between invigorating and exhausting?