One more vote for terrible audio mixes and one for actors who mumble their dialog.
Whenever I watch movies with my wife, we constantly have to fiddle with the remote control. The loud parts (including explosions and music) are too loud for her, so we have to turn the volume down during those parts, but the dialog is too quiet and mumbly for me to hear, so we turn the volume up during those parts. We now just keep the volume low and turn on subtitles--easy but shitty solution.
Sound engineers need to stop needlessly flexing all that dynamic range that modern tools give them, and just produce a watchable movie. Actors need to go do some stage acting and learn how to e n u n c i a t e and project their voices.
There's an article that's been floating around that notes this phenomenon started after Nolan's movies became the style to emulate, and there are various sound engineers who state they would like to do what you want but they have to produce what the paying entity asks for. Someone should feel free to link it here - I'm mobile at the moment so can't dig it up.
But anyway, I just wanted to jump in and say... all the time on this site we see programmers complaining about management/etc forcing things to happen that you don't want happening. Sound engineers may be in a similar boat here. ;P
I had this issue recently watching Rings of Power, so much so that I have to keep the remote nearby to change every few minutes. Dialogue volume is so low you have to turn it up, then action scenes happen and the house is shaking.
Wait it isn’t because 5.1 became the norm for recording movies bust most people have a 2.0 or a 2.1 setup at most? I thought that was the conventional wisedom.
I'm very much outside of the age range mentioned in the article but +100 for crappy audio mixing.
When my family was going through all the Marvel movies at first I tried to control the volume by making it higher during the dialog and lower during the battles etc but it was an exercise in futility, as I'm only human and there would be some delay in volume change and we would still miss what the guy/gal said or my house would start noticeably shaking from the unexpected explosion happening during a scene where a guy was intensely whispering his rescue plan to his compatriots or whatever. Plus it started feeling like I'm working instead of enjoying the movie. So lower volume and subtitles it is.
I have home theater with 5.2 and 5 speakers, all it does is that when stuff in the movie goes boom my house starts to shake.
> when stuff in the movie goes boom my house starts to shake.
That’s how it’s supposed to sound. Explosions are loud.
This is not a problem with the source material, it’s a problem with the setup of the playback equipment. The source contains all the original data, it’s the playback equipment’s job to render that information in the appropriate way considering the space and circumstances where it’s played.
The ‘problem’ is that the sound is mixed to a cinema-level dynamic range. If you don’t want the full dynamic range you can compress it at playback. They store it at the full range since you can’t exactly un-compress it and the exact amount of compression depends on your setup and situation anyway.
Any half-decent A/V receiver has this capability. If your receiver has Audyssey you may want to look into the “dynamic eq” and “dynamic volume” settings.
Apple TV had to introduce a feature that makes exactly that for you. Apparently this problem is that systematic. Movie makers ruin sound that the streamer device have to correct.
When I had a 5.1 setup, I had the center channel volume set higher than the side/rear channels and it made speech understandable while explosions weren't overpowering. Maybe the receiver you have allows the same?
Things became worse in the past 10 or something years (I am a 40 something btw.).
As the article states we used subtitles in movies to learn language and to understand thick accent. However throughout time while we got better and better in comprehension the use of subtitles remained the same in overall. But very uneven too.
Old movies (not yet seen) and movies for children are never a problem. Also some properly made movies. The atmosphere and sound effects are fine as well as the dialogs.
But it became too much of a trend to make muttery a mushy dialogs where people seemingly unable to articulate. To the level never heard in real life (where muttering may be more common due to the more causal and mood or situation affected nature of conversations, unlike the carefully created and heavily controlled ones in movies). Also it is uneven. Some movies are still fine, but others (perhaps the newest Batman comes to mind, unsure, forgot that movie mostly, being so obscure) are a struggle, not fun at all. Aggrevated by the loud bangs and muttery dialog combo.
A sign that this is a systematic and intentionally manufactured problem, that sound engineers and directors lost their touch completely, is the existence of a feature in the newest Apple TV for equalizing the sound level in a movie. Streaming hardware manufacturers had to come up with solution to correct what sound engineers ruined or unable to get right nowadays.
Older movies are almost never a problem. New movies are in an uncomfortable amount.
I use that feature on Apple TV (normalize audio levels) and it's STILL too dramatic for watching unless you want to wake everyone in the house up. I have a 5.1 setup with the center channel even increased (to try and boost talking) and I still have to play the volume game at times.
The sound mixes today are so far off a pleasant experience that the people who do make them should feel ashamed - How do many actually have a home cinema room with soundproofing and 20 speakers ?
Most people watch movies with a headset / a TV / 2.1 / 5.1 surround system and they may even be limited by other factors like neighbours and you know real life..
And the the voice levels is obnoxiously low and way (negatively) beyond what i would call a good experience.
The slurred dialog is obnoxious and a sign of laziness on everyone's part. The miking, the post processing, the overdubbing (if there's any at all - recent film makers don't seem to know that this is a thing). The actor's speech pattern, where they direct their speech while acting, even how they stand and where. The choice of location and additional preparation of the stage so the speech can be picked up clearly.
It's not your hearing, it's the mix. Source: I'm a mastering engineer with a monitoring setup worth six figures and I use closed captions too.
Speakers in TVs used to be tailored to making speech intelligible. Nowadays it's... well whatever it is, it's not really meant to do anything. Not even the more high-end popular speaker systems are meant to do anything at all, really. They all just kinda sorta make whatever is coming out of them sound flashy and whooshy. That's another reason why movies are even more difficult to understand.
And let's be honest - a lot of it nowadays is just not _worth_ hearing... Staple tv shows form the 70s that fell into obscurity for being extremely pedestrian had better writing than nearly all big-budget movies nowadays. I remember when Pig came out recently, it was the only good movie during a span of maybe 3 months. That's how dire it is nowadays.
> The sound mixes today are so far off a pleasant experience that the people who do make them should feel ashamed - How do many actually have a home cinema room with soundproofing and 20 speakers ?
Also, digital camera manufacturers should be feel ashamed for cameras providing photos that don't even fit on my monitor. My 24MP Sony camera outputs photo's at 6000x4000 pixel resolution! Who the hell has a monitor that big ? I can only see a small part of the photo at one time.
Oh way, no, that's absolutely crazy. Because maybe some people do have a monitor that high-res, or maybe they want to print it or edit it or.., or... And guess what, there is no need to have the photos be smaller to display them, you can simply use a viewer that scales them to fit on your display. There is absolutely no need to throw away that extra information and you might want it in the future.
Same goes for audio in a movie. If it doesn't fit the dynamic range suitable for your setup, then scale it to whatever you need it to be. You may not have a setup to enjoy that, but others might, and they shouldn't have to deal with sub-par mixes because other people don't know how to configure their playback equipment.
Christopher Nolan is one of the biggest offenders in this regard, often having barely audible dialog drowned out by music or background noise by stylistic choice. I understand the intent is to let it wash over you but I can’t help but struggle to understand and it ends up just making me feel frustrated and taking me out of the film.
I feel the same way, it's strangely mixed. You look at movies that are 30 years old and the music is very quiet or in different bands than speech, very carefully done. You can still hear speech in Jurassic Park or The Matrix. A Marvel Movie? Good luck...
Maybe it's a lost art. I have found that turning off subtitles and forcing myself to get used to their accents does improve my listening comprehension, but it's a much easier choice when you can't otherwise understand it just due to mixing.
I don't know that it helps me focus, if anything I have to focus harder when they're talking so I can hope that their lips will be enough to figure it out. But it does mean that if I'm watching something in a place with roomates I can use a much lower volume and figure out what is going on by combining the text with the low SNR audio I hear.
Listening at a low volume should mean better overall hearing over time too right? My default volume went from like -20dB to -45dB on the same system over the last decade, I think either I'm being polite to roomates or just more careful with my hearing.
The Apple TV has a setting called "reduce loud sounds" which helps with this problem. I'm sure you can find it on other devices too. My Sony TV also has a sound setting for enhanced dialogue which might help as well.
Can't really help with the actors thing, though I think I'd rather characters that are truer to life than sound like they're acting on stage, personally.
> I think I'd rather characters that are truer to life than sound like they're acting on stage, personally.
You can have both though. A great example is Patrick Stewart as Captain Jean-Luc Picard in Star Trek: The Next Generation. He’s never mumbled a line in his life and you can understand everything he says during quiet dialog. You can be true to life, quiet AND understandable.
> The Apple TV has a setting called "reduce loud sounds"
This does help, but at some point Apple increased the compressor’s release time so that it’s far too quiet even 10-15s after the loud transient. I wish they would make it behave more like a brickwall limiter.
That option actually works great for resurrecting the speech in a movie, but makes all the music utterly flat and boring. With this option on you might as well throw out your expensive speaker setup and use the built in TV speaker.
If only there was a hotkey to toggle it on and off, instead of hidden 4 levels down the touch-and-swipe menu.
I have the same issue but instead of using subtitles I've embraced not knowing what's going on with the show. If it was important that I know for my enjoyment then I'm probably not watching the show anymore. Like if what the actor said was unintelligible I'm going to go with that's part of the plot, a puzzle for me to figure out on the show subreddit or something, but I'm not going to deal with the shitty blinding subtitles some shows display on blinding white over pitch OLED black, it's impossible.
That’s how I watch movies in general. Maybe it’s a symptom of ADHD or lack of interest but I find it hard to remember people’s names and plot reveals that require me to remember to specific locations or details. I just watch movies and enjoy them “generally”.
Just for reference: if you need Subtitles on an OLED, e.g. on Netflix you can set the subtitles to grey (only in the Web app, but is used on all other apps). Had that same problem with the OLED
While we are teaching actors to enunciate can we do the same with musicians? When I listen to music in German, I can almost completely understand it despite speaking German at a kindergarten level. When I listen to music in English, despite being fluent, it is almost impossible to discern the lyrics they are mumbling.
I suspect that a large part of this problem is Autotune. Singers need no training to hit the right pitches anymore, so they receive no training at all for any of the other skills as well.
AFAIK, there's only one audio mix done with theaters in mind, and studios don't do separate mixes for home viewing. It's down to the distributor do make sure that the average home viewer gets a good experience. That's why it boggles my mind that big players like Netflix don't have any dynamic range compression and surround downmixing options in their settings. Also the way most TVs handle surround downmixing is straight up broken. I watch my movies from mkv files on a PC, so for me it's not a problem because there's tons of options available for that in players like mpv, VLC and MPC-HC, but I do feel bad for your average person that has no business knowing about this stuff. They should just have a simple DRC setting available with maybe 3 levels and a clear explanation of what it does.
If you use MPV, you can pass it ffmpeg filters, and ffmpeg has many ways of compressing the dynamic range. See [1] for the set of audio filters. Here are the settings I use [2]. If you can't run your video through ffmpeg, I guess you could run your audio signal through a physical compressor if you use external speakers, but that can get quite involved.
I always thought it's the same problem as modern web design: razor thin fonts, "stylistically" low contrast ratios, hieroglyphic icons that don't show properly low resolution screens, slow loading times... because the designers and developers that made the thing are using maxxed-spec iMacs with 5k IPS color-accurate displays, so they never experience the "real world usage". I imagine the sound engineers have the "maxxed-spec" equivalent mixing studio, and the directors imagine everyone will experience the film in a theater or in a home theater, and not on the built-in Hisense TV speakers from across their living room.
When I was working at a transcription company I had to build my test kit with spare parts from the Bin. The Bin was full of old discard parts, early Celeron and Pentium machines with PS/2 KB/M.
Everything had to compile and run on that machine before I could ship builds to any of our transcriptionists. Our website needed to be able to load on it and so on.
What you say rings true, sound engineers are also mixing audio at 24 bits uncompressed, and while DVDs and Blu-Ray have the ability to play it back, I doubt the streaming services are sending it in full bit depth and also that most people buy hardware capable of decoding it if they did.
I know I'm in the minority here, but let me explain why it's not terrible audio mixes or bad actors.
The overall trend in TV and movies has been towards greater realism in all aspects. Less stage-audience sitcoms, more single-cam. Less "look at this actor" and more "look at this real person".
Actors aren't mumbling, they're speaking how real people speak. Trust me, screen actors have more stage training than ever -- they know how to enunciate but if they do, the director tells them to stop because they seem like they're giving a fake theatrical performance rather than being a slice of real life.
And because more and more people have huge TV's with great contrast, 5.1 surround systems, or are watching with AirPods with spatial audio... TV is becoming more film-like both in range of brightness and loudness. For many people (like myself) this is a godsend. It's like I'm at the cinema every time I watch an hour-long drama. It's amazing.
We could go back to low-contrast everything and overly enunciated actors, the way sitcom TV was, designed for small screens and terrible speakers. But that just feels so... backwards and limited and fake.
HOWEVER, I do firmly believe that modern TV content should be made more accessible, and that it's high time to build adjustable "end-user compression" into all TV's and video players. Let the user apply automatic volume equalization so quiet parts are just as loud as the loud parts when you're watching TV in your living room with lots of activity around. (And with 5.1 signals, you can even always keep dialog louder than sound effects and music.) And brightness equalization so you can see what's happening in the dark scenes of Game of Thrones when you're in a sunlit room.
> Actors aren't mumbling, they're speaking how real people speak. Trust me, screen actors have more stage training than ever -- they know how to enunciate but if they do, the director tells them to stop because they seem like they're giving a fake theatrical performance rather than being a slice of real life.
But if it translates into me not being able to understand them, that’s less realistic, because in a real-world scenario I’d be able to understand them, since our ears are adapted for the real world. A real world slurrer is about as easy to understand as an enunciator through movie speakers — at least, with how the meth-addled mixers do it now.
Bottom line: speech being incomprehensible is not realistic, and is not accomplishing some wonderfully artistic gritty realism.
Edit: To put it a different way: they may be speaking the way real people speak, but once you put it through a speaker, it will translate into something less comprehensible than that speech would be in a real-world scenario, and thus less true to how that situation would be in real life.
I'm sorry but that is just not true. That kind of naturalistic acting has been the dominant trend in movies since Brando. TV may have been more stage like but definitely not movies.
I don't watch a movie to get confused as I may in real life. The director is confused if they don't understand the fundamental rules of cinema - speak clearly and face the camera.
No matter the size of the screen, bad acting is bad.
> Actors need to go do some stage acting and learn how to e n u n c i a t e and project their voices.
Watch some German shows (notably: Not Netflix’s Dark as that was untypical, I mean actually produced for the German market shows). It’s horrible. Everything sounds super fake. Yes, everything is easy to understand, but I (a German) can’t watch anything because nothing sounds real.
It’s usually claimed this is because German actors do more stage acting, but as Dark (or other German actors in US productions where they don’t sound fake) show, it goes much deeper.
I live in Germany and honestly it just seems to me from speaking to Germans that they have the worst mixture of a lack of sense of quality and also an uncaring-ness/sense of futility about it. I think it is kind of like the "poor service at restaurants" thing. It is a negative point that has become accepted and almost expected by Germans, so there is literally no desire to change it, which would take a lot of effort for very little gain.
Tangentially, it also reminds me of going back to the UK last week to visit a conference. The trains were just so much different to those in Germany, even the crappy "Stansted Express", those here in Germany feel very utilitarian and unfriendly (particularly the Deutsche Bahn "Bombardier Double-deck Coach") while in the UK similarly priced train services are more explicitly friendly and caring in their design.
I do think it is a systematic or cultural "problem", but also it's kind of one of the things I like about Germany. The emphasis is on things that are completely tangible to everyone rather than more removed concerns like aesthetics and friendliness. And, I think this bleeds into the cultural arts too. I have noticed the same thing with a lot of popular German music too. It just lacks the sense of "coolness" that there is in other countries, and feels almost like an cringey imitation of other countries popular music
Dynamic range compression is called different things on different platforms, "midnight mode", "reduce loud sounds", on Windows it's "loudness equalization".
> Sound engineers need to stop needlessly flexing all that dynamic range that modern tools give them, and just produce a watchable movie.
And not just sound engineers! So many TV shows these days are filmed in nearly pitch black. If there are any lights on in the room (or it's still bright on a northern summer night and the living room doesn't have blackout curtains), I can't see the action. At first I thought I was just being cranky and old, but then we watched The Neverending Story with my kid and lo and behold, they film in "dark scenes" with torches etc so you can still see the actors well enough to read their lips! I'm sure it would be fine if I was watching it in a proper theater, but it's a frikkin TV show and who even goes to the cinema anymore?
I know Sonos are not loved here, and other companies probably do it too, but Sonos sound bars allow you to enhance speech and also go into night mode. Night mode reduces the dynamic range - louder quiet bits and quieter loud bits.
I like the tv on quietly relative to other people I know, and continually messing with the volume is tedious.
> "The loud parts (including explosions and music) are too loud for her, so we have to turn the volume down during those parts, but the dialog is too quiet [..]"
On YouTube the ads are the explosions. Ads are so loud compared to the regular videos that I often just turn the volume down and the captions on.
If you know a foreign language, an option that works is to watch the dubbed version. I sometimes do this, especially for action movies: the foreign dubbed version is often easier to understand than the original english, because people doing the dubbing want to be understood.
French version are always so clear (friends and family watch movies on laptop and phones). But I know enough English to notice that the movement of the actor's mouth does not correspond to the sound.
After watching The Boys season 3 I thought I was losing my listening comprehension of English. Glad to hear even native speakers struggle with modern movies. Karl Urban is great, but he just grumbles into his beard.
Whenever I'm watching a modern movie, sticking a compressor on the audio is a must. Open VLC settings, search for "compressor". It's programmer UI, but it does exactly what it says.
What kind of speakers do you have? Let me say that voices become greatly more intelligible with traditional speakers with a full midrange than modern speakers with exaggerated treble and bass.
What may be happening is that your speakers have a scooped profile, and audio mixes often have a scooped profile too. The sound effects and music tend to be bass and treble heavy, so you hear a lot of that.
But the lack of midrange means voices are hard to understand. You need range that to distinguish vowels, etc
I don’t understand how anyone can watch a movie with iPad or iPhone speakers. They are missing the exact part of the spectrum where important info in voices live. It’s just painful for me to pay attention with those speakers.
My parents were doing this, and I got them a simple Bluetooth speaker with midrange, and they immediately noticed the difference. Their faces lit up.
Judging by the comments here, it does seem there is a "double" race to the bottom of more treble and bass -- on the audio mix side and on the speaker side. So it's a compounding effect, which would explain a lot. So the least you can do is get some better speakers.
Or honestly if you have an EQ, you can just experiment with turning up the midrange and the treble/bass down, although with some speakers this will make things sound bad.
----
Also I just skimmed the original article!! Isn't it obvious that the problem is iPad and iPhone speakers ????
I see young people listening to MUSIC on these things and it boggles my mind
They can't understand dialogue out of these crappy tinny speakers, so they turned on closed captioning ?? Makes total sense.
Or are they also using them with ear buds, which shouldn't have the problem? I notice more people NOT using ear buds (annoying people on the train), which makes me think the problem is the speakers
10 years after I bought some pairs of wireless headphones to avoid waking our newborn baby while watching a movie, they still get daily use for exactly this reason. Having the drivers right by our ears with individual volume controls gives us each the best audio, but it is unfortunate we need to resort to it.
Agreed. Background levels higher than dialog are a chronic problem. Actors turned away from the camera, so you don't get any facial cues. Yelled or hurried or mumbled lines, which don't get re-shot because, cost.
Add all that to the usual nodding actors - they don't understand what they're supposed to be feeling so they just nod as they talk. Sorry to call that out, now you will never be able to un-see it again.
And whispering as a substitute for emotion! What the hell? Character is supposed to be feeling intense emotion, so the actor whispers. The director should slap them every time, and re-shoot.
That difference in loudness between voice and effects happens often when the stereo material is derived by a badly mixed down 5+1 material. Having control of the 6 channels levels before mixdown may help to eliminate it, otherwise some sort of compression/limiting is necessary.
Many TVs have the option to set the audio output level to automatic, which doesn't mean the volume can't be changed but that around a given volume the dynamics are less extreme so that faint sounds are amplified and loud ones are attenuated.
I don't think this is a new phenomenon. It annoyed my Dad so much in the 1980s that he build an audio compressor for the TV which quietened the loud parts and amplified the quiet parts.
> Sound engineers need to stop needlessly flexing all that dynamic range that modern tools give them
Most of this is a result of surround sound and people not setting it up. There are two main speakers, a center speaker, and various surround speakers depending on the setup.
Dialog is primarily mixed to come out of the center speaker. If you turn the center volume up and the main speakers down you can fix the problem where explosions/music are too loud but dialog is too quiet.
My old AV receiver had a "night mode" (I believe it was called) that did some really great, natural-sounding dynamic range compression. I ended up just keeping it on all the time, for the reasons the parent lists. Alas, the entire unit stopped working several years ago, and I learned it would cost more to repair it than replace it... and I didn't do great research when buying its replacement, so "night mode" is no more.
You accurately diagnose symptoms, but you blame the wrong people. Engineers and actors don't make creative choices, they merely obey directors and producers.
That's actually close to another very important point. Most movies come with a 5.1 mix nowadays, and people watch them on a stereo system. Most players use a very simple formula when converting 5.1 to stereo, something like say Lout = 1/2 Lin + 1/3 SLin, Rout = 1/2 Rin + 1/3 SRin, and leave out the center channel completely. This is also done inside of TVs etc. The center channel gets dropped completely! And sometimes dialogue is only in the center channel - not all of it, but there are some lines every now and then that just only ever come out of the center channel. Those are lost.
I have a home theater (do they still call it that?) with 5.1, 5 speakers and all that jazz. I tried everything but with many movies (esp. all the superhero stuff) the only option is to lower the volume and turn on the subtitles to be able to follow the dialog. I wish there was an option to make the loud parts less louder and quiet less quiet, but I couldn't't find it (it's an older onkyo).
I'm going to guess that most folks now-a-days go with a single soundbar that acts as the whole sound system, maybe there is still a center channel in the bar? Whether that be for; price, small living space or a partner doesn't want speakers around the world, its a convenient space saver that sits under the TV, sight mostly unseen.
I observed this in The Walking Dead, in the first seasons I could clearly hear the dialogue, but since 4-5 seasons I need subtitles for everything. Zombie noises from 200m away are louder than main characters speaking next to the camera.
>Sound engineers need to stop needlessly flexing all that dynamic range that modern tools give them, and just produce a watchable movie.
Audio component manufacturers have actually responded to this tendency by adding in-line compressors to the audio channels that can be enabled/disabled. Sometimes this is labelled as the "Dynamic Range" feature in AV systems. Essentially it squashes the loudest sounds, which in turn brings up the average volume of the quietest sounds, thus evening the overall volume and allowing for a single volume setting to be used throughout the movie.
I think some of it is surround sound that modern streaming and televisions use without actual surround sound speakers. I've found if a movie/show is in surround sound, changing audio to non-surround sound helps a lot.
We do the same thing, loud noises, quiet dialogues and im constantly adjusting the sound. I've tried the audio settings where you can normalize or do night mode, it just doesn't work well. I have kids trying to sleep. We recently started hooking up a bluetooth speaker that sits between us and on loud scenes we just flip it down into the couch. We both try to work on stuff while watching so subtitles isn't something we can just turn on.
You've got this backwards. Overly compressed dynamics are far more common than "excessive" dynamics. It's definitely not a case of sound engineers needlessly flexing.
Your receiver will have a dynamic compression feature of some sort. Make sure that's maxed and it'll help. But frankly, an action movie isn't supposed to have explosions at the same sound level as dialog.
> ...Whenever I watch movies...turn on subtitles--easy but shitty solution...
Yep, I've been doing this now for several years. I guess i got used to watching (and enjoying!) many films that are not in English, but 99.9% of the time its because of actors mumbling, or some background noise (in the film) not making it easy to discern what was stated.
This is why I sometimes choose the German-dubbed version of a movie, even though I prefer the original one in English. They either mumble that it's hard to understand, or have an accent, or it's just a bad mix there the dialog is disturbed by noises. Dubbing reduces those noises ans removes the mumbling and the accents.
My wife is also very sensitive to loud or harsh sounds and we have to do a similar thing. Sometimes I set up a compressor in VLC, but it's a pain to get the settings right. Really we need something more like a limiter but I don't think VLC has that. It's not very fun to watch a movie on the TV nowadays
> Sound engineers need to stop needlessly flexing all that dynamic range that modern tools give them, and just produce a watchable movie. Actors need to go do some stage acting and learn how to e n u n c i a t e and project their voices.
I am all for a ruthless dialog loudness war, like the loudness war in the 1990s-2010s.
There's a lot of podcasts with egregiously bad audio mixing. Which...c'mon guys it's a podcast. You should get that right. I've wondered if there's a way to use an algorithm to make people's voices a little easier to understand, like slightly exaggerating consonants or something.
Shout out to the ex-Cracked podcasts; a lot of those people -- presumably because of their experience with being part of actual productions -- have sound engineers for their podcasts.
I thought it was my tv having completely broken audio… Or the broadcaster. I noticed that, the volume goes up when commercials kick in, but also explosions, shootings, etc. the volume has to be adjusted every 5 minutes and we are watching 90% of content with captions on since I can remember…
I also suspect the hardware they master for is not the hardware most consumers have. My cheap soundbar isn't going to produce the sounds they want, and im not going to hear dialog. But when I visit a theater, I have no problems hearing the dialog.
> The loud parts (including explosions and music) are too loud for her...
Exact same problem at home... it was so bad on Interstellar that after that movie, she never went to a cinema with me again, or watched any movies together at all :(.
There seems to be a wave of naturalistic acting in TV series nowadays. Actors who simply play out the script verbatim as if in pre-premiere. No idea when this started. I'm thinking of the new she-hulk series.
Agreed, mixing at ‘reference level’, which is deafening, isn’t doing anyone any favours. Very few people have a home theatre set at reference level when watching a film or show.
To be fair the consumer audio stack is crap in the average home. I bet a CRT TV would be a big improvement in audio quality compared to most flat screens on the market.
Don't forget watching on tiny laptop speakers, cheap earbuds, or little bluetooth speakers. Even older movies are an issue when the sound quality isn't good.
I've never seen an implementation of that that actually did what it claimed, even on a modestly high end Denon receiver it just ends up making all audio sound harsh.
>Actors need to go do some stage acting and learn how to e n u n c i a t e and project their voices.
Hard disagree. I want realism. The sophistication of modern audiences is now so high that the level of enunciation and clarity in older movies now sounds comical to modern audiences. Believe it or not this old style speak you see in Older movies was Literally artificial and made for acting: https://www.youtube.com/watch?v=Gpv_IkO_ZBU
Modern Audiences want realism. They want a certain level of believably that matches with reality. And a dialect artificially designed for "clarity" on radio just doesn't work.
Clarity takes a hit in the name of realism... but who cares? I have closed captions for that. Literally I don't see what's wrong with it.
The sophistication of modern audiences is now so high that the level of enunciation and clarity in older movies now sounds comical to modern audiences.
My mind immediately went to that scene in idiocracy - "You talk like a fag, and your shit's all retarded"
The fact that there's a claimed generational gap here speaks more to the values of the generations re: passive vs active entertainment. I could also ask: "Why do all these 60-Somethings have the TV on as background noise while doing other tasks?"
To me it seems closed-caption usage is correlated with actually paying attention to television consumption.
People who have their TV on at all hours, as background entertainment to support their lifestyle, tend to not use CC. Why should they? The words literally don't matter, it's just an aesthetic.
People who actually desire an immersive experience, who deliberately pay attention to the shows they watch, tend to care about CC since it complements the audio & visual nicely. I don't have any evidence for this but I'd wager that plot synthesis and comprehension of television shows is greatly improved by CC. Or maybe it's that people who use CC tend to value and perform better at synthesis and comprehension? Regardless of causality, CC seems related to an individual's desire for more active entertainment.
I wish there was a CC mode for “I’m not deaf, I just don’t catch the quiet mumbly dialog sometimes so give me captions only of the dialog.”
Bonus if it’s delayed by exactly 2 seconds, so that it doesn’t spoil jokes, and I can quickly reference it if I missed something someone said (helping me avoid fixating on the captions.)
Because it’s the CC descriptions of music swells and things like “[laughs] and “[crying]”, etc that are distracting for me, and not needed.
What’s worse, sometimes they ruin comedic timing by putting the punchline on the screen before the setup has finished (or logical equivalent of a punchline… comedy is often found in timing) and the captions ruin the timing.
There is on some shows. For those the choice is between "English" and "English [CC]" or similar; the former omits descriptions of noises. Most don't have this, probably because it's not provided by the producers.
Agreed on timing issues. This also seems to vary by show/producer.
> Bonus if it’s delayed by exactly 2 seconds, so that it doesn’t spoil jokes, and I can quickly reference it if I missed something someone said (helping me avoid fixating on the captions.)
I think that'd ruin me more than not having subtitles at all, because I just get stuck at reading the subtitles all the time if they are on (which, I have on 99% of the time). I go so far to avoid content that doesn't have subtitles.
What I'd love to have, would be a "speed-reading" version of subtitles, that just show each word individually as it's spoken (or maybe two/three) instead of the full sentence. That way it can still be in sync, while not ruining punchlines. Would be a real hassle to actually write/make the subtitles though, but with all the AI/ML around today, there is probably a way to automate it.
That’s the difference between subtitles and closed captions: one is for people who don’t speak that language but can hear in general, the other is for people who can’t hear at all.
There usually is (but not always). It is something like English vs English [for hearing impaired].
As a non-native English speaker I’m used to watching most television and movies with subtitles. The extra sound descriptions don’t bother me too much actually. Sometimes it is kind of funny actually, it is a running joke in my home to count the times a show has [ominous music].
> Because it’s the CC descriptions of music swells and things like “[laughs] and “[crying]”, etc that are distracting for me, and not needed.
This drives me nuts too, especially when they're being subjective and I'm asking myself things like 'Was that music really "ominous"? Didn't seem that way...'
Another thing that bugs me is when captions reveal a character's name to indicate who is speaking when the name of that character hasn't actually been revealed in the story yet.
> Bonus if it’s delayed by exactly 2 seconds, so that it doesn’t spoil jokes, and I can quickly reference it if I missed something someone said (helping me avoid fixating on the captions.)
This is the exact reason why I vastly prefer playing media on my own media player. I have it set to always delay subtitles by exactly 2 seconds. I even have the font overridden to use a nice geometric sans (Josefin Sans, to be exact); subtitles with bitmap fonts (like they are on DVD and Blu-ray) are disabled outright.
Every time there's a new person watching something with me, they comment on the subtitle delay. It usually starts with a complaint that the subs are off-sync. Never had someone (yet) stick to that complaint though, after this explanation. But, nobody I've met has previously recognised a need for such a delay, either. Until they watch something with me, of course.
> [CC] complements the audio & visual nicely. I don't have any evidence for this but I'd wager that plot synthesis and comprehension of television shows is greatly improved by CC
As you say, you don't have any evidence for this, so I'll just add that my subjective impression is the absolute opposite. I find it impossible to concentrate on what is happening visually on screen in a film when closed captions are on.
I associate CC with the one place I ever see them - at the laundromat, where they are always on because you can't hear the TV clearly above the background noise. Hard to think of a setting where "background entertainment" is a better descriptor.
As non english-native speaker I watched so much stuff with captions it's second nature for me.
But for any language that I understand well I only turn them on when the accent is so thick I can't get what the actors are saying.
Similarly I always turn on all subtitles in video games because it's easy to miss something if say ingame explosion or other sound effect covers the voice, or it is just too silent (say NPC talking that's a bit far)
But the article claim it helps multitasking seems weird - how having to look at screen would help when multitasking?
>I find it impossible to concentrate on what is happening visually on screen in a film when closed captions are on.
This is a skill. I would have agreed with you 20 years ago but at some point I started putting them on because I could only afford a shitty audio setup and dramas got really mumbly. Took a while, but I learned to not be distracted by the subtitles.
I'm from Europe so used to subtitles from early age. My wife is Canadian, and first little while of our dating was bewildered with subtitles.
Now though, literally all my Canadian in-laws use subtitles, intergenerationalLy - they're used to them, and addicted to knowing what the heck is going on :)
This aligns pretty well with McLuhan's concept of cool and hot media [0]. The younger generations prefer it cool - perhaps their hotness is found in newer forms like video games.
For context, I'm in my mid-30s and grew up playing video games. When doing chores I listen to podcasts or audio books.
I really like to think of media consumption in terms of attentional pressure. How long can you not pay attention without "missing something". For movies and TV shows it seems like 15 seconds, plus or minus depending on the content itself. Chatty podcasts give you 30 seconds or a minute.
I enjoy the fast video games I play (shoutout to Deep Rock Galactic, killer PVE FPS with a non-toxic player base!) because they modulate their attentional pressure and ratchet it up intensely at moments. It feels great to handle a situation where even a subsecond lapse in attention would result in failure.
From wikipedia: Cool media are those that require high participation from users, due to their low definition (the receiver/user must fill in missing information). Since many senses may be used, they foster involvement. Conversely, hot media are low in audience participation due to their high resolution or definition.
Seems like it's hard to place fast video games? You don't have to fill in much of anything (hot) but they require full participation (cool).
This is an insightful observation but also feels incomplete and outdated. They use hot and cold to distinguish between "one sense" and "all senses", for some reason, which doesn't quite match modern media.
But I definitely agree that there is something to it, and video games occupy a different niche of entertainment than movies for me. Video games are a pastime and a hobby, movies and shows are not unlike books: I don't want to miss a single piece of the action.
When I have dinner I watch something off Youtube. Movie time is after dinner when I can give my full attention to it.
I'm a native English speaker and grew up mostly overseas in non-English speaking countries. Subtitles helped me understand American slang and conversational English that I wasn't used to hearing at home. I've also always struggled with distinctly understanding dialogue in films. It never had anything to do with active vs passive viewing. Another reason I always had them on was because I grew up in apartments and was always mindful of my noise levels for my neighbors. Using subtitles let me actively consume the film without worrying about bothering people around me. Now as an adult subtitles are my default for all media when available and I prefer it.
They mix sound differently now as well. I can’t follow speech in many shows and movies because voice volume is low and the background crap is too high.
This whole framing is bull. People who grew up without closed captions are just used to watching a show without it. For them, it’s distracting to constantly have words to look down and read when they are trying to pay attention to what is happening on the screen. Like trying to read the words on every single road sign when you drive somewhere.
You may notice that in movie theaters—the most focused and immersive watching experience there is—they don’t show captions except maybe when translating foreign language.
Or you may not, if you only watch movies at home on your TV.
> You may notice that in movie theaters—the most focused and immersive watching experience there is—they don’t show captions except maybe when translating foreign language.
Well, it just tells you come from the English speaking country. For the rest of us, subtitles in a movie theatre are among most normal things in the world.
> You may notice that in movie theaters—the most focused and immersive watching experience there is
Movie theaters have the same influence on a mind as any other experience of collective attention. It keeps your attention aligned with attention of a crowd. It may be a case of herd behavior, or it may be something special, I dont know, but it works.
I believe that this collective attention is the main reason why movie threaters are still profitable. But not just it: surroundings associated with attention on a screen and all these rituals, like eating pop corn, also do their part as stimuli leading to a learned response. You get highly focused attention without any conscious effort. You need conscious to fight it and to divert attention from a screen.
But it doesn't matter when you try to watch a movie or a lecture at home. Despite of all your experience of collective attention it would be much harder to keep your attention focused.
No, that's not it. I grew up with literally everything in English being subtitled. Always hated it, but all I can do is make an effort to not have my eyes there.
Even some DVDs did not have the option to turn off subtitles. I never had subtitles on for anything that gave me the option. (I even bought DVDs from overseas occasionally to not get subtitles)
But now every other show is frequently incomprehensible.
Either they've put lead in the drinking water, making us all not understand the spoken word, or they've made dialog audio shit.
I think it's not as universal as you believe. I personally use closed captioning as a crutch when I'm half-ass watching something, because if my mind wanders and I miss hearing a line I can usually just read it without having to pause and rewind.
If I'm actually fully immersed in the movie or whatever, and I can understand the dialogue, then I don't need the captions and I leave them off.
In the past, that line needed to come in loud and clear, or people would just miss it, probably forever. Now, we can rewind, or of course turn on subtitles, so there's not as much pressure on the audio finishing to be super clear.
With television you turn it on and there are a few mediocre choices you can tune out. On streaming platforms you have to specifically select a program to watch. When Gen Z watches/listens to something in the background it's much more likely to be a long YouTube video or a podcast.
I use CC even in my native language especially when I background watch. It's easy to work while watching a tv show (only works with relatively boring work) if I know I can quickly glance at the subtitles if I missed anything.
Your tone is a little smug here. If I’m paying attention to something I’m watching on TV, I can hear the dialogue. I want to be able to actually watch the performances of the actors, or whatever else is on the screen, instead of reading captions.
Sounds right - my parents (early 70s) don't pay attention to what they are watching even tho watching TV is the only thing they are doing at that time. They both even regularly fall asleep watching TV/movies at night and literally don't care that they missed most of the show/movie they were "watching." They aren't interested in watching the parts they slept through in the morning.
Well sure if the actors mumble all the time trying to appear enigmatic, the comprehension is surely greatly improved. I m not american and often i find it very hard to listen to what is said in tv series/movies. This is a new trend in acting btw, older TV was not like that.
Older people are more likely to be alone and the TV helps to break the silence. TV is usually tv shows where people speak properly so that helps i guess
>To me it seems closed-caption usage is correlated with actually paying attention to television consumption.
The alternative is that the new generation has no ability to focus on anything and needs the same information piped to their brain through both their eyes and ears. Up next: smellovision captions.
Closed captioning only very recently become not crap. It used to be mainly served as those black bars with white text coming in about 30 seconds too late and with ridiculous typos.
31 but yeah, I've had subs on for almost a decade now? Maybe only 6-7 years?
For me I find I miss things without the subs on. Whispered/background dialog that I'm unsure how anyone is supposed to hear/understand being top of list but also sound mixing seems to be terrible by default and it's hard to hear/understand characters even in a quiet room sometimes. I read quickly so subs have never been an issue to me, I can scan and parse the subs and be looking back at the video in no time at all.
For foreign content (like Anime) I use dubs + subs (the subs are of the initial English translation before the dubs were done) because it gives 2 passes at any given line. I find it very interesting to see how they change things between the two and it sometimes paints a fuller picture.
> For me I find I miss things without the subs on. Whispered/background dialog that I'm unsure how anyone is supposed to hear/understand being top of list but also sound mixing seems to be terrible by default and it's hard to hear/understand characters even in a quiet room sometimes.
So many video sources coming with only surround mixes, or lazy stereo mixes that were made automatically from surround mixes, while very few homes have a surround system at all, and even fewer have one that's at least decently-calibrated, has been absolute hell for being able to tell WTF anyone's saying in movies. I suspect many VHS releases had better audio for most people's actual system even today, than the blu-ray of the same movie does.
That's interesting - I have subtitles on at all times (I am Swedish, we never dub adult content so I have used subtitles for a very long time), but when I do watch children's movies with my kids, they are dubbed, and then I find having subtitles on is extremely annoying.
I cannot stand having subtitles in the same language as the dialogue, but which differs in phrasing/content. It is very distracting to me.
I've moved to Germany a couple of years ago and this is a HUGE issue for me. I try to improve my German by watching German shows/movies in their native language. All German audio vs subtitles differ, constantly, and not in a minor way. It's practically never been an issue with English for the past two decades.
> For foreign content (like Anime) I use dubs + subs (the subs are of the initial English translation before the dubs were done) because it gives 2 passes at any given line. I find it very interesting to see how they change things between the two and it sometimes paints a fuller picture.
Anime is an interesting case, because — remember, a lot of these threads are about poor enunciation and mixing? — anime tends to be really well enunciated, and the characters speak like they're on stage. It's easy to follow along. I keep subs on, because I don't understand literally everything, but usually I don't read them.
Surprise surprise, this doesn't reduce my immersion at all.
Also if you speak some Japanese and listen to the Japanese audio while reading the subs it makes for some interesting food for thought, often the translations aren’t exactly what the characters are actually saying so you get a feel for how the languages and cultures differ.
1) The reason TikTok etc. have closed captions on is that audio is off by default. If you want people to listen to your words but audio is off, put it into closed captions.
2) With closed captions you can watch a movie on low volume without disturbing others and still understand every word even of the girl in the corner of the room barely utters a word.
> The reason TikTok etc. have closed captions on is that audio is off by default. If you want people to listen to your words but audio is off, put it into closed captions.
I can't believe that whole dramatic article missed this very obvious reason.
It didn't. When I downloaded it the other week to see what all the hubbub was about (spoiler alert: I killed it with fire lol), audio was on by default and I had to change that setting. I'd wager that OP forgot that they set it themselves.
tiktok audio is not off by default. But it's right in the idea that some people will have the audio off, and cc means you can consume the content in either case.
I have a hard time believing this. Captions demand more attention if anything. I can passively "watch" a video by only listening to the audio, but I can't passively read.
I think that the phenomenon described in the article may actually be a symptom of a much deeper social change. Listening and auditory comprehension were critical skills for communication and preservation of knowledge in prehistory. Spoken word is inherently ephemeral. As civilizations developed or adopted writing systems, and the population became increasingly literate, text supplanted spoken communication and oral history in many areas. There are many obvious benefits to that change, but I believe that we also inadvertently sacrificed our listening and auditory comprehension skills in the process over many generations.
Text messaging/SMS is increasingly preferred to phone calls, with many of the younger generations experiencing high levels of anxiety if they're required to call someone.
This is completely anecdotal, but I've also observed several others who are unable to follow verbal navigation instructions - either spoken from another person or even live step by step instructions from a navigation app. They only feel confident if there's a visual representation.
I think we've mistakenly classified that behavior as having a "visual learning style", when it is more accurately the result of our species losing its ability to process auditory language.
I think you're right in absolute terms, that captions require more attention than audio generally, that's only true if you're choosing one or the other, when most people are doing both.
With audio, if you missed a word, because you were focussing on something else for a moment, you have to rewind. If you have captions on, you can glance quickly at the screen, read the word or two that you missed, allowing you to 'recover'.
That allows you to pay even less active attention to the audio, because you know you can always 'error-correct' later.
> I have a hard time believing this. Captions demand more attention if anything. I can passively "watch" a video by only listening to the audio, but I can't passively read.
I do this. I can parse what is happening on screen and read the captions in a fraction of the time the thing actually plays out, then I got a few seconds to do something else. Usually I would be reading HN or something while I tune out the video for a few seconds, before glancing at it again to catch the next bit.
Listening to a video and reading something else at the same time doesn't work for me. When I do that, I usually forget what I was reading or miss something in the video. Interleaving works much better for me.
Also I don't normally do that, just when there's some particularly boring part that I don't want to skip but which doesn't demand my undivided attention.
> I do this. I can parse what is happening on screen and read the captions in a fraction of the time the thing actually plays out, then I got a few seconds to do something else. Usually I would be reading HN or something while I tune out the video for a few seconds, before glancing at it again to catch the next bit.
That kind of context switching really does sound terrible to me. In my case, I speed up the content itself by 3 to 5x speed so that I can process more information at once and any boring bits are basically sped through. It's helped me retain a lot more information than simply watching at 1x speed.
This is an interesting take actually.. If we can assume that younger generations spend more time communicating electronically via some form of text messaging, at least the majority so, are you saying that that just becomes the "default" form of communication naturally?
Where are older generations would have spend far more time communicating verbally..
That makes some sense to me anyway. Would be super interesting to see some further studies done. And to philosophise the results that could mean.
I was thinking along those lines, yeah. Extrapolating that out, I can imagine a future where writing is the only form of language. It'd make for some interesting dystopian fiction, if nothing else.
I'd be interested in some formal studies as well. My thoughts aren't well researched or anything. Just ideas.
I can read considerably faster than people speak. I can keep up with a scene by glancing at the image and reading the text while focused on something else primarily. I can’t do that with audio.
> Captions demand more attention if anything. I can passively "watch" a video by only listening to the audio, but I can't passively read.
If the dialogue is 60dB below the explosions (or even the music at times) and you're doing something like cooking with a 70dB noise floor and 80dB peaks, then you have zero chance of getting the dialogue without blowing out your windows.
I won't pretend I'm fully multitasking, but sometimes I want a comfort show on in the background that I've seen before while I play a game on my phone/iPad. In those cases it's nice to be able to look up and see what was said if you just missed part of it. But I also leave subs on 100% the time I'm normally watching for the obvious reason of not missing anything, it's not just for background TV.
> Captions demand more attention if anything. I can passively "watch" a video by only listening to the audio, but I can't passively read.
If the dialogue is 60dB below the explosions and you're doing something like cooking with a 70dB noise floor and 80dB peaks, then you have zero chance of getting the dialogue without blowing out your windows.
It's simpler than this. If you're watching a movie/show on a phone or laptop in a shared area, it'll probably be at low volume, so you can't hear the dialog. I've watched entire series with CC for exactly this reason.
With social media videos, CC is even more important cause nobody wants their phone making noise in public, and any wise app will mute by default anyway.
I disagree, I have a harder time paying attention to people speaking than reading. Reading is faster, much faster than waiting for people finish a sentence.
Then again, maybe you're right, I don't process auditory language and when I was in university, I skipped classes to instead read the textbook back to back without a teacher distracting me by yammering around.
I can browse my phone/watch instagram videos with a NFLX show playing. If something interesting happens I can look up and quickly catch up on the dialogue by reading the caption.
It's not so much the verbal aspect rather than the memory aspect. Written instructions are more soothing because you can refer to them verbatim and they stand no risk of being forgotten.
Not mentioned: The increase in poor audio mixing that sounds like it was done by a monkey in addition to the prevalence of devices with speakers crammed in odd directions inside of enclosures that are in no way, shape, or form big enough to contain them. If there was one advantage of the days of big tube TVs it was you had plenty of space to also fit big speakers in too
Even on my nice 2.1 setup on my PC, I always put captions on.
It's because re-reading a word I didn't understand is much faster and easier than re-hearing (in my head) a word I didn't understand.
It's even happened with my spouse, in the same room, speaking the same dialect of American English as me, with the same accent. I mis-hear one word, and it bleeds into the next word, and I need to think for a few seconds or ask them to repeat before the sentence makes any sense to me.
Perhaps us, and the TV actors, just don't enunciate like we used to?
I think mishearing a word causes an effect that feels something like a mental mispredict leading to a pipeline flush. There's definitely a delay to refill that pipeline.
I recently graduated to the next step beyond closed captions: I've started enabling audio descriptions for TV shows too.
It's SO interesting.
They are generally really well put together. And they often highlight things that I would have missed - mentioning the name of a character who joins a scene who I don't instantly recognize, for example.
Even more fun: audio descriptions of the title sequences for a TV show. Three that I particularly enjoyed were Severance, Moon Night and Star Trek: Picard - in all three cases the titles are rich with visual metaphors relating to the show, which the audio description then carefully spells out for you!
I find Amazon X-Ray similarily userful. It displays the current actors on screen, their screen names, and the current song playing in the background of any scene. As far as I know, no other streaming service has this feature.
Whenever I watch movies with my wife, we constantly have to fiddle with the remote control. The loud parts (including explosions and music) are too loud for her, so we have to turn the volume down during those parts, but the dialog is too quiet and mumbly for me to hear, so we turn the volume up during those parts. We now just keep the volume low and turn on subtitles--easy but shitty solution.
Sound engineers need to stop needlessly flexing all that dynamic range that modern tools give them, and just produce a watchable movie. Actors need to go do some stage acting and learn how to e n u n c i a t e and project their voices.
But anyway, I just wanted to jump in and say... all the time on this site we see programmers complaining about management/etc forcing things to happen that you don't want happening. Sound engineers may be in a similar boat here. ;P
When my family was going through all the Marvel movies at first I tried to control the volume by making it higher during the dialog and lower during the battles etc but it was an exercise in futility, as I'm only human and there would be some delay in volume change and we would still miss what the guy/gal said or my house would start noticeably shaking from the unexpected explosion happening during a scene where a guy was intensely whispering his rescue plan to his compatriots or whatever. Plus it started feeling like I'm working instead of enjoying the movie. So lower volume and subtitles it is.
I have home theater with 5.2 and 5 speakers, all it does is that when stuff in the movie goes boom my house starts to shake.
That’s how it’s supposed to sound. Explosions are loud.
This is not a problem with the source material, it’s a problem with the setup of the playback equipment. The source contains all the original data, it’s the playback equipment’s job to render that information in the appropriate way considering the space and circumstances where it’s played.
The ‘problem’ is that the sound is mixed to a cinema-level dynamic range. If you don’t want the full dynamic range you can compress it at playback. They store it at the full range since you can’t exactly un-compress it and the exact amount of compression depends on your setup and situation anyway.
Any half-decent A/V receiver has this capability. If your receiver has Audyssey you may want to look into the “dynamic eq” and “dynamic volume” settings.
(https://support.apple.com/en-gb/guide/tv/atvba773c3c9/tvos> "Reduce Loud Sounds")
Things became worse in the past 10 or something years (I am a 40 something btw.). As the article states we used subtitles in movies to learn language and to understand thick accent. However throughout time while we got better and better in comprehension the use of subtitles remained the same in overall. But very uneven too.
Old movies (not yet seen) and movies for children are never a problem. Also some properly made movies. The atmosphere and sound effects are fine as well as the dialogs.
But it became too much of a trend to make muttery a mushy dialogs where people seemingly unable to articulate. To the level never heard in real life (where muttering may be more common due to the more causal and mood or situation affected nature of conversations, unlike the carefully created and heavily controlled ones in movies). Also it is uneven. Some movies are still fine, but others (perhaps the newest Batman comes to mind, unsure, forgot that movie mostly, being so obscure) are a struggle, not fun at all. Aggrevated by the loud bangs and muttery dialog combo.
A sign that this is a systematic and intentionally manufactured problem, that sound engineers and directors lost their touch completely, is the existence of a feature in the newest Apple TV for equalizing the sound level in a movie. Streaming hardware manufacturers had to come up with solution to correct what sound engineers ruined or unable to get right nowadays.
Older movies are almost never a problem. New movies are in an uncomfortable amount.
Dead Comment
The sound mixes today are so far off a pleasant experience that the people who do make them should feel ashamed - How do many actually have a home cinema room with soundproofing and 20 speakers ?
Most people watch movies with a headset / a TV / 2.1 / 5.1 surround system and they may even be limited by other factors like neighbours and you know real life..
And the the voice levels is obnoxiously low and way (negatively) beyond what i would call a good experience.
It's not your hearing, it's the mix. Source: I'm a mastering engineer with a monitoring setup worth six figures and I use closed captions too.
Speakers in TVs used to be tailored to making speech intelligible. Nowadays it's... well whatever it is, it's not really meant to do anything. Not even the more high-end popular speaker systems are meant to do anything at all, really. They all just kinda sorta make whatever is coming out of them sound flashy and whooshy. That's another reason why movies are even more difficult to understand.
And let's be honest - a lot of it nowadays is just not _worth_ hearing... Staple tv shows form the 70s that fell into obscurity for being extremely pedestrian had better writing than nearly all big-budget movies nowadays. I remember when Pig came out recently, it was the only good movie during a span of maybe 3 months. That's how dire it is nowadays.
Edit: oh, and I forgot about the ubiquitous 5.1-to-stereo downmixing bug: https://news.ycombinator.com/item?id=32883588
Also, digital camera manufacturers should be feel ashamed for cameras providing photos that don't even fit on my monitor. My 24MP Sony camera outputs photo's at 6000x4000 pixel resolution! Who the hell has a monitor that big ? I can only see a small part of the photo at one time.
Oh way, no, that's absolutely crazy. Because maybe some people do have a monitor that high-res, or maybe they want to print it or edit it or.., or... And guess what, there is no need to have the photos be smaller to display them, you can simply use a viewer that scales them to fit on your display. There is absolutely no need to throw away that extra information and you might want it in the future.
Same goes for audio in a movie. If it doesn't fit the dynamic range suitable for your setup, then scale it to whatever you need it to be. You may not have a setup to enjoy that, but others might, and they shouldn't have to deal with sub-par mixes because other people don't know how to configure their playback equipment.
Maybe it's a lost art. I have found that turning off subtitles and forcing myself to get used to their accents does improve my listening comprehension, but it's a much easier choice when you can't otherwise understand it just due to mixing.
I don't know that it helps me focus, if anything I have to focus harder when they're talking so I can hope that their lips will be enough to figure it out. But it does mean that if I'm watching something in a place with roomates I can use a much lower volume and figure out what is going on by combining the text with the low SNR audio I hear.
Listening at a low volume should mean better overall hearing over time too right? My default volume went from like -20dB to -45dB on the same system over the last decade, I think either I'm being polite to roomates or just more careful with my hearing.
Can't really help with the actors thing, though I think I'd rather characters that are truer to life than sound like they're acting on stage, personally.
You can have both though. A great example is Patrick Stewart as Captain Jean-Luc Picard in Star Trek: The Next Generation. He’s never mumbled a line in his life and you can understand everything he says during quiet dialog. You can be true to life, quiet AND understandable.
This does help, but at some point Apple increased the compressor’s release time so that it’s far too quiet even 10-15s after the loud transient. I wish they would make it behave more like a brickwall limiter.
If only there was a hotkey to toggle it on and off, instead of hidden 4 levels down the touch-and-swipe menu.
> It's hard to bargle nawdle zouss with all these marbles in my mouth.
(German band, lots of English language covers with "wait THAT'S why they're saying?!?" moments.)
[1]: https://ffmpeg.org/ffmpeg-filters.html#acompressor [2]: https://github.com/ruuda/dotfiles/blob/fff8d1d905767e85b57d7...
https://github.com/slhck/ffmpeg-normalize
Everything had to compile and run on that machine before I could ship builds to any of our transcriptionists. Our website needed to be able to load on it and so on.
What you say rings true, sound engineers are also mixing audio at 24 bits uncompressed, and while DVDs and Blu-Ray have the ability to play it back, I doubt the streaming services are sending it in full bit depth and also that most people buy hardware capable of decoding it if they did.
The overall trend in TV and movies has been towards greater realism in all aspects. Less stage-audience sitcoms, more single-cam. Less "look at this actor" and more "look at this real person".
Actors aren't mumbling, they're speaking how real people speak. Trust me, screen actors have more stage training than ever -- they know how to enunciate but if they do, the director tells them to stop because they seem like they're giving a fake theatrical performance rather than being a slice of real life.
And because more and more people have huge TV's with great contrast, 5.1 surround systems, or are watching with AirPods with spatial audio... TV is becoming more film-like both in range of brightness and loudness. For many people (like myself) this is a godsend. It's like I'm at the cinema every time I watch an hour-long drama. It's amazing.
We could go back to low-contrast everything and overly enunciated actors, the way sitcom TV was, designed for small screens and terrible speakers. But that just feels so... backwards and limited and fake.
HOWEVER, I do firmly believe that modern TV content should be made more accessible, and that it's high time to build adjustable "end-user compression" into all TV's and video players. Let the user apply automatic volume equalization so quiet parts are just as loud as the loud parts when you're watching TV in your living room with lots of activity around. (And with 5.1 signals, you can even always keep dialog louder than sound effects and music.) And brightness equalization so you can see what's happening in the dark scenes of Game of Thrones when you're in a sunlit room.
But if it translates into me not being able to understand them, that’s less realistic, because in a real-world scenario I’d be able to understand them, since our ears are adapted for the real world. A real world slurrer is about as easy to understand as an enunciator through movie speakers — at least, with how the meth-addled mixers do it now.
Bottom line: speech being incomprehensible is not realistic, and is not accomplishing some wonderfully artistic gritty realism.
Edit: To put it a different way: they may be speaking the way real people speak, but once you put it through a speaker, it will translate into something less comprehensible than that speech would be in a real-world scenario, and thus less true to how that situation would be in real life.
No matter the size of the screen, bad acting is bad.
Watch some German shows (notably: Not Netflix’s Dark as that was untypical, I mean actually produced for the German market shows). It’s horrible. Everything sounds super fake. Yes, everything is easy to understand, but I (a German) can’t watch anything because nothing sounds real.
It’s usually claimed this is because German actors do more stage acting, but as Dark (or other German actors in US productions where they don’t sound fake) show, it goes much deeper.
Tangentially, it also reminds me of going back to the UK last week to visit a conference. The trains were just so much different to those in Germany, even the crappy "Stansted Express", those here in Germany feel very utilitarian and unfriendly (particularly the Deutsche Bahn "Bombardier Double-deck Coach") while in the UK similarly priced train services are more explicitly friendly and caring in their design.
I do think it is a systematic or cultural "problem", but also it's kind of one of the things I like about Germany. The emphasis is on things that are completely tangible to everyone rather than more removed concerns like aesthetics and friendliness. And, I think this bleeds into the cultural arts too. I have noticed the same thing with a lot of popular German music too. It just lacks the sense of "coolness" that there is in other countries, and feels almost like an cringey imitation of other countries popular music
And not just sound engineers! So many TV shows these days are filmed in nearly pitch black. If there are any lights on in the room (or it's still bright on a northern summer night and the living room doesn't have blackout curtains), I can't see the action. At first I thought I was just being cranky and old, but then we watched The Neverending Story with my kid and lo and behold, they film in "dark scenes" with torches etc so you can still see the actors well enough to read their lips! I'm sure it would be fine if I was watching it in a proper theater, but it's a frikkin TV show and who even goes to the cinema anymore?
I like the tv on quietly relative to other people I know, and continually messing with the volume is tedious.
On YouTube the ads are the explosions. Ads are so loud compared to the regular videos that I often just turn the volume down and the captions on.
What may be happening is that your speakers have a scooped profile, and audio mixes often have a scooped profile too. The sound effects and music tend to be bass and treble heavy, so you hear a lot of that.
But the lack of midrange means voices are hard to understand. You need range that to distinguish vowels, etc
I don’t understand how anyone can watch a movie with iPad or iPhone speakers. They are missing the exact part of the spectrum where important info in voices live. It’s just painful for me to pay attention with those speakers.
My parents were doing this, and I got them a simple Bluetooth speaker with midrange, and they immediately noticed the difference. Their faces lit up.
Judging by the comments here, it does seem there is a "double" race to the bottom of more treble and bass -- on the audio mix side and on the speaker side. So it's a compounding effect, which would explain a lot. So the least you can do is get some better speakers.
Or honestly if you have an EQ, you can just experiment with turning up the midrange and the treble/bass down, although with some speakers this will make things sound bad.
----
Also I just skimmed the original article!! Isn't it obvious that the problem is iPad and iPhone speakers ????
I see young people listening to MUSIC on these things and it boggles my mind
They can't understand dialogue out of these crappy tinny speakers, so they turned on closed captioning ?? Makes total sense.
Or are they also using them with ear buds, which shouldn't have the problem? I notice more people NOT using ear buds (annoying people on the train), which makes me think the problem is the speakers
Add all that to the usual nodding actors - they don't understand what they're supposed to be feeling so they just nod as they talk. Sorry to call that out, now you will never be able to un-see it again.
And whispering as a substitute for emotion! What the hell? Character is supposed to be feeling intense emotion, so the actor whispers. The director should slap them every time, and re-shoot.
Most of this is a result of surround sound and people not setting it up. There are two main speakers, a center speaker, and various surround speakers depending on the setup.
Dialog is primarily mixed to come out of the center speaker. If you turn the center volume up and the main speakers down you can fix the problem where explosions/music are too loud but dialog is too quiet.
Audio component manufacturers have actually responded to this tendency by adding in-line compressors to the audio channels that can be enabled/disabled. Sometimes this is labelled as the "Dynamic Range" feature in AV systems. Essentially it squashes the loudest sounds, which in turn brings up the average volume of the quietest sounds, thus evening the overall volume and allowing for a single volume setting to be used throughout the movie.
I avoid stuff without captions. Sorry, your shite is almost certainly mixed horribly.
Your receiver will have a dynamic compression feature of some sort. Make sure that's maxed and it'll help. But frankly, an action movie isn't supposed to have explosions at the same sound level as dialog.
Yep, I've been doing this now for several years. I guess i got used to watching (and enjoying!) many films that are not in English, but 99.9% of the time its because of actors mumbling, or some background noise (in the film) not making it easy to discern what was stated.
I am all for a ruthless dialog loudness war, like the loudness war in the 1990s-2010s.
The difference is that adverts are very heavily compressed so the quiet bits are much much louder making the whole thing seem loud.
Exact same problem at home... it was so bad on Interstellar that after that movie, she never went to a cinema with me again, or watched any movies together at all :(.
Turns out I’m also half deaf in one ear.
Anyway I’d recommend both a centre channel with adjustable volume and a hearing test.
A pretty common complaint is that developers only test things on their latest generation i9, with unlimited ram, and 8k displays.
Guy could barely speak english, but listen to the background audio when he screams that. Great example
Deleted Comment
Hard disagree. I want realism. The sophistication of modern audiences is now so high that the level of enunciation and clarity in older movies now sounds comical to modern audiences. Believe it or not this old style speak you see in Older movies was Literally artificial and made for acting: https://www.youtube.com/watch?v=Gpv_IkO_ZBU
Modern Audiences want realism. They want a certain level of believably that matches with reality. And a dialect artificially designed for "clarity" on radio just doesn't work.
Clarity takes a hit in the name of realism... but who cares? I have closed captions for that. Literally I don't see what's wrong with it.
My mind immediately went to that scene in idiocracy - "You talk like a fag, and your shit's all retarded"
To me it seems closed-caption usage is correlated with actually paying attention to television consumption.
People who have their TV on at all hours, as background entertainment to support their lifestyle, tend to not use CC. Why should they? The words literally don't matter, it's just an aesthetic.
People who actually desire an immersive experience, who deliberately pay attention to the shows they watch, tend to care about CC since it complements the audio & visual nicely. I don't have any evidence for this but I'd wager that plot synthesis and comprehension of television shows is greatly improved by CC. Or maybe it's that people who use CC tend to value and perform better at synthesis and comprehension? Regardless of causality, CC seems related to an individual's desire for more active entertainment.
Bonus if it’s delayed by exactly 2 seconds, so that it doesn’t spoil jokes, and I can quickly reference it if I missed something someone said (helping me avoid fixating on the captions.)
Because it’s the CC descriptions of music swells and things like “[laughs] and “[crying]”, etc that are distracting for me, and not needed.
What’s worse, sometimes they ruin comedic timing by putting the punchline on the screen before the setup has finished (or logical equivalent of a punchline… comedy is often found in timing) and the captions ruin the timing.
Agreed on timing issues. This also seems to vary by show/producer.
I think that'd ruin me more than not having subtitles at all, because I just get stuck at reading the subtitles all the time if they are on (which, I have on 99% of the time). I go so far to avoid content that doesn't have subtitles.
What I'd love to have, would be a "speed-reading" version of subtitles, that just show each word individually as it's spoken (or maybe two/three) instead of the full sentence. That way it can still be in sync, while not ruining punchlines. Would be a real hassle to actually write/make the subtitles though, but with all the AI/ML around today, there is probably a way to automate it.
As a non-native English speaker I’m used to watching most television and movies with subtitles. The extra sound descriptions don’t bother me too much actually. Sometimes it is kind of funny actually, it is a running joke in my home to count the times a show has [ominous music].
This drives me nuts too, especially when they're being subjective and I'm asking myself things like 'Was that music really "ominous"? Didn't seem that way...'
Another thing that bugs me is when captions reveal a character's name to indicate who is speaking when the name of that character hasn't actually been revealed in the story yet.
This is the exact reason why I vastly prefer playing media on my own media player. I have it set to always delay subtitles by exactly 2 seconds. I even have the font overridden to use a nice geometric sans (Josefin Sans, to be exact); subtitles with bitmap fonts (like they are on DVD and Blu-ray) are disabled outright.
Every time there's a new person watching something with me, they comment on the subtitle delay. It usually starts with a complaint that the subs are off-sync. Never had someone (yet) stick to that complaint though, after this explanation. But, nobody I've met has previously recognised a need for such a delay, either. Until they watch something with me, of course.
As you say, you don't have any evidence for this, so I'll just add that my subjective impression is the absolute opposite. I find it impossible to concentrate on what is happening visually on screen in a film when closed captions are on.
I associate CC with the one place I ever see them - at the laundromat, where they are always on because you can't hear the TV clearly above the background noise. Hard to think of a setting where "background entertainment" is a better descriptor.
But for any language that I understand well I only turn them on when the accent is so thick I can't get what the actors are saying.
Similarly I always turn on all subtitles in video games because it's easy to miss something if say ingame explosion or other sound effect covers the voice, or it is just too silent (say NPC talking that's a bit far)
But the article claim it helps multitasking seems weird - how having to look at screen would help when multitasking?
never watched anything in a foreign language? It's just familiarity, after a while you get used to reading and watching at the same time.
My 26 year-old son watches everything with captions, so there’s always a battle because I switch them off and he switches them back on again.
This is a skill. I would have agreed with you 20 years ago but at some point I started putting them on because I could only afford a shitty audio setup and dramas got really mumbly. Took a while, but I learned to not be distracted by the subtitles.
I'm from Europe so used to subtitles from early age. My wife is Canadian, and first little while of our dating was bewildered with subtitles.
Now though, literally all my Canadian in-laws use subtitles, intergenerationalLy - they're used to them, and addicted to knowing what the heck is going on :)
[0] https://mediawiki.middlebury.edu/MIDDMedia/Hot_versus_cool_m...
I really like to think of media consumption in terms of attentional pressure. How long can you not pay attention without "missing something". For movies and TV shows it seems like 15 seconds, plus or minus depending on the content itself. Chatty podcasts give you 30 seconds or a minute.
I enjoy the fast video games I play (shoutout to Deep Rock Galactic, killer PVE FPS with a non-toxic player base!) because they modulate their attentional pressure and ratchet it up intensely at moments. It feels great to handle a situation where even a subsecond lapse in attention would result in failure.
From wikipedia: Cool media are those that require high participation from users, due to their low definition (the receiver/user must fill in missing information). Since many senses may be used, they foster involvement. Conversely, hot media are low in audience participation due to their high resolution or definition.
Seems like it's hard to place fast video games? You don't have to fill in much of anything (hot) but they require full participation (cool).
But I definitely agree that there is something to it, and video games occupy a different niche of entertainment than movies for me. Video games are a pastime and a hobby, movies and shows are not unlike books: I don't want to miss a single piece of the action.
When I have dinner I watch something off Youtube. Movie time is after dinner when I can give my full attention to it.
Deleted Comment
Ditto with excessive darkness.
You may notice that in movie theaters—the most focused and immersive watching experience there is—they don’t show captions except maybe when translating foreign language.
Or you may not, if you only watch movies at home on your TV.
If there is bullshit framing then it is the very notion of assumptions based on age, which is caked into the headline.
Well, it just tells you come from the English speaking country. For the rest of us, subtitles in a movie theatre are among most normal things in the world.
Movie theaters have the same influence on a mind as any other experience of collective attention. It keeps your attention aligned with attention of a crowd. It may be a case of herd behavior, or it may be something special, I dont know, but it works.
I believe that this collective attention is the main reason why movie threaters are still profitable. But not just it: surroundings associated with attention on a screen and all these rituals, like eating pop corn, also do their part as stimuli leading to a learned response. You get highly focused attention without any conscious effort. You need conscious to fight it and to divert attention from a screen.
But it doesn't matter when you try to watch a movie or a lecture at home. Despite of all your experience of collective attention it would be much harder to keep your attention focused.
Even some DVDs did not have the option to turn off subtitles. I never had subtitles on for anything that gave me the option. (I even bought DVDs from overseas occasionally to not get subtitles)
But now every other show is frequently incomprehensible.
Either they've put lead in the drinking water, making us all not understand the spoken word, or they've made dialog audio shit.
If I'm actually fully immersed in the movie or whatever, and I can understand the dialogue, then I don't need the captions and I leave them off.
In the past, that line needed to come in loud and clear, or people would just miss it, probably forever. Now, we can rewind, or of course turn on subtitles, so there's not as much pressure on the audio finishing to be super clear.
I'm a background watcher for sure, and this trend with CC suggests to me that the whole "2nd screen" thing isn't playing with the younger generation.
Any insights as to why that may be?
To give more context, I was always a "background media" person. Even as a kid I'd draw while in front of the TV, I was never a focused participant.
By contrast I always pay attention when CC is off.
Given how much younger people use their my phone, I’m gonna say my hypothesis is plausible.
Deleted Comment
Older people are more likely to be alone and the TV helps to break the silence. TV is usually tv shows where people speak properly so that helps i guess
The alternative is that the new generation has no ability to focus on anything and needs the same information piped to their brain through both their eyes and ears. Up next: smellovision captions.
Deleted Comment
For me I find I miss things without the subs on. Whispered/background dialog that I'm unsure how anyone is supposed to hear/understand being top of list but also sound mixing seems to be terrible by default and it's hard to hear/understand characters even in a quiet room sometimes. I read quickly so subs have never been an issue to me, I can scan and parse the subs and be looking back at the video in no time at all.
For foreign content (like Anime) I use dubs + subs (the subs are of the initial English translation before the dubs were done) because it gives 2 passes at any given line. I find it very interesting to see how they change things between the two and it sometimes paints a fuller picture.
So many video sources coming with only surround mixes, or lazy stereo mixes that were made automatically from surround mixes, while very few homes have a surround system at all, and even fewer have one that's at least decently-calibrated, has been absolute hell for being able to tell WTF anyone's saying in movies. I suspect many VHS releases had better audio for most people's actual system even today, than the blu-ray of the same movie does.
I cannot stand having subtitles in the same language as the dialogue, but which differs in phrasing/content. It is very distracting to me.
Anime is an interesting case, because — remember, a lot of these threads are about poor enunciation and mixing? — anime tends to be really well enunciated, and the characters speak like they're on stage. It's easy to follow along. I keep subs on, because I don't understand literally everything, but usually I don't read them.
Surprise surprise, this doesn't reduce my immersion at all.
2) With closed captions you can watch a movie on low volume without disturbing others and still understand every word even of the girl in the corner of the room barely utters a word.
Source: I'm a "20-something".
I can't believe that whole dramatic article missed this very obvious reason.
I have a hard time believing this. Captions demand more attention if anything. I can passively "watch" a video by only listening to the audio, but I can't passively read.
I think that the phenomenon described in the article may actually be a symptom of a much deeper social change. Listening and auditory comprehension were critical skills for communication and preservation of knowledge in prehistory. Spoken word is inherently ephemeral. As civilizations developed or adopted writing systems, and the population became increasingly literate, text supplanted spoken communication and oral history in many areas. There are many obvious benefits to that change, but I believe that we also inadvertently sacrificed our listening and auditory comprehension skills in the process over many generations.
Text messaging/SMS is increasingly preferred to phone calls, with many of the younger generations experiencing high levels of anxiety if they're required to call someone.
This is completely anecdotal, but I've also observed several others who are unable to follow verbal navigation instructions - either spoken from another person or even live step by step instructions from a navigation app. They only feel confident if there's a visual representation.
I think we've mistakenly classified that behavior as having a "visual learning style", when it is more accurately the result of our species losing its ability to process auditory language.
With audio, if you missed a word, because you were focussing on something else for a moment, you have to rewind. If you have captions on, you can glance quickly at the screen, read the word or two that you missed, allowing you to 'recover'.
That allows you to pay even less active attention to the audio, because you know you can always 'error-correct' later.
I do this. I can parse what is happening on screen and read the captions in a fraction of the time the thing actually plays out, then I got a few seconds to do something else. Usually I would be reading HN or something while I tune out the video for a few seconds, before glancing at it again to catch the next bit.
Listening to a video and reading something else at the same time doesn't work for me. When I do that, I usually forget what I was reading or miss something in the video. Interleaving works much better for me.
Also I don't normally do that, just when there's some particularly boring part that I don't want to skip but which doesn't demand my undivided attention.
That kind of context switching really does sound terrible to me. In my case, I speed up the content itself by 3 to 5x speed so that I can process more information at once and any boring bits are basically sped through. It's helped me retain a lot more information than simply watching at 1x speed.
Where are older generations would have spend far more time communicating verbally..
That makes some sense to me anyway. Would be super interesting to see some further studies done. And to philosophise the results that could mean.
I'd be interested in some formal studies as well. My thoughts aren't well researched or anything. Just ideas.
If the dialogue is 60dB below the explosions (or even the music at times) and you're doing something like cooking with a 70dB noise floor and 80dB peaks, then you have zero chance of getting the dialogue without blowing out your windows.
If the dialogue is 60dB below the explosions and you're doing something like cooking with a 70dB noise floor and 80dB peaks, then you have zero chance of getting the dialogue without blowing out your windows.
With social media videos, CC is even more important cause nobody wants their phone making noise in public, and any wise app will mute by default anyway.
Then again, maybe you're right, I don't process auditory language and when I was in university, I skipped classes to instead read the textbook back to back without a teacher distracting me by yammering around.
Deleted Comment
It's because re-reading a word I didn't understand is much faster and easier than re-hearing (in my head) a word I didn't understand.
It's even happened with my spouse, in the same room, speaking the same dialect of American English as me, with the same accent. I mis-hear one word, and it bleeds into the next word, and I need to think for a few seconds or ask them to repeat before the sentence makes any sense to me.
Perhaps us, and the TV actors, just don't enunciate like we used to?
It's SO interesting.
They are generally really well put together. And they often highlight things that I would have missed - mentioning the name of a character who joins a scene who I don't instantly recognize, for example.
Even more fun: audio descriptions of the title sequences for a TV show. Three that I particularly enjoyed were Severance, Moon Night and Star Trek: Picard - in all three cases the titles are rich with visual metaphors relating to the show, which the audio description then carefully spells out for you!
But as a 'go to', that's like reading the comments before the article.
Edit: Not exactly the same : I was referring to "Directors commentary" audio tracks.
Edit 2: I'm a bit worried you need the environment and expressions explained to you...
"John enters the room, looking furious. He's holding some money. Simon stares at David."
So at least in Britain, it's available on normal, broadcast TV.