They say the models were trained on a bunch of books and that they learned the use of the dash from there. That's fine, no one is denying that humans have always used dashes in their books.
But where you would bet rarely see a dash would be something like a short product review, a YouTube comment or a WhatsApp message. In these contexts the dashes can and do seem out of place.
The ship has sailed, unfortunately. Obviously humans use em-dash too. But more and more people's first reaction to em-dash would be "haha got you, AI!"
Imaging you're an artist designing a character with 6 fingers today.
The situation is really sad. People who have the proper skills have to change how they work just to avoid "witch hunting" (for the lack of better term). What's next? If GPT-5.5 uses a lot of ellipses, are we going to stop using them? Semicolon? Will humans be using the most watered-down subset of English only at some point?
If m-dashes becomes a turn off, GPT users will simply replace them with hyphens and using m-dashes will become a sign of real professionalism or alternatively a sing of completely clueless.
> They say the models were trained on a bunch of books…
Yeah, it's where I learned to use em-dashes as well.
> In these contexts the dashes can and do seem out of place.
Hmmm… For sure I use em-dashes in HN comments. I am not sure that I mentally differentiate as to whether I am in one scenario or another. (But to be sure I am not likely to leave an Amazon review though — so perhaps those contexts you called out self-select.)
I use em dashes in my comments too but this is Hacker News. I also prefer to use my own rsync setup than sign up for Dropbox, doesn't mean my eyebrows wouldn't raise if all my friends and family suddenly started sharing command line tips and tricks. It's self selection like you say.
But my point about the article not being convincing is just this: I can share my anecdotal evidence, you can too, we all go in a circle and it gets us nowhere. What I was expecting when I clicked the link was some actual data on dash prevalence in casual writing such as YouTube comments and a conclusion based on that data. What I got was more "Well if you look at this very particular kind of writing then extrapolate that to cover all writing then my point is made."
Yeah, I remember Word doing that, and I manually did it when writing things like my honours thesis (which I typeset in LaTeX) or when I was writing HTML where the – and — would be liberally used.
> no one is denying that humans have always used dashes in their books.
I am. Em-dashes, like all punctuation, were invented at some point. Even the space didn't always exist, and the em-dash is a lot more recent than that.
And if it was such a vital part of punctuation, it would have been on our typewriters and therefore on our modern keyboards.
> And if it was such a vital part of punctuation, it would have been on our typewriters and therefore on our modern keyboards.
Typewriters were monospaced, which gives you extremely limited scope for distinguishing hyphens and em dashes. Small wonder that they didn’t bother attempting a distinction, and then that provided the inertia for us to never get such a thing now.
Typewriters are a lowest-common-denominator sort of thing. They lacked all kinds of widely-used stuff, and some of it they killed by their omission. Accented letters you mostly couldn’t do at all, and the rest of the time could only do by a terrible hack.
There’s a similar story in the final death of the letter thorn (þ) in English <https://en.wikipedia.org/wiki/Thorn_(letter)#Middle_and_Earl...>: imported fonts lacked the character, so people substituted it with y which looked most similar, and that substitution became ubiquitous, and now most people think the first word in “Ye Olde Curiositie Shoppe” is pronounced /jiː/ (“ye”), whereas it was actually just how they spelled “the”, so it was /ðiː/.
It’s a general rule in such technologies: although they make many new things possible, they also damage what was there before.
I am fairly confident the majority of my LinkedIn network are not experienced writers and don't know what em dash means. All make regular posts with em dashes in them. Their excessive use, combined with a certain presentation style, tells me it's ChatGPT. When I ask them they confirm it's ChatGPT.
I wasn't using em dash, but appreciate looking it what seems pedantry. It's about semantics after all and having the right syntax is key. So I realized I'd like to be more thorough and use em dashes, en dashes and hyphens correctly.
My point is that if you/we treat things "statically" we're missing the point. It's not just tech that's changing, it's society changing as a result of tech (always has been).
> It's not just tech that's changing, it's society changing as a result of tech (always has been).
True, and it goes both ways. As the cultural backlash to AI grows (see terms for it like Generative Slop, Bullshit Oracles, Regression Engines, etc) so too does people's desire to both identify and differentiate themselves from AI content and/or content that appears AI-esque.
So just know there is a significant subsection of the population that will clock such writing styles and will immediately dismiss and/or react negatively to your messages not on merit, but on "smell".
Rather, human typesetters of professionally printed material have always (well, since it was invented) used the em-dash. Handwritten dashes rarely clearly break down into categories that are clearly exactly matching one or the other typographical dash, and until relatively recently with proportional display fonts with large character sets and fancy input methods, typed (whether on typewriter or computer) text was unlikely to directly contain typographical dashes, though some systems (especially publishing/typesetting toolchains!) had system-specific, ad hoc means of representing them.
OTOH, as long as user-interactive web content has existed—so “always” in a context of a particular view of the online world—em-dashes have been part of it, because the facilities that make it easy to use (whether automatic replacement, or various keyboard input modifying mechanisms) have been sufficiently common that a robust minority of users have regularly used one or more of them.
> I speak of the elegant, elongated hyphen, the gentle friend and ally of all writers, used to set off a chunk of text within a sentence.
There's nothing elegant about a punctuation mark firmly glued to the words on either side, making a sequoia-sized typographic log that typically gets wrapped in its entirety to the next line, leaving a half mile or so of white space just hanging in space before the wrap.
If you're gonna use the em dash, make sure your software can break a line on either side of one.
I totally agree: em-dashes simply do not belong glued to the words on either side, style guides be damned! They look awful and wrap badly. There are no other punctuation marks that conceptually separate words without spaces! Every other punctuation mark that connects two words brings them syntactically closer together than a single space. And/or, hyphen-style, person@place, A&W, foo.bar -- they're all creating a closer association between the words than a space would. Why should the em dash stand alone in making a more distant association -- essentially a lower-precedence operator -- while removing spaces? It's nonsense. Put spaces around your em-dashes and fuck the style guides!
I was wondering about this since a while. It looks weird to me as in German between the word and the em dash a space is mandatory. (At least some decades ago.)
I frequently am accused of using LLMs to write my prose, something that I not only eschew, but also believe is morally corrupt and intellectually dishonest.
I’m not above spellcheck, grammar checkers, or even LLM driven evaluation of articles, but my thoughts, word choices, and structure are always of my own design.
I use the em-dash where it is appropriate.
I find that people accusing writers of using AI typically disagree with the premise of the text, and use the “AI” character assault as a method of dehumanising the author and dismissal of their work. The assertion is very rarely made in good faith, but rather is used as a weak attempt to discredit an idea without actually refuting the premise or even examining the argument.
Shame on whoever argues in this way, it’s weak, unproductive, and intellectually lazy. It’s fine to disagree, but if you aren’t willing to act in good faith, just keep your thoughts to yourself. You’re only going to discredit your own point of view if you touch the keyboard.
For lack of an easy way to type it on my computer I tend to use parentheses (which effectively serve the same purpose) but will opt for an em dash more often when typing on my phone at the risk of bookish messages and notes.
Coworkers have emailed me before suggesting a certain course of action which I can tell is heavily influenced by an LLM. "I think we should X because Y" to which I just think "Is this really what you know and believe?". If I wanted an LLM to answer I could have asked it myself. But I don't accuse — I ask for more evidence or a better argument because if I'm forced to work with an LLM by proxy I am going to reflect the burden of dealing with one back to the author.
I'm not a professional writer except of software, but both there and in my non-professional writing, I'm a lot more likely to use semicolons than em-dashes.
They say the models were trained on a bunch of books and that they learned the use of the dash from there. That's fine, no one is denying that humans have always used dashes in their books.
But where you would bet rarely see a dash would be something like a short product review, a YouTube comment or a WhatsApp message. In these contexts the dashes can and do seem out of place.
Imaging you're an artist designing a character with 6 fingers today.
The situation is really sad. People who have the proper skills have to change how they work just to avoid "witch hunting" (for the lack of better term). What's next? If GPT-5.5 uses a lot of ellipses, are we going to stop using them? Semicolon? Will humans be using the most watered-down subset of English only at some point?
At least, that’s what I’m doing.
Deleted Comment
(/s)
Yeah, it's where I learned to use em-dashes as well.
> In these contexts the dashes can and do seem out of place.
Hmmm… For sure I use em-dashes in HN comments. I am not sure that I mentally differentiate as to whether I am in one scenario or another. (But to be sure I am not likely to leave an Amazon review though — so perhaps those contexts you called out self-select.)
But my point about the article not being convincing is just this: I can share my anecdotal evidence, you can too, we all go in a circle and it gets us nowhere. What I was expecting when I clicked the link was some actual data on dash prevalence in casual writing such as YouTube comments and a conclusion based on that data. What I got was more "Well if you look at this very particular kind of writing then extrapolate that to cover all writing then my point is made."
Deleted Comment
Word will insert emdashes for you for example, but it's not like the reddit comment box does.
It works the same way on a Mac (key repeat off) or by pressing option+shift+hyphen (key repeat on).
But nothing I type in a web form would have them.
I am. Em-dashes, like all punctuation, were invented at some point. Even the space didn't always exist, and the em-dash is a lot more recent than that.
And if it was such a vital part of punctuation, it would have been on our typewriters and therefore on our modern keyboards.
Typewriters were monospaced, which gives you extremely limited scope for distinguishing hyphens and em dashes. Small wonder that they didn’t bother attempting a distinction, and then that provided the inertia for us to never get such a thing now.
Typewriters are a lowest-common-denominator sort of thing. They lacked all kinds of widely-used stuff, and some of it they killed by their omission. Accented letters you mostly couldn’t do at all, and the rest of the time could only do by a terrible hack.
There’s a similar story in the final death of the letter thorn (þ) in English <https://en.wikipedia.org/wiki/Thorn_(letter)#Middle_and_Earl...>: imported fonts lacked the character, so people substituted it with y which looked most similar, and that substitution became ubiquitous, and now most people think the first word in “Ye Olde Curiositie Shoppe” is pronounced /jiː/ (“ye”), whereas it was actually just how they spelled “the”, so it was /ðiː/.
It’s a general rule in such technologies: although they make many new things possible, they also damage what was there before.
Deleted Comment
My point is that if you/we treat things "statically" we're missing the point. It's not just tech that's changing, it's society changing as a result of tech (always has been).
> It's not just tech that's changing, it's society changing as a result of tech (always has been).
True, and it goes both ways. As the cultural backlash to AI grows (see terms for it like Generative Slop, Bullshit Oracles, Regression Engines, etc) so too does people's desire to both identify and differentiate themselves from AI content and/or content that appears AI-esque.
So just know there is a significant subsection of the population that will clock such writing styles and will immediately dismiss and/or react negatively to your messages not on merit, but on "smell".
OTOH, as long as user-interactive web content has existed—so “always” in a context of a particular view of the online world—em-dashes have been part of it, because the facilities that make it easy to use (whether automatic replacement, or various keyboard input modifying mechanisms) have been sufficiently common that a robust minority of users have regularly used one or more of them.
There's nothing elegant about a punctuation mark firmly glued to the words on either side, making a sequoia-sized typographic log that typically gets wrapped in its entirety to the next line, leaving a half mile or so of white space just hanging in space before the wrap.
If you're gonna use the em dash, make sure your software can break a line on either side of one.
I frequently am accused of using LLMs to write my prose, something that I not only eschew, but also believe is morally corrupt and intellectually dishonest.
I’m not above spellcheck, grammar checkers, or even LLM driven evaluation of articles, but my thoughts, word choices, and structure are always of my own design.
I use the em-dash where it is appropriate.
I find that people accusing writers of using AI typically disagree with the premise of the text, and use the “AI” character assault as a method of dehumanising the author and dismissal of their work. The assertion is very rarely made in good faith, but rather is used as a weak attempt to discredit an idea without actually refuting the premise or even examining the argument.
Shame on whoever argues in this way, it’s weak, unproductive, and intellectually lazy. It’s fine to disagree, but if you aren’t willing to act in good faith, just keep your thoughts to yourself. You’re only going to discredit your own point of view if you touch the keyboard.
For lack of an easy way to type it on my computer I tend to use parentheses (which effectively serve the same purpose) but will opt for an em dash more often when typing on my phone at the risk of bookish messages and notes.
Coworkers have emailed me before suggesting a certain course of action which I can tell is heavily influenced by an LLM. "I think we should X because Y" to which I just think "Is this really what you know and believe?". If I wanted an LLM to answer I could have asked it myself. But I don't accuse — I ask for more evidence or a better argument because if I'm forced to work with an LLM by proxy I am going to reflect the burden of dealing with one back to the author.
Personally, I'm more prone to excessive semicolon usage, which seems to aggravate editors.