Readit News logoReadit News
mmooss · a year ago
Here's an easy, if not always precise way to remember:

* Hyphens connect things, such as compound words: double-decker, cut-and-dried, 212-555-5555.

* EN dashes make a range between things: Boston–San Francisco flight, 10–20 years: both connect not only the endpoints, but define that all the space between is included. (Compare the last usage with the phone number example under Hyphens.)

* EM dashes break things, such as sentences or thoughts: 'What the—!'; A paragraph should express one idea—but rules are made to be broken.

Unicode has the original ASCII hyphen-minus (U+002d), as well as a dedicated hyphen (U+2010), other functional hyphens such as soft and non-breaking hyphens, and a dedicated minus sign (U+2212), and some variations of minus such as subscript, superscript, etc.

There's also the figure dash "‒" (U+2012), essentally a hyphen-minus that's the same width as numbers and used aesthetically for typsetting, afaik. And don't overlook two-em-dashes "⸺" and three-em-dashes "⸻" and horizontal bars "―", the latter used like quotation marks!

lxgr · a year ago
> EM dashes break things, such as sentences or thoughts

Some style guides recommend "space, en dash, space" for this, and I prefer that myself – mainly because some software doesn't treat em dashes correctly as word separators for double click selection purposes.

For example, I'm pretty sure that at least some Kindle models would highlight both the word before and after the em dash when selecting one of them, which makes using the dictionary very annoying.

krick · a year ago
It's actually only your post that made me realize people don't normally put spaces around em dash. In French, Russian and a bunch of other languages proper typesetting is to use em dash as a standard dash character, and you always put spaces around them. So I did it in English as well, for many years now.

(I also now looked up and found out that in Spanish, apparently, you are supposed to put space only on one side of the dash, when used as a direct speech separator.)

rahimnathwani · a year ago
I grew up in the UK, and have always used space, minus, space.

The first keyboard I used was my dad's typewriter, and I don't recall it having any 'dash' other that the minus sign.

opello · a year ago
> Some style guides recommend "space, en dash, space" for this

The last paragraph of the article also addressed the subjective nature of spacing around the em dash:

> Spacing around an em dash varies. Most newspapers insert a space before and after the dash, and many popular magazines do the same, but most books and journals omit spacing, closing whatever comes before and after the em dash right up next to it.

As far as the selection detail, did you mean that you replace an em dash used like a comma or parenthesis with spaces and an en dash for specific highlight performance issues? Surely the spaces and an em dash would alleviate the selection highlight behavior and not muddy the waters of when to use an em vs. an en dash?

mmooss · a year ago
The AP Style Manual, a/the leading source for US journalism at least, says

  <word> <space> <dash> <space> <word>
Outside of journalism, usually there is no padding, only,

  <word> <dash> <word>
I'm with you: For searches, the spaces make the words easier to parse. Those rules predate computers, I would guess.

cyrillite · a year ago
I have been doing this for purely aesthetic reasons my whole life. Style guides be damned, I hate connected em dashes.

Deleted Comment

KPGv2 · a year ago
> Some style guides recommend "space, en dash, space" for this

Which one does that? I threw up a little in my mouth and wish to avoid such style guides in the future!

divbzero · a year ago
I prefer the dedicated minus (U+2212) over the hyphen-minus (U+002d) for mathematical use because they look different in most font faces.

Are there cases where the dedicated hyphen (U+2010) is preferred over the hyphen-minus?

LegionMammal978 · a year ago
G. Brandon Robinson swears by U+2010 for hyphens in groff's Unicode output [0], but I see it as a hypercorrection. The most common convention by far (among authors who use Unicode and care about dashes) is to use U+002D for hyphens and U+2212 for minus signs. Not even the Unicode Consortium uses U+2010 for hyphens in its documents, and I'm not aware of any major organization that does.

As far as appearance goes, almost all fonts I've looked at make U+2010 identical to U+002D (i.e., they don't put any 'minus' into the 'hyphen-minus'), but a few make U+2010 a smidgeon shorter.

[0] https://news.ycombinator.com/item?id=38121765

wruza · a year ago
Intl.NumberFormat also prefers it, but then you can't paste negative numbers into most financial software, calculators, spreadsheets. Even back into inputs on the same webpage, if it does custom number parsing. Even though <input type=number> accepts U+2212 as a minus, it turns it into a regular minus when you spin it down to -2.

It looks much better though and more visible: −1 vs -1. I wish hyphen was a separate symbol from the ascii start, or that monospace fonts didn't tend to shorten "-" cause it makes little sense in monospace anyway.

layer8 · a year ago
It has two potential benefits:

— In the context of automatic text processing, it unambiguously indicates the function of a hyphen, as opposed to a minus

— Fonts can choose to make the hyphen-minus a bit wider than a regular hyphen, to accommodate the usage as a minus sign. In that case, U+2010 would be typographically more appropriate for a hyphen, similar to how U+2212 usually is typographically more appropriate for a minus sign.

zajio1am · a year ago
Visual style of hyphen-minus depends on font. Some fonts displays it more like a minus, others like a hyphen. So if you care about distinguishing hyphen and minus, it makes sense to use dedicated hyphen and minus, and do not use hyphen-minus at all.
mproud · a year ago
A regular hyphen arguably looks better when used as a hyphen and not a minus.
BoumTAC · a year ago
I'm not a native English speaker, but don't you use the ";" in English ?

To me, it feels like it is the same purpose as the EM dashes.

And I discovered the EM with ChatGPT, I've never seen it before.

layer8 · a year ago
A semicolon connects, whereas an em-dash creates more of a pause and therefore separates. In addition, em-dashes can be used in pairs to create a parenthesis, which semicolons can’t. I think with time you will appreciate the difference.

https://thenarrativearc.org/blog/2020/2/4/epic-grammar-battl...

OJFord · a year ago
Dashes surround a sub-clause - something like this - which is like a parenthetical addition to a sentence that could stand alone without it; semi-colons (';') connect a further sentence or part of one where perhaps a full-stop and additional word could have been. They also sometimes separate list items following a colon, especially if the things listed are longer sentences perhaps themselves containing commas that'd otherwise be ambiguous.
grey413 · a year ago
Em dashes are very similar to semicolons. You use em dashes if your related sentence is in the middle of another sentence, and semicolons if it's at the end.

They're frequently used in skilled and professional grade writing.

mmooss · a year ago
Many people don't use semicolons (;) in English but many do, and they are certainly part of correct grammar.

Semicolons are generally alternatives to periods, when you want more connection between the two sentences. Like periods, semicolons must have two full sentences—that is, what could be full sentences—on either side of them; the potential 'full sentences' are properly called independent clauses. (A dependent clause needs the rest of the sentence to form valid grammar; it can't function on its own. For example, in this paragraph's first sentence, when you want more connection between the two sentences is a dependent clause. Often they follow commas.)

Another use of semicolons is for lists in a paragraph where one of the list items has a comma in it (similar to the parsing problem for CSVs where some records contain commas): I only like wine; beer, but only ales; and orange juice.

dspillett · a year ago
> Unicode has the original ASCII hyphen-minus (U+002d), as well as a dedicated hyphen (U+2010), other functional hyphens…

Which can be fun when parsing CSV files from various sources. I've hit numbers with U2010 or others where you would expect a hyphen-minus should be. Presumably someone² has copied a negative number from a document where one of the alternate symbols was used, and pasted it into everyone's favourite data-mangler¹ which interpreted it as a string, and so on down the chain.

--------

[1] Excel. Sometimes a joy, sometimes the bane of my existence.

[2] It is surprising, horrifying even, how much manual manipulation of data goes on in banking, where you might naturally assume everything is more automated these days. Sometimes a laborious manual process done regularly is seen as cheaper than paying for it to be automated…

neycoda · a year ago
Bring in fonts that don't distinguishes these different characters well and now it's hard to tell how messy this has become.
energy123 · a year ago
The em dash is now a GPT-ism and is not advisable unless you want people to think your writing is the output of a LLM.
sho_hn · a year ago
My advise is to take pleasure and have confidence in good writing, over misspent energy worrying about things like this.

If you practice your skills, you will reap the rewards.

alt187 · a year ago
The letter 'm' is now a GPT-ism and is not advisable unless you want people to think your writing is the output of a LLM.
xanderlewis · a year ago
No, thanks—I’ll keep using them as I always have.
mmooss · a year ago
Someone else said the same. How can that be when most word processors, and at least some phone keyboards, automatically insert em dashes?
grey413 · a year ago
It's infuriating that people are drawing this conclusion. LLMs pick up on em dash usage because professional and skilled writers use em dashes. They're a consistently useful, if niche, part of the literary toolkit.

But, no, now it's a problem because the majority of people's experience with writing is graded essays. And because LLMs emulate professionals, it's now a red flag if students write too much like professionals. What a joke.

phlakaton · a year ago
Emily Dickinson wept—
nkotov · a year ago
Recently ran into this. Didn't realize it was that obvious.
windward · a year ago
And you'd better not 'delve' into anything

Dead Comment

docmars · a year ago
EN dashes are also great for date ranges: 1/1/2025–3/28/2025
raverbashing · a year ago
You are right of course

However this is the kind of rule that "existed" for a while and most likely will go away as most people can't be bothered with the difference and it all looks similar anyway

Or maybe who knows, it will keep going on because chatgpt knows it

econ · a year ago
I've always wanted an array or object with range keys like: arr[0–2] = 123; if(arr[1.5555]>122){}
yesbabyyes · a year ago
That doesn't seem to be an array at all, if the idea is to check whether a number is within a range. Seems like an interesting data type though, a combination of a range data type and a map/associative array.

Deleted Comment

paulddraper · a year ago
In Python it’s a colon.
hilbert42 · a year ago
"There's also the figure dash…"

Re last paragraph: dashes, etc. are confusing for perhaps most of us who aren't, say, typesetters, myself included. I use EM dashes a lot usually without a space between words and sometimes with spaces when I think the typography calls for it—or for extra emphasis.

Essentially, most of us guess the rules and often this doesn't matter much but it can in certain circumstances.

For example, in say machine conversion/transliteration. The ASCII dash is often used as a substitute for Unicode minus sign because it's easy to select [it's my usual practice], and anyway many don't know there is an actual difference. Whilst a human will usually know the difference by its use or context a machine may take the literal interpretation which could lead to say a numerical calculation error.

This problem has annoyed me for a long while. Why is it that wordprocessors and editors do not highlight these characters and query whether the usage is correct? Surely this ought not to be that difficult.

Another example is Roman numerals. The average person will enter say an uppercase 'I' for the Roman numeral one. Here's a typical example which is incorrect:

WWII

Here I entered the normal ASCII 'I' because it was too involved to find the correct Unicode character for Roman numeral one.

I'd like to know what others who are in typography, machine learning etc. think about this, and why WP programs and editors don't have simple ergonomics that allow for easy selection of the correct character.

† On a related matter, you'll note I've used single quotes whereas mmooss uses double quotes. This tell me that mmooss is likely in the US whereas I'm not. Again, this is not really a major problem for humans but it can be in transliteration, etc. Also, it's unclear (at least to me) what the default is for quoting quotes, i.e.: "" versus "' (right, I've refrained from using triple quotes).

Again, this seems country specific with I believe the US favoring double followed by single. Even when these rules are defined do people strictly adhere to them?

mproud · a year ago
A Figure Dash is perfect for phone numbers (especially when working with tabular numbers).
st_goliath · a year ago
Also, not to be confused with "一", which is a different thing entirely……
mortos · a year ago
This one is U+4E00, CJK Unified Ideograph-4E00. So it's a common character between Chinese, Japanese, and Korean. This should be "one" in all three. And it does technically look a little different than a dash: https://unicodeplus.com/U+4E00

Deleted Comment

A_D_E_P_T · a year ago
AFAIK most computer keyboards don't have em dashes. Rather than hit ALT+0151 every time, I've always just strung along two hyphens, like: --

Absolutely proper and correct use of em dashes, en dashes, and hyphens is, to me, the most obvious tell of the LLM writer. In fact, I think that you can use it to date internet writing in general. For it seems to me that real em dashes were uncommon pre-2022.

BalinKing · a year ago
This test feels biased by the fact that, like others have said, macOS provides keyboard shortcuts. For example, I'm only Gen Z and yet have tried for many years to use the proper dash characters in the right places, which is made much easier by virtue of being on a Mac.

Of course, I guess it's entirely possible—even accounting for OS—that this test remains statistically useful. It makes me kinda sad that my (very much human-generated) writing fails the Turing test....

MrJohz · a year ago
The compose key, for those who use it, also makes it very easy to do em/en dashes, and I use them quite regularly as a result.
pests · a year ago
Windows does too now via Windows+. which opens the "emoji keyboard" but you can switch to the "symbols" tab to see unicode. It does have multiple dashes in the quick access bar at the top or you can search.
Freak_NL · a year ago
That has nothing to do with being on a Mac. Em-dashes and the compose-key work fine on Linux, and Android has them under the '-' of the on-screen keyboard when long-pressed.

(Windows probably has some way, but those are rarely discoverable.)

harrall · a year ago
On iPhone, type two hyphens to make an em dash:

-- into —

If OP wrote their post on an iPhone, they would have inadvertently appeared as an LLM by their own test.

ogurechny · a year ago
Just install a proper keyboard layout with proper typography support once.

It is maddening that the whole world uses typewriter keyboards with some facelift in the era of Unicode and even blasphemous full color emoji font rendering. What has changed in decades? Windows logo key, power keys, media keys, IE and Outlook logo keys — all Microsoft's fancies.

So initially IBM made some ad hoc decisions on what keys would be suitable for a single user office computer (as opposed to data input and admin terminals they had). Then everyone copied that, because sending unexpected scan codes could lead to bad things (random BIOS and program code couldn't care less about your ideas of forward compatibility). Then Windows became the “basic system” installed on most computers. Microsoft really pushed forward the internationalisation at the time, making a lot of national layouts and code pages (sometimes contradicting the national standards, for better or for worse). Then everyone copied what they decided. What's more important, even single byte code pages had the basic typographic symbols, anyone could've been using them for three decades, but they were not added to most physical keyboard layouts.

I wonder if that was because they wanted Word to seem more sophisticated than it was, and to make people think it was a requirement for “proper documents”, or because programmers still treated all non-ASCII symbols as free data markup constants that would “never appear in a regular text”.

mmooss · a year ago
> So initially IBM made some ad hoc decisions on what keys would be suitable for a single user office computer

Didn't it match ASCII and possibly typewriter keyboards?

n2d4 · a year ago
Alt+hyphen or alt+shift+hyphen is an endash/emdash. You may not have been aware of it because it's so subtle, but many people (including myself) used emdashes long before 2022

(edit: apparently only on Mac, see reply below)

jml7c5 · a year ago
I believe that's only on MacOS.
lxgr · a year ago
That's one of my favorite features of macOS keyboard layouts, but it's so close to one of my least favorite ones – option + space inserting a non-breaking space.

I almost never want that, and when typing "space, en dash, space", it happens quite easily and is usually impossible to tell visually.

IsTom · a year ago
Works here on Linux too, so not just Macs.
tshaddox · a year ago
I've been Googling "em dash" and copypasting from the Google results for a solid 15 years now. Long before LLMs.
hiccuphippo · a year ago
I modify the keymap to use AltGr+dash as em dash. Very easy in Linux with xmodmap, bit more complicated in Windows with the Keyboard Layout Creator.

Deleted Comment

jbverschoor · a year ago
Just use the Raycast emojipicker, it's very good. Better and faster than the macOS one
maegul · a year ago
Certain corners of the world have absolutely cared about and employed the proper use of all the “dashes” well before but all the way up to 2022. I’d imagine LLMs have just consumed some of that material.
dragonwriter · a year ago
Pretty much everything professionally edit and typeset does, and those will generally be retained in Unicode text (obviously, not if it gets converted to ASCII). It’s less common in internet fora because not all users either know the use of dashes or have easy access to them on the devices they are using, and if its not both familiar and easy, people are going to skip it in quick messages.
dml2135 · a year ago
I used to intern for a literary magazine and I can confirm that half my copy-editing was enforcing proper use of em-dashes. This was well before 2022.
mkehrt · a year ago
I always use an em dash when possible when I should, and double en dash when I can't, just because I'm that kind of nerd. But it is the case that a double en dash on iOS autocorrects to an em dash, so I'm suspicious of the claim that em dashes are a tell for LLM writing.
nextts · a year ago
Most editors should auto changes a double dash into em dash. I thing Google Docs does for example.
mmooss · a year ago
Why not a double hyphen, which has the same result?
grumbel · a year ago
In Linux/Xorg with a compose/multi-key one can do:

<Multi_key> <minus> <minus> <period> : "–" U2013 # EN DASH

<Multi_key> <minus> <minus> <minus> : "—" U2014 # EM DASH

More in /usr/share/X11/locale/en_US.UTF-8/Compose

oneeyedpigeon · a year ago
I've tried to use real hyphens and dashes since learning a bit about typography roughly 10–15 years ago. macOS makes it really easy with just alt and hyphen for en-dash, shift+alt and hyphen for em-dash. Definitely not an "obvious tell" of an LLM!
apt-apt-apt-apt · a year ago
Thanks for the '⇧⌥<dash>' tip— from 2022–2025, I have been using macOS en's thinking they were em's.

(Side note: GTP says apostrophes should be used for pluralizing only for single letters to avoid confusion, but this seems more readable than "ens and ems" IMO.)

account-5 · a year ago
I recently got accused of using AI for some writing I submitted because I regularly use both en-dashes and em-dashes, and have for years. I said in another thread recently they are second and third, to semi-colons, as my favourite punctuation marks.

I was able to demonstrate my long use of them, prior to LLMs. And since I write in quarto markdown I don't need keyboard shortcuts.

wil421 · a year ago
The engineers of various AIs are probably reading your comment and making adjustments.

Or we are both just AIs, as a portion of HN comments are, commenting back and forth about other AIs.

nextts · a year ago
As are super pedantic humans.
akshayshah · a year ago
Em- and en-dashes have been well-supported by LaTeX, the smartypants family of Markdown extensions, and plain HTML for more than 20 years.

In your support, though, calling the extension “smartypants” really hints at the target audience :)

kbenson · a year ago
Automatic conversions have been happening for a long time. In fact, a few years ago there was some combination of settings on my terminal locale settings and man (well, troff/groff most likely) was converting hyphens in param definitions to some sort of dash character, meaning I couldn't copy and paste out of the man page. I think it also affected perldoc for the same reason.

I don't doubt there are publishing platforms that do it automatically as well, so I wouldn't count on seeing them as an indicator of generated output, even if it may be processed in some manner.

tedunangst · a year ago
This is because the original was written using the wrong markup. When the output was ascii, nobody noticed, but it matters when the output is unicode.
blueflow · a year ago
Context: https://lists.debian.org/debian-devel/2023/10/msg00085.html

Money quote:

  This issue does indeed have a history of provoking unhinged lunacy.

necovek · a year ago
While this is true, this is an amazingly silly omission.

Serbian and Croatian XKB keyboard layouts have had em- and en-dashes since early 2000s even if they were not standardized: AltGr (right Alt) + hyphen (to the left of right Shift) produces an em-dash, and press Shift on top, and you get an en-dash.

This is how long I've had them easily accessible on any keyboard (I even have them converted to MacOS keyboard layouts for use with Karabiner).

http://srpski.org/dunav/raspored-c.html

edflsafoiewq · a year ago
It's &mdash; in HTML or Markdown.

If you use eg. a Japanese IME, you can also get it by typing a normal hyphen and selecting the em dash from the picker.

starfezzy · a year ago
The lack of em dash usage in popular culture speaks more about typical people than it does about whether a text's author was an LLM. In fact, the average person has never even noticed—let alone considered—that the em dash exists. If they've read for 20+ years, they've seen at LEAST hundreds of them.

Imagine being an NPC (a human bot), flattering yourself with the thought that people who understand the language are language bots...

ryandrake · a year ago
21% of adults in the US are illiterate in 2024 and 54% of adults have a literacy below a 6th-grade level[1]. “The average person” isn’t really a high bar, unfortunately.

1: https://www.thenationalliteracyinstitute.com/post/literacy-s...

A_D_E_P_T · a year ago
Not at all. It's just inconvenient for most of the Windows-using world, as the characters are not accessible. It's ALT+[whatever] or Google-it-and-ctrl+V. Hence an awful lot of internet writing didn't really use any of that stuff properly.

See, e.g., Boss Szabo's blog: https://unenumerated.blogspot.com/2018/03/the-many-tradition...

Two chained hypens, as was pretty much the norm back then.

And did you just call me an NPC?!? It's not a matter of "understanding the language" at all. It's a matter of convenience and of a sort of evolved convention.

PartiallyTyped · a year ago
On mac it's very easy to get an em-dash, just alt+shift+`-`. Though I do concur that it's more likely to come from an LLM, I don't think it should be considered a tell — I find it more of a predictor of the writer's age.
layman51 · a year ago
That’s interesting to note. I have usually taken the time to properly use en-dashes when it seems appropriate because I frequently deal with strings that represent academic years. At least where I live, these span two calendar years. I have noticed that a lot of college websites tend to use the en-dash properly (e.g. on their academic calendar webpages).
globular-toast · a year ago
(La)TeX would typeset -- as an en dash. --- gets you an em dash.

I, of course, used proper dashes in typeset documents, at least after I'd learnt about them in Knuth's The TeXbook. I have found myself occasionally use them in ASCII contexts just as ---. But I've never sought out the proper unicode character.

WhyNotHugo · a year ago
Just configure something like RightAlt to work as a compose key:

Compose--- produces —

Compose--. produces –

Lots of other characters like áăǎ°±€ are available through compose: https://whynothugo.nl/journal/2024/07/12/typing-non-english-...

> Absolutely proper and correct use of em dashes, en dashes, and hyphens is, to me, the most obvious tell of the LLM writer.

Or just someone who likes to use the right characters. There was a report a few months back about how writing from autistic kids keeps getting mislabelled as LLM simply because they use the correct specific terms.

Please stop associating being precise with being an LLM.

scelerat · a year ago
The Mac has had them as part of the standard keyboard layout since 1984. Using Apple kit since then, they have long been burned into my muscle memory:

option-[-] for en dash –

shift-option-[-] for em dash —

simondotau · a year ago
The option key is IMHO the most underrated feature of the Mac platform. Having another modifier for character input is insanely handy, and I know where to find numerous characters like trademark™, divide (÷), pound (£), degrees (°), pi (π) and so on.
thomasfromcdnjs · a year ago
Someone should parse HN api and figure out total dash usage and see if there is a spike in recent times aha

I write poems a fair bit and use em dash a lot. (maybe too much and incorrectly)

beejiu · a year ago
If em dashes were uncommon pre-2022, they wouldn't have ended up in the LLM training sets.
metaphor · a year ago
I use --- to represent em dash in prose here, e.g. [1][2]. The behavior is just a residual of long time exposure to TeX.

[1] https://news.ycombinator.com/item?id=41833665

[2] https://news.ycombinator.com/item?id=41774199

PhunkyPhil · a year ago
Iphones will autocorrect two hyphens to an em dash
unleaded · a year ago
What i've been using: Install https://github.com/samhocevar/wincompose and you can then press AltGr then three hyphens to insert one. or if you're on Linux just search for "compose key".
kingo55 · a year ago
More sophisticated clients require we use dashes correctly. I first encountered it pre-pandemic, so in professional contexts it's not a sure-fire signal of LLM use — Should you see em dashes correctly used in the Hacker News comments or Reddit, for that matter, then it's pretty reliable tell... Usually. ;)
lxgr · a year ago
I'd like to have the record show that I've been using them since before LLMs :)

Not sure when I started; my guess is that I got into the habit of using them in LaTeX when writing my thesis, and then at some point realized that they are easily reachable on standard macOS keyboard layouts (via "option" + "-").

necovek · a year ago
As I mentioned above, I've had them easily accessible with a keyboard layout for >20 years on all the systems I've used — the only caveat that I find it really ugly with no spaces around em-dashes, which is usually recommended for English.
op00to · a year ago
My LLM prompts all have “don’t use em dashes or semicolons ever” when I send the output to someone else. ;)
lxgr · a year ago
I get not using em/en dashes, but semicolons don't really have an alternative in many cases (other than rephrasing), do they?
mmooss · a year ago
Most word processing applications auto-substitute EM dashes as appropriate - some do it for two consecutive hyphens, iirc. I don't know if they substitute EN dashes automatically ... I don't know if there's a logic for that without understanding the text.
phlakaton · a year ago
I've been using real em- and en-dashes for decades, in more or less the way M-W describes. MacOS and iOS make it easy to do, and growing up Mac kindled a life of typographical nerdage.
AstralSerenity · a year ago
"Windows" + "." brings up symbols, and at the very top were em dashes. I've been using that since it was added.

On my Linux laptop, I confess to manually Googling them every time.

Deleted Comment

dadoum · a year ago
I use a compose key on Linux to write those. By default you should have these compositions available: --- → — || --. → –
tomrod · a year ago
I wrote for a magazine during college days a few decades ago that uses the Chicago manual of style. I still use em dashes, en dashes, and hyphens regularly. They don't show up as such in markdown, but they are effectively: one dash for hyphen, two for em-dash, and one with spaces surrounding it for en dash.
thesauri · a year ago
On Macs:

Hyphen -: -

En Dash –: alt -

Em Dash —: alt shift -

alabastervlog · a year ago
The default US English Mac keyboard is so extremely good, and has been the way it is for so long, that I remain baffled that other platforms haven't simply copied it. I came to it relatively late in life and it's one of the reasons I wish I'd started using Macs sooner.
shmerl · a year ago
Compose - - - works for M-Dash (KDE / Linux).

For other combos — see /usr/share/X11/locale/en_US.UTF-8/Compose

See also: System Settings > Keyboard > Key Bindings > Position of Compose key

ubermonkey · a year ago
Word and Outlook have replaced "hyphenhyphen" with an Em dash for decades.

Or, I mean, it does SOMETHING. I've never checked, and just always assumed I was getting the em dash.

QuantumGood · a year ago
I have typed Alt+0151 almost every day for decades—and now with some annoyance I am limiting their use due to the "that's how LLMs write."
psunavy03 · a year ago
It's pretty bonkers (and mildly depressing, really) to imply that correct grammar and usage is a reason to accuse someone of using an LLM.

I mean if it's an obvious break from their normal style, sure. But by itself? Every time I hear this argument, it just seems like sour grapes from poor writers.

Quailman84 · a year ago
For a while, em dashes were really popular among LLM enthusiasts because of the idea that it would encourage the LLM to draw from training data that contained em dashes—which typically were higher quality training data written by a professional writer or somebody with a professional editor. Subjectively, I think it worked. I suspect that the LLMs trained to be used as chatbots were finetuned to use the em dash liberally for that reason. Now, after a few generations of these models, I think that the em dash is starting to have the effect of drawing from "slop" training data that was written by other LLMs rather than well-written human data.
heyjamesknight · a year ago
I disagree—LLMs don't use them properly. They always put a space between the words before – and after – the dashed part.
Freak_NL · a year ago
Using spaces is not wrong. Typographically, a hair space or another thinner than usual space is usually used, but in plain text a space is often preferred. Style guides vary of opinion on this, but newspapers often space them. Without a space they end up looking like elongated hyphens joining the words on both sides. That's not their function.
CRConrad · a year ago
LLMs do use dashes "properly"; it could just as well be argued that you don't: The very article at the start of this discussion mentions that, while thy don't use spaces, using spaces is a valid alternative.
toss1 · a year ago
As a diligent user of ALT+0151 for many years on Windoes systems, I can contradict that it is a sign of LLM writing — perhaps in combination with other factors it can be used to increase the likelihood of LLM authorship, but alone, nope.
ryoshu · a year ago
I'm married to an editor and friends with an editor at work. They both use em dashes appropriately—even with informal writing. I've now learned the keyboard shortcut just to confuse people in the age of AI slop.
y1zhou · a year ago
A few years back a journal editor maticulously reviewed all dashes in our manuscript and pointed out places where em dashes should have been used. Since then I started noticing different dashes everywhere around the internet.
salynchnew · a year ago
Most obvious tell of the former/current Stripe employee, imho.
jsheard · a year ago
What's the significance of Stripe here?
Starlevel004 · a year ago
I refuse to care about this. A single dash is all I will ever use. I see no possible reason to use the other two.
lioeters · a year ago
That's the comment I was looking for to rally behind. I use the same character `-` for all purposes: minus, hyphen, em/en dash. It's easy to type and it makes practically no difference in meaning or legibility. I refuse to waste my time differentiating between multiple variations of a short horizontal line with a few pixels more or less. Ain't nobody got time for that.
gabeidx · a year ago
By the same logic, why bother with capital letters then?
theelous3 · a year ago
Throwing my hat in here. The sub millimeter difference in the length of a dash conveys no additional meaning or clarity. It is impossible to argue me out of this position.

It's not like you can reliably write these consistently by hand either without going over the top in length to make it extremely obvious.

california-og · a year ago
Here's some examples where the en dash could make things more clear:

-5--2°C

post-war-pre-digital era

See sections 10-O-15-Q

Try Our New York-London Flight Connection!

harrall · a year ago
Em dashes don’t convey much meaning or clarity for me.

Rather, seeing too short of a dash is like putting two clashing colors together or wearing two pieces of clothes that don’t match. It just looks instantly off.

It’s just not aesthetically pleasing for me.

miltonlost · a year ago
Length of breath/pause with a longer dash. Read some -- Emily Dickinson poems – you'll find a world ––– of meaning ––– in the millimeter.
MindBeams · a year ago
This sort of anti-intellectualism is the perfect antidote for those who claim that improper grammar is nothing more than evidence of language "evolving."
grey413 · a year ago
En dashes, I'll grant you, are pointless. Those can go away.

However, em dashes are a different case. The main reason why it's desirable to use em dashes (beside convention) is for clarity of purpose. The hyphen is already a very overloaded character; they're extensively used to denote ranges and link compound words. Importantly, both of those usages do not correspond to pauses in spoken language. If you're voicing a hyphen you're supposed to barrel on through it. An em dash is much closer to a parenthesis, comma, or semicolon. It's a meaningful break in the sentence, in the way that a hyphen isn't.

Now, if it were up to me I'd choose a different character to replace em dashes (maybe underscores), but that's a separate argument.

krupan · a year ago
Just use two dashes. Or like you said, use parentheses, commas, or semi-colons
hydrogen7800 · a year ago
I was going to post basically this. There is only one dash, and it's the one for which my keyboard has a key. Minus sign, hyphen, or any other use case. When MS word autocorrects to something else, I always angrily undo it, because I don't know or care what it's doing.

-proud dash luddite

Deleted Comment

Hnrobert42 · a year ago
I don’t care about the length of the mark, but I did find this idea useful. Prone to excessive detail, I often find myself with a parenthetical inside of parenthetical. The developer in me insists on 2 closing parentheses. But it looks weird and nerdy. Although, using an em dash instead is probably just as nerdy.

> Dashes are used inside parentheses, and vice versa, to indicate parenthetical material within parenthetical material. ...

> The bakery’s reputation for scrumptious goods (ambrosial, even—each item was surely fit for gods) spread far and wide.

LinuxAmbulance · a year ago
Long live the parenthetical!

I wish it was more popular, it neatly indicates meaning so very well.

zamalek · a year ago
This is coming from someone who can only speak English: what a stupid language. How is having 3 symbols that are discernible only by their, almost identical, length a good idea? How would one grade a paper for correct usage, especially if handwritten?

I agree with you completely.

jeroenhd · a year ago
I take this advice like "do not use a preposition to end a sentence with" and "pay close attention to 'much' and 'many'". Personal preferences from the 1800s taken as gospel by grammatical extremists, to the point where they're taken as some kind of solid rule in a vain attempt to forcefully shape language to a personal preference.

There are cases when you want to follow certain guidelines, for sure. If you write for a publication that adheres to Meriam-Webster, you'd better stay consistent and figure out the right AltGr code to type the right dashes. However, for the 99.99% of written media today, none of that matters.

MindBeams · a year ago
"Much" and "many" are not interchangeable:

"I have too many water in the cup."

"How much people are in attendance?"

These sound obviously incorrect.

Starlevel004 · a year ago
> Personal preferences from the 1800s taken as gospel by grammatical extremists, to the point where they're taken as some kind of solid rule in a vain attempt to forcefully shape language to a personal preference.

This is also true of "less" and "fewer". I use "less" everywhere.

milesrout · a year ago
Ending sentences with prepositions is and had always been fine. It has never been a serious rule of grammar that you may not end a sentence with a preposition. It does sometimes make a sentence sound better to rewrite it so that it doesn't end with one though. For example, "do not use a preposition to end a sentence with" sounds awkward to my ears, probably because you deliberately crafted the sentence to end with a preposition even though that is not naturally what you'd end that sentence with. (The previous sentence doesn't sound awkward to me, interestingly.)

Getting "much" and "many" right is completely different. They mean different things. Confusing them makes you sound stupid. Less vs fewer is the same. It often doesn't matter but in some cases it really grates on the ears (eg "there wasnt much people there" just sounds awful).

Dashes are not in the same category. They are orthographical conventions. They aren't really grammar. They are more like spelling. You can spell things wrong and say it doesn't matter because spelling is arbitrary and you can use the wrong dashes too, but it makes you look either uncaring or ignorant. If you want to give a good first impression, learn the basic conventions of written English and follow them.

account42 · a year ago
Real monsters use a signle dash but with a wider font.
quanloh · a year ago
me too, do not think it makes a different in actual writing, like handwriting.
tejohnso · a year ago
Yeah, trying to get people to take Em vs En vs Hyphen seriously is a fool's errand. Only typography nerds would take it seriously and there just aren't enough of them to make a difference. I'd guess that the vast majority of people have never even heard of these distinctions.
knallfrosch · a year ago
And that is why noone will remember your name.
fernandotakai · a year ago
uh, really?

i really like using em dashes -- for some reason, it feels "better" in my head than using something like a comma or a semi-colon.

RandallBrown · a year ago
Then why didn't you use an em dash?
milesrout · a year ago
i refuse to care about this lowercase letters are all i will ever use i see no possible reason to use the other symbols

Suit yourself, but if you refuse to learn basic grammar you will be treated like you are stupid and uneducated. Like it or not, presentation matters. Getting the basics right, including things like spelling, grammar, etc, shows a basic attention to detail without which your services will likely do more harm than good.

handoflixue · a year ago
> etc,

actually it's "etc."

(I wouldn't usually be a pedant, but if you think the difference between "--" and "—" matters, you should probably try to get the basics right too.)

mvdtnz · a year ago
The various dashes are not "basic grammar" they are for pedants to argue amongst one another while the rest of the world just gets thing done.
sandbach · a year ago
Robert Bringhurst¹ prefers the en dash in the context of setting off phrases:

"The em dash is the nineteenth-century standard, still prescribed in many editorial style books, but the em dash is too long for use with the best text faces. Like the oversized space between sentences, it belongs to the padded and corseted aesthetic of Victorian typography.

"Used as a phrase marker – thus – the en dash is set with a normal word space either side."

¹https://archive.org/details/isbn_9780881791327/page/80/mode/...

tkcranny · a year ago
Presently re-reading this book, The Elements of Typographic Style. It’s one of the few books I’ve gone out of my way to get a physical copy of – it’s just beautiful.

And I totally agree, space-set en dashes are vastly superior to em. I dislike the way it connects the word more closely to the word in the next clause than the phrase itself.

E.g. He left—no explanation. Vs. He left – no explanation.

To me, left—no feels like a weird gluing together than a separator for a different section.

munificent · a year ago
Because I am exactly the kind of person to obsess about this sort of thing, when I was working on my last book, I spent a lot of time deciding how I wanted to style dashed subordinate clauses.

Personally, I think en dashes are too small and look like a mistaken use of a hyphen. I really only use them in their Chicago Manual of Style recommended uses like date ranges.

But I agree that em dashes without spaces around them look wrong. They glue the adjoining words together when the whole point is that the clause is secondary and should be set aside from the surrounding text.

I ended up using em dashes with a little blob of CSS to put a tiny amount of space on either side.

fsckboy · a year ago
"Used as a phrase marker – thus – the en dash is set with a normal word space either side."

"Used as a phrase marker—thus—the em dash is set without normal word spaces."

>the em dash is too long for use

above, the em-dash without spaces is smaller, at least in this typeface

I've taken to using dash offsets—just as an aside—in many places were I formerly used parentheses; I find it "less interrupts" the flow of the sentence.

CRConrad · a year ago
Nope. Without the spaces, the dash doesn't set anything aside, it glues together. And with them, the m is of course longer than the n.
asplake · a year ago
I think of that as “British” style (as opposed to American). I think it’s more common here and I certainly prefer it
ibaikov · a year ago
So much this. Two weeks ago I learned that en dashes are used for numbers, but I thought they are what em dashes are for. Em dashes for me are too long and ugly.
7bit · a year ago
That's how you use them in Germany. N-dash with spaces around, instead of an m-dash, as Americans do.
milesrout · a year ago
Mr Bringhurst is wrong. Em dashes have nothing to do with Victorian aesthetics.
nayuki · a year ago
Additionally:

* Use the minus sign /−/ (U+2212) when formatting numbers, because the default hyphen-minus /-/ (U+2D) just looks wrong: "It is −1 °C vs. -1 °C." Moreover, the correct minus has the same width as plus (− vs. +).

* Rare, but use the figure dash /‒/ (U+2012) or figure space / / (U+2007) if you need a placeholder character that is the same width as a single digit. For example, "Guess the PIN: 1‒34."

bangaladore · a year ago
Somewhat off topic, however, I'm thoroughly convinced that there is a very high probability something is AI generated when I see Em dashes. Anyone else noticing this?

ChatGPT for example almost always uses them. I'm sure they are more common in academic writing, but its now super common on boards like Reddit.

alabastervlog · a year ago
I've been employing em-dashes extensively since I went on a JD Salinger binge circa 2002. Also, "incidentally", for the same reason. I use "Nb" a lot, from reading a bunch of DFW years ago. Oh, and that very-precise construction he does with "which" all the time, I stole that.

Before LLMs, I think em-dashes mostly signaled that you read books and paid attention to details, to the extent they signaled anything.

arduanika · a year ago
To generalize your point: A lot of the "brown m&ms" that we've walked around with for detecting a writers status, education, etc., are less useful in an age of LLMs.[1]

We might even be entering some waves of counter-signaling.

[1] They'll never totally nail all of DFW's mannerisms, though.

abyssin · a year ago
What is this very precise construction?
arduanika · a year ago
So you're saying that when you see an Em dash in someone's prose, it's a big minus?
bangaladore · a year ago
As I said in another comment, it depends highly on the context and previous / alternative knowledge of the source.
pavlov · a year ago
It’s largely the Baader-Meinhof phenomenon. You’ve started noticing it because you just learned about it.
bangaladore · a year ago
I feel this is an broad oversimplification.

When looking at the context of a given text, use of certain words or punctuation, can very well indicate AI use.

The "original" example was delve. There is no doubt that AI (did, or still does) use this word at a significantly higher frequency than the average person. I would say the same about em dashes.

When browsing a Reddit thread about a video game, if you encounter numerous comments written perfectly, especially those containing indicators like em dashes, the word delve, or similar language, it certainly can raise the question: am I genuinely seeing comments from users who write this way in this specific context, or is this content more likely produced by an LLM?

citrus1330 · a year ago
No, it's not. AI uses em dashes far more frequently than the average human.
jeroenhd · a year ago
It depends. Em dashes in news articles and written publications? Definitely expected. Em dashes on social media or reddit? Either someone who works in typesetting, or an LLM. Most likely an LLM, giving the dying nature of printed media.

Only typography nerds and professional printers care about things like these. Popular media, even modern professional media, hasn't been paying all that much attention.

arduanika · a year ago
Plausible. But apparently per TFA it's actually spelled Baader–Meinhof, with an en-dash not a hyphen.
dkdcwashere · a year ago
yep. been using them for years. others have too. it’s not weird

same thing happened with “delve” — these are just words and grammar, people use them

there is no accurate way to tell whether text came out of a neural network or not

dskhatri · a year ago
There are regular folk who tend to be pedantic with their writing. I'm not sure this is a good test of whether text is generated by LLM. Consider that some may use LLMs to correct spelling or grammar, and the LLMs may often edit an en dash to em dash.
bangaladore · a year ago
To be clear, It's essentially impossible to know if a given text is autonomously LLM generated (a bot on social media for example) or is the result of revision of real human effort.

To what extent that distinction matters, I'm not sure.

nilkn · a year ago
I've encountered and used em dashes regularly for the last 20 years. If most of your reading and writing are associated with social media, I could see the trend you're describing appearing real within that limited context. But em dashes are not new and have been a feature of high quality writing for many decades.
encypherai · a year ago
Yes, several of the most popular (and even lesser-popular but newly open-sourced models such as Gemma 3 27b) overuse Em dashes. Even when prompting them to not use dashes, they almost can't help themselves and include them occasionally anyways as it must be part of their learned stylometry. It's just not a common symbol to use at all as most people generally use commas for the same purpose. I can't even remember learning about Em dashes in my college english classes.
nextos · a year ago
I submitted an application which I typeset using LaTeX, and some people thought it was AI-generated because of en and em dashes. I have been using these since forever.
kbenson · a year ago
If it's posted through a publishing platform (not just a commend on one or on a public site), it's very possible they do an automatic conversion of some of the common cases. That could also be filtering down to comment boxes and stuff, I'm not sure.

That's not to say that generated content doesn't use them, just that using them as an indicator might require a bit of nuance based on where you're seeing them.

vanschelven · a year ago
There is a special kind of irony in the fact that habits that used to set one apart from the unwashed masses (like the proper use of punctuation) now serve as a signal for being non-human.
mychaelangelo · a year ago
I’ve noticed this, too. ChatGPT especially overuses them relative to other models. It’s an easy tell-sign that something is probably LLM-written.
zimpenfish · a year ago
I saw a reel the other day where some Young People(tm) were talking about "the ChatGPT hyphen" (an em-dash.) There was much wailing and gnashing of (false) teeth from Old People(tm) in the comments.
culi · a year ago
Everyone I know that writes a lot, especially for copy or product design, seems to use em dashes more heavily. I've even seen a Drake format meme where he is shaking his head at parantheses, commas, and colons but—finally—nodding in approval at the em dash.

I wonder if it's a more recent phenomenon.

jyunwai · a year ago
Em and en dash usage is officially part of style guides such as The Chicago Manual of Style [1], so it's often a work requirement for many writers and editors to use them in writing. This is why these kinds of dashes are everywhere in newspaper and magazine articles.

Eventually, people learn to include them out of habit—especially as most people see them as aesthetically nicer than a simple hyphen (-).

[1] https://www.chicagomanualofstyle.org/qanda/data/faq/topics/H...

im3w1l · a year ago
I saw this comment a day ago but it only clicked today. The way we tell it's AI is the use of too formal grammar. I think that means they now pass the Turing test. Or at most a hair's breadth from passing.
gukov · a year ago
Yep, definitely been noticing it, especially on Reddit. It almost always makes me navigate away from the post, unless the author mentions that they’re using AI.
keybored · a year ago
I’m bored with y’alls keyboard habits.

Not all though. Many people on HN use em-dashes and other proper punctuation.

arduanika · a year ago
Hold on, I'm coming back to this thread, I think I've cracked it guys. Some real alpha for you right here:

If the em dash has spaces around it -- as seen in AP style -- it was probably written by a real human, because that's how it comes out most conveniently on a word processor.

But if the em dash has no spaces around it--Chicago style--there's a good chance you're looking at LLM slop.

Anon1096 · a year ago
The only people still using em-dashes are those who think it's somehow a signal of high intellect rather than being (extremely) behind the times. Case in point: this exact comment section where you see it with ~10000x the frequency of standard human writing, or even the average HN thread.

Just makes me roll my eyes really seeing a human use an em-dash. We've in the age of informality, and at least for me personally I've definitely filed the em-dash away as "a near guarantee the text was written by a machine". No matter how much and perhaps especially because HN commentators are coming out of the woodworks to insist they've been using it daily for years.

MindBeams · a year ago
This level of thinly veiled insecurity is just projection on your part.
medstrom · a year ago
Maybe you're projecting? Not everyone has an agenda beyond just thinking it looks good.
awestley · a year ago
Yes! It's a tell-tale sign something is written by AI.

Deleted Comment

dkdcwashere · a year ago
it is not

Deleted Comment

rsch · a year ago
Today in “typesetting before we had typewriters”: …

At least we have dedicated O/0, and l/1 keys now. But we still see a lot of "straight" quotes instead of “those smart quotes Microsoft Word likes to generate”. And dashes. Did you know there is a dedicated ellipsis character? This is often set with slightly more space between dots than ..., and it by definition never wraps across a line between those dots. You still see (C) instead of ©.

It is one of those things that doesn’t really matter for readability, but although they can’t necessarily put a finger on why, people may still notice that some documents or pages appear to be set with more care for details than others.

(edit: I guess if you don’t have to search on Google what the hell a ‘Microsoft Word’ is, then you’re officially old)

thangalin · a year ago
> dedicated O/0, and l/1 keys now

And the 1 and 8 aren't next to each other anymore, either. (See typewriters from the "18"00s.)

> those smart quotes

Fixing straight quotes is a hard problem[0]. My FOSS text editor, KeenWrite[1], includes my library, KeenQuotes[2], for replacing them at build time. It's not perfect, but can typeset my ~400 page novel without any errors.

> Did you know there is a dedicated ellipsis character?

Yes! Here's where it gets parsed:

https://gitlab.com/DaveJarvis/KeenQuotes/-/blob/main/src/mai...

Then emitted:

https://gitlab.com/DaveJarvis/KeenQuotes/-/blob/main/src/mai...

Then transformed into an HTML entity:

https://gitlab.com/DaveJarvis/KeenQuotes/-/blob/main/src/mai...

When typesetting Markdown, KeenWrite first converts the document to XHTML (i.e., XML), then invokes ConTeXt to convert XML into TeX macros. One of those macros handles the ellipses by converting it to \dots{}:

https://gitlab.com/DaveJarvis/keenwrite-themes/-/blob/main/x...

This renders as the Unicode character in the final document: …

> set with more care for details

Some of us old folks care about these details. ;-)

[0]: https://stackoverflow.com/a/73466438/59087

[1]: https://keenwrite.com/

[2]: https://whitemagicsoftware.com/keenquotes

keybored · a year ago
People have approximated ellipsis by using `. . .`.

I use ellipsis. Which ironically is way too short when viewed in monotype…

kps · a year ago
I use ellipses & dashes… perhaps the former will convince people I am human.
knallfrosch · a year ago
I hate smart quotes because it's super weird to use the «French» and „German“ quotation marks.
CRConrad · a year ago
What's so weird about it? It's the appropriate way to do it when writing in those languages.

And really easy to do on an Android phone, I've found: Switch the input to French or German, and the on-screen keyboard offers the appropriate quote marks for that language in the same place as usual.

vanschelven · a year ago
for em dashes and ellipsis at least it's trivial to convert before displaying them... which I do in my own markdown-to-publication toolchain (but not here on HN).
a3w · a year ago
> spans pages 128–34.

Who omits the 1 from the second number?! That is aweful!

crazygringo · a year ago
Who keeps the 1?

You write pages 1,003–4, instead of typing out 1,003–1,004 which is just unnecessary.

Works the same with two digits, or even three: pp. 1,899–902.

This is standard practice and arguably clearer.

I've only ever seen it done with page ranges, though. I'm not sure if it's done with year ranges? E.g. 1984–5? Or 1989–92? You work with page ranges constantly in academia, I just don't see year ranges much in any form.

lucgommans · a year ago
Literally never seen this (wish I could grep all comments I've ever replied to) and I do not understand what makes you say that it's clearer when it's dropping information, making it relative rather than a fully qualified number

In speech, it's common, and misunderstandings are usually not a problem (if you're not monologuing on a recording) because someone will just ask; but in writing it looks like the range is the wrong way around. Maybe I expect more care in writing because the feedback loop is longer, or maybe it's just habit and I think it's wrong in writing because I never see it?

MindBeams · a year ago
It's definitely standard, but in what way is it clearer? An abbreviation is never more clear than the full thing it abbreviates.

EDIT: I saw your explanation below, and you make a very good point.

a3w · a year ago
copy/paste, "print", paste in from page, to to page

Result:

> print pages in range from: 1, 003

> print pages in range to 4

Now have I have two errors to fix: page 1003 to page 1004. Not nice. Who formats like this?!

-------------------

Also, some RPG books or encyclopedias I own have chapter that span like this:

p. 630 to p. 70 (book 2)

To me, now is unclear, is that 70 with a reset page count, or 670 for book 2?

Since I just now learned that a quotation standard somewhere outside Germany exists that omits leading numbers, I now need to manually check where it ends.

TL;DR:

Don't make me think, and allow for automation. So just write on more number.

aio2 · a year ago
closest thing we have on hn to being a reddit like comment/remark lol
rossant · a year ago
When I was editing an academic book published by a well-known university press, we were all asked to do that for the references. (And my colleagues, all doctors and lawyers, only knew Word and entered the references manually.)
mkehrt · a year ago
What if it's 124 to 127? would you really type 124–127, or 124–7?
eCa · a year ago
> would you really type 124–127?

Yes, every time. The clarity for the reader is more important than the time I save by leaving out '12'.

wavemode · a year ago
> would you really type 124–127

literally yes

rossant · a year ago
The latter, I believe.