I Prefer RST to Markdown (2024)

Arainach · 8 days ago

Previously discussed: https://news.ycombinator.com/item?id=41120254

Copying my thoughts from there which haven't changed:

>To which I say, are you really going to avoid using a good tool just because it makes you puke? Because looking at it makes your stomach churn? Because it offends every fiber of your being?"

Yes. A thousand times yes. Because the biggest advantage of Markdown is that it's easy to read, and its second-biggest advantage is that it's easy to write. How easy it is to parse doesn't matter. How easy it is to extend is largely irrelevant.

Markdown may or may not be the best tool for writing a book, but Markdown is the best tool for what it does - quickly writing formatted text in a way that is easy to read even for those who are not well versed in its syntax.

I don't want to write a book. If I did I'd use LaTeX before RST. I want something to take notes, make quick documentation and thread comments.

freehorse · 7 days ago

I would argue that being harder to extend is actually an advantage of markdown, because it helps with it staying simple and having a relatively agreed upon standard form instead of getting lost in the complexities of different ways to extend it and the different standards this would bring. Being hard to extend means that it is easier to find local optimum rather than exploring the syntax space.

Moreover, simple, human readable parsing rules help a lot with reducing cognitive load of the form and focus on the content. Extending a syntax necessarily brings abstractions and more complex parsing rules which would conflict with that goal. In some contexts minimalism and simplicity are features in themselves.

For me, I often want to spend my time writing down the stuff I need to write and not play with extensions/logic/configs. I like that it forces me to actually not be able to do sth more complex because I am pretty sure that if I was incentivised to extend it instead, I would end up spending my time with that instead of writing.

Markdown is not good for stuff where complex logical structure in the content is important to be represented in the form. In the article it is beyond clear to me why the author did not use markdown for their book, I would be more interested in why they chose RST instead of latex or another language that is more towards the complex end than the minimalistic end. I guess what the author needed was some point in-between, and they found it in RST.

blenderob · 7 days ago

>>To which I say, are you really going to avoid using a good tool just because it makes you puke? Because looking at it makes your stomach churn? Because it offends every fiber of your being?"

> Yes. A thousand times yes.

Your comment comes off as if it makes an opposing point to the article. My apologies if it wasn't meant that way.

But I want to note that the author agrees with you! The next sentence from the author which you didn't include in your quote says:

> Okay yeah that's actually a pretty good reason not to use it. I can't get into lisps for the same reason. I'm not going to begrudge anybody who avoids a tool because it's ugly.

chrismorgan · 8 days ago

> How easy it is to parse doesn't matter.

How easy it is to parse does matter, because there’s a definite correlation between how easy it is to parse for the computer and for you. When there are bad corner cases, you either have to learn the rules, or keep on producing erroneous and often-content-destructive formatting.

> How easy it is to extend is largely irrelevant.

If you’re content with stock CommonMark, it is irrelevant to you.

If you want to go beyond that, you’re in for a world of pain and mangled content, content that you often won’t notice is mangled until much later, because there’s generally no meaningful way of sanity-checking stuff.

As soon as you interact with more than one Markdown engine—which is extremely likely to happen, your text editor is probably not using the parser your build tool uses, configured as it is configured—it matters a lot. If you have ever tried migrating from one engine to another on anything beyond the basics, you will have encountered problems because of this.

Arainach · 8 days ago

It's miserable to parse C++ and that's fine, because only a few people have to write a parser while 5 orders of magnitude more have to read and write it. Same thing with markdown - the user experience is what matters.

Edge cases largely don't matter, because again I'm not trying to make a book. I don't care if my table is off by a few pixels. 50% of the time I'm reading markdown it's not even formatted, it's just in raw format in an editor.

thiht · 7 days ago

> there’s a definite correlation between how easy it is to parse for the computer and for you

I’m not sure that’s true tbh. Exhibit A: natural language. Exhibit B: Polish notation.

noosphr · 8 days ago

TeX is a type setting language, not a writing language. LaTeX inherits this. Unless you know ahead of time the exact dimensions you will be displaying your book at you shouldn't use either. ReST on the other hand can be resized to your hearts content which is what you need for digital publishing.

blenderob · 7 days ago

> Unless you know ahead of time the exact dimensions you will be displaying your book at you shouldn't use either.

This is incorrect. You can sure write LaTeX that is intricately dependent on the output dimensions. But you can just as easily write LaTeX that is independent of output dimensions.

Case in point is compiling LaTeX doc to HTML which you'd admit is easily resizable.

Case in point is also writing LaTeX docs for journals or publication where you can easily resize the document to match the publisher's style guide and dimensions by changing the documentclass.

kazinator · 7 days ago

Anything can be resized without touching the document, if you don't care whether it looks like shit.

The TeX philosphy rejects that. When TeX can't format a paragraph beautifully, it emits diagnostics like "overfull \hbox".

That's totally incompatible with being able to dictate a width, and expecting things to fit without having to get involved.

yawaramin · 7 days ago

Technically, plain text can be resorted to your heart's content. You don't need ReST for that. But in practice if you're writing a serious technical book and you need a serious markup language, you will likely end up with DocBook XML for its flexibility and large range of outputs.

JohnKemeny · 8 days ago

LaTeX cannot be resized?

bravesoul2 · 8 days ago

Yes. I think he prefers a car to walking. But there are few trips where you would think "should I drive, or walk?".

He should compare it to HTML or XML or Haml

kazinator · 7 days ago

I can write something better than RST for my use in an afternoon of Lisp coding.

My personal resume is a Lisp thing (now well over 20 years old). There is a kind of markup language, and CLOS-driven back ends for producing different output formats.

humanfromearth9 · 6 days ago

Nowadays you'd use Typist if producing a PDF is OK

jeroenhd · 7 days ago

I can see the advantages RST offers in term of HTML generation, but whenever I've needed to work with custom blocks like that, I've always just written HTML.

I'm not sure if <img src="file.jpg" alt="alt text"/> is less readable than

    .. image:: file.jpg
       :alt: Alt text

HTML5 allows for leaving certain tags unclosed (such as <li>, or <head> or even <p>) to such an extent that I find many template languages to not be worth the effort of their complex syntax.

Sure, there are three or four lines here that you can omit using RST or markdown:

    <!doctype html>
    <html lang="en">
    <head>
    <title>My blog page</title>
    <body>
    <h1>Welcome to my blog</h1>
    <p>This is a bunch of text.
    Feel free to stuff newlines here.
    <p>This is also a bunch of text
    <p>Here's a list just for fun:
    <ol>
      <li>This is the first item!
      <li>This is the second one!
      <li>Boom, a third!
    </ol>
    <p>Have an image: <img src="filename.jpg" alt="alt text goes here">

But is having to wrap a list in <ol> and closing the <title> really that bad?

Automatically generating an index and such is nice, but five lines of Javascript can do the same. Plus, you don't need to run a second tool to "process" your input.

I generally use Markdown as a standardised way to format text that will probably be read in plaintext by other people, but when it comes to formatting documents, I don't see the point of most complex template languages.

tpoacher · 7 days ago

Same. I have a couple of nice html templates (with locally-defined css and mathjax styling), and I now take all my notes directly in html in nano.

Once you've written a couple of documents, the usual tags become muscle memory and are no more of a bother to write than markdown. I've even created a couple of nano macros to automate some of the process.

"But it's not readable like markdown" you might say. Well. This might be true of 'some' html, especially autogenerated stuff, but the stuff I write is totally readable. Once you settle on some meaningful indentation and tag presentation conventions, readability is not a problem. We're talking about plain html documents, after all, not complex websites. The subset of html tags you'll need is generally very small and largely unintrusive.

I could even go a step further and say, my HTML is as readable as this guy's rST, but this guy's generated HTML code is far worse than how my direct HTML would have looked.

aitchnyu · 6 days ago

These formats must have made sense with Notepad (no autocomplete balanced tags, indentation, syntax highlight) and custom parsers in C.

Eric Raymond's, in his 2003 book, advocating terse text formats in chapters 5 and 18 https://www.catb.org/esr/writings/taoup/html/graphics/taoup....

tambourine_man · 8 days ago

> Markdown is ubiquitous because it's lightweight and portable…

Markdown is ubiquitous because it’s easy for humans to read and write.

tpoacher · 7 days ago

Markdown is ubiquitous because it is easy for humans to read and write AND enough humans used it to make it so.

The second part is more important than the first. There could be far better systems which not enough humans used to make ubiquitous. And as far as we know, markdown could be one of the worse ones, but became ubiquitous because it became ubiquitous.

cf: MS Windows.

tambourine_man · 7 days ago

I don't think Windows is an apt comparison. One had huge market forces and distribution channels propelling it, the other is a description page, not even a standard, on Gruber's site.

What Gruber got right is that the syntax is beautiful to read, easy to write and powerful enough to be useful, with the optional inline HTML as an escape hatch. It may not seem much, but that's hard to get right.

jajuuka · 7 days ago

Agreed. Personally I really like asciidoc but hardly anything supports it. Markdown is just everywhere. In all the tools I use and all the most popular tools available. So it's far easier to use when it's so portable. So I only need to remember one set of operators to get the results I want. Even in systems where I don't know which syntax they support. There is a good chance Markdown is will be one of them.

deafpolygon · 7 days ago

you're both wrong.

markdown is ubiquitous thanks to github.

lifthrasiir · 8 days ago

Guaranteed, reST is more feature-complete and extension-friendly, but it is simply unusable for me because it wasn't designed for agglutinative languages like Korean. Markdown is much better in this case (though CommonMark has an annoying edge case [1]).

[1] https://talk.commonmark.org/t/foo-works-but-foo-fails/2528

thaumasiotes · 8 days ago

> reST is more feature-complete and extension-friendly, but it is simply unusable for me because it wasn't designed for agglutinative languages like Korean.

How does whether you think of the language as agglutinative affect the usability of reST?

The biggest problem that occurs to me is that there isn't really a conceptual difference between an "agglutinative" language in which you have very long words expressing complex meanings, and an "isolating" language in which the same syllables occur in the same order with the same meaning but are thought of on a Platonic level as being all independent words.

This is because an "agglutinative" language is one in which syntax markers are more or less independent of any other syntax markers that may apply to the same word†, which means it's always possible by definition to consider those markers to be "words" themselves.

Would your problems be solved if you viewed what you had considered "long" Korean words as instead being several short words in a row? What difficulties does agglutination present?

† Compare: https://glossary.sil.org/term/agglutinative-language

> An agglutinative language is a language in which words are made up of a linear sequence of distinct morphemes and each component of meaning is represented by its own morpheme.

https://glossary.sil.org/term/isolating-language

> An isolating language is a language in which almost every word consists of a single morpheme.

lifthrasiir · 7 days ago

> This is because an "agglutinative" language is one in which syntax markers are more or less independent of any other syntax markers that may apply to the same word†, which means it's always possible by definition to consider those markers to be "words" themselves.

I think SIL's definition is, while robust, not the usual definition because English can be regarded as agglutinative in this definition. This is particularly visible from the statement that most European languages are somewhat fusional [1], which is okay under their definitions but not the usual way we think of English.

In my understanding, the analyticity is a spectrum and highly analytic languages with most (but not necessarily all) words containing just one morpheme are said to be isolating. Words in agglutinative languages can be, but not necessarily have to be, analyzed as a main morpheme ("word") with dependent morphemes attached ("affixes"). Polysynthetic languages go further by allowing multiple main morphemes in one word. As languages tend to become synthetic (as opposed to analytic), the space-separated "word" is less useful [2] and segmentation gets harder and harder. reST's failure to support those languages is all about a bad assumption about segmentation.

[1] https://glossary.sil.org/term/fusional-language

[2] So much that several agglutinative languages---in which space-separated words can still be useful---don't even think about spacing, e.g. Japanese.

chrismorgan · 7 days ago

The key here is whether there’s a word separator, not agglutinativity or isolation. The term I find for this on a brief search is scriptio continua <https://en.wikipedia.org/wiki/Scriptio_continua>.

mattclarkdotnet · 7 days ago

These are descriptive terms though? It’s not like the language actually works that way

do_not_redeem · 8 days ago

What do you mean not designed for Korean? It's just unicode. If there's some situation where RST isn't parsing inline markup, you can write the role explicitly like this:

  this is **bold** text
  this is :strong:`bold` text

rune-space · 8 days ago

But you can’t say:

   thisis:strong:`bold`text

Whereas the equivalent is perfectly fine in markdown.

Falsehoods programmers believe about written language: whitespace is used to separate atomic sequences of runes.

lifthrasiir · 8 days ago

reST inline syntaxes are pretty much word-based, which doesn't work very well with agglutinative languages. For example if you want to apply a markup to "이 페이지" in "이 페이지는 ..." (lit. This page in This page is ...), you need to do `*이 페이지*\ 는 ...` AFAIK. That would happen every single time affixes are used, and affixes are extremely frequent in such languages.

chrismorgan · 8 days ago

reStructuredText and Markdown both have a bad habit of clevernesses that fall down—just in different areas.

Both do at least some degree of only matching delimiters at word boundaries. I consider that to be a huge mistake.

reStructuredText falls for it, but has a universally-applicable workaround (backslash-space as a separator—note that it is not an escaped space, as you might reasonably expect: it’s special-cased to expand to nothing but a syntax separator).

Markdown falls for it inconsistently, which as a user of languages that divide words with spaces, is honestly worse. Its rules are more nuanced, which is generally a bad thing, because it makes it harder to build the appropriate mental model. It was also wildly underspecified, though that’s mostly settled now. For many years, Stack Overflow used at least two, I think three but I can’t remember where the third would have been, mutually-incompatible engines, and underscores and mid-word formatting were a total mess. Python in particular suffered—for many years, in comments it was impossible to get plain-text (i.e. not `-wrapped code) __init__.

In CommonMark, _abc_ and *abc* get you abc, but a*b*c gets you abc while a_b_c gets you a_b_c. That’s an admission of failure in syntax. Hmm… I hadn’t thought of this, but I suppose that makes _ basically untenable in languages with no word separator. Interesting argument against Prettier, which has a badly broken Markdown mode¹, and which insists on _ for emphasis, not *.

In my own lightweight markup language I’ve been steadily making and using for my own stuff for the last five years or so, there’s nothing about word boundaries. a*b*c is abc, and if a dialect² defined _ as emphasis, a_b_c would be abc.

Another example of the cleverness problem in reStructuredText is how hard wrapping is handled. https://docutils.sourceforge.io/docs/ref/rst/restructuredtex... is a good example of how badly wrong this can go. (Markdown has related issues, but a little more constrained. A mid-paragraph line starting with “1. ” or “- ”—both plausible, and the latter certain to occur eventually if you use - as a dash—will start a list.) The solution here is to reject column-based hard-wrapping as a terrible idea. Yes, this is a case where the markup language should tell people “you’re doing it wrong”, because otherwise the markup language will either mangle your content, or become bad; or more likely both.

Meanwhile in Markdown, it tries to be clever around specific HTML tags and just becomes hard to predict.

—⁂—

¹ Prettier’s Markdown formatting is known to mangle content, particularly around underscores and asterisks, and they haven’t done anything about it. The first time I accidentally used it it deleted the rest of a file after some messy bad emphasis stuff from a WYSIWYG HTML → Markdown conversion. That was when I discovered .prettierignore is almost completely broken, too. I came away no longer just unimpressed with some of Prettier’s opinions, but severely unimpressed with the rest of it technically. Why they haven’t disabled it until such things are fixed, I don’t know.

² There’s very little fundamental syntax in it: line break, indent and parsing CSS Counter Styles is about it. The rest is all defined in dialects, for easy extension.

bluGill · 8 days ago

My only problem with rst is that several useful the extentions are not updated. I have some great rst documentation, but part of that is I importing doxygen, dolphin, and other extentions that are useful but saddly not updated on the same schedule as the main tool. I end up many versions back just because it is all that is compatible.

still markdown just isn't powerful enough for anything non trivial.

lifthrasiir · 8 days ago

The original spirit of Markdown was to use HTML elements (or custom elements if you like) for whatever is missing from Markdown. That's surprisingly versatile in hindsight, but the specification didn't fully anticipate what happens to Markdown contents inside such elements. Some implementations supported them, some didn't, some used the `markdown` pseudo-attribute, and so on. And it was even less clear how block syntaxes work inside HTML elements. (CommonMark defines a very lengthy list of rules for them [1].) Markdown could have been extensible... if it did have a sensible specification from beginning.

[1] https://spec.commonmark.org/0.31.2/#html-blocks

chipotle_coyote · 8 days ago

> still markdown just isn’t powerful enough for anything non trivial

I see this sentiment a lot, and my reaction is always, “Sure it is, with asterisks.” In the past decade I was the primary author of the RethinkDB documentation, a senior technical writer on Bixby’s developer documentation, and am now a contractor working on Minecraft’s developer docs. All of them were large, decidedly non-trivial, and Markdown. Microsoft’s entire learning portal, AFAICT, is in Markdown.

And the thing is, each of those systems used a different Markdown processor. My own blog uses one that’s different from all of those. According to HN, I should be spending virtually all my time fighting with all those weird differences and edge cases, but I’m not. I swear. The thing about edge cases is they’re edge cases. I saw a “Markdown torture” document the other day which contained a structure like this:

    [foo[bar(http://bar.com)](http://foo.com)

and proudly proclaimed that different Markdown processors interpret that construct differently. Yes, okay, and? Tell me a use case for that beyond “I want to see how my Markdown processor breaks on that.”

The asterisk is that almost any big docs (or even blogging) system built on Markdown has extensions in it, which are usually a function of the template system. Is that part of Markdown? Obviously not. Is it somehow “cheating”? I mean, maybe? At the end of the day, 99% of what I’m writing is still Markdown. I just know that for certain specific constructs I’m going use {{brace-enclosed shortcodes}}, or begin an otherwise-typical Markdown block quote with a special tag like “%tip%” to make it into a tip block. Every system that proclaims it’s better than Markdown because it allows for extensions, well, if you take advantage of that capability, look at you adding site-specific customization just like I’m doing with (checks notes) Markdown.

If reStructured Text works better for you, or AsciiDoc, or Org Mode, great! Hell, do it all in DITA, if you’re a masochist. But this whole “this is obviously technically superior to Markdown, which surely no one would ever do real work in, pish tosh” nonsense? We do. It works fine. Sorry.

chrismorgan · 8 days ago

> {{brace-enclosed shortcodes}}

I haven’t checked if any of the details have changed any time recently, but Zola does this, and I had a rough time with it because of the interactions with Markdown rules around raw HTML and escaping and such. I have worked to forget the details. I reckon Zola bakes Markdown in too deeply, and it’s a pain. Especially because of indentation causing code blocks, that’s one of the biggest problems with extending Markdown by “just writing HTML”.

bluGill · 7 days ago

markdown is great for a single short page. It doesn't have good links to the middle of a page (some extentions do but not the popular ones), nor can it generate tables of contents, indexes, and the other things a large site should have. Rst will do all that and because it is a site generator if you reorganize the links get fixed - or at least you get a warning onia dead link.

jeberle · 8 days ago

The RST parser is available in only one language, Python. I don't want my content tied to a single language stack, regardless of how good it might be. Markdown parsers exist in any language I care to use.

xigoi · 7 days ago

> Markdown parsers exist in any language I care to use.

Except each one actually parses a slightly different language.

https://git.sr.ht/~xigoi/markdown-monster/blob/master/monste...

kazinator · 7 days ago

So for instance, your git repo README.md that looks nice on your self-hosted CGIT site might not look right if someone clones your stuff to GitHub.

ajross · 8 days ago

Everyone who works seriously with editing and formatting documentation for presentation prefers RST.

Markdown is for the people, almost never full time doc jockeys, who need to WRITE that documentation.

acidburnNSA · 8 days ago

For books or significant document sets I definitely agree with the author on this. The builtin features for glossary and index are also nice. The extensibility is amazing. Some people are even doing formal requirements and lifecycle management in RST these days!!

https://www.sphinx-needs.com/

4b11b4 · 8 days ago

This looks kind of useful for creating good contexts about project requirements