Typst: A Programmable Markup Language for Typesetting [pdf]

tannhaeuser · 3 years ago

I really like the paper, though I'm not sure the world needs another Turing-complete document language however well-motivated ;)

As SGML pedant however, I can't resist commenting on the following:

> The second offspring of SGML is XML, specified by the World Wide Web Consortium (W3C) in 1998. It has a reduced feature set compared to SGML (for example, it forbids unclosed tags and concurrent markup). But it retains the most important aspect of SGML, one that HTML is lacking: The ability to define custom structural elements. This lets XML represent documents with much more semantic detail than HTML.

As the SGML vocabulary HTML was once envisioned, HTML itself doesn't need extensibility. When used as an SGML application, defining your own elements in HTML is as easy as declaring those in the "internal subset" or in a custom DTD right away. Assuming any wellformed element is accepted as of ISO 8879 Annex K's FEATURES IMPLYDEF ELEMENT rather than rejecting undeclared elements, that's actually only necessary if you want to validate/infer custom content models, or use any of the other things markup declarations provide, such as custom SHORTREF syntax a la markdown.

Arguably, HTML5's "custom elements" do provide a facility to define new elements, if incredibly lousy; ie. custom elements can't have content model restrictions (see above) and can't be used with tag omission/inference (important for customized elements), aren't integrated with DOM parsing, and need JavaScript for declaration - the latter point making them completely pointless as a markup feature.

chrismorgan · 3 years ago

But HTML never was “an SGML application” in practice, and I highly doubt it was ever actually envisioned as that. There may have been some tools out there that processed HTML as SGML, but none of the ones I know of did (most notably browsers).

And in fact, in practice you could just use your own custom elements without worrying about validity and it’d mostly just work. This wasn’t even particularly rare. (There was the whole “CSS doesn’t work on them until you call document.createElement("…")” bug in IE, but that’s the only problem I can think of, and it was easily worked around.)

tannhaeuser · 3 years ago

HTML the markup language was clearly intended as an SGML vocabulary - TBL himself said as much [1] and HTML also reused element names from the SGML spec/handbook as example/folklore vocabulary such as for paragraphs and headings.

What browsers made out of it isn't the matter here, but even if it were, the "practical, real-world HTML out there" argument is mostly used to pull up the ladder by an ad company/browser cartel made worse day-in day-out through an atrocious and absurdly voluminous HTML spec (and by CSS, of course).

Even though Ian Hickson, of WHATWG, wanted to capture HTML as it was understood by browsers, he couldn't help but added additional elements of his own - such as for marking up ads as "aside" lol plus the alien sectioning elements concept that gave rise to the flawed "outline algorithm" and misuse of heading elements (and earlier failure to understand SGML's RANK feature), a problem that was only fixed last year [2] by an incompatible change to HTML invalidating documents using hgroup as originally advised.

In practice, very few changes to the HTML syntax brought HTML outside SGML - for the most part, ad-hoc and basically unnecessary commenting rules for the script and style elements to keep legacy browsers from rendering JavaScript and CSS, resp., when those where introduced.

[1]: http://info.cern.ch/hypertext/WWW/MarkUp/MarkUp.html

[2]: https://github.com/w3c/htmlwg/issues/22

sebzim4500 · 3 years ago

>I really like the paper, though I'm not sure the world needs another Turing-complete document language however well-motivated ;)

What would you recommend instead? From my experience of the beta it is vastly nicer to use than latex, and I'm not really aware of any other competition.

Ghast · 3 years ago

This is exciting, as I've been waiting for something like this to come along as I'm typesetting an RPG book with LaTeX, and they generally have rather complex layouts:

Random example:

https://external-content.duckduckgo.com/iu/?u=https%3A%2F%2F...

Calculating stat-blocks by hand is a nighmtare for one author, so I need a typesetting language so I can type `\elf`, and have a random elf, then generate all its derived stats correctly (like Attack, Defence, et c.).

My example: https://i.redd.it/ng82unzqxru41.png

Each book has different versions (minimal/ full) so getting reliable output has been a bit of a chore.

LaTeX is the only tool I know which can do the job, but it's still a bit of a nightmare at time. Ugly formatting means my code looks whacky, images can't float cleanly around a multicolumn environment, packages conflict, words overlap and it needs constant hand-holding.

Typst looks like just what I've been after - something like rmarkdown, with more power. Tables and layouts, and best of all - no packages.

What licence is typst under?

Will it be able to reference page numbers from a different book dynamically? (so if book-a and book-b are in the same directory, one can reference another's sections)

Will there be support for intelligent floating images?

Will it accept 'every-page' commands, so I can use that #rect command to show a chapter's name on side-tabs?

Will error messages tell me hbox underfull badness 10000 at least 3000 times per compile?

laurmaedje · 3 years ago

> What licence is typst under?

We will open source in March, probably with a permissive license, but that's not yet definitively decided.

> Will it be able to reference page numbers from a different book dynamically? (so if book-a and book-b are in the same directory, one can reference another's sections)

This is not currently implemented, but also shouldn't be fundamentally impossible.

> Will there be support for intelligent floating images?

There will be floating containers (also with text flowing around). We will probably keep the amount of "intelligence" low for more predicatability. So you would specify top or bottom and it would be placed on the next page with free space.

> Will it accept 'every-page' commands, so I can use that #rect command to show a chapter's name on side-tabs?

Yes, every-page headers, footers, foregrounds, and backgrounds are already available and a way to query for the current chapter name is coming in a future update.

> Will error messages tell me hbox underfull badness 10000 at least 3000 times per compile?

No!

Ghast · 3 years ago

I hope it's FOSS, otherwise it won't be much use - there won't be any way to know if it will work in a couple of years.

> There will be floating containers (also with text flowing around). We will probably keep the amount of "intelligence" low for more predicatability. So you would specify top or bottom and it would be placed on the next page with free space.

If it's not too 'intelligent', I hope there is - at a minimum - a floating image command that basically says 'place this floating image so it stays on the same page as this current line'. Currently I'm placing every float about 30 lines before where I want it and hoping for the best. It'd be a lot easier to just say 'place this float 50 lines back, typeset the page, and if the image is not in the right place, then start again 20 lines back', then loop until it's at no lines back or until it's with the right text.

That's ugly coding, but less ugly than the current hacks I do with LaTeX.

I hope the project goes well, and hope to see it in the standard repos before long.

vitorsr · 3 years ago

It is not my intention to overstep what you seem to be very engrossed in but have you tried LuaLaTeX? You can have Lua code that generates valid LaTeX and LaTeX macros that execute Lua code (see [1]). LuaLaTeX also supports TTF and OTF formats which plays well with other publishing software.

In relation to formatting, I cannot tell what could be the issue but this does not correspond to mine and others' experiences.

Typically you would typeset the book with the generally intended text areas with LaTeX, and then generate the preprint with previously not included graphical elements and fine-tuned text layouts with Adobe InDesign. This is what publishers typically do, again, from mine and others' experiences.

[1] http://mirrors.ctan.org/info/luatex/lualatex-doc/lualatex-do...

Ghast · 3 years ago

I've heard of it, but never seen anything that suits my exact needs. If it solves image float problems, it'll be worth it.

fermigier · 3 years ago

Impressive, as a Master's thesis, but also as a long-term project.

I've been recently working on a book project, and I've reviewed all the tools available. My current conclusion is that I found nothing that would combine the simplicity of Markdown or Asciidoc with the typographical control of LaTeX.

This project seems to hit the sweet spot.

Is the source code of the thesis available somewhere? Of other works created with Typst? That would help making a more educated bet.

Edit: this doesn't seem to be an open source project. So not what I'm looking for.

laurmaedje · 3 years ago

Hey, I'm the author of the thesis. Thanks for the kind words! Just wanted to let you know that while Typst isn't open source yet, we are planning to open source it in March. We will also make a free CLI tool available as to not lock anybody into our web app.

There isn't much Typst source code out there yet, but you can find some examples and discussion on our Discord server [1] and in our documentation [2].

[1]: https://discord.gg/2uDybryKPe

[2]: https://typst.app/docs/

mgubi · 3 years ago

Have you considered TeXmacs (www.texmacs.org)? See here a short video describing its features: https://www.youtube.com/watch?v=H46ON2FB30U and the twitter feed for more examples: https://twitter.com/gnu_texmacs

Deleted Comment

hendrikrassmann · 3 years ago

Have you tried org-mode?

luisivan · 3 years ago

I've been working on something somewhat similar for producing legal documents, called Linked Markdown (https://linked.md). So the section about modules/importing and dynamic references really rang a bell. Very cool stuff, will keep track of progress.

breck · 3 years ago

I'm very interested in computational law. Drop me an email if you ever want to chat.

mjburgess · 3 years ago

As far as I can tell it compiles directly to pdf, which seems a non-starter for submitting to many journals which accept either latex or word documents.

This would need various compiler backends (perhaps via pandoc) to be that useful. Certainly it would help adoption if you could emit, eg., markdown/latex/etc. and others you're working with wouldn't need to adopt your tooling.

sebzim4500 · 3 years ago

Markdown is not close to powerful enough to produce all the stuff that Typst can produce, and emitting latex would be a massive technical effort. I think the best hope is that whatever journals currently only accept latex or .docx will start accepting typst files (or at least pdf).

Loading comment...

blacklion · 3 years ago

It is known, that TeX algorithm to split long text into strings to make paragraph is globally-optimal, but splitting stack of lines into pages are only locally-optimal due to memory constrains at the time of original TeX implementation.

Does this system uses globally-optimal algorithm for second step (page composition)? Does anybody know globally-optimal formatting system?

erk__ · 3 years ago

The program for it seems to be the one here https://typst.app/

sitkack · 3 years ago

Their supporting code is available on github and it is all in Rust.

https://github.com/typst/

Loading comment...

sebzim4500 · 3 years ago

I'm in the beta and it's really nice. It's still missing some important features (references is the main thing for me) but once they arrive I doubt I will ever go back to latex.

amichail · 3 years ago

Have you tried TeXmacs?