Show HN: Lit – a modern literate programming tool

From the description in the README I feared that the entire source-code of the program would appear twice in the resulting document; first under the definition of the "" macro and again wherever each code-fragment was defined. Looking at the contents of the "examples" directory, however, I can see that the "" macro works more like a table-of-contents.

That's reasonable, in a minimalist kind of way, but it's a bit unfortunate that Lit syntax winds up unmodified in the output document; I'd wind up having to put a paragraph at the top of each document explaining what Lit was and why all the "<< >>" tokens throughout the code weren't actually part of the code.

Also, the resulting HTML doesn't actually validate: http://validator.nu/?doc=https%3A%2F%2Fraw.githubusercontent...

cdosborn · 11 years ago

The project is still early, and the html generation is rudimentary. Will be working soon to validate.

Not sure what you mean by appearing twice. I debated removing the "<< >>" syntax in the generated html, but I don't think it's bad. It's essentially the syntax for macros, which aid in providing context. It's useful in html, because you can quickly refer to the defn of a macro by following the anchor.

thristian · 11 years ago

The readme says:

lit only has two valid constructs: A macro definition: << ... >>= and a macro reference: << ... >>

...from my experience with other macro systems, I assumed that a "macro reference" would be replaced with the content of the macro definition, leading the code-block to appear at the top (under the star macro) and also in the macro definition. I'm pleased to see that's not the case.

I'd rather not have the "<< >>" syntax in the resulting HTML, because it's the syntax for Lit macros... if I'm writing a document in human language A to explain concept B in programming language C, that's already a lot of context, and requiring the reader to also be familiar with literate-programming-tool D is a drawback.

Linking to macro definitions is definitely a useful feature, but I'd rather those links were distinguished with a CSS class so that I could define their location and appearance in the stylesheet, rather than giving them specific text and markup.

cdosborn · 11 years ago

validator passes, thanks.

I really really want to like this. Org mode for emacs has really expanded what I could want from a literate programming tool, though. Specifically, being able to execute small segments and have the output immediately usable is rather nice. That you can also feed tables of data directly into the language is also nice.

Granted, I do feel I have a fair bit to learn in how to structure code for others to read. Myself included. I am learning the rather obvious point that keeping a narrative to the code is not easy. And, of course, in the couple of attempts I've done recently, I end with a fairly large dump of "and here is the boring stuff" at the end.[1]

Also, I can't recommend reading straight from Knuth's site heavily enough. His programs are rather interesting by modern aesthetics, but they are all still runnable.

[1] http://taeric.github.io/DancingLinks.html

jostylr · 11 years ago

My literate-programming tool, https://github.com/jostylr/literate-programming, has this feature. It really is nice to be able to pipe bits of code into various other functions for compiling (or evaling).

The notion I am working towards is actually more of a literate-project. Something that can do all the grunt tasks, such as linting and testing code, importing data, etc. and weave together bits from multiple literate documents.

taeric · 11 years ago

Do check out org-mode, then. It is surprising how far that tool has been extended.

Many of these features are actually under the label "reproducible research" nowdays. There was a really good talk given on this at a pypy convention a couple of years ago.[1]

[1] https://www.youtube.com/watch?v=1-dUkyn_fZA

akkartik · 11 years ago

I've been playing with literate programming for a while, and I don't think having boring stuff is a bad thing. The strategy I've settled on is to try to show a cleaned up history of the evolution of a codebase -- and a key part of that is including tests in the literate program. http://akkartik.name/post/wart-layers

taeric · 11 years ago

Sorry, my point was that I didn't even attempt to explain the boring stuff other than a "this was all incidental stuff that I needed."

Even without literate programming, this is the harder part of the program to document, to me. To really give it credit here, I would split these out into different files. But, I have pretty low interest in really delving into some of the stuff there. Hence, "and this is the support code for the figures I did."

porker · 11 years ago

I don't know if this counts as literate programming, but one of my favourite outputs for learning about code is Docco: http://jashkenas.github.io/docco/

Sample output: http://backbonejs.org/docs/backbone.html

Literate programming requires that there is freedom in ordering content in the literate file. Otherwise, you are restricted to the execution order of the computer. The intent of literate programming is to circumvent that requisite. Otherwise the tool is just a pretty printer of sorts.

desireco42 · 11 years ago

Yep, he pretty much applied literate programming and first made it practical

Deleted Comment

I think the greatest motivation for literate programming is providing context. After five minutes of reading, you can have a very good high level understanding of a complex program. This just isn't achievable in the same amount of time otherwise.

In that vein, you should consider the ability to coordinate a project using literate programming. It is hard enough to understand a code-base in a single file, but when a project has several different directories with a variety of files in each, having a starting document that describes all this structure is extremely important. And one can write the compiler to use that starting document itself to generate that complex structure.

ppereira · 11 years ago

I love literate programming, however my main issue with most tools like this one is that they do not properly support compiler/debugger error messages. This is a huge, often unstated caveat. It is very important to be able to read error messages and quickly locate the offending line in the literate source.

For languages that support file/line pragmas like C, literate programming works very well. Alternately, if the language supports goto and can sensibly unravel these statements, one can cobble a weave/tangle script to still have a 1-1 mapping between program and source line numbers.

Otherwise, I find that I need to program in a very functional style in order to force my tangled program to have the exact same line numbers as the literate source. This is possible, but it also negates the advantages of having natural language macro names, since they are essentially equivalent to the function names. In this case, tangle becomes an identity mapping.

This is a feature I'm looking to add. It was designed to be included. When I parse the lit file, I record the line no. which will be represented in comments in the generated source file. Hopefully, this will mitigate some of the extra workflow.

The granularity will be for every macro defn, so there will be some ambiguity.

padator · 11 years ago

The main issue I had with LP in the past is that sometimes you want to modify the code directly, not the literate document and synchronize them. I made such a tool: https://github.com/aryx/syncweb

Neat. Do you find that you use that a lot? I have thought about it implementing it in the literate-programming tool I wrote, but so far I have found a split screen editor (vim for me) with search to be fairly effective.

On the opposite side of things, I rather enjoy mucking about with the compiled code to diagnose it, and then recompiling to get rid of the diagnostic debris.

I use syncweb all the time. I can't go back to just using noweb. Many of the subprojects in pfff have a literate document that is synchronized with the code using syncweb (e.g https://github.com/facebook/pfff/blob/master/h_visualization... ). When I have errors in my code, or when I debug my code, I do it on the generated code, not the tex document, so of course if there is a fix it's easier to fix the code directly and far later to synchronize.

My mind was almost literally blown when I found that org-mode in emacs has this. I believe you have to be putting in the noweb comments, but there is a detangle function.

Do you have a link explaining this org-mode feature?

lindig · 11 years ago

Telling from the number of recent implementations, Literate Programming enjoys a bigger following than I had expected. Here is my implementation of a similar tool in OCaml: https://github.com/lindig/lipsum. I have some projects using it on GitHub like https://github.com/lindig/ocaml-hyphenate, which implements Knuth's algorithm for hyphenation. The README is basically the literate program of the project.

Question for the author: Are the spaces around `<<`, `>>`, and `>>=` mandatory?

No they are not

edmack · 11 years ago

Nice! What do people use literate programming for? I've yet to, but I'm sure I'm missing some cool benefits.

It really is a great way to introduce a narrative into the code. Too much of reading programs feels like taking a look at either just a plain convoluted mess, or an index that someone put together for another larger work. (This is especially true for pieces where someone makes a lot of single use functions. Sure, the pieces may be "self documenting," but it is almost akin to just seeing a bag of screws. You know what they do individually, but you don't know why they are there.)

If you have the time, take a look at any of Knuth's programs[1]. Obviously, they are not all immediately approachable, but render them to pdf and give them a try. You'll hopefully be surprised just how much you do pick up.

The only downside, to me, is that I become less concerned with modern trends of long variable names and abstractions that "self document." These are still great, of course, but they don't go nearly as far as knowing the narrative of why a piece of code was written.

[1] http://www-cs-faculty.stanford.edu/~uno/programs.html

noonat · 11 years ago

I personally have been using it to document the more complex code that I write -- that is, the code that I have the most trouble explaining to others. I'm quite happy with the results for my collision detection library. [1] The library itself serves as a basic tutorial on collision detection, and I've used it myself for reference when I step away from the code for a while.

[1] https://noonat.github.io/intersect/

As I said elsewhere in this thread, I've been exploring using it as a replacement for my hack of reading git logs to better understand codebases. Git logs are immutable, whereas a literate format can provide a cleaned-up history of the evolution of a codebase. http://akkartik.name/post/wart-layers