From Languages to Language Sets

One thing I'll note is we tend to use languages from different levels in different settings (front end, back end, systems) and we spend an awful lot of time writing glue code to get them to talk to each other.

A major advantage of the proposed approach is automated FFI and serialization/deserialization between languages in the same language set. RustScript would be able to accept a struct or enum from Rust or RustGC, and vice-versa. You could have a channel with different languages on either end.

You can also see that we _want_ something like this, e.g. we bolt TypeScript on top of JavaScript, and types onto Python. If JavaScript (or python) were designed so they could be more easily compiled (notably, no monkey patching) then they would support level 2 as well.

I have been thinking of level 2 or 1 languages that support the higher levels. This is a really good framing. (The problem with going the other way is the implementation decisions in the interpretter often constrain how the compiler can work, e.g. CPython is dominant because all the libraries that make use of the CPython FFI, and similarly for NodeJS. It is easier to interpret a constrained language than to compile a dynamic language designed with an interpretter in mind).

sitkack · 5 months ago

Back when I did some high perf Python, I’d define my data at C structs and bump allocate those structs in a list using the cffi.

It is not unlike defining your data model for SQL so that you can have sane data access.

zahlman · 5 months ago

>bump allocate

The term isn't familiar to me, and when I try to look it up I get almost exclusively Rust-related results. I guess you mean https://en.wikipedia.org/wiki/Region-based_memory_management , which I grew up calling "pool allocation".

eterps · 5 months ago

This a 100%. It's madness that languages are effectively siloed from each other.

> One language could combine the 2nd and 3rd level though. A language that can be interpreted during development for fast iteration cycle, but compiled for better performance for deployment. There isn’t such a language popular today though.

I'm not sure if Dart counts as "popular", but it otherwise fits this bill. It has a JIT and can startup pretty quickly and interpret on the fly. You can also hot reload code changes while a program is running. And it can ahead-of-time compile to efficient machine code when you're ready to ship.

layer8 · 5 months ago

Many interpreted languages use an intermediate representation and/or JIT compilation internally, like for example Python with its .pyc files. And Java as a level-2 language only compiles to bytecode (class files) which by default is then interpreted, and typically only JIT-compiled for “hot” code. The distinction between levels 3 and 2 is more about how the application is distributed for execution, in source-code form vs. in some compiled binary form.

munificent · 5 months ago

> Many interpreted languages use an intermediate representation and/or JIT compilation internally, like for example Python with its .pyc files.

Yes, but Python, Ruby, Lua, etc. are also all dynamically typed, which places them in level 4.

> And Java as a level-2 language only compiles to bytecode (class files) which by default is then interpreted, and typically only JIT-compiled for “hot” code.

Yes, but Java is generally only run in the JVM and is not often compiled ahead-of-time to a static executable. There are AOT compilers for Java, but the performance isn't great. Java was designed to run in a VM. Classloaders, static initializers, reflection, every-method-is-virtual all make it quite difficult to compile Java to a static executable and get decent performance.

Dart was designed to be a decent AOT target.

andyferris · 5 months ago

notarobot123 · 5 months ago

I think Peter Naur's description of levels of computation is a better one for considering an actual layering of levels of abstraction:

> Each level is associated with a certain set of operations and with a programming language that allows us to write or otherwise express programs that call these operations into action. In any particular use of the computer, programs from all levels are executed simultaneously. In fact, the levels support each other. In order to execute one operation of a given level, several operations at the next lower level will normally have to execute. Each of these operations will in their turn call several operations at the still lower level into execution.

The old term "problem-oriented languages" seems to still be quite useful. Programming languages are always focused on allowing the programmer to solve a set of problems and their features hide irrelevant details.

These language sets seem like a helpful grouping of features that suit particular problem domains but I don't think it works as a taxonomy of levels of abstraction.

There is something to be said for the three languages (“levels”) to actually look sufficiently different from each other, so that when looking at some code it’s immediately clear which one it’s in. Making them too similar increases the likelihood of mistaking which one you’re in, and applying the mindset of one to the other.

Another reason to do that is that the different levels are amenable to different affordances, and have different trade-offs in their design. For example, at level 4 you may want to go for a more BASIC-like syntax, without semicolons, and commands without argument-list parentheses.

teaearlgraycold · 5 months ago

I’m a strong supporter of adding an automatic GC to Rust. Although it seems difficult to justify as RustGC code wouldn’t be trivial to convert to traditional Rust. But going in the opposite direction should be trivial.

tmtvl · 5 months ago

A language which can be either interpreted or compiled doesn't exist? Nobody tell the Common Lispers, they'd vanish in a puff of logic.

emidln · 5 months ago

This is one of the reasons I like Clojure. There are very useful dialects with broad overlap between:

Browser / JavaScript environments -> ClojureScript

General Purpose (JVM) -> Clojure

Fast Scripting -> Babashka (although I've used ClojureScript for this in the past)

C/C++ Interop (LLVM-based) -> Jank (new, but progressing rapidly and already useful)

I can largely write the same expressive code in each environment, playing to the platform strengths as needed. I can combine these languages inside the same project, and have libraries that have unified APIs across implementation. I can generally print and read EDN across implementations, provided I register the right tag handlers for custom types (this is one area jank still has to catch up). Reader conditionals allow implementation-specific code as needed.

I'm really excited about Jank giving me a good alternative to JNI/JNA/Panama when I need my Clojure to touch OS parts the JVM hasn't wrapped.

rickcarlino · 5 months ago

This is a better taxonomy of what a language is rather than the dated concept of “High-level” vs. “Low-level”.