The search for easier safe systems programming

I don't get why people are scared of GC. After working in embedded software, using mostly C, it was evident from profiling that C programs spend a lot of time allocating and releasing memory. And those programs usually release memory in place, unlike GC languages where the release is deferred, done in parallel, and with time-boxed pauses. Most "systems" programs use worse memory management strategies than what a modern GC actually offers.

Sure, some devices require using static memory allocations or are quite restricted. But a lot of other "system programming" targets far more capable machines.

refulgentis · a year ago

I was stunned how many engineers I met at Google who didn't know memory allocation takes a ton of time, didn't fundamentally believe it when you told them, and just sort of politely demurred on taking patches for it - first get me data, then walk me through the data, then show me how to get the data myself, then silence, because profiling ain't trivial.

There's a 3x speedup sitting in a library because it's in a sea of carefully optimized functions manipulating floats, and someone made a change to extract a function that returned an array.

It was such a good Chesterton fence: change seemed straightforward and obvious, better readability. But, it had cascading effects that essentially led to 9 on-demand array allocations in the midst of about 80 LOC of pure math.

In my experience, there's hangover from two concepts:

- Android does garbage collection and Android is slower than iOS

- You can always trade CPU time for memory and vice versa (ex. we can cache function results, which uses more memory, but less CPU)

toast0 · a year ago

> Android does garbage collection and Android is slower than iOS

There's a lot of things going on that make this the case.

Android was built with unrestrained multitasking; there's a lot of restraint now, but there's probably still a ton of stuff going on from unrelated processes while your foreground app is chugging along. iOS locks down background execution, and I suspect system apps have limited background execution as well. Less task switching and less cache pollution help a lot.

iOS only runs on premium phones. Even Apple's lower priced phones are premium phones. Android isn't necessarily quick on premium phones, but it can be much worse on phones designed to a $50 price point.

IMHO, A large issue is that the official UI toolkits are terribly slow. I'm more of a backend person, but I put together a terrible weather app, and first paint is very slow, even though it starts with placeholder data (or serialized data from a previous run); using an HTML view was much faster because it only makes one toolkit objeft and more capable because css layout lets you wrap text around an image nicely. Maybe there's some trick I'm not aware of to make 'native ui' not terrible on Android... but that experience helped me understand why everything is so slow to start.

pjmlp · a year ago

> Android does garbage collection and Android is slower than iOS

Thing is, not only has Android runtime witness several JIT and GC refactorings since Darwin was created, originally worse than Sony and Nokia's J2ME implementations regardless of Google's marketing otherwise, there are so many factors contributing to Android's perceived slowness from bad coded UIs, sloppy programing, cheapstakes OEMs with low performance chips, and so on.

On the other hand on iOS side, due to lack of GC and OS paging, when memory goes bad, applications just die.

hgs3 · a year ago

> memory allocation takes a ton of time

Depends how you define "a ton of time". Good general purpose memory allocation algorithms, like the TLSF algorithm [1], guarantee a O(1) bounded response time suitable for real-time use. To your example however, if someone introduces extra computation into math-heavy hot-looping code, then that is just sloppy development as their adding extra computation. That extra computation being memory allocation is tangential.

[1] https://www.researchgate.net/publication/4080369_TLSF_A_new_...

flohofwoe · a year ago

> that C programs spend a lot of time allocating and releasing memory

...then this code didn't get the point of manual memory management, which is to minimize allocations and move them out of the hot path. Unfortunately a lot of C and C++ code is still stuck in a 1990's OOP model where each individual object is explicitly constructed and destroyed, and each construction and destruction is linked to an alloc/free call. If you do this (or reach for reference counting, which isn't much better), a GC is indeed the better choice.

estebarb · a year ago

Moving out the allocations/releases out the hot path seems too much like a GC. Also, not all programs can be easily optimized moving the allocations: databases are an example, usually you don't know how much rows are you going to touch until you process the query.

pjmlp · a year ago

It is the consequence of urban performance myths being propagated by gut feeling.

kbolino · a year ago

If a "systems" program can afford GC, then I would consider it mischaracterized. The whole premise of "systems" programming IMO is that you need the code to behave exactly in the way specified, without unexpected preemption (i.e., preemption can only occur at controlled locations and/or for controlled durations). I think a lot of "embedded" software really isn't "systems" software, though the two get conflated a lot.

Note that GC cannot possibly promise "time-boxed pauses" in the general case; either allocations are allowed to outpace collections, in which case a longer collection pause will eventually be required to avoid memory exhaustion, or else allocations must be throttled, which just pushes an unbounded pause forward to allocation time.

pjmlp · a year ago

Xerox PARC, ETHZ, DEC Olivetti, Microsoft Research, Genera, and even Bell Labs had another understanding of systems programming.

> Rust's focus on embedded and system's development is a core strength. June, on the other hand, has a lean towards application development with a system's approach. This lets both co-exist and offer safe systems programming to a larger audience.

I think this is a mistake, both on June's part and on Rust's. All low-level languages (by which I mean languages that offer control over all/most memory allocation) inherently suffer from low abstraction, i.e. there are fewer implementations of a particular interface or, conversely, more changes to the implementation require changes to the interface itself or to its client. This is why even though writing a program in many low-level languages can be not much more expensive than writing the program in a high-level language (one where memory management is entirely or largely automatic), costs accrue in maintenance.

This feature of low-level programming isn't inherently good or bad -- it just is, and it's a tradeoff that is implicitly taken when choosing such a language. It seems that both June and Rust try to hide it, each in their own way, Rust by adopting C++'s "zero-cost abstraction approach", which is low abstraction masquerading as high abstraction when it appears as code on the screen, and June by yielding some amount of control. But because the tradeoff of low-level programming is real and inescapable, ultimately (after some years of seeing the maintenance costs) users learn to pick the right tradeoff for their domain.

As such, languages should focus on domains that are most appropriate for the tradeoffs they force, while trying to aim for others usually backfires (as we've seen happen with C++). Given that ultimately virtually all users of a low level language will be those using it in a domain where the low-level tradeoff is appropriate -- i.e. programs in resource-constrained environments or programs requiring full and flexible control over every resource like OS kernels -- trying to hide the tradeoff in the (IMO) unattainable hope of growing the market beyond the appropriate domain, will result in disappointment due to a bad product-market fit.

Sure, it's possible that C++'s vision of broadening the scope of low-level programming was right and it's only the execution that was wrong, but I wouldn't bet on it on both theoretical (low abstraction and its impact on maintenance) and empirical (for decades, no low-level languages have shown signs of taking a significant market share from high-level languages in the applications space) grounds. Trying to erase tradeoffs that appear fundamental -- to have your cake and eat it -- has consistently proven elusive.

pjmlp · a year ago

I beg to differ, as shown on the linage of system languages started with Pascal dialects, Mesa, Cedar, Modula variants, which in a way is what Zig is going back to, with a revamped syntax for the C crowd.

High level system languages, that provide a good programming confort, while havig the tools to go under the hood, if so desired.

Being forced to deal with naked pointers and raw memory in every single line of code like C, is an anomaly only made possible due to industry's adoption of UNIX at scale.

pron · a year ago

But you're using a different definition of high and low level than I do, and are thus missing my point. I define a high-level language as one that trades off control over memory resources in exchange for higher abstraction (by relying on automatic memory management, AKA garbage collection -- be it based on a refcounting or a tracing algorithm -- as the primary means of memory management), while a low-level language makes the opposite tradeoff. By this definition, C++ is just as low-level as C, regardless of the use of raw vs. managed pointer.

The question I'm interested in here is not which features a low-level language should add to make it more attractive for low-level programming, but should it add features that are primarily intended to make more attractive for application programming. The declining share of low-level languages (by my definition) for application programming over the last 30 year, leads me to answer this question in the negative. This is a big difference between the approach taken by low-level languages like C++ and Rust, which try to appeal to application programming, so far unsuccessfully, and low-level languages like Zig, which don't. So far, neither Rust nor Zig have been able to gain a significant market share of low-level programming, let alone application programming, which makes judging the success of their approach hard, but C++ has clearly failed to gain significant ground in the application space despite achieving great success in the low-level space.

The reason I'm focusing on this question is that this article specifically calls out an attempt by the June language to appeal to application programmers, and I claim that C++/Rust's "zero cost abstraction" approach does the same -- it attempts to give the illusion of high abstraction (something that I believe isn't useful for low-level programmers, who make the low-abstraction tradeoff with their eyes open) without actually providing it (clients are still susceptible to internal changes in implementation).

gpderetta · a year ago

C++ managed to get market share from C by being both as low level and higher level. It is true that it hasn't happened again since, as other higher level languages took market share from C++ for applications, but didn't replace it for lower level stuff. Still is conceivable that another language might do to C++ what it did to C.

pron · a year ago

I think that the way you use "high level" here is vague and so makes it hard to see what's going on. I define a high-level language as one that trades off control over memory resources in exchange for higher abstraction (by relying on automatic memory management, AKA garbage collection -- be it based on a refcounting or a tracing algorithm -- as the primary means of memory management), while a low-level language makes the opposite tradeoff. By this definition, C++ is just as low-level as C. We can argue over which C++ features made it more attractive than C in the low-level space, but it is clear that the overall market share of C + C++ has only declined over the past thirty years, and C++ has failed to make significant inroads in application programming over the long term (it had a short-lived rise followed by a drop). The question I focus on is whether a low-level language, by my definition, should have features specifically accommodating application programming. The obvious failure of low-level languages -- which include C++ according to my definition -- to take a significant market share of application programming leads me to answer that question in the negative.

Various features accommodating low-level programming -- those that may have helped C++ take market share away from C but, crucially, have not helped it gain market share in application programming (over the long term) are, therefore, irrelevant to this core question. It's one thing to make a low-level language more attractive to low-level programming, and a whole other thing to make it more attractive to application programming. C++ has succeeded in the former but failed in the latter.

This is a common trend among D, Chapel, Vale, Hylo, ParaSail, Haskell, OCaml, Swift, Ada, and now June.

While Rust made Cyclone's type system more manageable for mainstream computing, everyone else is trying to combine the benefits of linear/affine type systems, with the productivity of automated resource management.

Naturally it would be interesting to see if some of those attempts can equally ping back into Rust's ongoing designs.

brabel · a year ago

A few other languages I think deserve a mention:

* Carp: https://github.com/carp-lang/Carp

* Nim: https://nim-lang.org/

* Zig: https://ziglang.org/

* Austral: https://borretti.me/article/introducing-austral

arka2147483647 · a year ago

The problem for systems/low-level programming is that you want high-performance, and manual-control of resource management. As such, automated resource management can often look like a problem, instead of a feature. I think there is a deeper disconnect here between the language designers and the programmers in the trenches.

winternewt · a year ago

Yes if I read the article right, every object is being allocated on the heap. That is a no-go for systems programming as far as I'm concerned.

bluetomcat · a year ago

Stuff like memory pools, arena and slab allocators have been in widespread use in C/C++ systems programming for decades. It looks like designers of hip languages are reinventing that stuff in compilers that try to protect you from yourself.

eru · a year ago

Agreed. (Though I wouldn't exactly call Haskell a 'systems programming language'. At least not in the sense the article uses it.)

Yes, linear / affine types and uniqueness types can give you a lot of control over resources, while also allowing you to preserve functional semantics.

I would like to see a version of Haskell that doesn't just track IO, but also track totality. Ie unless your function is annotated with a special tag, it has to be guaranteed to return a value in finite time, ie it has to be total. If you tag it with eg 'Partial', you can rely on laziness.

That's very similar to how functions in Haskell can't do any IO, unless they are tagged with 'IO'.

(I know that Haskell doesn't see 'IO' as a tag or annotation. But it behaves like one in the sense I am using here.)

skybrian · a year ago

“Finite time” isn’t much of a guarantee. A million years is finite. Totality checking is useful in language for doing proofs, where you need a guarantee that a function always returns a value even though you never run it.

In languages for doing practical calculations, a progress dialog and a cancel button are more useful than a totality guarantee. It should be easier to make complex, long-running calculations safely cancellable.

(Still true with laziness, though it changes a bit. At some point you will ask for a value to be calculated.)

Y_Y · a year ago

Although I complain about them in a sibling comment, Agda and Idris do meet the criterion of "Haskell-like with totality checking".

marcosdumay · a year ago

> Though I wouldn't exactly call Haskell a 'systems programming language'. At least not in the sense the article uses it.

I don't disagree, but...

Except for excutable sizes and maybe unpredictable performance (performance unpredictability on Haskell has the same shape of UB on C, in that it creeps in, but it's way easier to keep away), there's actually no feature missing.

You can argue for C-like speed (instead of Java-like), but the stuff on the article doesn't have it either.

Haskell is my favourite language ever, and the (vanilla) type system is a joy to use. Inevitably though I find myself wish for some dependent typing. Agda and Idris aren't quite there yet imho, but there's definitely a need for something that's at least as easy to work with as Haskell but a little more powerful (and ideally not some arcane GHC megahack).

valenterry · a year ago

Same here, using Scala in a Haskell-like way and while Scala has more dependent typing than Haskell (ignoring liquid extension), it's still not sufficient and practical. I think we are quite far from having fully featured dependent type systems in a mainstream language. Maybe typescript comes closest so far.

I think the resource management issue is mostly solved in practice when using pure functional programming.

However that comes with performance drawbacks (or at least unpredictable/unreliable performance) which creates the need for languages like Rust. It's great to see the progress in those languages as well.

> I think the resource management issue is mostly solved in practice when using pure functional programming.

Well, in the sense that garbage collectors solve memory management. But pure functional programming in the Haskell sense doesn't really manage file handles for you, or database connections.

You could have more predictable performance in Haskell, if you didn't have to deal with mandatory laziness for some functions.

(Basically, in this hypothetical variant of Haskell, by default functions would be safe to be evaluated in any order, strict or lazy or whatever. If you want functions that need to be evaluated lazily, then you'd need to declare that; just like today you already need to declare that your function might have side effects.)

The compiler would then be free to re-order evaluation a lot more, and pick predictable, fast performance.

nercury · a year ago

struct Node<'a, 'b, 'c> { data1: &'a Data data2: &'b Data data3: &'c Data }

Wow. It's like teaching C++ and starting from SFINAE. Or C# and starting from type parameter constraints.

Please think of a real-world examples when teaching stuff. I am very eager to see the program a beginner would need to write that requires: 1) references in a struct; 2) 3 separate lifetime parameters for the same struct.

noelwelsh · a year ago

Effect systems strike again! They've come up a few times recently on HN, and region-based memory management is another problem they can solve. This paper describes a type system that region-based memory management falls out of as a special case: https://dl.acm.org/doi/10.1145/3618003

mst · a year ago

I was quite fascinated by koka's use of refcounting during compilation to be able to do June's 'recycle' trick automatically (i.e. if you consume or discard the last reference to something during an operation that returns a new 'something' it re-uses the memory of the now-defunct one).

taosx · a year ago

I don't know why but Rust's syntax just nails it for me. The more I use it the more I appreciate it. I see many projects that diverge from Rust's syntax while being inspired by it. Why ?

keyle · a year ago

Related, I really like the look of hare[1], sadly they don't seem to be interested in a cross-platform compiler. As I understand it, some of the design decisions have basically led it to be mostly a linux/bsd language.

I personally love C. I think designing a language top-down is a poor approach overall, I prefer the bottom-up approach of the C-inspired for system languages, that aim to fix C rather than this is how the world should beeee!

[1] https://harelang.org/

samuell · a year ago

The discussion of grouped lifetimes reminds me of the principles of Flow-based programming (without the visual part), where one main idea is that only one process owns a data packet (IP) at a time.

My own experience coding in this style [1] has been extremely reassuring.

You can generally really safely consider only the context of each process at a time, since there aren't even any function calls between processes, only data sharing.

This meant for example that I could port a PHP application that I had been coding on for years, fighting bugs all over, into a flow-based Go application in two weeks, with a perfectly development time perfectly linear to the number of processes. I just coded each processes in the pipeline one by one, tested them and continued with the next. There were never any surprises as the application grew, as the interactions between the processes are just simple data sharing which can't really cause that much trouble.

This is of course a radically different way of thinking and designing programs, but it really holds some enormous benefits.

https://github.com/rdfio/rdf2smw/blob/master/main.go#L58-L15...

I shall have a read through https://github.com/flowbase/flowbase later, thank you.