Before making criticisms that Garbage Collection "defeats the point" of Rust, it's important to consider that Rust has many other strengths:
- Rust has no overhead from a "framework"
- Rust programs start up quickly
- The rust ecosystem makes it very easy to compile a command-line tool without lots of fluff
- The strict nature of the language helps guide the programmer to write bug-free code.
In short: There's a lot of good reasons to choose Rust that have little to do with the presence or absence of a garbage collector.
I think having a working garbage collection at the application layer is very useful; even if it, at a minimum, makes Rust easier to learn. I do worry about 3rd party libraries using garbage collectors, because they (garbage collectors) tend to impose a lot of requirements, which is why a garbage collector usually is tightly integrated into the language.
You've just listed "Compiled language" features. Only the 4th point has any specificity to Rust, and even then, is vague in a way that could be misinterpreted.
Rust's predominant feature, the one that brings most of its safety and runtime guarantees, is borrow checking. There are things I love about Rust besides that, but the safety from borrow checking (and everything the borrow checker makes me do) is why I like programming in rust. Now, when I program elsewhere, I'm constantly checking ownership "in my head", which I think is a good thing.
The heavyweight framework (and startup cost) that comes with Java and C# makes them challenging for widely-adopted lightweight command-line tools. (Although I love C# as a language, I find the Rust toolchain much simpler and easier to work with than modern dotnet.)
Just going to jump in here and say that there's another reason I might want Rust with a Garbage Collector: The language/type-system/LSP is really nice to work with. There have indeed been times that I really miss having enums + traits, but DON'T miss the borrow checker.
Also, the proposed garbage collector is still opt in. Only pointers that are specifically marked as GC are garbage collected. This means that most references are still cleaned up automatically when the owner goes out of scope. This greatly reduces the cost of GC compared to making all heap allocations garbage collected.
This isn't even a new concept in Rust. Rust already has a well accepted RC<T> type for reference counted pointers. From a usage perspective, GC<T> seems to fit in the same pattern.
Language where most of the libraries are without GC, but has an GC opt in would be interesting. For example only your business logic code would use GC (so you can write it more quickly). And parts where you don't want GC are still written in the same language, avoiding the complexity of FFI.
Add opt-in development compilation JIT for quick iteration and you don't need any other language. (Except for user scripts where needed.)
Also assuming one can mix garbage collection with the borrower (is that what its called in rust?) one should be able to use GC for things that arent called that much / that important and use the normal way for things that benefit from no GC interupts etc
In all honesty, there are three topics I try to refrain myself from engaging with on HN, often unsuccesfully: politics, religion, and rust.
I don't know what you had to go through before reaching rust's secure haven, but what you just said is true for the vast majority of compiled languages, which are legions.
It's the fledging of a new generation of developers. Every time I see one of these threads I tell myself, "you, too, were once this ignorant and obnoxious". I don't know any cute except letting them get it out of their system and holding my nose as they do.
Rust's choice of constructs also makes writing safe and performant code easy. Many other compiled languages lack proper sum and product types, and traits (type classes) offer polymorphism without many of the pitfalls of inheritance, to name a few.
The problem with conventional garbage collection has very little to do with the principle or algorithms behind garbage collection and more to do with the fact that seemingly every implementation has decided to only support a single heap. The moment you can have isolated heaps almost every single problem associated with garbage collection fades away. The only thing that remains is that cleaning up memory as late as possible is going to consume more memory than doing it as early as possible.
What problem does that solve with GC, specifically? It also seems like that creates an obvious new problem: If you have multiple heaps, how do you deal with an object in heap A pointing to an object in heap B? What about cyclic dependencies between the two?
If you ban doing that, then you’re basically back to manual memory management.
BEAM (i.e. erlang) is exactly that model, every lightweight process has its own heap. I don't see how you'd make that work in a more general environment that supports sharing pointers across threads.
Aren't Rust programs still considerably larger than their C equivalent because everything is statically linked? It's kind of hard to see that as an advantage.
But in practice it's more like there's an overhead for "hello world" but it's a fixed overhead. So it's really only a problem where you have lots of binaries, e.g. for coreutils. The solution there is a multi-call binary like Busybox that switches on argv[0].
C programs often seem small because you don't see the size of their dependencies directly, but they obviously still take up disk space. In some cases they can be shared but actually the amount of disk space this saves is not very big except for things like libc (which Rust dynamically links) and maybe big libraries like Qt, GTK, X11.
Yes, all Rust libraries depended on are statically compiled into the final binary. One one hand, it makes binary size much larger, on the other, it makes it much easier to build an application that will "just work" without too much fuss.
In my personal projects with Rust, this ends up being very nice because it makes packaging easier. However, I've never been in a situation where binary size matters like in the embedded space, for example.
Rust isn't the only language with this approach, Go is another.
With data intensive Go applications you eventually hit a point where your code has performance bottlenecks that you cannot fix without either requiring insane levels of knowledge on how Go works under the hood, or using CGo and incurring a high cost for each CGo call (last I heard it was something like 90ns), at which point you find yourself regretting you didn't write the program in Rust. If GC in Rust could be made ergonomic enough, I think it could be a better default choice than Go for writing a compiled app with high velocity. You could start off with an ergonomic GC style of Rust, then later drop into manual mode wherever you need performance.
For those who are interested, I think that arena allocation is an underrated approach to managing lifetimes of interconnected objects that works well with borrow checking.
That effort was focused primarily on learnability and teachability, but it seems like more fundamental arena support could help even for experienced devs if it made patterns like linked lists fundamentally easier to work with.
Thanks for those links. Have you tried using arenas that give out handles (sometimes indexes) instead of mutable references? It's less convenient and you're not leveraging borrow checking but I would imagine it supports Send well.
Rust that uses standard techniques for asynchronous code, or is synchronous, does not. Async/await sucks all the oxygen from asynchronous Rust
Async/await Rust is a different language, probably more popular, and worth pursuing (for somebody, not me) it already has a runtime and the dreadful hacks like (pin)[https://doc.rust-lang.org/std/pin/index.html] that are due to the lack of a garbage collector
Hi -- I'm the one who presented the talk -- honored!
I'm curious how you got to "async Rust needs a [tracing] garbage collector" in particular. While it's true that a lot of the issues here are downstream of futures being passive (which in turn is downstream of wanting async to work on embedded), I'm not sure active futures need a tracing GC. Seems to me like Arc or even a borrow-based approach would work, as long as you can guarantee that the future is dropped before the scope exits (which admittedly isn't possible in safe Rust today [0]).
The difficulties with async/await seem to me to be with the fact that code execution starts and stops using "mysterious magic", and it is very hard for the compiler to know what is in, and what is out, of scope.
I am by no means an expert on async/await, but I have programmed asynchronously for decades. I tried using async/await in Rust, Typescript and Dart. In Typescript and Dart I just forget about memory and I pretend I am programming synchronously. Managed memory, runtimes, money in the bank, who is complaining? Not me.
\digression{start}
This is where the first problem I had with async/await cropped up. I do not like things that are one thing, and pretend to be another - personally or professionally - and async/await is all about (it seems to me) making asynchronous programming look synchronous. Not only do I not get the point - why? is asynchronous programming hard? - but I find it offensive. That is a personal quibble and not one I expect many others to find convincing
I guess I am complaining....
\digression{end}
In Rust I swiftly found myself jumping through hoops, and having to add lots and lots of "magic incantations" none of which I needed in the other languages. It has been a while, and I have blotted out the details.
Having to keep a piece of memory in scope when the scope itself is not in my control made me dizzy. I have not gone back and used async/await but I have done a lot of asynchronous rust programming since, and I will be doing more.
My push for Rust to bifurcate and become two languages is because async/await has sucked up all the oxygen. Definitely from asynchronous Rust programming, but it has wrecked the culture generally. The first thing I do when I evaluate a new crate is to look for "tokio" in the dependencies - and two out of three times I find it. People are using async/await by default.
That is OK, for another language. But Rust, as it stands, is the wrong choice for most of those things. I am using it for real time audio processing and it is the right choice for that. But (e.g) for the IoT lighting controller [tapo](https://github.com/mihai-dinculescu/tapo) it really is not.
I am resigned to my Cassandra role here. People like your good self (much respect for your fascinating talk, much envy for your exciting job) are going to keep trying to make it work. I think it will fail. It is too hard to manage memory like Rust does with a borrow checker with a runtime that inserts and runs code outside the programmer's control. There is a conflict there, and a lot of water is going under the bridge and money down the drain before people agree with me and do what I say...
Either that or I will be proved wrong
Lastly I have to head off one of the most common, and disturbing, counter (non) arguments: I absolutely do not accept that "so many smart people are using it it must be OK". Many smart people do all sorts of crazy things. I am old enough to have done some really crazy things that I do not like to recall, and anyway, explain Windows - smart people doing stupid things if ever
> Having acknowledged that pointers can be 'disguised' as integers, it is then inevitable that Alloy must be a conservative GC
C# / dotnet don't have this issue. The few times I've needed a raw pointer to an object, first I had to pin it, and then I had to make sure that I kept a live reference to the object while native code had its pointer. This is "easier done than said" because most of the time it's passing strings to native APIs, where the memory isn't retained outside of the function call, and there is always a live reference to the string on the stack.
That being said, because GC (in this implementation) is opt-in, I probably wouldn't mix GC and pointers. It's probably easier to drop the requirement to get a pointer to a GC<T> instead of trying to work around such a narrow use case.
Also, Rust is not going to have it for the long run that pointers can be, in fact, disguised as integers. There is this thing called pointer provenance, and some day, all pointers are required to have provenance (i.e. a proof where they did come from) OR they are required to admit that POOF this is a pointer out of thin air, you can't assume anything about the pointee. As long as there are no POOF magicians, the GC can assume that it knows every reference!
Even their so-called conservative assumption is also insufficient.
> if a machine word's integer value, when considered as a pointer, falls within a GCed block of memory, then that block itself is considered reachable (and is transitively scanned). Since a conservative GC cannot know if a word is really a pointer, or is a random sequence of bits that happens to be the same as a valid pointer, this over-approximates the live set
Suppose I allocate two blocks of memory, convert their pointers to integers, then store the values `x` and `x^y`. At this point, no machine word points to the second allocation, and so the GC would consider the second allocation to be unreachable. However, the value `y` could be computed as `x ^ (x^y)`, converted back to a pointer, and accessed. Therefore, their reachability analysis would under-approximate the live set.
If pointers and integers can be freely converted to each other, then the GC would need to consider not just the integers that currently exist, but also every integer that could be produced from the integers that currently exist.
I find the idea of provenance a bit abstract so it's a lot easier to think about a concrete pointer system that has "real" provenance: CHERI. In CHERI all pointers are capabilities with a "valid" tag bit (it's out-of-band so you can't just set it to 1 arbitrarily). As soon as you start doing raw bit manipulation of the address the tag is cleared and then it can be no longer used as a pointer. So this problem doesn't exist on CHERI.
Also the problem of mistaking integers as pointers when scanning doesn't exist either - you can instead just search for memory where the tag bit is set.
What you're describing is not just a problem with GC, but pointers in general. Optimizers would choke on exactly the same scheme.
What compiler writers realized is that pointers are actually not integers, even though we optimize them down to be integers. There's extra information in them we're forgetting to materialize in code, so-called "pointer provenance", that optimizers are implicitly using when they make certain obvious pointer optimizations. This would include the original block of memory or local variable you got the pointer from as well as the size of that data.
For normal pointer operations, including casting them to integers, this has no bearing on the meaning of the program. Pointers can lower to integers. But that doesn't mean constructing a new pointer from an integer alone is a sound operation. That is to say, in your example, recovering the integer portion of y and casting it to a pointer shouldn't be allowed.
There are two ways in which the casting of integers to pointers can be made a sound operation. The first would be to have the programmer provide a suitably valid pointer with the same or greater provenance as the one that provided the address. The other, which C/C++ went with for legacy reasons, is to say that pointers that are cast to integers become 'exposed' in such a way that casting the same integer back to a pointer successfully recovers the provenance.
If you're wondering, Rust supports both methods of sound int-to-pointer casts. The former is uninteresting for your example[0], but the latter would work. The way that 'exposed provenance' would lower to a GC system would be to have the GC keep a list of permanently rooted objects that have had their pointers cast to integers, and thus can never be collected by the system. Obviously, making pointer-to-integer casts leak every allocation they touch is a Very Bad Idea, but so is XORing pointers.
Ironically, if Alloy had done what other Rust GCs do - i.e. have a dedicated Collect trait - you could store x and x^y in a single newtype that transparently recovers y and tells the GC to traverse it. This is the sort of contrived scenario where insisting on API changes to provide a precise collector actually gets what a conservative collector would miss.
[0] If you're wondering what situations in which "cast from pointer and int to another pointer" would be necessary, consider how NaN-boxing or tagged pointers in JavaScript interpreters might be made sound.
A Gc<T> that can't give you a pointer inside seems almost unusable in the context of Rust. Pointers are not a narrow use case; references are pointers.
Rust APIs are largely built around references. If you were to put a Vec<T> (dynamic array) into a pointerless Gc<T>, you would be almost entirely unable to access its contents. The only way to access it would be swap it with an empty Vec, access it, then swap it back a-la Cell. You wouldn't even be able to clone the Vec without storing a dummy version in its place during the call.
You miss the point: I'm referring to cases where you pass pointers from one language into another. In that case, because GC is opt-in, it's the wrong approach for managing whatever you're passing into the non-Rust language.
In your case, do you need to get a pointer to a GC<T> and use it within Rust? I haven't worked with Rust at that level yet, so perhaps I'm ignorant of a more common use case.
Worse, conservatism in a GC further implies it can't be a moving GC, which means you can't compact, use bump pointer allocation, and so on. It keeps you permanently behind the frontier.
I remain bitterly disappointed that so much of the industry is so ignorant of the advances of the past 20 years. It's like it's 1950 and people are still debating whether their cloth and wood airplanes should be biplanes or triplanes.
The thing I don't understand is why anyone would pass a pointer to a GC'ed object into a 3rd party library (that's in a different language) and expect the GC to track the pointer there?
Passing memory into code that uses a different memory manager is always a case where automatic memory management shouldn't be used. IE, when I'm using a 3rd party library in a different language, I don't expect it to know enough about my language's memory model to be able to effectively clean up pointers that I pass to it.
Yea, but, this is Rust. How is a moving GC supposed to handle an untagged union? Or a person who uses the now-stable provenance api to read/write pointer bits to/from disk.
I don’t understand the desire to staple a GC into Rust.
If you want this, you might just…want a different language? Which is fine and good! Putting a GC on Rust feels like putting 4WD tyres on a Ferrari sports car and towing a caravan with it. You could (maybe) but it feels like using the wrong tool for the job.
Adding a GC to Rust might honestly be easier than getting the OCaml ecosystem to adopt something that works as well as cargo. It's tragic, but that's the world we live in.
Of all the things I'd change about OCaml, dune is very much down the list. It's flat out better than cargo in that it's an actual build system with build rules driven by file dependencies, not simply a glorified frontend for a compiler. Not great, but better.
Now, upstreaming OxCaml's unboxed types and stack allocations? That might actually take longer than adding a GC to Rust.
If I understand the article correctly it's for those cases where you want memory safety (i.e. not using "unsafe") but where the borrow checker is really hard to work with such as a doubly linked list, where nodes can point to each other.
A doubly linked list is not the optimal case for GC. It can be implemented with some unsafe code, and there are approaches that implement it safely with GhostCell (or similar facilities, e.g. QCell) plus some zero-overhead (mostly) "compile time reference counting" to cope with the invariants involved in having multiple references simultaneously "own" the data. See e.g. https://github.com/matthieu-m/ghost-collections for details.
Where GC becomes necessary is the case where even static analysis cannot really mitigate the issue of having multiple, possibly cyclical references to the same data. This is actually quite common in some problem domains, but it's not quite as simple as linked lists.
I just foresee it become irrevocably viral, as it becomes the “meh, easier” option, and then suddenly half your crates depend on it, and then you’re losing one of the major advantages of the language.
Worth highlighting: library-level GC would not be convenient enough to use pervasively in Rust anyway. library-level GC does not replace Rust's "point".
And there's a huge benefit in being able to narrowly use a GC. GCs can be useful in gamedev, but it's a terrible tradeoff to need to use a GC'd language to get them, because then everything is GCd. library-level GC lets you GC the handful of things that need to be GCd, while the bulk of your program uses normal, efficient memory management.
This is a very important point, careful use of GCs for a special subset of allocations that say have tricky lifetimes for some reason and aren't performance critical could have a much smaller impact on overall application performance than people might otherwise expect.
One clear use case for GC in Rust is for implementing other languages (eg writing a JS engine). When people ask why SpiderMonkey hasn't been rewritten in Rust, one of the main technical blockers I generally bring up is that safe, ergonomic, performant GC in Rust still appears to be a major research project. ("It would be a whole lot of work" is another, less technical problem.)
For a variety of reasons I don't think this particular approach is a good fit for a JS engine, but it's still very good to see people chipping away at the design space.
Would you plug Boehm GC into a first class JS engine? No? Then you're not using this to implement JS in anything approaching a reasonable manner either.
The mechanisms that Rust provide for memory management are various. Having a GC as a library for usecases with shared ownership / handles-to-resources is not out of question. The problem is that they have been hard to integrate with the language.
While you're of course correct, there's just something that feels off. I'd love if we kept niche, focused-purpose languages once in a while, instead of having every language do everything. If you prioritize everything you prioritize nothing.
Rust's memory safety guarantees do not insure the absence of leaks. However, Rust's design does offer significant protection against leaks (relative to languages like C where all heap allocations must be explicitly freed).
The fact that any felt it nessasary to add a "leak" function to the standard library should tell you something about how easy it is to accidentally leak memory.
Yeah; I wished they'd gone the other way, and made memory leaks unsafe (yes, this means no Rc or Arc). That way, you could pass references across async boundaries without causing the borrow checker to spuriously error out.
(It's safe to leak a promise, so there's no way for the borrow checker to prove an async function actually returned before control flow is handed back to the caller.)
Same as with GC, neither need be a fixed choice; having a GC library/feature in Rust wouldn't mean that everything will be and must be GC'd; and it's still possible to add unleakable types were it desired: https://rust-lang.github.io/keyword-generics-initiative/eval... while keeping neat things like Rc<T> available for things that don't care. (things get more messy when considering defaults and composability with existing libraries, but I'd say that such issues shouldn't prevent the existence of the options themselves)
No has seemed to call it out yet but swift uses a form of garbage collection but remains relatively fast. I was against this at first but the more I think about it, I think it has real potential to make lots of hard problems with ownership easier to solve. I think the next big step or perhaps an alternative would be to make changes to restrictions in unsafe rust.
I think the pursuit of safety is a good goal and I could see myself opting into garbage collections for certain tasks.
Reference counted pointers can deference an object (via a strong pointer) without checking the reference count. The reference count is accessed only on operations like clone, destruction, and such. That being said, access via a weak pointer does require a reference count check.
> Slows down every access to objects as reference counts must be maintained
Definitely not every access. Between an “increase refcount” and an “decrease refcount” you can access an object as many times as you want.
Also:
- static analysis can remove increase/decrease pairs.
- Swift structs are value types, and not reference counted. That means Swift code can have fewer reference-counted objects than similar Java code has garbage-collected objects.
It does perform slower than GC-ed languages or languages such as C and rust, but is easier to write [1] than rust and C and needs less memory than GC-ed languages.
[1] The latest Swift is a lot more complex than the original Swift, but high-level code still can be reasonably easy.
While I'm not ideologically opposed to GC in Rust I have to note:
- the syntax is hella ugly
- GC needs some compiler machinery, like precise GC root tracking with stack maps, space for tracking visted objects, type infos, read/write barriers etc. I don't know how would you retrofit this into Rust without doing heavy duty brain surgery on the compiler. You can do conservative GC without that, but that's kinda lame.
- Rust has no overhead from a "framework"
- Rust programs start up quickly
- The rust ecosystem makes it very easy to compile a command-line tool without lots of fluff
- The strict nature of the language helps guide the programmer to write bug-free code.
In short: There's a lot of good reasons to choose Rust that have little to do with the presence or absence of a garbage collector.
I think having a working garbage collection at the application layer is very useful; even if it, at a minimum, makes Rust easier to learn. I do worry about 3rd party libraries using garbage collectors, because they (garbage collectors) tend to impose a lot of requirements, which is why a garbage collector usually is tightly integrated into the language.
Rust's predominant feature, the one that brings most of its safety and runtime guarantees, is borrow checking. There are things I love about Rust besides that, but the safety from borrow checking (and everything the borrow checker makes me do) is why I like programming in rust. Now, when I program elsewhere, I'm constantly checking ownership "in my head", which I think is a good thing.
The heavyweight framework (and startup cost) that comes with Java and C# makes them challenging for widely-adopted lightweight command-line tools. (Although I love C# as a language, I find the Rust toolchain much simpler and easier to work with than modern dotnet.)
Building C (and C++) is often a nightmare.
This isn't even a new concept in Rust. Rust already has a well accepted RC<T> type for reference counted pointers. From a usage perspective, GC<T> seems to fit in the same pattern.
Add opt-in development compilation JIT for quick iteration and you don't need any other language. (Except for user scripts where needed.)
Dead Comment
I don't know what you had to go through before reaching rust's secure haven, but what you just said is true for the vast majority of compiled languages, which are legions.
Is there a real distinction between any of those?
If you ban doing that, then you’re basically back to manual memory management.
But in practice it's more like there's an overhead for "hello world" but it's a fixed overhead. So it's really only a problem where you have lots of binaries, e.g. for coreutils. The solution there is a multi-call binary like Busybox that switches on argv[0].
C programs often seem small because you don't see the size of their dependencies directly, but they obviously still take up disk space. In some cases they can be shared but actually the amount of disk space this saves is not very big except for things like libc (which Rust dynamically links) and maybe big libraries like Qt, GTK, X11.
In my personal projects with Rust, this ends up being very nice because it makes packaging easier. However, I've never been in a situation where binary size matters like in the embedded space, for example.
Rust isn't the only language with this approach, Go is another.
They may be larger because they are doing more work, depends on the program.
But no they don’t statically compile everything.
I was previously excited about this project which proposed to support arena allocation in the language in a more fundamental way: https://www.sophiajt.com/search-for-easier-safe-systems-prog...
That effort was focused primarily on learnability and teachability, but it seems like more fundamental arena support could help even for experienced devs if it made patterns like linked lists fundamentally easier to work with.
Yes, because it defeats borrow checking.
Unsafe Rust, used directly, works too
Asyc/await really desperately needs a garbage collector. (See this talk from Rustconf 2025: https://youtu.be/zrv5Cy1R7r4?si=lfTGLdJOGw81bvpu and this blog:https://rfd.shared.oxide.computer/rfd/400)
Rust that uses standard techniques for asynchronous code, or is synchronous, does not. Async/await sucks all the oxygen from asynchronous Rust
Async/await Rust is a different language, probably more popular, and worth pursuing (for somebody, not me) it already has a runtime and the dreadful hacks like (pin)[https://doc.rust-lang.org/std/pin/index.html] that are due to the lack of a garbage collector
What a good idea
I'm curious how you got to "async Rust needs a [tracing] garbage collector" in particular. While it's true that a lot of the issues here are downstream of futures being passive (which in turn is downstream of wanting async to work on embedded), I'm not sure active futures need a tracing GC. Seems to me like Arc or even a borrow-based approach would work, as long as you can guarantee that the future is dropped before the scope exits (which admittedly isn't possible in safe Rust today [0]).
[0]: https://without.boats/blog/the-scoped-task-trilemma/
The difficulties with async/await seem to me to be with the fact that code execution starts and stops using "mysterious magic", and it is very hard for the compiler to know what is in, and what is out, of scope.
I am by no means an expert on async/await, but I have programmed asynchronously for decades. I tried using async/await in Rust, Typescript and Dart. In Typescript and Dart I just forget about memory and I pretend I am programming synchronously. Managed memory, runtimes, money in the bank, who is complaining? Not me.
\digression{start} This is where the first problem I had with async/await cropped up. I do not like things that are one thing, and pretend to be another - personally or professionally - and async/await is all about (it seems to me) making asynchronous programming look synchronous. Not only do I not get the point - why? is asynchronous programming hard? - but I find it offensive. That is a personal quibble and not one I expect many others to find convincing I guess I am complaining.... \digression{end}
In Rust I swiftly found myself jumping through hoops, and having to add lots and lots of "magic incantations" none of which I needed in the other languages. It has been a while, and I have blotted out the details.
Having to keep a piece of memory in scope when the scope itself is not in my control made me dizzy. I have not gone back and used async/await but I have done a lot of asynchronous rust programming since, and I will be doing more.
My push for Rust to bifurcate and become two languages is because async/await has sucked up all the oxygen. Definitely from asynchronous Rust programming, but it has wrecked the culture generally. The first thing I do when I evaluate a new crate is to look for "tokio" in the dependencies - and two out of three times I find it. People are using async/await by default.
That is OK, for another language. But Rust, as it stands, is the wrong choice for most of those things. I am using it for real time audio processing and it is the right choice for that. But (e.g) for the IoT lighting controller [tapo](https://github.com/mihai-dinculescu/tapo) it really is not.
I am resigned to my Cassandra role here. People like your good self (much respect for your fascinating talk, much envy for your exciting job) are going to keep trying to make it work. I think it will fail. It is too hard to manage memory like Rust does with a borrow checker with a runtime that inserts and runs code outside the programmer's control. There is a conflict there, and a lot of water is going under the bridge and money down the drain before people agree with me and do what I say...
Either that or I will be proved wrong
Lastly I have to head off one of the most common, and disturbing, counter (non) arguments: I absolutely do not accept that "so many smart people are using it it must be OK". Many smart people do all sorts of crazy things. I am old enough to have done some really crazy things that I do not like to recall, and anyway, explain Windows - smart people doing stupid things if ever
C# / dotnet don't have this issue. The few times I've needed a raw pointer to an object, first I had to pin it, and then I had to make sure that I kept a live reference to the object while native code had its pointer. This is "easier done than said" because most of the time it's passing strings to native APIs, where the memory isn't retained outside of the function call, and there is always a live reference to the string on the stack.
That being said, because GC (in this implementation) is opt-in, I probably wouldn't mix GC and pointers. It's probably easier to drop the requirement to get a pointer to a GC<T> instead of trying to work around such a narrow use case.
creating pointers without provenance is safe, so the GC can’t assume that a program won’t have them also be sound. This always be an issue.
> if a machine word's integer value, when considered as a pointer, falls within a GCed block of memory, then that block itself is considered reachable (and is transitively scanned). Since a conservative GC cannot know if a word is really a pointer, or is a random sequence of bits that happens to be the same as a valid pointer, this over-approximates the live set
Suppose I allocate two blocks of memory, convert their pointers to integers, then store the values `x` and `x^y`. At this point, no machine word points to the second allocation, and so the GC would consider the second allocation to be unreachable. However, the value `y` could be computed as `x ^ (x^y)`, converted back to a pointer, and accessed. Therefore, their reachability analysis would under-approximate the live set.
If pointers and integers can be freely converted to each other, then the GC would need to consider not just the integers that currently exist, but also every integer that could be produced from the integers that currently exist.
You can only freely convert integers to pointers with "exposed provenance" in Rust which is currently unstable.
https://doc.rust-lang.org/std/ptr/index.html#exposed-provena...
I find the idea of provenance a bit abstract so it's a lot easier to think about a concrete pointer system that has "real" provenance: CHERI. In CHERI all pointers are capabilities with a "valid" tag bit (it's out-of-band so you can't just set it to 1 arbitrarily). As soon as you start doing raw bit manipulation of the address the tag is cleared and then it can be no longer used as a pointer. So this problem doesn't exist on CHERI.
Also the problem of mistaking integers as pointers when scanning doesn't exist either - you can instead just search for memory where the tag bit is set.
What compiler writers realized is that pointers are actually not integers, even though we optimize them down to be integers. There's extra information in them we're forgetting to materialize in code, so-called "pointer provenance", that optimizers are implicitly using when they make certain obvious pointer optimizations. This would include the original block of memory or local variable you got the pointer from as well as the size of that data.
For normal pointer operations, including casting them to integers, this has no bearing on the meaning of the program. Pointers can lower to integers. But that doesn't mean constructing a new pointer from an integer alone is a sound operation. That is to say, in your example, recovering the integer portion of y and casting it to a pointer shouldn't be allowed.
There are two ways in which the casting of integers to pointers can be made a sound operation. The first would be to have the programmer provide a suitably valid pointer with the same or greater provenance as the one that provided the address. The other, which C/C++ went with for legacy reasons, is to say that pointers that are cast to integers become 'exposed' in such a way that casting the same integer back to a pointer successfully recovers the provenance.
If you're wondering, Rust supports both methods of sound int-to-pointer casts. The former is uninteresting for your example[0], but the latter would work. The way that 'exposed provenance' would lower to a GC system would be to have the GC keep a list of permanently rooted objects that have had their pointers cast to integers, and thus can never be collected by the system. Obviously, making pointer-to-integer casts leak every allocation they touch is a Very Bad Idea, but so is XORing pointers.
Ironically, if Alloy had done what other Rust GCs do - i.e. have a dedicated Collect trait - you could store x and x^y in a single newtype that transparently recovers y and tells the GC to traverse it. This is the sort of contrived scenario where insisting on API changes to provide a precise collector actually gets what a conservative collector would miss.
[0] If you're wondering what situations in which "cast from pointer and int to another pointer" would be necessary, consider how NaN-boxing or tagged pointers in JavaScript interpreters might be made sound.
Rust APIs are largely built around references. If you were to put a Vec<T> (dynamic array) into a pointerless Gc<T>, you would be almost entirely unable to access its contents. The only way to access it would be swap it with an empty Vec, access it, then swap it back a-la Cell. You wouldn't even be able to clone the Vec without storing a dummy version in its place during the call.
https://doc.rust-lang.org/stable/std/cell/struct.Cell.html#m...
In your case, do you need to get a pointer to a GC<T> and use it within Rust? I haven't worked with Rust at that level yet, so perhaps I'm ignorant of a more common use case.
I remain bitterly disappointed that so much of the industry is so ignorant of the advances of the past 20 years. It's like it's 1950 and people are still debating whether their cloth and wood airplanes should be biplanes or triplanes.
Passing memory into code that uses a different memory manager is always a case where automatic memory management shouldn't be used. IE, when I'm using a 3rd party library in a different language, I don't expect it to know enough about my language's memory model to be able to effectively clean up pointers that I pass to it.
If you want this, you might just…want a different language? Which is fine and good! Putting a GC on Rust feels like putting 4WD tyres on a Ferrari sports car and towing a caravan with it. You could (maybe) but it feels like using the wrong tool for the job.
Now, upstreaming OxCaml's unboxed types and stack allocations? That might actually take longer than adding a GC to Rust.
For the rest you'd still use non-GC rust.
Where GC becomes necessary is the case where even static analysis cannot really mitigate the issue of having multiple, possibly cyclical references to the same data. This is actually quite common in some problem domains, but it's not quite as simple as linked lists.
It's useful to have when you have complex graph structures. Or when implementing language runtimes. I've written a bit about these types of use cases in https://manishearth.github.io/blog/2021/04/05/a-tour-of-safe...
And there's a huge benefit in being able to narrowly use a GC. GCs can be useful in gamedev, but it's a terrible tradeoff to need to use a GC'd language to get them, because then everything is GCd. library-level GC lets you GC the handful of things that need to be GCd, while the bulk of your program uses normal, efficient memory management.
For a variety of reasons I don't think this particular approach is a good fit for a JS engine, but it's still very good to see people chipping away at the design space.
https://doc.rust-lang.org/std/boxed/struct.Box.html#method.l...
The fact that any felt it nessasary to add a "leak" function to the standard library should tell you something about how easy it is to accidentally leak memory.
(It's safe to leak a promise, so there's no way for the borrow checker to prove an async function actually returned before control flow is handed back to the caller.)
Rust has loads of other advantages over C++, though.
I think the pursuit of safety is a good goal and I could see myself opting into garbage collections for certain tasks.
Slows down every access to objects as reference counts must be maintained
Something weird that I never bothered with to enable circular references
Definitely not every access. Between an “increase refcount” and an “decrease refcount” you can access an object as many times as you want.
Also:
- static analysis can remove increase/decrease pairs.
- Swift structs are value types, and not reference counted. That means Swift code can have fewer reference-counted objects than similar Java code has garbage-collected objects.
It does perform slower than GC-ed languages or languages such as C and rust, but is easier to write [1] than rust and C and needs less memory than GC-ed languages.
[1] The latest Swift is a lot more complex than the original Swift, but high-level code still can be reasonably easy.
- the syntax is hella ugly
- GC needs some compiler machinery, like precise GC root tracking with stack maps, space for tracking visted objects, type infos, read/write barriers etc. I don't know how would you retrofit this into Rust without doing heavy duty brain surgery on the compiler. You can do conservative GC without that, but that's kinda lame.