Intentionally or not, this post demonstrates one of the things that makes safer abstractions in C less desirable: the shared pointer implementation uses a POSIX mutex, which means it’s (1) not cross platform, and (2) pays the mutex overhead even in provably single-threaded contexts. In other words, it’s not a zero-cost abstraction.
C++’s shared pointer has the same problem; Rust avoids it by having two types (Rc and Arc) that the developer can select from (and which the compiler will prevent you from using unsafely).
> the shared pointer implementation uses a POSIX mutex [...] C++’s shared pointer has the same problem
It doesn't. C++'s shared pointers use atomics, just like Rust's Arc does. There's no good reason (unless you have some very exotic requirements, into which I won't get into here) to implement shared pointers with mutexes. The implementation in the blog post here is just suboptimal.
(But it's true that C++ doesn't have Rust's equivalent of Rc, which means that if you just need a reference counted pointer then using std::shared_ptr is not a zero cost abstraction.)
The primary exotic thing I can imagine is an architecture lacking the ability to do atomic operations. But even in that case, C11 has atomic operations [1] built in. So worst case, the C library for the target architecture would likely boil down to mutex operations.
Unfortunately, for C++, thats not true. At least with glibc and libstdc++, if you do not link with pthreads, then shared pointers are not thread-safe. At runtime it will do a symbol lookup for a pthreads symbol, and based off the result, the shared pointer code will either take the atomic or non-atomic path.
I'd much rather it didnt try to be zero-cost and it always used atomics...
True, but that's a fault of the implementation, which assumes POSIX is the only thing in town & makes questionable optimization choices, rather that of the language itself
The number of times I might want to write something in C and have it less likely to crash absolutely dwarfs the number of times I care about that code being cross-platform.
Sure, cross-platform is desirable, if there's no cost involved, and mandatory if you actually need it, but it's a "nice to have" most of the time, not a "needs this".
As for mutex overheads, yep, that's annoying, but really, how annoying ? Modern CPUs are fast. Very very fast. Personally I'm far more likely to use an os_unfair_lock_t than a pthread_mutex_t (see the previous point) which minimizes the locking to a memory barrier, but even if locking were slow, I think I'd prefer safe.
Rust is, I'm sure, great. It's not something I'm personally interested in getting involved with, but it's not necessary for C (or even this extra header) to do everything that Rust can do, for it to be an improvement on what is available.
There's simply too much out there written in C to say "just use Rust, or Swift, or ..." - too many libraries, too many resources, too many tutorials, etc. You pays your money and takes your choice.
> As for mutex overheads, yep, that's annoying, but really, how annoying ?
For this use-case, you might not notice. ISTR, when examing the pthreads source code for some platform, that mutexes only do a context switch as a fallback, if the lock cannot be acquired.
So, for most use-cases of this header, you should not see any performance impact. You'll see some bloat, to be sure.
> There's simply too much out there written in C to say "just use Rust, or Swift, or ..." - too many libraries, too many resources, too many tutorials, etc.
There really isn't. Speaking as someone who works in JVM-land, you really can avoid C all the time if you're willing to actually try.
> Intentionally or not, this post demonstrates one of the things that makes safer abstractions in C less desirable: the shared pointer implementation uses a POSIX mutex, which means it’s (1) not cross platform, and (2) pays the mutex overhead even in provably single-threaded contexts. In other words, it’s not a zero-cost abstraction.
It's an implementation detail. They could have used atomic load/store (since c11) to implement the increment/decrement.
TBH I'm not sure what a mutex buys you in this situation (reference counting)
I'd think a POSIX mutex--a standard API that I not only could implement anywhere, but which has already been implemented all over the place--is way more "cross platform" than use of atomics.
To lift things up a level: I think a language’s abstractions have failed if we even need to have a conversation around what “cross platform” really means :-)
If you're targeting a vaguely modern C standard, atomics win by being part of the language. C11 has atomics and it's straightforward to use them to implement thread-safe reference counting.
Rust pays the cumbersome lifetime syntax tax even in provably single threaded contexts. When will Rust develop ergonomics with better defaults and less boilerplate in such contexts?
Tecnhically the mutex refcounting example is shown as an example of the before the header the author is talking about. We don't know what they've chosen to implement shared_ptr with.
A recent superpower was added by Fil aka the pizlonator who made C more Fil-C with FUGC, a garbage collector with minimal adjustments to existing code, turning it into a memory safe implementation of the C and C++ programming languages you already know and love.
Because about 99% of the time the garbage collect is a negligible portion of your runtime at the benefit of a huge dollop of safety.
People really need to stop acting like a garbage collector is some sort of cosmic horror that automatically takes you back to 1980s performance or something. The cases where they are unsuitable are a minority, and a rather small one at that. If you happen to live in that minority, great, but it'd be helpful if those of you in that minority would speak as if you are in the small minority and not propagate the crazy idea that garbage collection comes with massive "performance penalties" unconditionally. They come with conditions, and rather tight conditions nowadays.
IDK about Fil-C, but in Java garbage collector actually speeds up memory management compared to C++ if you measure the throughput. The cost of this is increased worst-case latency.
A CLI tool (which most POSIX tools are) would pick throughput over latency any time.
This feels like a misrepresentation of features that actually matter for memory safety. Automatically freeing locals and bounds checking is unquestionably good, but it's only the very beginning.
The real problems start when you need to manage memory lifetimes across the whole program, not locally. Can you return `UniquePtr` from a function? Can you store a copy of `SharedPtr` somewhere without accidentally forgetting to increment the refcount? Who is responsible for managing the lifetimes of elements in intrusive linked lists? How do you know whether a method consumes a pointer argument or stores a copy to it somewhere?
I appreciate trying to write safer software, but we've always told people `#define xfree(p) do { free(p); p = NULL; } while (0)` is a bad pattern, and this post really feels like more of the same thing.
It is surprisingly hard to find information about it, do you have any ? From what I can guess it's a new syntax but it's the feature itself is still an extension ?
The `[[attribute]]` syntax is new, the builtin ones in C23 are `[[deprecated]]`, `[[fallthrough]]`, `[[maybe_unused]]`, `[[nodiscard]]`, `[[noreturn]]`, `[[reproducible]]`, and `[[unsequenced]]`.
C++: "look at what others must do to mimic a fraction of my power"
This is cute, but also I'm baffled as to why you would want to use macros to emulate c++. Nothing is stopping you from writing c-like c++ if that's what you like style wise.
It's interesting to me to see how easily you can reach a much safer C without adding _everything_ from C++ as a toy project. I really enjoyed the read!
Though yes, you should probably just write C-like C++ at that point, and the result sum types used made me chuckle in that regard because they were added with C++17. This person REALLY wants modern CPP features..
> I'm baffled as to why you would want to use macros to emulate c++.
I like the power of destructors (auto cleanup) and templates (generic containers). But I also want a language that I can parse. Like, at all.
C is pretty easy to parse. Quite a few annoying corner cases, some context sensitive stuff, but still pretty workable. C++ on the other hand? It’s mostly pick a frontend or the highway.
No name mangling by default, far simpler toolchain, no dependence on libstdc++, compiles faster, usable with TCC/chibicc (i.e. much more amenable to custom tooling, be it at the level of a lexer, parser, or full compiler).
C’s simplicity can be frustrating, but it’s an extremely hackable language thanks to that simplicity. Once you opt in to C++, even nominally, you lose that.
I highly doubt (and some quick checks seem to verify that) any of the tiny CC implementations will support the cleanup extension that most of this post's magic hinges upon.
Yup. And I like the implication that Rust is 'cross platform', when it's 'tier 1' support consists of 2 architectures (x86 & arm64). I guess we're converging on a world where those 2 + riscv are all that matter to most people, but it's not yet a world where they are all that matter to all people.
In my experience most chips released in the past 10+ years ship with C++ compilers.
Quite frankly I'm not sure why you wouldn't given that most are using GCC on common architectures. The chip vendor doesn't have to do any work unless they are working on an obscure architecture.
How can anyone be this interested in maintaining an annex k implementation when it's widely regarded as a design failure, specially the global constraint handler. There's a reason why most C toolchains don't support it.
FWIW, it's heavily used inside Microsoft and is actually pretty nice when combined with all the static analysis tools that are mandatory parts of the dev cycle.
The problem with macro-laden C is that your code becomes foreign and opaque to others. You're building a new mini-language layer on top of the base language that only your codebase uses. This has been my experience with many large C projects: I see tons of macros used all over the place and I have no idea what they do unless I hunt down and understand each one of them.
Macros are simply a fact of life in any decent-sized C codebase. The Linux kernel has some good guidance to try to keep it from getting out of hand but it is just something you have to learn to deal with.
I think this can be fine if the header provides a clean abstraction with well-defined behaviour in C, effectively an EDSL. For an extreme example, it starts looking like a high-level language:
Nim is a language that compiles to C. So it is similar in principle to the "safe_c.h". We get power and speed of C, but in a safe and convenient language.
> It's finally, but for C
Nim has `finally` and `defer` statement that runs code at the end of scope, even if you raise.
> memory that automatically cleans itself up
Nim has ARC[1]:
"ARC is fully deterministic - the compiler automatically injects destructors when it deems that some variable is no longer needed. In this sense, it’s similar to C++ with its destructors (RAII)"
> automated reference counting
See above
> a type-safe, auto-growing vector.
Nim has sequences that are dynamically sized, type and bounds safe
> zero-cost, non-owning views
Nim has openarray, that is also "just a pointer and a length", unfortunately it's usage is limited to parameters.
But there is also an experimental view types feature[2]
> explicit, type-safe result
Nim has `Option[T]`[3] in standard library
> self-documenting contracts (requires and ensures)
Nim's assert returns message on raise: `assert(foo > 0, "Foo must be positive")`
> safe, bounds-checked operations
Nim has bounds-checking enabled by default (can be disabled)
> The UNLIKELY() macro tells the compiler which branches are cold, adding zero overhead in hot paths.
C++’s shared pointer has the same problem; Rust avoids it by having two types (Rc and Arc) that the developer can select from (and which the compiler will prevent you from using unsafely).
It doesn't. C++'s shared pointers use atomics, just like Rust's Arc does. There's no good reason (unless you have some very exotic requirements, into which I won't get into here) to implement shared pointers with mutexes. The implementation in the blog post here is just suboptimal.
(But it's true that C++ doesn't have Rust's equivalent of Rc, which means that if you just need a reference counted pointer then using std::shared_ptr is not a zero cost abstraction.)
I'd be interested to know what you are thinking.
The primary exotic thing I can imagine is an architecture lacking the ability to do atomic operations. But even in that case, C11 has atomic operations [1] built in. So worst case, the C library for the target architecture would likely boil down to mutex operations.
[1] https://en.cppreference.com/w/c/atomic.html
I'd much rather it didnt try to be zero-cost and it always used atomics...
(for reference, the person above is referring to what's described here: https://snf.github.io/2019/02/13/shared-ptr-optimization/)
Sure, cross-platform is desirable, if there's no cost involved, and mandatory if you actually need it, but it's a "nice to have" most of the time, not a "needs this".
As for mutex overheads, yep, that's annoying, but really, how annoying ? Modern CPUs are fast. Very very fast. Personally I'm far more likely to use an os_unfair_lock_t than a pthread_mutex_t (see the previous point) which minimizes the locking to a memory barrier, but even if locking were slow, I think I'd prefer safe.
Rust is, I'm sure, great. It's not something I'm personally interested in getting involved with, but it's not necessary for C (or even this extra header) to do everything that Rust can do, for it to be an improvement on what is available.
There's simply too much out there written in C to say "just use Rust, or Swift, or ..." - too many libraries, too many resources, too many tutorials, etc. You pays your money and takes your choice.
> We love its raw speed, its direct connection to the metal
If this is a strong motivating factor (versus, say, refactoring risk), then C’s lack of safe zero-cost abstractions is a valid concern.
For this use-case, you might not notice. ISTR, when examing the pthreads source code for some platform, that mutexes only do a context switch as a fallback, if the lock cannot be acquired.
So, for most use-cases of this header, you should not see any performance impact. You'll see some bloat, to be sure.
There really isn't. Speaking as someone who works in JVM-land, you really can avoid C all the time if you're willing to actually try.
It's an implementation detail. They could have used atomic load/store (since c11) to implement the increment/decrement.
TBH I'm not sure what a mutex buys you in this situation (reference counting)
Do you have a source for this? I couldn't find the implementation in TFA nor a link to safe_c.h
Deleted Comment
In any case, you could use the provided primitives to wrap the C11 mutex, or any other mutex.
With some clever #ifdef, you can probably have a single or multithreaded build switch at compile time which makes all the mutex stuff do nothing.
BTW don’t fight C for portability, it is unlikely you will win.
https://news.ycombinator.com/item?id=45133938
https://fil-c.org/
Deleted Comment
This is beautiful!
People really need to stop acting like a garbage collector is some sort of cosmic horror that automatically takes you back to 1980s performance or something. The cases where they are unsuitable are a minority, and a rather small one at that. If you happen to live in that minority, great, but it'd be helpful if those of you in that minority would speak as if you are in the small minority and not propagate the crazy idea that garbage collection comes with massive "performance penalties" unconditionally. They come with conditions, and rather tight conditions nowadays.
A CLI tool (which most POSIX tools are) would pick throughput over latency any time.
The real problems start when you need to manage memory lifetimes across the whole program, not locally. Can you return `UniquePtr` from a function? Can you store a copy of `SharedPtr` somewhere without accidentally forgetting to increment the refcount? Who is responsible for managing the lifetimes of elements in intrusive linked lists? How do you know whether a method consumes a pointer argument or stores a copy to it somewhere?
I appreciate trying to write safer software, but we've always told people `#define xfree(p) do { free(p); p = NULL; } while (0)` is a bad pattern, and this post really feels like more of the same thing.
Yes: you can return structures by value in C (and also pass them by value).
> Can you store a copy of `SharedPtr` somewhere without accidentally forgetting to increment the refcount?
No, this you can't do.
Have we? Why?
C23 didn't introduce it, it's still a GCC extension that needs to be spelled as [[gnu::cleanup()]] https://godbolt.org/z/Gsz9hs7TE
https://en.cppreference.com/w/c/language/attributes.html
https://en.cppreference.com/w/cpp/language/attributes.html
https://gcc.gnu.org/onlinedocs/gcc/Attributes.html
https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attribute...
https://thephd.dev/_vendor/future_cxx/technical%20specificat...
Discussion:
https://thephd.dev/_vendor/future_cxx/papers/C%20-%20Improve...
This is cute, but also I'm baffled as to why you would want to use macros to emulate c++. Nothing is stopping you from writing c-like c++ if that's what you like style wise.
Though yes, you should probably just write C-like C++ at that point, and the result sum types used made me chuckle in that regard because they were added with C++17. This person REALLY wants modern CPP features..
I like the power of destructors (auto cleanup) and templates (generic containers). But I also want a language that I can parse. Like, at all.
C is pretty easy to parse. Quite a few annoying corner cases, some context sensitive stuff, but still pretty workable. C++ on the other hand? It’s mostly pick a frontend or the highway.
C’s simplicity can be frustrating, but it’s an extremely hackable language thanks to that simplicity. Once you opt in to C++, even nominally, you lose that.
(Agree on your other points for what it's worth.)
It's choosing which features are allowed in.
[1] https://doc.rust-lang.org/beta/rustc/platform-support.html
Quite frankly I'm not sure why you wouldn't given that most are using GCC on common architectures. The chip vendor doesn't have to do any work unless they are working on an obscure architecture.
You'll just have to get used to the C++ community screaming at you that it's the wrong way to write C++ and that you should just use Go or Zig instead
Deleted Comment
https://github.com/rurban/safeclib/tree/master/include
https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1967.htm
A global constraint handler is still by far better than dynamic env handlers, and most of the existing libc/POSIX design failures.
You can disable this global constraint handler btw.
Macros are simply a fact of life in any decent-sized C codebase. The Linux kernel has some good guidance to try to keep it from getting out of hand but it is just something you have to learn to deal with.
https://www.libcello.org/
Discussed here https://news.ycombinator.com/item?id=22191790
Nim is a language that compiles to C. So it is similar in principle to the "safe_c.h". We get power and speed of C, but in a safe and convenient language.
> It's finally, but for C
Nim has `finally` and `defer` statement that runs code at the end of scope, even if you raise.
> memory that automatically cleans itself up
Nim has ARC[1]:
"ARC is fully deterministic - the compiler automatically injects destructors when it deems that some variable is no longer needed. In this sense, it’s similar to C++ with its destructors (RAII)"
> automated reference counting
See above
> a type-safe, auto-growing vector.
Nim has sequences that are dynamically sized, type and bounds safe
> zero-cost, non-owning views
Nim has openarray, that is also "just a pointer and a length", unfortunately it's usage is limited to parameters. But there is also an experimental view types feature[2]
> explicit, type-safe result
Nim has `Option[T]`[3] in standard library
> self-documenting contracts (requires and ensures)
Nim's assert returns message on raise: `assert(foo > 0, "Foo must be positive")`
> safe, bounds-checked operations
Nim has bounds-checking enabled by default (can be disabled)
> The UNLIKELY() macro tells the compiler which branches are cold, adding zero overhead in hot paths.
Nim has likely / unlikely template[4]
------------------------------------------------------------
[0] https://nim-lang.org
[1] https://nim-lang.org/blog/2020/10/15/introduction-to-arc-orc...
[2] https://nim-lang.org/docs/manual_experimental.html#view-type...
[3] https://nim-lang.org/docs/options.htm
[4] https://nim-lang.org/docs/system.html#likely.t%2Cbool