C Style: My favorite C programming practices (2014)

> Write correct, readable, simple and maintainable software, and tune it when you're done, with benchmarks to identify the choke points

If speed is a primary concern, you can't tack it on at the end, it needs to be built in architecturally. Benchmarks applied after meeting goals of read/maintainability are only benchmarking the limits of that approach and focus.

They can't capture the results of trying and benchmarking several different fundamental approaches made at the outset in order to best choose the initial direction. In this case "optimisation" is almost happening first.

Sometimes the fastest approach may not be particularly maintainable, and that may be just fine if that component is not expected to require maintaining, eg, a pure C bare-metal in a bespoke and one-off embedded environment.

eschneider · 2 years ago

Well, yes. Architect for performance, try not to do anything "dumb", but save micro-optimizations for after performance measurement.

ragnese · 2 years ago

The problem with all of these rules of thumb is that they're vague to the point of being vacuously true. Of course we all agree that "premature optimization is the root of all evil" as Knuth once said, but the saying itself is basically a tautology: if something is "premature", that already means it's wrong to do it.

I'll be more impressed when I see specific advice about what kinds of "optimizations" are premature. Or, to address your reply specifically, what counts as "doing something dumb" vs. what is a "micro-optimization". And, the truth is, you can't really answer those questions without a specific project and programming language in mind.

But, what I do end up seeing across domains and programming languages is that people sacrifice efficiency (which is objective and measurable, even if "micro") for a vague idea of what they consider to be "readable" (today--ask them again in six months). What I'm specifically thinking of is people writing in programming languages with eager collection types that have `map`, `filter`, etc methods, and they'll chain four or five of them together because it's "more readable" than a for-loop. The difference in readability is absolutely negligible to any programmer, but they choose to make four extra heap-allocated, temporary, arrays/lists and iterate over the N elements four or five times instead of once because it looks slightly more elegant (and I agree that it does). Is it a "micro-optimization" to just opt for the for-loop so that I don't have to benchmark how shitty the performance is in the future when we're iterating over more elements than we thought we'd ever need to? Or is it not doing something dumb? To me, it seems ridiculous to intentionally choose a sub-optimal solution when the optimal one is just as easy to write and 99% (or more) as easy to read/understand.

vbezhenar · 2 years ago

I don't know if this embedded development still alive. I'm writing firmware for nRF BLE chip which is supposed to run from battery and their SDK uses operating system. Absolutely monstrous chips with enormous RAM and Flash. Makes zero sense to optimize for anything, as long as device sleeps well.

mark_undoio · 2 years ago

A little over 10 years ago I was doing some very resource-constrained embedded programming. We had been using custom chip with an 8051-compatible instruction set (plus some special purpose analogue circuitry) with a few hundred bytes of RAM. For a new project we used an ARM Cortex M0, plus some external circuitry for analogue parts.

The difference was ridiculous - we were actually porting a prototype algorithm from a powerful TI device with hardware floating point. It turned out viable to simply compile the same algorithm with software emulation of floating point - the Cortex M0 could keep up.

Having said all that though: the 8051 solution was so much physically smaller that the ARM just wouldn't have been viable in some products (this was more significant because having the analogue circuitry on-chip limited how small the feature size for the digital part of the silicon could be).

Obviously that was quite a while ago! But even at the time, I was amazed how much difference the simpler chip made actually made to the size of the solution. The ARM would have been a total deal breaker for that first project, it would just have been too big. I could certainly believe people are still programming for applications like that where a modern CPU doesn't get a look in.

robxorb · 2 years ago

Probably right in the broader sense, but there are still niches. Eg, for one: space deployments, where sufficiently hardened parts may lag decades behind SOTA and the environ can require a careful balance of energy/heat against run-time.

ajross · 2 years ago

It's still alive, but pushed down the layers. The OS kernel on top of which you sit still cares about things like interrupt entry latency, which means that stack usage analysis and inlining management has a home, etc... The bluetooth radio and network stacks you're using likely has performance paths that force people to look at disassembly to understand.

But it's true that outside the top-level "don't make dumb design decisions" decision points, application code in the embedded world is reasonably insulated form this kind of nonsense. But that's because the folks you're standing on did the work for you.

kragen · 2 years ago

i just learned the other day that you can get a computer for 1.58¢ in quantity 20000: https://jlcpcb.com/partdetail/NyquestTech-NY8A051H/C5143390

if we can believe the datasheet, it's basically a pic12f clone (with 55 'powerful' instructions, most single-cycle) with 512 instructions of memory, a 4-level hardware stack, and 32 bytes of ram, with an internal 20 megahertz clock, 20 milliamps per pin at 5 volts, burning half a microamp in halt mode and 700 microamps at full speed at 3 volts

and it costs less than most discrete transistors. in fact, although that page is the sop-8 version, you can get it in a sot23-6 package too

there are definitely a lot of things you can do with this chip if you're willing to optimize your code. but you aren't going to start with a 30-kilobyte firmware image and optimize it until it fits

yeah it's not an nrf52840 and you probably can't do ble on it. but the ny8a051h costs 1.58¢, and an nrf52840 costs 245¢, 154 times as much, and only runs three times as fast on the kinds of things you'd mostly use the ny8a051h for. it does have a lot more than 154 times as much ram tho

for 11.83¢ you can get a ch32v003 https://www.lcsc.com/product-detail/Microcontroller-Units-MC... which is a 48 megahertz risc-v processor with 2 kilobytes of ram, 16 kilobytes of flash, a 10-bit 1.7 megahertz adc, and an on-chip op-amp. so for 5% of the cost of the nrf52840 you get 50% of the cpu speed, 1.6% of the ram, and 0% of the bluetooth

for 70¢, less than a third the price of the nrf52840, you can get an ice40ul-640 https://www.lcsc.com/product-detail/Programmable-Logic-Devic... which i'm pretty sure can do bluetooth. though it might be saner to hook it up to one of the microcontrollers mentioned above (or maybe something with a few more pins), you can probably fit olof kindgren's serv implementation of risc-v https://github.com/olofk/serv into about a third of it and probably get over a mips out of it. but the total amount of block ram is 7 kilobytes. the compensating virtue is that you have another 400 or so luts and flip-flops to do certain kinds of data processing a lot faster and more predictably than a cpu can. 19 billion bit operations per second and pin-to-pin latency of 9 nanoseconds

so my summary is that there's a lot of that kind of embedded work going on, maybe more than ever, and you can do things today that were impossible only a few years ago

f1shy · 2 years ago

That was my way of thinking as I was junior programming.

itishappy · 2 years ago

And now...?

I feel like I probably agree with about 80% of this. It also seems like this would apply fairly well to C++ as well.

One thing that I'll strongly quibble with: "Use double rather than float, unless you have a specific reason otherwise".

As a graphics programmer, I've found that single precision will do just fine in the vast majority of cases. I've also found that it's often better to try to make my code work well in the single precision while keeping an eye out for precision loss. Then I can either rewrite my math to try to avoid the precision loss, or selectively use double precision just in the parts where its needed. I think that using double precision from the start is a big hammer that's often unneeded. And using single precision buys you double the number of floats moving through your cache and memory bandwidth compared to using double precision.

ack_complete · 2 years ago

I'm torn both ways on the double issue. On the one hand, doubles are much more widely supported these days, and will save you from some common scenarios. Timestamps are a particular one, where a float will often degrade on a time scale that you care about, and doubles not. A double will also hold any int value without loss (on mainstream platforms), and has enough precision to allow staying in world coordinates for 3D geometry without introducing depth buffer problems.

OTOH, double precision is often just a panacea. If you don't know the precision requirements of your algorithm, how do you know that double precision will work either? Some types of errors will compound without anti-drifting protection in ways that are exponential, where the extra mantissa bits from a double will only get you a constant factor of additional time.

There are also current platforms where double will land you in very significant performance problems, not just a minor hit. GPUs are a particularly fun one -- there are currently popular GPUs where double precision math runs at 1/32 the rate of single precision.

teddyh · 2 years ago

I think you used the word “panacea” incorrectly. Judging by context, I would guess that the word “band-aid” would better convey your intended meaning.

orthoxerox · 2 years ago

Are there C++ libs that use floating points for timestamps? I was under the impression that most stacks have accepted int64 epoch microseconds as the most reasonable format.

Deleted Comment

amszmidt · 2 years ago

The one about not using 'switch' and instead using combined logical comparisons is terrible ... quite opinionated, but that is usually the case with these type of style guides.

mcinglis · 2 years ago

As the author 10 years later, I agree. A hard-and-fast rule to ban switch, as that rule seems to advocate, is silly and terrible.

Switch has many valid uses. However, I also often see switch used in places where functional decomposition would've been much better (maintainable / testable / extensible). So I think there's still value in advocating for those switch alternatives, such as that rule's text covers. Not that I agree with everything there either. But, useful for discussion!

aulin · 2 years ago

it's like they purposely add some controversial rule just for engagement

ezconnect · 2 years ago

He even uses 'switch' on his code.

lifthrasiir · 2 years ago

I think the fact that graphics care a lot more about efficiency over marginal accuracy qualifies for a specific reason. Besides from that and a few select areas like ML, almost any reason to use `float` by default vanishes.

sdk77 · 2 years ago

There are popular embedded platforms like STM32 that don't have hardware double support, but do have hardware float support. Using double will cause software double support to be linked and slow down your firmware significantly.

AnimalMuppet · 2 years ago

OK, but if you're writing for that kind of platform, you know it. Don't use double there? Sure. "Don't use double on non-embedded code just because such platforms exist" doesn't make sense to me.

Sure, my code could maybe run on an embedded platform someday. But the person importing it probably has an editor that can do a search and replace...

JKCalhoun · 2 years ago

> I feel like I probably agree with about 80% of this.

What I was thinking too. There's something in here to offend everyone, and that's probably a good thing.

Lvl999Noob · 2 years ago

I think your case comes under the "specific reason to use `float`"? If I am writing some code and I need floating point numbers, then without any more context, I will choose `double`. If I have context and the context makes it so `float`s are vastly better, then I will use `float`s.

projektfu · 2 years ago

The issue for me is that unlabeled constants are doubles and they can cause promotion where you don't expect it, leading to double arithmetic and rounding instead of single arithmetic. Minor issue, but hidden behavior.

atiedebee · 2 years ago

What's even more annoying is that the *printf functions take in double which forces you to cast all of the floats you pass in when using -Wdouble-promotion

sixthDot · 2 years ago

funnily the example used, i.e `printf` of single values is very special. Under the hood, variadic arguments that are `single` are actually converted to `double`. See the `cvtss2sd` in [1].

[1]: https://godbolt.org/z/Yr7Kn4vqr

spookie · 2 years ago

Yeah, sometimes as a graphics programmer you don't even want the precision provided by built-in functions! As it has been pointed out though, be careful about error propagation

- Someone using a Braille display - Someone with a vision impairment (i.e. high scaling factor; common occurence during ageing) - A group of people that doesn't sit close to the display - Someone with a low-DPI (or small) display due to the normal workplace being unavailable

I was surprised to see this on the HN front page, after so many years. Thanks for sharing it!

Suffice to say: my opinions on this topic have shifted significantly. A decade+ more of programming-in-the-large, and I no longer pay much heed to written-in-prose style guides. Instead, I've found mechanistic "style" enforcement and close-to-live-feedback much more effective for maintaining code quality over time.

A subtext is that I wrote this during a period of work - solo programmer, small company - on a green-field power system microcontroller project; MODBUS comms, CSV data wrangling. I'd opted for C primarily for the appeal of having a codebase I could keep in my head (dependencies included!). There was much in-the-field development, debugging and redeployments, so it was really valuable to have a thin stack, and an easy build process.

So, other than one vendored third-party package, I had total control over that codebase's style. And so, I had the space to consider and evolve my C programming style, reflecting on what I considered was working best for that code.

My personal C code style has since shifted significantly, as well - much more towards older, more-conventional styles.

Still, opinionated, idiosyncratic documents like this - if nothing else - can serve as fun discussion prompts. I'm appreciating all the discussion here!

bjourne · 2 years ago

Update the text! I would love to read the diff.

a_e_k · 2 years ago

wruza · 2 years ago

Treat 79 characters as a hard limit

Try pasting a long URL into a comment describing a method/problem/solution and you’ll see immediately that it doesn’t fit 77 chars and you cannot wrap it. Then due to your hard limit you’ll invent something like “// see explained.txt:123 for explanation” or maybe “https://shrt.url/f0ob4r” it.

There’s nothing wrong with breaking limits if you do that reasonably, cause most limits have edge cases. It’s (Rule -> Goal X) most of the times, but sometimes it’s (Rule -> Issue). Make it (Solution (breaks Rule) -> Goal X), not (Solution (obeys Rule) -> not (Goal X)).

kleiba · 2 years ago

Agree. This 80 character limit stems from a time where terminals could only display comparatively few characters in a line, a limit we haven't had in decades as screen resolutions grew.

Another argument for shorters lines is that it is much harder for us to read any text when lines get too long. There's a reason why we read and write documents in portrait mode, not landscape.

But in sum, I don't think there's a need for creating a hard limit at the 80 character mark. Most code is not indented more than three or four times anyways, and most if not all languages allow you to insert newlines to make long expressions wrap. However, if you occasionally do need to go longer, I think that's completely fine and certainly better than having to bend around an arcane character limit.

> This 80 character limit stems from a time where terminals could only display comparatively few characters in a line, a limit we haven't had in decades as screen resolutions grew.

The 80 char rule has little to do with old monitors. Has to do with ergonomics, and is why any good edited and typeset book will have between 60 and 80 characters per line.

And at the very least, "80-characters-per-line is a de-facto standard for viewing code" has been long wrong. As the post even mentions, 100 and 120 columns have been another popular choices and thus we don't really have any de-facto standard about them!

My opinion is that line width depends on identifier naming style.

For example Java often prefers long explicitly verbose names for class, fields, methods, variables.

Another approach is to use short names as much as possible. `mkdir` instead of `create_directory`, `i` instead of `person_index` and so on.

I think that max line length greatly depends on the chosen identifier naming style. So it makes sense to use 100 or 120 for Java and it makes sense to use 72 for Golang.

C code often use short naming style, so 72 or 80 should be fine.

Scarbutt · 2 years ago

I doubt that rule applies to pasting URLs in comments. It's about code.

No rule is absolute. So you may have some line longer. Anyway I’m VERY skeptic that hardcoding URLs is a good idea at all.

ReleaseCandidat · 2 years ago

> Anyway I’m VERY skeptic that hardcoding URLs is a good idea at all.

They are talking about URLs in comments.

Only hardcore cool URLs!

gavinhoward · 2 years ago

I agree with most, and most of the others I might quibble with, but accept.

However, the item to not use unsigned types is vastly stupid! Signed types have far more instances of UB, and in the face of 00UB [1], that is untenable.

It is correct that mixing signed and unsigned is really bad; don't do this.

Instead, use unsigned types for everything, including signed math. Yes, you can simulate two's complement with unsigned types, and you can do it without UB.

On my part, all of my stuff uses unsigned, and when I get a signed type from the outside, the first thing I do is convert it safely, so I don't mix the two.

This does mean you have to be careful in some ways. For example, when casting a "signed" type to a larger "signed" type, you need to explicitly check the sign bit and fill the extension with that bit.

And yes, you need to use functions for math, which can be ugly. But you can make them static inline in a header so that they will be inlined.

The result is that my code isn't subject to 00UB nearly as much.

[1]: https://gavinhoward.com/2023/08/the-scourge-of-00ub/

Author here, 10 years later -- I agree. I'd remove that rule wholesale in an update of this guide. Unsigned integer types can and should be used, especially for memory sizes.

I would still advocate for large signed types over unsigned types for most domain-level measurements. Even if you think you "can't" have a negative balance or distance field, use a signed integer type so that underflows are more correct.

Although validating bounds would be strictly better, in many large contexts you can't tie validation to the representation, such as across most isolation boundaries (IPC, network, ...). For example, you see signed integer types much more often in service APIs and IDLs, and I think that's usually the right call.

I think with those changes, my disagreement would become a mere quibble.

> I would still advocate for large signed types over unsigned types for most domain-level measurements. Even if you think you "can't" have a negative balance or distance field, use a signed integer type so that underflows are more correct.

I agree with this, but I think I would personally still use unsigned types simulating two's complement that gives the correct underflow semantics. Yeah, I'm a hard egg.

nwellnhof · 2 years ago

In the vast majority of cases, integer overflow or truncation when casting is a bug, regardless whether it is undefined, implementation-defined or well-defined behavior. Avoiding undefined behavior doesn't buy you anything.

If you start to fuzz test with UBSan and -fsanitize=integer, you will realize that the choice of integer types doesn't matter much. Unsigned types have the benefit that overflowing the left end of the allowed range (zero) has a much better chance of being detected.

> Avoiding undefined behavior doesn't buy you anything.

This is absolutely false.

Say you want to check if a mathematical operation will overflow. How do you do it with signed types?

Answer: you can't. The compiler will delete any form of check you make because it's UB.

(There might be really clever forms that avoid UB, but I haven't found them.)

The problem with UB isn't UB, it's the compiler. If the compilers didn't take advantage of UB, then you would be right, but they do, so you're wrong.

However, what if you did that same check with unsigned types? The compiler has to allow it.

Even more importantly, you can implement crashes on overflow if you wish, to find those bugs, and I have done so. You can also implement it so the operation returns a bit saying whether it overflowed or not.

You can't do that with signed types.

> If you start to fuzz test with UBSan and -fsanitize=integer, you will realize that the choice of integer types doesn't matter much.

I do this, and this is exactly why I think it matters. Every time they report UB is a chance for the compiler to maliciously destroy your hard work.

lelanthran · 2 years ago

> In the vast majority of cases, integer overflow or truncation when casting is a bug, regardless whether it is undefined, implementation-defined or well-defined behavior. Avoiding undefined behavior doesn't buy you anything.

With respect, this is nonsense. With UB, the compiler might remove the line of code entirely. With overflow/underflow/truncation, the results are well-defined and the compiler is not allowed to simply remove the offending line.

jackiesshirt · 2 years ago

>Prefer compound literals to superfluous variables

I used to agree with this but I have moved away from compound literals entirely except for global statics/const definitions.

Having a variable and explicit:

  foo.x = whatever;
  foo.y = something_else;

Leads to better debug experience imo, can set breakpoints and single step each assignment and have a name to put a watch on.

flohofwoe · 2 years ago

Hmm, what's the point of single-stepping over a simple data assignment though? And when the initialization involves function calls, the debugger will step into those anyway.

One advantage of initialization via compound literals is that you can make the target immutable, and you won't accidentially get any uninitialized junk in unlisted struct members, e.g.:

const vec3 vec = { .x = 1.0, .y = 2.0 };

...vec.z will be default-initialized to zero, and vec doesn't need to be mutable.

LeoNatan25 · 2 years ago

> 80-characters-per-line is a de-facto standard for viewing code. Readers of your code who rely on that standard, and have their terminal or editor sized to 80 characters wide, can fit more on the screen by placing windows side-by-side.

This is one of the silliest practices to still be enforced or even considered in 2024. “Readers” should get a modern IDE/text editor and/or modern hardware.

nanolith · 2 years ago

On my 4K monitor, I use 4-5 vertical splits and 2-3 horizontal splits. The 80 column rule makes each of these splits readable, and allows me to see the full context of a chunk of kernel code or firmware at once. It has nothing to do with "modern" hardware or "modern" IDEs. It has everything to do with fitting the most amount of relevant information that I can on the screen at once, in context, and properly formatted for reading.

The 80 column rule may seem arbitrary, but it really helps analysis. I avoid open source code that ignores it, and I'll ding code that violates it during code review.

If I had code marching off the screen, or rudely wrapped around so it violated spacing, I'd have to reduce the number of splits I used to see it, and that directly impacts my ability to see code in context. Modern IDEs don't reduce the need to see things in context. It's not a matter of organizing things in drop-down menus, smart tabs, font changes, or magic "refactor" commands. Verifying function contracts in most extant software -- which lacks modern tooling like model checking -- requires verifying these things by hand until these contracts can be codified by static assertions. This, in turn, requires examining function calls often 5-6 calls deep to ensure that the de facto specifications being built up don't miss assumptions made in code deep in the bowels of under-documented libraries. I'd be terribly upset if I had to try to do this in code that not only missed modern tooling but that was written by a developer who mistakenly believed that "80 columns is for geezers." I freely admit that, at 43, I probably count as a "geezer" to many young developers. But, that doesn't change the utility of this rule. Violations of contracts in software account for a large percentage of errors in software AND security vulnerabilities. Most of these violations are subtle and easy to miss unless you can see the call stack in context. No developer can keep hundreds of details from code that they did not write in their head with perfect clarity. It's incredibly nice to have uniform style and uniform maximum line lengths. By convention, 80 columns has shown itself to be the most stable of these limits.

Even FAANG companies like Google follow this rule.

kllrnohj · 2 years ago

> Even FAANG companies like Google follow this rule.

Google also uses 100

I'm using modern IDE and 32" 4K display yet I still support this rule. One example where it's particularly convenient is 3-way merge. Also if we're talking about IDE's, they often use horizontal space for things like files tree (project explorer) and other tool windows.

masklinn · 2 years ago

And on a wide display it's very convenient to use the width to put useful ancillary content on there (e.g. docs, company chat, ...). I shouldn't waste half my display on nothing because you won't add line breaks to your code.

Annoyingly lots of modern website have very wonky breakpoints / detection and will serve nonsense mobile UIs on what I think is reasonable window widths e.g. if you consider bootstrap's "xl" to be desktop then an UWQHD display (3440x1440) won't get a desktop layout in 3 (to say nothing of 4) columns layouts, nor may smaller laptops (especially if they're zoomed somewhat).

hkwerf · 2 years ago

The part you quoted has the one argument against yours right at the end. It's not about hardware or IDEs or text editors, it's about workspace layout.

Keyframe · 2 years ago

au contraire! considering programming involves a lot of reading, it overlaps (or even comes from) with. best practices from ye olde tradition of typesetting https://en.m.wikipedia.org/wiki/Line_length#:~:text=Traditio.... Aside books and print magazines and newspapers, we still respect that on web sites when reading is involved, why should programming be exempt of ergonomy?

programming involves a lot of reading

Is that true for an average developer, really? Yes, we read lots of manuals, snippets, stackoverflows. But code? One does mostly write code.

And when we do read code, it may lack good naming, structure, comments, clarity, may be unnecessarily complex or hacky. Where does it wrap is the thing one would care about only in perfect code, if at all. Most editors can smart-wrap and clearly indicate it anyway.

ablob · 2 years ago

As soon as you collaborate with more people, 80 charactars becomes a valid target for line width. Eventually you'll have someone reading your code in a manner that is hardly pleasant with lengths of 200 characters or more:

While you could, of course, disregard all these scenarios, the sheer amount of people profiting from or requiring a character limit on lines is usually grounds for a restrictive policy regarding this topic. You might consider it silly, but as long as there is no reliable way to convert between these "styles of presentation" you will find that many people prefer to err on the safe side.

jonathanstrange · 2 years ago

There is a reason why books have only between 45 to 75 characters per line. It greatly enhances readability.

jcalvinowens · 2 years ago

IMHO if the 80-column limit bothers you in C, you're writing bad C. Quoting the kernel docs, it is "warning you when you’re nesting your functions too deep. Heed that warning".

I remember reading this for the first time as a teenager: "if you need more than 3 levels of indentation, you’re screwed anyway, and should fix your program". Twenty years later, it seems like solid advice to me.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...

The rule is a bit silly sure, but OTH I typically have multiple editors tabs open side by side (I don't restrict myself to a hard 80 char line width though, but I have vertical rulers set at 80 and 120 characters in the editor as visual guidance).

xipix · 2 years ago

A proportional font really helps ergonomics too.

This is probably a snarky reply, but here is the serious answer: proportional fonts, with appropriate kerning, is a lot more legible than monospaced font. There is a reason why the press moved into that direction once it was technically feasible. But the same people that bring books as an example why 80 character line length should be enforced would gag at the notion of using proportional fonts for development. It just goes to show that none of these things actually matter, it’s just legacy patterns that remain in-place from sheer inertia, with really very little relevancy today other than the inertia of the past.

Dead Comment

yoochan · 2 years ago

"We can't get tabs right, so use spaces everywhere"

I'm more like: Always use tabs, never use space. Code doesn't need to be "aligned" it's not some ASCIIart masterpiece...

One tab means one indentation level and if your taste is to have tabs of pi chars wide, nice! But it won't mess my code