The C++20 Naughty and Nice List for Game Devs

I was surprised to see `fmt/format.h` on that list, but I do have to admit that the objections seem reasonable. Perhaps because he(?) mentioned wanting to use -O0. Template code is almost useless without optimization. If -O0 is needed then I am surprised that all of the STL doesn't get pitched.

Ok, I was also surprised to see co-routines on the nice list, but I don't have direct experience there. I normally see complaints about them. I would like them to be good because some code is easier to express that way.

danpla · 2 years ago

> I was surprised to see `fmt/format.h` on that list, but I do have to admit that the objections seem reasonable

The author talks about the code bloat, beacause of "an API that encourages custom formatter specification to live in a template". But at the end he mentions the standard solution to this problem:

> A preferable interface (I use, but also others AFAIK) is to check the type in a template (no choice there), and dispatch the formatting routine to somewhere that lives in a single translation unit.

So what prevents you from doing this with <format>? As I understand, the implementations of parse() and format() of std::formatter don't depend on the template parameters and can delegate to non-template functions residing in one CPP file. You can also provide additional wformat_parse_context/wformat_context overloads if you need wchar_t support.

gumby · 2 years ago

I guess there’s some legitimate complaint about compile time, but the code bloat issue is simply crazy if you use a linker written since ~1995. And the “fix” is simply to move the common code into a .cc file which the author even mentions.

The alternatives are worse: un-type-checked printf or the horrible stream interpreter system (std::cout << “foo”) which was a cute but bad idea in 1985.

PH95VuimJjqBqy · 2 years ago

The author talked about its effect on compile-times and I have to say, I agree with him. It's also why I dislike header-only libraries, they also bloat the compile times unnecessarily.

Don't get me wrong, the fmt library is very nice, but you can't deny its effect on compile times.

IAmLiterallyAB · 2 years ago

Wondering the same. Seems like you could provide your own implementation or use a third party implementation. Be curious to see a write up on the bloat, what exactly it looks like

verall · 2 years ago

He says in a different line they he doesn't use the STL, which does make sense for gamedev.

In my corner of the C++ world though, I am so, so excited for <format> in 6 years or however long it will take us to move to C++20.

vitaut · 2 years ago

{fmt} doesn't encourage "custom formatter specification to live in a template". On the contrary, if you look at the docs in https://fmt.dev/latest/api.html#formatting-user-defined-type..., none of the examples is parameterized. One even demonstrates how to define your formatting code in a source file. And if your formatters are so big that they meaningfully impact build speed you are doing something wrong. fmt/core.h is heavily optimized for build speed so you can just use it as a type-safe replacement for *printf. That said, implementations of std::format (especially Microsoft's) may not be as optimized for build speed yet. This will likely improve now that the ABI can be stabilized.

tubs · 2 years ago

For me the compile time increases are a killer, even if the api is considerably nicer.

danpla · 2 years ago

Unfortunately, if you're using the standard library, you get this just by switching to the C++20 mode. For example, the committee decided to put tons of std::ranges-related stuff right in <algorithm>.

https://www.reddit.com/r/cpp/comments/o94gvz/what_happened_w...

gumby · 2 years ago

It isn’t just a nicer API but type safe and much faster at runtime.

Since I rarely compile all my code at once (usually just a single file followed by a re-link) compile time doesn’t matter much. And that’s even though while editing or writing code I don’t have all the slowdown bloat of an IDE so compile time is more noticeable.

> the main caveat regarding designated initializer usage is that initialized members must appear in declaration order

I hate this about C++! In C you can initialize them in any order, and this allowed us to write nbdkit plugins in a very natural way:

  static struct nbdkit_plugin plugin = {
   .name              = "myplugin",
   .open              = myplugin_open,
   .get_size          = myplugin_get_size,
   .pread             = myplugin_pread,
   .pwrite            = myplugin_pwrite,
   /* etc */
  };

where the order is not related to the order the fields appear in the struct (that has to be maintained for ABI reasons), and not all fields need to appear (the others are initialized with 0/NULL).

For C++ we have to do this mess:

https://bugzilla.redhat.com/show_bug.cgi?id=1418328#c3

Anyway my question is .. why is this, C++ people?

flohofwoe · 2 years ago

It's not just the random initialization order that didn't make it from C99 into C++, other features also don't work, most notably:

...no chaining of designated initializers:

   const bla_t bla = { .a.b.c = 23 };

...and no array indexing:

   const blub_t blub = {
       .arr = {
           [4] = 23,
           [2] = 1
       }
   };

...all those limitations taken together, and the C++ designated initialization feature is pretty much useless except for the most trivial structs - while in C99, designated initialization really shines with complex, nested structs.

The funny thing is that none of those limitations would be required. Clang had supported full C99 designated init in C++ mode just fine for many years before C++20 appeared.

loeg · 2 years ago

> all those limitations taken together, and the C++ designated initialization feature is pretty much useless except for the most trivial structs

We find it pretty useful even with the limitations. Certainly not “pretty much useless.”

calamari4065 · 2 years ago

Oh man, I've desperately wanted both of these features in basically every language I've used.

I think you can do a similar sort of array initialization in C#, but definitely not chained initializers. Those are both so useful, but I can see why they aren't included in C++/++

gumby · 2 years ago

C++ isn’t C and has different structure semantics. Members are initialized in the order defined, which means you can write

    struct foo {
       int a = 0;
       int b = a+1;
    }

If the compiler just did the initialization in the order of declaration, regardless of the order in the initialization list this would not do what you expect: struct obj { int a; int b; }

     int ival = 0;

     auto o = obj {.b = ++ival, .a = ival};

o.a would not equal o.b.

I would like to have the initialization syntax of C because then one could reorder elements (say for packing reasons) and the designated initialization would “just work”…except it wouldn’t.

C++ designated initialization does buy you two things: 1- documentation, but more importantly 2- if you do reorder a struct or class data members the compiler will warn you that your initialization lists are now invalid rather than silently failing. I don’t know how to even find them all in a large code base any other way!

pjmlp · 2 years ago

Contrary to C, in C++ initialization order of data members matters and can have side effects, that is why.

rwmj · 2 years ago

I think an exception might be made for a plain "C-like" struct that doesn't initialize members or contain anything except basic types. In the specific example[0] the code is actually surrounded by extern "C" { ... } so I suppose that the compiler "knows" this is a plain C struct? (Does extern "C" change parsing rules? I will need to look at what GCC does)

[0] https://gitlab.com/nbdkit/nbdkit/-/blob/cd761c9bf770b23f678f...

wheybags · 2 years ago

Destructors will execute in the reverse of declaration order, so if initialization order doesnt match declaration order, and some members depend on each other somehow, things will break. At the very least, it could be surprising. Not a problem in C where destructors don't exist.

gpderetta · 2 years ago

I think, as usual this was the compromise that the committee was able to agree on above all objections. There is stills the possibility that the rules are relaxed if there is agreement. But somebody has to do the work to push it through standardization.

I also thought that the behaviour as standardized was useless, but recently I started writing more minimalist code eschewing constructors where aggregate initialisation would suffice, and I haven't really missed the ability to reorder initializers or skip them.

nemetroid · 2 years ago

Initialization in C++ is already a mess. Making one of the core behaviours (members are initialized in declaration order) work subtly different for this case would make it even more difficult for the programmer to build a correct mental model.

From what I can tell, the snippet you posted would compile fine in C++20 mode.

zalyalov · 2 years ago

  struct S {
    int A;
    int B = A;
    int C;
  };

  int i = 0;
  S s = {.C = i++, .A = ++i};

What would you expect code like this to do?

flohofwoe · 2 years ago

The only useful solution for C++ would be to let the compiler reorder the initialization in declaration order, e.g. first A, then B, then C.

This would still be a lot better for 99% of real world use cases than requiring the programmer to manually place the items in declaration order.

CamperBob2 · 2 years ago

Fail to compile due to forward references in initializers.

The problem is, someone thought this was a good idea, but the act of supporting it ruled out a lot of more-useful future improvements.

calamari4065 · 2 years ago

Is `int B = A` even legal? Aren't field initializers like that supposed to be compile time constant?

rwmj · 2 years ago

I wouldn't write that, and if someone did they'd get to keep all the pieces.

sp1rit · 2 years ago

If you can use GCC to target all platforms, you can build with -std=gnu++XX to use C-like initializers.

Deleted Comment

wscott · 2 years ago

Xeamek · 2 years ago

>Large codebase (10M+ LOC) that is routinely compiled on your machine

Is that really 'an average' for modern AAA game?

Damn. That's an order of magnitude bugger then I'd imagine

dagmx · 2 years ago

For reference, from Tokei from Unreal Engine reports lines of code (no empty lines)

Just C++ 11,375,669

Total (of everything) 31,379,114

That’s fairly representative of just the tooling side of things for a AAA engine. That’s not counting the logic of the game itself.

jihiggins · 2 years ago

it's partly because of the engine code. there's even bigger stuff, especially especially if it's a company with any legacy codebase that's 10-20 years old or whatever (e.g. EA / Frostbite.) one i worked on took hours to compile the first time on a machine with 128gb of ram and a threadripper. the onboarding doc suggests getting some coffee at that point haha

a big part of working on them as a generalist ends up being the ability to know how to even navigate something like that (especially since they're often haphazardly documented)

(part of it is that most of the games "fork" the engine rather than using it as a standalone thing)

it's probably not everyone on the team building that whole thing each time, but yea. hundreds of solutions and millions of LOC isn't unusual

*i just did a quick check with unreal's source, it's ~20 million LoC (assuming I didn't mess up the filtering somehow)

nox100 · 2 years ago

If you've got a 10M+ LOC project and you're not doing some kind of distributed build you're throwing money and time away.

The main time killer in day-to-day work on such big projects is usually the linker step, which is terribly slow with the MSVC linker and doesn't benefit from incremental or distributed compilation (not sure though how much the MSVC linker has improved in the last 5 years or so).

It's a lot more common than you'd think. I'm not in gamedev but similarly weird (multiple supported userspaces and OS's for embedded an device line) our "full build" is probably approaching 50M+ lines and only quite recently do people do incrementals from a build server snapshots. No bazel or distributed ccache or anything.

It can be pretty difficult.

Especially for games or OS development, you might have shifting toolchains and SDks. Different teams may move out of sync because different teams want different things at a given time.

obviously many teams do (i've seen incredibuild used pretty often.) that's not a cureall, though.

mgaunard · 2 years ago

Isn't that average for any C++ software development?

Terminal135 · 2 years ago

I would have thought maybe 1M LOC at most, but 10M?

We've been around 1mloc for the Drakensang single player and MMO games, and that was 15..5 years ago, with a relatively small team (up to 20 programmers), and budget-wise far away from what's considered an AAA production.

While I wouldn't want to work on a 10mloc C++ code base either, it sounds totally realistic to me.

Why on earth is anyone normally compiling more than one or two files at a time? Why do a full build each time you change a function?

Change a basic thing in a header of a core library, or even add a flag to the compiler, and you need to recompile everything.

most of the builds are incremental, but even incremental builds still take awhile when it's that massive (linking, modifying a header / template code, etc)

dafelst · 2 years ago

True, but linker times still suck and don't parallelize well.

Also, sometimes you need to iterate on some very core .h file and touching any of those brings the whole house of cards down and triggers a full or nearly full rebuild.

hotjump · 2 years ago

I'm a junior in uni, and I hate it when I say "Yeah we learned this technique in the C class, but it's UB in C++ so please rewrite that" in reviewing friends' codes that do type-punning with unions. So I'm also very happy with the 'std::bit_cast' in general.

BTW how about std::is_constant_evaluated()? I assumed it would help folks who do heavy physics simulations, but looks like not listed in the article.

TBF, I have yet to see a C++ compiler where the union type punning trick doesn't work, there would be a lot of broken code if real-world compilers would change the current behaviour no matter what the standard says.

Of course now that std::bit_cast exists it's the safe thing to do (but then there's still C code that's compiled in C++ mode which was even recommended by Microsoft because the Visual Studio team couldn't be bothered to keep their C compiler in shape until a little while ago).

There would be a lot of broken code if real-world compilers would change the current behaviour no matter what the standard says

GCC maintainers: Hold my Jolt

> I have yet to see a C++ compiler where the union type punning trick doesn't work

The problem isn’t that compilers won’t implement the feature (that would take more work); the problem is that it’s processor-specific.

The spec doesn’t mandate many specific bit-ordering layouts (some are, such two’s complement representation being mandated which was just added, &obj == &base, I think nullptr has to be 0, etc) rather than trying to make everything a PDP-11.

aw1621107 · 2 years ago

For those of us who aren't very familiar with what heavy physics simulations might involve, how would std::is_constant_evaluated() help?

codeflo · 2 years ago

The author is not a fan of lambdas:

> Personally, I find code that leverages ranges harder to read, not easier, because lambdas inlined in functions introduce new scopes that have a strong non-linearizing effect on the code. This isn’t a criticism of ranges per se, but certainly is a stylistic preference.

Does anyone know what “non-linearizing” means here?

armchairhacker · 2 years ago

I assume “code outside the lambda runs first, then code inside the lambda maybe runs later, maybe runs multiple times, maybe doesn’t run at all”.

It can especially create problems when the lambda captures a variable by reference which gets mutated and/or deallocated before the lambda runs, and the developer didn’t plan for mutation or deallocation.

Or (a problem with lambdas, but not “non-linearizing”), if the lambda captures a variable by value (copies the value) and mutates it, and the developer expected the mutation to persist outside the lambda.

mysterydip · 2 years ago

This was my first encounter with the three-way comparison operator (<=>). Can someone give a practical use case? There must be one for it to be included in the spec, but I'm not seeing it.

craftit · 2 years ago

It saves writing lots of boilerplate. If you implement it for a type, you automatically get: <, >, <=, >=, ==, !=

Ah that makes sense, thanks! Then you can just do normal comparisons on the integer value of < > or == to 0.

read cppreference.com

But the sort answer is all the other operators are automatically generated from that one if it is defined. So it makes the code simpler. And for many types <=> isn't much more complicated than the others

pizlonator · 2 years ago

> Signed overflow/underflow remain UB (and it’s understandable that changing this behavior would have dramatic consequences)

I think that the dramatic consequences are only understandable if you succumb to mimetic contagion.

The consequences are real but not dramatic and possibly not even measurable in many workloads.

It just means that you’ll have an extra sign extension (one of the cheapest ops the CPU has) in a subset of your loops, namely the ones that had a 32 bit signed induction variable and the compiler could reason about that variable but only if it also could assume no wrapping. That’s a lot of caveats.

Most loops will be unaffected by making signed integer overflow defined. Anything that’s not in a loop will almost certainly be unaffected by this change. If you use size_t as your indices then you’ll definitely be unaffected.

So yeah. “Dramatic consequences”. I wish folks stopped exaggerating. There’s nothing dramatic here. It’s a fraction of a percent of perf maybe.

rdtsc · 2 years ago

> a 32 bit signed induction variable and the compiler could reason about that variable but only if it also could assume no wrapping.

(Amateur C programmer silly question) I think I understand it as if we increment the variable (i+10) and use it in an if condition. With UB the compiler could skip that code altogether and assume it will never be reached?

The compiler has to assume that I+10 won’t overflow by virtue of I never being big enough. So, it’ll emit all of the code and UB won’t come into play.

It’s more like this. If you say A[I] where I is 32 bit signed and you’re on a 64 bit target, then this lowers to:

- sign extend I to get a 64-bit value

- multiply it by the size of A’s element type

- add that to A

- then do the access

The last three steps will be just one instruction in the common case on arm and x86. The first step will require a separate instruction on x86.

The compiler can kill the sign extend if it’s sure that the integer value cannot be negative. That’s hard to prove. But you can almost prove it if you see code like:

for (int i = 0; something; ++i)

It looks like i starts out as zero and only grows! So it has to be positive! So if you say A[i] then no sign extend needed!

But wait, what if ++i overflows?

With signed int UB, the compiler can just assume it won’t overflow. And then it can prove that i is nonnegative. And then it can kill the sign extend on those CPUs where it’s not free, like x86.

I’m a compiler writer. I know how valuable this optimization is. Namely, it’s the tiniest of benefits on some program/CPU combos. Modern languages like Java or Swift just give this well defined semantics and call it a day because this isn’t a good hill to die on. Fucking up the language isn’t worth 0.3% on some stupid benchmark, period.

ohnoesjmr · 2 years ago

Is it just me, or the worst part of coroutines is lack of tooling around them? Whenever I get a crash in a coroutine, the "stacktrace" is totally useless and doesn't actually show where the crash happened, just some boiler plate code around executing some continuation which doesn't refer to real code that you wrote.

more or less agree, although this issue isn't even really unique to C++. in practice it's still worth it imo, since debugging callback heavy stuff isn't exactly fun either

You'd still have lines referring to a callback you've writtem. I found it that with coroutines, not even a single stackframe refers to my code , other than the one starting the loop.