Sadly, this is representative of the new trend of C++ programmers.
I will get down voted, hated, and laughed at, but I do prefer the original version of the code.
It is infinitely more simple and readable, and the layers of abstraction added are of pretty much no value here.
If I stumbled across the latter version, I would just scratch my head. And I've seen sooo many projects just end up in a big code bloat just because some programmers wanted to add tons of "new" features like this that add nothing and just complexify the code for no reason.
It's simpler at the expense of being more bug prone. Notice that the "simpler" struct has no housekeeping code to "delete[]" the memory. The programmer now has to manually "manage" that cleanup somewhere.
>, and the layers of abstraction added are of pretty much no value here.
The "value" he's addressing from the "simpler" but more fragile code:
1) remove bug-prone code which requires manually synchronization (copy&paste) of pointer types in both the lines defining the variable and the subsequent lines casting the pointer from a raw buffer: "mesh->positions = (Vector3 )mesh->memory_block;"
Calculating offsets (and/or enforcing logical boundaries) inside of arrays is very error prone and the OpenSSL Heartbleed bug was an example of that type of defect.
3) manage the release of memory using unique_ptr. Obviously, it is to help the programmer (and other coworkers using Mesh data structs) avoid memory leaks.
>features like this that add nothing and just complexify the code for no reason.
It doesn't look like "no reason" to me. The attempt was to reduce bugs, and add memory safety.
I don't understand how you can say that with a straight face. The ArrayView class is incredibly simple and takes about 10 seconds to understand. The "complex" stuff is in the make_contiguous code, which in the end is simply adding integers taking into account alignment requirements (which you still need to understand if you want to write the first version in the first place).
And seriously, did you read the initial point? The problem is that the first approach must be fully rewritten for every new class with different members that exists in your code. Given how HN is butthurt over off-by-one and out-of-buffer errors, you can't say that the first example can really be debugged easily when you have 30 of them in your code. If you only had the Mesh class in your code and nothing else similar maybe it would not be worth it.. but only maybe.
The final example is incredibly simple to use, hides the complexity in one single point and prevents you from ever making mistakes outside of it and seriously, it is not that complex.
Sorry, but again I have to disagree with you on every point. I may appear as an old grumpy programmer, but this kind of article exactly falls in what Zed Shaw calls "the expert", and I totally agree with him. (http://zedshaw.com/archive/the-master-the-expert-the-program...)
There are two 3 wrong premises in this article (IMHO):
- It makes you think that using raw pointers (so called 'low level types' in the article) is more error prone than using combined types.
- It makes you think that adding abstraction removes complexity.
- Makes you think that you should combine/factor everything because repeating yourself is forbidden.
These assertions are straight false.
- Adding abstraction does NOT remove complexity. It ADDS complexity, and then HIDES it. The ArrayView class has absolutely no use whatsoever. Every C/C++ programmer is used to access arrays through pointers. Creating a class for that just makes a reader wonder "what the hell is this thing?". And then you have to crawl in the code to understand what is this thing. And of course, in 3/5 years from now, this ArrayView class will have tripled in size. For something as simple as reading numbers in memory, i don't want any abstraction.
- There is nothing wrong in repeating yourself once on a while => if it improves readability! Factoring too much IS wrong. It makes reading the code a pain.
When I read code, I don't want to read an abstract version of what the code does. I want to grasp the maximum possible information with the minimum noise.
After reading the final version of this Mesh class, I have no idea of its memory layout. I see some ArrayView and a magical create_contiguous_memory function. I'm lost. I don't know where the memory come from, whether or not I own it, what will happen if I hand a pointer to it. Everything is hidden, and that's a pure pain to understand.
Post author here. Sorry you feel that way. I'm surprised you don't think that moving out the messy code into make_contiguous is a win. Isn't it good to factor out clearly reusable code into functions, so you can reuse it? Would you really prefer writing out code like that in a dozen places for a dozen classes that needed contiguous storage, to writing one function, testing it, and then calling it from all those other places?
The 'functional people' are currently anti-reusable code. It gets in their way and complexifies their function graph. Which are bad things for them.
As a lifetime C++ guy I find this new paradigm (inline code a-la structured code) brittle and fragile. App assumptions are laced throughout; nothing can be changed without a rewrite. And my go-to code-reading techniques (follow a class through all its incarnations with grep etc) fail - they rarely subclass, just overriding existing classes locally to create callbacks etc.
So I'm not a fan of the new order. I see it's a productivity tool for the app writers out there. Especially with the one-off code that startups write, expecting to do a rewrite once they're funded. It really makes sense for them.
Hi, Another Game dev here.
Firstly, neat article, thanks for writing it.
I'm curious as to the problem you're exploring/solving here?
"Concatenated struct's in single chunk" is a neat trick. The use case seems a bit broken[1], but I'm assuming that it's from the original video, and your c++-ifying it? Or at least trying to find a nicer way to write this style of code?
If you'r going this route, I'd be tempted not to use pointers at all, and just use offsets and accessors.[2]
* Lack of pointers means you can load in 1 read. (endian needs to be watched, and assumes you can get the entire size elsewhere)
* You can also shuffle it around in memory should you be doing something fancy. (defragging heaps might be useful if doing a sandbox)
* Better encapsulation? maybe..
But the basic point I'd like to make is, there is no nice way to write this type of code, because c and c++ gives us no nice[3] way of expressing it.
[1] Verts and Indices are generally uploaded to a GPU then discarded, but this has the counts, which is needed CPU side for draw calls, in the same chunk so would need to be copied before freeing.
[2]
struct Mesh
{
int32_t num_indices,
num_verts;
// inline to this struct..
// assumes Vector3 is aligned 4 bytes.
// and that Mesh has been alloced to same alignment.
//
// Vector3 positions[num_verts];
// int32_t indices[num_indices];
// Vector2 uvs[num_verts];
inline const Vector3* PositionsBegin() const { return reinterpret_cast<const Vector3*>(this + 1); }
inline const Vector3* PositionsEnd() const { return PositionsBegin() + num_verts; }
inline const int32_t* IndicesBegin() const { return reinterpret_cast<const int32_t*>(PositionsEnd()); }
inline const int32_t* IndicesEnd() const { return IndicesBegin() + num_indices; }
// etc
};
[3] Maintainable, fast at runtime, low mental friction to people other than the author, etc
I agree with you. While I don't code c++ now it used to be my tool of choice (right back to when it was a c pre-processor) and every time I look in, the language is uglier and conceptually heavier. If there's two ways to solve a problem, c++ will provide three - none of which work on their own without gotchas, none which play nicely with each other and each which will be required for working with some 3rd party library you need.
I could not agree more with you. You would never write a class like that for a skinning system, it's full of bloat and template guff.
Also at some point you're going to have to promote this to the GPU, at that point the original code will have a closer affinity to the GPU code rather the nonsense c++. I can forgive this though as he states he's not a games programmer.
There is a third way... don't write ArrayView<> and put all the "C style" code in the constructor and destructor of a Mesh class. Then you get something you find intuitive and you get exception safety.
Your comment that this is less readable is valid. Readability varies widely with background, skillset, consistency, and a host of other factors. If your team finds C-style code more intuitive, don't write Java style code. If your team finds malloc error prone, use smart pointers.
Perhaps the biggest advantage of C++ is its flexibility, at least compared to the other mainstream languages. This means everyone can write code the way they want and others can use it (a).
(a) OK. There is still a large list of gotchas around types, the preprocessor, header inclusions, namespacing, etc. that you need to know about to be a good citizen, at least until C++ can figure out how to support modules.
The current trend is for C++11 libraries to be header-only, lightweight and specifically without the hacks that have traditionally been required to work around all sorts of compiler issues.
The original version is terrible. C code written in C++. Basically an high level assembly language. Write all your code this way and soon you'll stumble with all the nastiness of manual memory management.
Code like that is the main culprit of C++ is still considered an "unsafe" language, despite a correct usage of it, although more verbose and with more abstractions, brings both the benefits of C-like speed and higher level languages safety.
>Write all your code this way and soon you'll stumble with all the nastiness of manual memory management.
See, this is what I never quite understand. I use C++ when I want to do C type things. Manage my own memory. Bit twiddle with speed. Avoid all dynamic allocations. Increase cache coherency. Call inline assembler. Use the CRTP to achieve static polymorphism. Design my own containers. Lock free containers with atomics. Etc. If I'm using C++ I want nastiness. I want to get down and dirty.
Why on earth would you ever want to use it as a higher level lanuage? It sucks balls at that. Just go and use a proper, nice, well designed high level language instead. If I want a language that holds my hand and protects me from the evil pointers, I'm not choosing C++. Using C# or Java (for example) is complete luxury compared to C++.
But apparently, there are a whole bunch of people that do attempt to use C++ for this purpose. Who strive to achieve a language where you never see a raw pointer, but you are still stuck with maybe 30% of the productivity you'd have in a proper higher level language. It boggles my mind. Congratulations, now you have no garbage collection, a crap IDE, you're still wrestling with forward declarations of classes and header files and the preprocessor and linker options and slow compilation and you've now buried the machine under a layer of abstraction. The absolute worst of both worlds.
>brings both the benefits of C-like speed and higher level languages safety.
The original version is not only bug prone. In some scenarios, both versions have a potential vulnerability. Can you spot it?
In a scenario where users can access elements of a joint array of requested size without limitations, they can access memory outside the array. Memory block's total requested size may overflow size_t. Fortunately, overflow prevention is easy to add to the generic version.
> both versions have a potential vulnerability. Can you spot it?
What the hell are you talking about? It's a game.
> In a scenario where users can access elements of a joint array of requested size without limitations, they can access memory outside the array
Welcome to modern computing! A process can access any part of its mapped address space without limitation.
Joke aside, are you suggesting a process should hide memory from itself?
Forget it, I can't even make sense out of your comment.
> Memory block's total requested size may overflow size_t
Are you serious? Then please suggest a fix for C, C++, Java, and whatsnot, because nothing will prevent you in pretty much any language to ask for more memory than you can handle. This is not a _bug_, it's just expecting the programmer not to fuck around.
The second version is quite convoluted but it makes sense. But as you say, in this instance I don't think it is a problem that really needs to be fixed when the first one is quite obvious and simple to read.
And to me, the first version does look like someone used to C who comes to write C++ but writes their C++ like it is C. Wrapping it in the second version (as others have pointed out) is safer because it is reusable, and cleans itself up.
Quite. It's been going on from the outset. Basically they got it wrong and they've been hacking at it ever since in a vain attempt to get it right. The should have killed the C linking compatibility from the outset (header files are devil spawn), they should have had a native string type as well.
D represents a much better attempt at what C++ ought to have been.
I watched the linked original video to understand the motivation, and read the article, and I think this is a misguided solution to a real problem.
These "joint allocations" seem to be "a small pool of heterogenous types." The motivation is that heap allocations are expensive, so we want to coalesce them. The only way this would matter at all is if we are creating lots of these Mesh objects.
The given solutions make one heap allocation per Mesh object, as opposed to three. Create six million Meshes and you get two million allocations.
Instead, one pool should be created for each of the three array types: Vector2, int, and Vector3. Now, create six million Meshes and you can do somewhere between three and a few dozen allocations, depending on your pool's initial size and growth strategy.
I don't know as I'm not a game developer, but my guess is that individual Meshes may correspond to things that need to be created and destructed on a regular basis during runtime. If that's the case, then depending on the details it may be preferable to have an array of Meshes, and not try to pool all the Meshes together.
I think it's risky to say that it's misguided without the full context of the problem.
But if you only ever create a single Mesh, then you wouldn't care to coalesce the allocations, because three allocations instead of one is inconsequential in the course of an entire program lifetime. I agree that there are many vertices and edges within each Mesh, but there better also be tons of Meshes within one program, or there is no problem to begin with.
Exactly. Stateful allocators were added to the standard library especially for this sort of problem. Before, allocators were assumed to have no internal state, which made it problematic to manage memory from an arena or pool.
Perhaps there's a reason allocators wouldn't work here, but if so, it deserves a little discussion.
I did discuss stateful allocators. They're nice, I'm a big fan. But as I said, they would impose pointless space overhead of one extra pointer per array. Also, you now have more complicated issues: each of the arrays is now an owner. What if you try to make a copy of it? Move construction? What if the container tries to deallocate, how does our allocator handle that? When I copy Mesh, would I need to change the allocator's state in the copies? How would that even work? I think that once you think about trying to accomplish the very specific and relatively simple thing being done here with allocators, you'll see that it would be quite a bit more complexity to no benefit.
I think I did give it a "little" discussion :-). Guess it depends on your definition of a little. I didn't want to talk more about it because I didn't want to get sidetracked, and this is ultimately the way I chose to do it.
Hope that sheds some light on the post and the choices I made with it.
There are prebuilt slab allocators that do this well. I used the arena allocator in facebook's libfolly recently as the base for a huge page allocator, and the code turned out quite nice. I think the allocator is a much better abstraction here. You could use something like the arena allocator here to carve out the memory for all the vertex and mesh storage and amortize the cost of hundreds of objects as well, so I don't like the argument in the first few sentences of the article actually that allocators would take more space.
I was recently bit using a similar technique to the original code for an intrusive type in C. Using manual non-typesafe raw offsets for things can definitely lead to nasty bugs.
Automating the calculation of offsets and assigning pointers definitely eliminates a lot of potential bugs, but I do wonder if this isn't a bigger deficiency in C++. Why force storing an extra pointer per array? I am still not sure why C++ doesn't allow FAMs [1].
There's probably a question of which would be more efficient, storing pointers in the object and calculating the offset from the pointer, or dynamically calculating the offset within the object. Still, it doesn't seem like a problem developers should need to solve.
It's not really an extra pointer, unless you assume there are no alignment issues. If you assume that there's zero padding between members, then yes, you can do 1 pointer for storage + 1 per view. In the follow-up post, I plan to either do one pointer per view, or one integer per view, haven't decided which.
The thing is, who should solve it? The different approaches you listed have different advantages, there isn't one right answer. If the language itself solves it for you, you are stuck with whatever solution the language picked. This would be fine in a higher level language, but not in C++.
You're right though that individual devs shouldn't be solving it, it should be in a library. If there's enough interest in these posts, I'm happy to put up my work (fully fleshed out and documented) on a github for people to use.
I suspect the compiler will be better able to optimize offsets than pointers just because of the semantics required by the language, but I think you're right that it isn't necessarily one size fits all. Another possible solution might even be storing static bounds at specific intervals. Without numbers, I can only guess which would give optimal performance though.
I think the library approach is right. It would be nice to see the language transparently support contiguous array members, but supporting it in a library will allow more people to use it regardless of compiler. It avoids the standards approval process and implementation as well, which take non-trivial amounts of time. The 0x/1y features definitely make it a lot more feasible.
It would probably give people a better chance to play with the source if you could post a link to it somewhere. I'm not sure what you plan to license it under.
To allow FAMs for non-trivially constructible/destructible types would be slightly more complicated than C FAMs, but it doesn't seem like an unreasonable thing to expect from the language/library, or intractable from a standard or compiler perspective.
Regardless, I'm still not sure what the justification is requiring it be UB even for just primitive types.
If you ever start from something like this, you should be asking a lot of more-basic questions first, such as:
1. Why is the goal to put everything in one block with a single allocation? Could everything still work if they were separated?
2. What do Vector2 and Vector3 look like? What if they say "int a, b" and "int a, b, c" respectively? If a perfectly-aligned "new int[total]" would have fixed the problem, it should have been tried from the start and the code refactored accordingly to not necessarily use structure types to look up the data.
3. Conversely, are Vector2 and Vector3 complex classes with special constructors (or equally important, could they someday become that way)? If so, even a "fixed" memory-allocation solution will be equally fragile to maintain because there is a responsibility to ensure that constructors are called correctly.
4. What is the memory profile of the rest of the application, e.g. how many Mesh objects themselves are created and is the entire approach to managing Mesh objects wrong? Maybe the focus on optimizing one piece has missed an entire problem somewhere else that is more fundamental.
Clearly the original code has bugs but it is also only 14 lines, the fixes for the bugs are straightforward and the right solution (after other analysis) may well have been to remove the code entirely instead of doing the same thing in a different way. Beware of the tendency to "fix" things without looking more deeply at the actual problem.
The whole point is that they shouldn't, because there are multiple loops over the different attributes and for best performance they should be in individual contiguous memory regions. Remember that you only have a few milliseconds per frame in a game and those things can matter here.
The two attributes are pos and uv, which are almost always tied to each other (and therefore be updated together). The GPU will access attributes by index hence they should be interleaved otherwise you are just forcing cache misses.
Indices are not vertex attributes and thus should not be interleaved with the other data.
If I stumbled across the latter version, I would just scratch my head. And I've seen sooo many projects just end up in a big code bloat just because some programmers wanted to add tons of "new" features like this that add nothing and just complexify the code for no reason.
It's simpler at the expense of being more bug prone. Notice that the "simpler" struct has no housekeeping code to "delete[]" the memory. The programmer now has to manually "manage" that cleanup somewhere.
>, and the layers of abstraction added are of pretty much no value here.
The "value" he's addressing from the "simpler" but more fragile code:
1) remove bug-prone code which requires manually synchronization (copy&paste) of pointer types in both the lines defining the variable and the subsequent lines casting the pointer from a raw buffer: "mesh->positions = (Vector3 )mesh->memory_block;"
2) remove brittle manual arithmetic of pointer offsets embedded inside contiguous arrays such as: "mesh->uvs = (Vector2 )(mesh->memory_block + positions_size + indices_size);"
Calculating offsets (and/or enforcing logical boundaries) inside of arrays is very error prone and the OpenSSL Heartbleed bug was an example of that type of defect.
3) manage the release of memory using unique_ptr. Obviously, it is to help the programmer (and other coworkers using Mesh data structs) avoid memory leaks.
>features like this that add nothing and just complexify the code for no reason.
It doesn't look like "no reason" to me. The attempt was to reduce bugs, and add memory safety.
And seriously, did you read the initial point? The problem is that the first approach must be fully rewritten for every new class with different members that exists in your code. Given how HN is butthurt over off-by-one and out-of-buffer errors, you can't say that the first example can really be debugged easily when you have 30 of them in your code. If you only had the Mesh class in your code and nothing else similar maybe it would not be worth it.. but only maybe.
The final example is incredibly simple to use, hides the complexity in one single point and prevents you from ever making mistakes outside of it and seriously, it is not that complex.
There are two 3 wrong premises in this article (IMHO): - It makes you think that using raw pointers (so called 'low level types' in the article) is more error prone than using combined types. - It makes you think that adding abstraction removes complexity. - Makes you think that you should combine/factor everything because repeating yourself is forbidden.
These assertions are straight false.
- Adding abstraction does NOT remove complexity. It ADDS complexity, and then HIDES it. The ArrayView class has absolutely no use whatsoever. Every C/C++ programmer is used to access arrays through pointers. Creating a class for that just makes a reader wonder "what the hell is this thing?". And then you have to crawl in the code to understand what is this thing. And of course, in 3/5 years from now, this ArrayView class will have tripled in size. For something as simple as reading numbers in memory, i don't want any abstraction.
- There is nothing wrong in repeating yourself once on a while => if it improves readability! Factoring too much IS wrong. It makes reading the code a pain. When I read code, I don't want to read an abstract version of what the code does. I want to grasp the maximum possible information with the minimum noise. After reading the final version of this Mesh class, I have no idea of its memory layout. I see some ArrayView and a magical create_contiguous_memory function. I'm lost. I don't know where the memory come from, whether or not I own it, what will happen if I hand a pointer to it. Everything is hidden, and that's a pure pain to understand.
As a lifetime C++ guy I find this new paradigm (inline code a-la structured code) brittle and fragile. App assumptions are laced throughout; nothing can be changed without a rewrite. And my go-to code-reading techniques (follow a class through all its incarnations with grep etc) fail - they rarely subclass, just overriding existing classes locally to create callbacks etc.
So I'm not a fan of the new order. I see it's a productivity tool for the app writers out there. Especially with the one-off code that startups write, expecting to do a rewrite once they're funded. It really makes sense for them.
But I weep for the craft.
I'm curious as to the problem you're exploring/solving here?
"Concatenated struct's in single chunk" is a neat trick. The use case seems a bit broken[1], but I'm assuming that it's from the original video, and your c++-ifying it? Or at least trying to find a nicer way to write this style of code?
If you'r going this route, I'd be tempted not to use pointers at all, and just use offsets and accessors.[2]
But the basic point I'd like to make is, there is no nice way to write this type of code, because c and c++ gives us no nice[3] way of expressing it.[1] Verts and Indices are generally uploaded to a GPU then discarded, but this has the counts, which is needed CPU side for draw calls, in the same chunk so would need to be copied before freeing.
[2]
[3] Maintainable, fast at runtime, low mental friction to people other than the author, etcAlso at some point you're going to have to promote this to the GPU, at that point the original code will have a closer affinity to the GPU code rather the nonsense c++. I can forgive this though as he states he's not a games programmer.
Your comment that this is less readable is valid. Readability varies widely with background, skillset, consistency, and a host of other factors. If your team finds C-style code more intuitive, don't write Java style code. If your team finds malloc error prone, use smart pointers.
Perhaps the biggest advantage of C++ is its flexibility, at least compared to the other mainstream languages. This means everyone can write code the way they want and others can use it (a).
(a) OK. There is still a large list of gotchas around types, the preprocessor, header inclusions, namespacing, etc. that you need to know about to be a good citizen, at least until C++ can figure out how to support modules.
Code like that is the main culprit of C++ is still considered an "unsafe" language, despite a correct usage of it, although more verbose and with more abstractions, brings both the benefits of C-like speed and higher level languages safety.
See, this is what I never quite understand. I use C++ when I want to do C type things. Manage my own memory. Bit twiddle with speed. Avoid all dynamic allocations. Increase cache coherency. Call inline assembler. Use the CRTP to achieve static polymorphism. Design my own containers. Lock free containers with atomics. Etc. If I'm using C++ I want nastiness. I want to get down and dirty.
Why on earth would you ever want to use it as a higher level lanuage? It sucks balls at that. Just go and use a proper, nice, well designed high level language instead. If I want a language that holds my hand and protects me from the evil pointers, I'm not choosing C++. Using C# or Java (for example) is complete luxury compared to C++.
But apparently, there are a whole bunch of people that do attempt to use C++ for this purpose. Who strive to achieve a language where you never see a raw pointer, but you are still stuck with maybe 30% of the productivity you'd have in a proper higher level language. It boggles my mind. Congratulations, now you have no garbage collection, a crap IDE, you're still wrestling with forward declarations of classes and header files and the preprocessor and linker options and slow compilation and you've now buried the machine under a layer of abstraction. The absolute worst of both worlds.
>brings both the benefits of C-like speed and higher level languages safety.
It doesn't. You end up with neither!
In a scenario where users can access elements of a joint array of requested size without limitations, they can access memory outside the array. Memory block's total requested size may overflow size_t. Fortunately, overflow prevention is easy to add to the generic version.
What the hell are you talking about? It's a game.
> In a scenario where users can access elements of a joint array of requested size without limitations, they can access memory outside the array
Welcome to modern computing! A process can access any part of its mapped address space without limitation. Joke aside, are you suggesting a process should hide memory from itself? Forget it, I can't even make sense out of your comment.
> Memory block's total requested size may overflow size_t
Are you serious? Then please suggest a fix for C, C++, Java, and whatsnot, because nothing will prevent you in pretty much any language to ask for more memory than you can handle. This is not a _bug_, it's just expecting the programmer not to fuck around.
And to me, the first version does look like someone used to C who comes to write C++ but writes their C++ like it is C. Wrapping it in the second version (as others have pointed out) is safer because it is reusable, and cleans itself up.
Deleted Comment
Regrettably this trend is all but new. Boost is god knows how old now and just look how popular it is in certain circles.
D represents a much better attempt at what C++ ought to have been.
These "joint allocations" seem to be "a small pool of heterogenous types." The motivation is that heap allocations are expensive, so we want to coalesce them. The only way this would matter at all is if we are creating lots of these Mesh objects.
The given solutions make one heap allocation per Mesh object, as opposed to three. Create six million Meshes and you get two million allocations.
Instead, one pool should be created for each of the three array types: Vector2, int, and Vector3. Now, create six million Meshes and you can do somewhere between three and a few dozen allocations, depending on your pool's initial size and growth strategy.
I think it's risky to say that it's misguided without the full context of the problem.
I didn't watch the presentation so I may be wrong but it's possible you're misinterpret how Mesh was intended to be used.
The "million" objects is meant to be inside one Mesh. Mesh is singular. The vertices are plural (millions).
If there's more than one Mesh, it would be dozens or hundreds of Mesh (per video game characters).
Perhaps there's a reason allocators wouldn't work here, but if so, it deserves a little discussion.
I think I did give it a "little" discussion :-). Guess it depends on your definition of a little. I didn't want to talk more about it because I didn't want to get sidetracked, and this is ultimately the way I chose to do it.
Hope that sheds some light on the post and the choices I made with it.
Automating the calculation of offsets and assigning pointers definitely eliminates a lot of potential bugs, but I do wonder if this isn't a bigger deficiency in C++. Why force storing an extra pointer per array? I am still not sure why C++ doesn't allow FAMs [1].
There's probably a question of which would be more efficient, storing pointers in the object and calculating the offset from the pointer, or dynamically calculating the offset within the object. Still, it doesn't seem like a problem developers should need to solve.
[1] https://en.wikipedia.org/wiki/Flexible_array_member
The thing is, who should solve it? The different approaches you listed have different advantages, there isn't one right answer. If the language itself solves it for you, you are stuck with whatever solution the language picked. This would be fine in a higher level language, but not in C++.
You're right though that individual devs shouldn't be solving it, it should be in a library. If there's enough interest in these posts, I'm happy to put up my work (fully fleshed out and documented) on a github for people to use.
I think the library approach is right. It would be nice to see the language transparently support contiguous array members, but supporting it in a library will allow more people to use it regardless of compiler. It avoids the standards approval process and implementation as well, which take non-trivial amounts of time. The 0x/1y features definitely make it a lot more feasible.
It would probably give people a better chance to play with the source if you could post a link to it somewhere. I'm not sure what you plan to license it under.
I'll keep an eye out for the next post.
Regardless, I'm still not sure what the justification is requiring it be UB even for just primitive types.
1. Why is the goal to put everything in one block with a single allocation? Could everything still work if they were separated?
2. What do Vector2 and Vector3 look like? What if they say "int a, b" and "int a, b, c" respectively? If a perfectly-aligned "new int[total]" would have fixed the problem, it should have been tried from the start and the code refactored accordingly to not necessarily use structure types to look up the data.
3. Conversely, are Vector2 and Vector3 complex classes with special constructors (or equally important, could they someday become that way)? If so, even a "fixed" memory-allocation solution will be equally fragile to maintain because there is a responsibility to ensure that constructors are called correctly.
4. What is the memory profile of the rest of the application, e.g. how many Mesh objects themselves are created and is the entire approach to managing Mesh objects wrong? Maybe the focus on optimizing one piece has missed an entire problem somewhere else that is more fundamental.
Clearly the original code has bugs but it is also only 14 lines, the fixes for the bugs are straightforward and the right solution (after other analysis) may well have been to remove the code entirely instead of doing the same thing in a different way. Beware of the tendency to "fix" things without looking more deeply at the actual problem.
Indices are not vertex attributes and thus should not be interleaved with the other data.