Language design is a curious mixture of grand ideas and fiddly details
I hadn't heard this last one before, but it's SO right ...
I always wondered why JS and PHP and Perl got so many details "wrong" (e.g. with Perl, one definition of "wrong" is that Perl 6 / Raku didn't make the same design choice)
Turns out there's an avalanche of details, and they interact in many ways!
Python did better, but I strongly argue both Python 3 and Python 2 got strings wrong. (Array of code points isn't generally useful, and it's hard to implement efficiently. See fish shell discussion about wchar_t on the front page now; also see Guile Scheme)
OCaml seems to have gotten mutable strings wrong (for some time), and also I think the split between regular sum types and GADTs is awkward. And also most people argue that objects vs. records vs. modules is suboptimal. And a bunch of mistakes with syntactic consistency, apparently.
So basically I agree that plowing through all the details -- and really observing their consequences in real programs -- is more important and time-consuming than grand ideas.
But if you lack any grand ideas, then the language will probably turn out poorly too. And you probably won't have any reason to finish it.
Yeah, it's extremely difficult. Everything interacts with everything else. A recent snag I've run into recently with my language: I want to have mutable strings and lists but that complicates their use as hash table keys.
I'm so used to the C mindset of just modifying everything in place. It wasn't until I actually tried this that I understood why so many languages just copy everything. I researched how other languages dealt with this problem and many only allow immutable objects as hash keys. Most interesting was Ruby which allows mutable objects as keys, warns programmers that modifying keys can invalidate the hash table and provides a rehashing method to fix it.
I think the main alternative design is to treat strings like in Rust or Go.
The problem with the “array of code points” idea is that you end up with the most general implementation, which is a UTF-32 string, and then you end up with the fastest implementation, which is a UTF-8 string, and maybe throw in UCS-2 for good measure. These all have the same asymptotic performance characteristics, but allow ASCII strings (which are extremely common) to be stored with less memory. The cost is that now you have two or three different string representations floating around. This approach is used by Python and Java, for example.
The Rust / Go approach is to assume that you don’t need O(1) access to the Nth code point in a string, which is probably reasonable, since that’s rarely necessary or even useful. You get a lot of complexity savings from only using one encoding, and the main tradeoff is that certain languages take 50% more space in memory.
Python and Java both date back to an era where fixed-width string encodings were the norm.
The short answer is somewhere between Go and Rust strings, which are newer languages that use UTF-8 for interior representation, and also favor it for exterior encoding.
Roughly speaking, Java and JavaScript are in the UTF-16 camp, and Python 2 and 3 are in the code points camp. C and C++ have unique problems, but you could also put them in the code points camp.
So there are at least 3 different camps, and a whole bunch of weird variations, like string width being compile-time selectable in interpreters.
A main design issue is that string APIs shouldn't depend on a mutable global variable -- the default encoding, or default file system encoding. That's an idea that's a disaster in C, and also a disaster in Python.
It leads to buggy programs. Go and Rust differ in their philosophies, but neither of them has that design problem.
Raku introduced the concept of NFG - Normal Form Grapheme - as a way to represent any Unicode string in its logical ‘visual character’ grapheme form. Sequences of combining characters that don’t have a canonical single codepoint form are given a synthetic codepoint so that string methods including regexes can operate on grapheme characters without ever causing splitting side effects.
Of course there are methods for manipulating at the codepoint level as well.
Unfortunately, strings cross at least 3 different problems:
* charset encoding. Cases worth supporting include Ascii, Latin1, Xascii, WeirdLegacyStatefulEncoding, WhateverMyLocaleSaysExceptNotReally, UTF8, UTF16, UCS2, UTF32, and sloppy variants thereof. Note that not supporting sloppiness means it is impossible to access a lot of old data (for example, `git` committers and commit messages). Note that it is impossible to make a 1-to-1 mapping between sloppy UTF-8 and sloppy UTF-16, so if all strings have a single representation (unless it is some weird representation not yet mentioned), it is either impossible to support all strings encountered on non-Windows platforms, or impossible to support all strings encountered on non-Windows platforms. I am a strong proponent of APIs supporting multiple compile-time-known string representations, with transparent conversion where safe.
* ownership. Cases worth supporting: Value (but finite size), Refcounted (yes, this is important, the problem with std::string is that it was mutable), Tail or full Slice thereof, borrowed Zero-terminated or X(not) terminated, Literal (known statically allocated), and Alternating (the one that does SSO, switching between Value and Refcounted; IME it is important for Refcounted to efficiently support Literal). Note that there is no need for an immutable string to support Unique ownership. That's 8 different ownership policies, and if your Z/X implementation doesn't support recovering an optional owner, you also need to support at least one "maybe owned, maybe borrowed" (needed for things like efficient map insertion if the key might already exist; making the insert function a template does not suffice to handle all cases). It is important that, to the extent possible, all these (immutable) strings offer the same API, and can be converted implicitly where safe (exception: legacy code might make implicit conversion to V useful, despite being technically wrong).
(there should be some Mutable string-like thing but it need not provide the API, only push/pop off the end followed by conversion to R; consider particularly the implementation of comma-separated list stringification)
* domain meaning. The language itself should support at least Format strings for printf/scanf/strftime etc. (these "should" be separate types, but if relying on the C compiler to check them for you, don't actually have to be). Common library-supported additions include XML, JSON, and SQL strings, to make injection attacks impossible at the type level. Thinking of compilers (but not limited to them), there also needs to be dedicated types for "string representing a filepath" vs "string representing a file's contents" (the web's concept of "blob URL" is informative, but suffers from overloading the string type in the first place). Importantly, it must be easy to write literals of the appropriate type and convert explicitly as needed, so there should not be any file-related APIs that take strings.
(related, it's often useful to have the concept of "single-line string" and "word" (which, among other cases, makes it possible to ); the exact definition thereof depending on context. So it may be useful to be able to tag strings as "all characters (or, separately, the first character or the last character (though "last character" is far less useful)) are one of [abc...]"; reasonably granularity being: NUL, whitespace characters individually, other C0 controls, ASCII symbols individually, digit 0, digit 1, digits 2-7, digits 8-9, letters a-f, letters A-F, letters g-z, letters G-Z, DEL, C1 controls, other latin1, U+0100 through U+07FF, U+0800 through U+FFFF excluding surrogates, low surrogates, high surrogates, and U+10000 through U+10FFFF, and illegal values U+110000 and higher (maybe splitting 31-bit from 32-bit?) (several of these are impossible under certain encodings and strictness levels). Actually supporting all of this in the compiler proper is complicated and likely to be deferred, but thinking about it informs both language and library design. Particularly, consider "partial template casting")
I can appreciate why it was cool. "What if everything could automatically do the right thing when it interacts with other things, and you wouldn't need any of this ritual boilerplate stuff? Why do we keep needing to convert from string to number to string over and over and over it's crazy. The language should just do the right thing!"
I think it's very easy to be sympathetic to the design trend.
There are definitely trends, but the pattern I see with early JS and PHP is they simply didn't anticipate the consequences of certain decisions.
What happens is you make a decision "locally", for some use case or program. And then you don't realize how it affects other programs.
The well known "Wat" talk about JS by Gary B. is basically a bunch of these unintended consequences.
So the problem is really to see the whole picture, and to get some experience using the language. But there's that catch 22 -- by the time you release it, it's almost set in stone.
I remember in the early 2000's Guido and the Python team would always refer to "Python 3000" -- the one chance they had to break things!
I haven't used C# enough to comment on the language itself, but I do know it was designed (at least initially) by Anders Hejlsberg who is very experienced in language design.
I swear Stroustrup said this at the beginning of his EE380 (CS Colloquium) talk at Stanford around 1986. At the time we (grad students) were all C programmers and we were sure that C was the best language evar, at least, better than Pascal, which was the language Stanford used for teaching back then. So we all wanted to see this person who had the temerity to claim that he had improved C and who had the audacity to call it “C++” of all things! Skilling Auditorium was packed. When Stroustrup took the stage, a tense hush fell over the crowd. He said the above line fairly quietly, and everyone burst out laughing, breaking the tension. The rest of the talk was a pretty straightforward explanation of C++ 1.0, and I came away fairly impressed.
Several years later at another talk I asked him to inscribe my copy of the C++ book with that quote. He claimed not to remember it, but he inscribed the book per my request anyway.
What I like about this quote is that it embodies Stroustrup’s personal attitude. He’s ultimately pragmatic. Nothing can be perfect — including C++ by his own admission. It’s aligned well with his other writings and with his commentary on the quotes page, particularly where he says that he tries not to be rude about other languages.
“The problem with many professors is that their previous occupation was student”
This one is one of the more impactful on our industry in my opinion. I have a “side gig” as an external examiner, which means that every half year I get to sit through examinations on topics that I know most of the students will never ever have to use. And that’s the best case for much of it, the worst case is all the stuff they’re going to need to unlearn to actually become good programmers in a world where much of the academic programming that is taught today is the same I learned more than 20 years ago. Like a heavy use on technology based architecture, like separating your controllers and your models into two directories… Fine when you have two or three, not so fine when you have 9 million. I’m still impressed with how hard it’s been for domain driven architecture and a focus on actual business logic to make its way into academia when it’s literally always going to be how the students are expected to work in my area of the world. The same goes for much of the OOP principles, like code-share which has been considered an anti-pattern around here for the better part of a decade. Not because the theory is wrong, but because it just never works out well over a period of several years and multiple developers extending and changing the shared code.
I’m honestly not sure how we can change it though. Because most of the people who govern the processes are exactly students who’ve become professors, and if I had gone that route myself, I’d probably also be teaching the flawed lessons I was taught those 20+ years ago. Hell, I would’ve still taught them a few years into my career as it wasn’t until I “stuck around” or got to work on long running projects it became apparent just how wrong those theories were in practice. Again, not because the theories are wrong but because they are never applied as intended because the world is so much less perfect than what is required for the theories to actually work. You’re not going to be at your best on a Thursday afternoon after a sleepless week of baby watching and being slightly sick, but you’re still going to write code and someone is going to approve it.
>And that’s the best case for much of it, the worst case is all the stuff they’re going to need to unlearn to actually become good programmers in a world where much of the academic programming that is taught today is the same I learned more than 20 years ago.
Not commenting on the general picture of "student to professor without any industrial experience in the middle", just this point.
At some point, you have to acquire the basics. Learning to articulate thoughts into algorithms is something to be acquired at some point to work in CS, and this just didn’t change over the last 20 years. That’s the whole point of Knuth’s (M)MIX actually.
Just like learning to use alphabet won’t be enough to write every prose you will ever need, but alphabets don’t change every six months.
I don’t disagree that some of the CS education is fine. A lot of the algorithm/math curriculum is fine even though it’s very old. I’m talking more about the systems design, systems architecture and project management which is sorely outdated compared to what most of the students will meet in the real world.
It wasn’t when I stated, but in the 20 years since then, things have just evolved so much. Nobody really does OOP around here anymore. Parts of it, sure, but for the most parts functions live on their own and “classes” are now mainly used as state stores for variables, and that’s in languages that don’t have a real alternative to “state store” because people vastly prefer types that can’t have functions to protect themselves from bad habits. But fresh from CS hires come out expecting to build abstract classes that are inherited, and then they meet the culture shock, and sometimes some of them don’t even know you can write functions without putting them inside an object. They come out with the expectation that “agile good, waterfall bad” but modern project and contract management has long since realised that “pure agile” just doesn’t work unless you’re in a specified team in a massive tech company. Because in smaller companies nobody is going to sign a contract that’s based on agile promises, and anyone who uses Scrum by the Book has basically gone bankrupt because they got outcompeted by more adaptable ways of working. It’s not that modern things aren’t inspired by what came before, and there is even a lot of research and good books available on things like team-topologies and how to work as fast delivery teams, but it’s just not what’s being taught in traditional CS around here.
I don't remember the exact quote, but I saw a talk where he said something like "People want big/verbose/explicit syntax for the languages features they don't understand, and small/terse/implicit syntax for the language features do understand." It made me realize that many of my language design opinions at that time were a matter of personal preference.
One of my favorite insights about syntax design appeared in a retrospective on C++ by Bjarne Stroustrup:
For new features, people insist on LOUD explicit syntax.
For established features, people want terse notation.
I call this Stroustrup's Rule. Part of what I love about his observation is that it acknowledges that design takes place over time, and that the audience it addresses evolves. Software is for people and people grow.
I often notice that the actual Rust authors and designers respect and are influenced by C++ very much. (Niko M is another C++ fan)
It's only the randoms online that like to start C++ vs. Rust arguments.
Another thing to note is that the Mozilla Rust is MUCH closer to C++ than Graydon's Rust was. Graydon's Rust was not at all about zero-cost abstraction, the shared motto of C++ and Rust.
"The official mascot for C++ is an obese, diseased rat named Keith, whose hind leg is missing because it was blown off. The above image is a contemporary version drawn by Richard Stallman."
"There are only two kinds of languages: the ones people complain about and the ones nobody uses".
He is right on this one. Pretty much in every discussion about Programming Languages people write how good Rust is and complain about how bad C++ is but the reality is, C++ it's one of the most used languages in the world.
This quote could be a very harsh reply to Rust vs C++.
I came to the conclusion that the inverse is true, people tend to love languages they don't use.
I used to love Lisp and Racket. But after writing some real programs with other people I realized the idea that every codebase has its own DSL and languages is actually stupid, doesn't scale and hard to maintain. Came to hate Haskell for the very same reason. Every Haskell programmer think he's more clever than others so he decides on 30/40 language extensions and you have something that simply isn't Haskell.
People should not program programming languages. There's use cases for this style of programming, but they aren't how general-purpose programming should look like.
> But after writing some real programs with other people I realized the idea that every codebase has its own DSL and languages is actually stupid, doesn't scale and hard to maintain.
Code bases can use DSLs. DSLs should used judiciously. For example, if you need an LALR parser, you'd probably wouldn't code it all by hand, and you'd probably use a DSL.
Just like we use libraries judiciously in many languages. (Well, we should, but casually pulling in a hundred libraries is more a Python/JS/Rust convention, than a Lisp family one.)
> Came to hate Haskell for the very same reason. Every Haskell programmer think he's more clever than others so he decides on 30/40 language extensions and you have something that simply isn't Haskell.
Is this a problem when Haskell is used professionally by software engineering teams? Or are you speaking of code by academics/students, who don't have a lot of experience on professional software engineering teams? Or by hobbyists, who are (rightly) indulging, and writing code however they want (more power to them), not writing how they have to at their day job?
No, it could be a very stupid reply to Rust vs C++ since people do write in Rust. Bigger programs get written in it all the time and - what a surprise - people who use it have things they are annoyed about, which is why it gets improved.
To me this is one of the most stupid things he's ever uttered on one hand and the most useful one on the other. Cause it can be used to remind people that there's always trade-offs, which is a good thing if a discussion gets a bit too heated and "I am right!" "No, I am right!", but it can also be used, and most often is, as a very shallow and arrogant dismissal - funny enough, especially by C++ zealots, IME - of someone trying to fix some things. As if trying to do things better is somehow an affront to their greatness.
If you think nobody complains about Rust then you haven't visited HN much recently ;). Heck, Bjarne Stroustrup himself has recently taken to complaining about Rust in papers and talks (though most recently he's taken to referring to it without naming names).
There's a noticeable difference between what you get from somebody like Barry Revzin, who understands C++ and Rust well and has very specific criiques, and from people like Bjarne or Herb who seem to be relying on superficial impressions at best.
I think you would have a very hard time defending the claim that 'nobody uses Rust,' given its current adoption trend in major technology companies like Microsoft and its integration in software projects like the Linux kernel.
You can already program. When you program a hobby/research project in the language you want to learn (better) you program /with/ the grain of the language. It's a nice experience.
Move over to implementing someone else's hard requirements where you have to make that happen, with time pressure - you find yourself going against the grain of the language by necessity and start describing the difficulties, sometimes colorfully.
People waxing lyrical about (this year haskell, rust for example) and who don't have a list of complaints are in the first category.
> An organization that treats its programmers as morons will soon have programmers that are willing and able to act like morons only.
In a broader sense what it implies is that companies should not make a Programmer's job onerous (in any dimension) to the point that the "joy" is gone from the doing of the activity itself. Thus Reports/Meetings/Processes/Testing/etc. should all be modulated/balanced based on needs/project and not because it is the latest fad. Managers should really really heed this.
> Far too often, 'software engineering' is neither engineering nor about software
This is a follow-on from the above.
> Any problem in computer science can be solved with another layer of indirection.
I have also heard this attributed to Andy Koenig.
> My ideal of program design is to represent the concepts of the application domain directly in code. That way, if you understand the application domain, you understand the code and vice versa.
This is how i learnt the techniques of designing in C++ from the early days i.e. from "Barton and Nackman's Scientific and Engineering C++" and "James Coplien's Multi-paradigm design for C++". This is fundamental to problem-solving itself and hence in any job it is of utmost importance to understand the domain i.e. Why and What is being done rather than the How.
> Legacy code' often differs from its suggested alternative by actually working and scaling.
Very very true. This is why i dismiss people who come in and start saying "everything must be rewritten" without spending time studying and learning about the existing system.
I hadn't heard this last one before, but it's SO right ...
I always wondered why JS and PHP and Perl got so many details "wrong" (e.g. with Perl, one definition of "wrong" is that Perl 6 / Raku didn't make the same design choice)
Turns out there's an avalanche of details, and they interact in many ways!
Python did better, but I strongly argue both Python 3 and Python 2 got strings wrong. (Array of code points isn't generally useful, and it's hard to implement efficiently. See fish shell discussion about wchar_t on the front page now; also see Guile Scheme)
OCaml seems to have gotten mutable strings wrong (for some time), and also I think the split between regular sum types and GADTs is awkward. And also most people argue that objects vs. records vs. modules is suboptimal. And a bunch of mistakes with syntactic consistency, apparently.
Looks like almost every language had problems with for loops and closures, including C# and Go - https://news.ycombinator.com/item?id=37575204
So basically I agree that plowing through all the details -- and really observing their consequences in real programs -- is more important and time-consuming than grand ideas.
But if you lack any grand ideas, then the language will probably turn out poorly too. And you probably won't have any reason to finish it.
I'm so used to the C mindset of just modifying everything in place. It wasn't until I actually tried this that I understood why so many languages just copy everything. I researched how other languages dealt with this problem and many only allow immutable objects as hash keys. Most interesting was Ruby which allows mutable objects as keys, warns programmers that modifying keys can invalidate the hash table and provides a rehashing method to fix it.
https://ruby-doc.org/core/Hash.html
> Modifying a Hash key while it is in use damages the hash’s index.
> You can repair the hash index using method rehash
> A String key is always safe
> That’s because an unfrozen String passed as a key will be replaced by a duplicated and frozen String
I like Ruby a lot.
The problem with the “array of code points” idea is that you end up with the most general implementation, which is a UTF-32 string, and then you end up with the fastest implementation, which is a UTF-8 string, and maybe throw in UCS-2 for good measure. These all have the same asymptotic performance characteristics, but allow ASCII strings (which are extremely common) to be stored with less memory. The cost is that now you have two or three different string representations floating around. This approach is used by Python and Java, for example.
The Rust / Go approach is to assume that you don’t need O(1) access to the Nth code point in a string, which is probably reasonable, since that’s rarely necessary or even useful. You get a lot of complexity savings from only using one encoding, and the main tradeoff is that certain languages take 50% more space in memory.
Python and Java both date back to an era where fixed-width string encodings were the norm.
Roughly speaking, Java and JavaScript are in the UTF-16 camp, and Python 2 and 3 are in the code points camp. C and C++ have unique problems, but you could also put them in the code points camp.
So there are at least 3 different camps, and a whole bunch of weird variations, like string width being compile-time selectable in interpreters.
More details here:
https://www.oilshell.org/blog/2023/06/ysh-design.html#text
and here:
https://www.oilshell.org/blog/2023/06/surrogate-pair.html#hi...
A main design issue is that string APIs shouldn't depend on a mutable global variable -- the default encoding, or default file system encoding. That's an idea that's a disaster in C, and also a disaster in Python.
It leads to buggy programs. Go and Rust differ in their philosophies, but neither of them has that design problem.
Deleted Comment
* charset encoding. Cases worth supporting include Ascii, Latin1, Xascii, WeirdLegacyStatefulEncoding, WhateverMyLocaleSaysExceptNotReally, UTF8, UTF16, UCS2, UTF32, and sloppy variants thereof. Note that not supporting sloppiness means it is impossible to access a lot of old data (for example, `git` committers and commit messages). Note that it is impossible to make a 1-to-1 mapping between sloppy UTF-8 and sloppy UTF-16, so if all strings have a single representation (unless it is some weird representation not yet mentioned), it is either impossible to support all strings encountered on non-Windows platforms, or impossible to support all strings encountered on non-Windows platforms. I am a strong proponent of APIs supporting multiple compile-time-known string representations, with transparent conversion where safe.
* ownership. Cases worth supporting: Value (but finite size), Refcounted (yes, this is important, the problem with std::string is that it was mutable), Tail or full Slice thereof, borrowed Zero-terminated or X(not) terminated, Literal (known statically allocated), and Alternating (the one that does SSO, switching between Value and Refcounted; IME it is important for Refcounted to efficiently support Literal). Note that there is no need for an immutable string to support Unique ownership. That's 8 different ownership policies, and if your Z/X implementation doesn't support recovering an optional owner, you also need to support at least one "maybe owned, maybe borrowed" (needed for things like efficient map insertion if the key might already exist; making the insert function a template does not suffice to handle all cases). It is important that, to the extent possible, all these (immutable) strings offer the same API, and can be converted implicitly where safe (exception: legacy code might make implicit conversion to V useful, despite being technically wrong).
(there should be some Mutable string-like thing but it need not provide the API, only push/pop off the end followed by conversion to R; consider particularly the implementation of comma-separated list stringification)
* domain meaning. The language itself should support at least Format strings for printf/scanf/strftime etc. (these "should" be separate types, but if relying on the C compiler to check them for you, don't actually have to be). Common library-supported additions include XML, JSON, and SQL strings, to make injection attacks impossible at the type level. Thinking of compilers (but not limited to them), there also needs to be dedicated types for "string representing a filepath" vs "string representing a file's contents" (the web's concept of "blob URL" is informative, but suffers from overloading the string type in the first place). Importantly, it must be easy to write literals of the appropriate type and convert explicitly as needed, so there should not be any file-related APIs that take strings.
(related, it's often useful to have the concept of "single-line string" and "word" (which, among other cases, makes it possible to ); the exact definition thereof depending on context. So it may be useful to be able to tag strings as "all characters (or, separately, the first character or the last character (though "last character" is far less useful)) are one of [abc...]"; reasonably granularity being: NUL, whitespace characters individually, other C0 controls, ASCII symbols individually, digit 0, digit 1, digits 2-7, digits 8-9, letters a-f, letters A-F, letters g-z, letters G-Z, DEL, C1 controls, other latin1, U+0100 through U+07FF, U+0800 through U+FFFF excluding surrogates, low surrogates, high surrogates, and U+10000 through U+10FFFF, and illegal values U+110000 and higher (maybe splitting 31-bit from 32-bit?) (several of these are impossible under certain encodings and strictness levels). Actually supporting all of this in the compiler proper is complicated and likely to be deferred, but thinking about it informs both language and library design. Particularly, consider "partial template casting")
Weak typing/Implicit conversion was cool in the 90s.
JS and PHP would have magnitudes better if they were typed like Python, but that's history.
I think it's very easy to be sympathetic to the design trend.
What happens is you make a decision "locally", for some use case or program. And then you don't realize how it affects other programs.
The well known "Wat" talk about JS by Gary B. is basically a bunch of these unintended consequences.
So the problem is really to see the whole picture, and to get some experience using the language. But there's that catch 22 -- by the time you release it, it's almost set in stone.
I remember in the early 2000's Guido and the Python team would always refer to "Python 3000" -- the one chance they had to break things!
Deleted Comment
I swear Stroustrup said this at the beginning of his EE380 (CS Colloquium) talk at Stanford around 1986. At the time we (grad students) were all C programmers and we were sure that C was the best language evar, at least, better than Pascal, which was the language Stanford used for teaching back then. So we all wanted to see this person who had the temerity to claim that he had improved C and who had the audacity to call it “C++” of all things! Skilling Auditorium was packed. When Stroustrup took the stage, a tense hush fell over the crowd. He said the above line fairly quietly, and everyone burst out laughing, breaking the tension. The rest of the talk was a pretty straightforward explanation of C++ 1.0, and I came away fairly impressed.
Several years later at another talk I asked him to inscribe my copy of the C++ book with that quote. He claimed not to remember it, but he inscribed the book per my request anyway.
I think the quote holds up well.
Try telling a C++ programmer RAII is not a good idea. I dare you. Or smart-pointers. Or that the STL is mostly-garbage..
Trying to honestly have these conversations is typically an exercise in starting a religious flame-war.
What I like about this quote is that it embodies Stroustrup’s personal attitude. He’s ultimately pragmatic. Nothing can be perfect — including C++ by his own admission. It’s aligned well with his other writings and with his commentary on the quotes page, particularly where he says that he tries not to be rude about other languages.
This one is one of the more impactful on our industry in my opinion. I have a “side gig” as an external examiner, which means that every half year I get to sit through examinations on topics that I know most of the students will never ever have to use. And that’s the best case for much of it, the worst case is all the stuff they’re going to need to unlearn to actually become good programmers in a world where much of the academic programming that is taught today is the same I learned more than 20 years ago. Like a heavy use on technology based architecture, like separating your controllers and your models into two directories… Fine when you have two or three, not so fine when you have 9 million. I’m still impressed with how hard it’s been for domain driven architecture and a focus on actual business logic to make its way into academia when it’s literally always going to be how the students are expected to work in my area of the world. The same goes for much of the OOP principles, like code-share which has been considered an anti-pattern around here for the better part of a decade. Not because the theory is wrong, but because it just never works out well over a period of several years and multiple developers extending and changing the shared code.
I’m honestly not sure how we can change it though. Because most of the people who govern the processes are exactly students who’ve become professors, and if I had gone that route myself, I’d probably also be teaching the flawed lessons I was taught those 20+ years ago. Hell, I would’ve still taught them a few years into my career as it wasn’t until I “stuck around” or got to work on long running projects it became apparent just how wrong those theories were in practice. Again, not because the theories are wrong but because they are never applied as intended because the world is so much less perfect than what is required for the theories to actually work. You’re not going to be at your best on a Thursday afternoon after a sleepless week of baby watching and being slightly sick, but you’re still going to write code and someone is going to approve it.
Not commenting on the general picture of "student to professor without any industrial experience in the middle", just this point.
At some point, you have to acquire the basics. Learning to articulate thoughts into algorithms is something to be acquired at some point to work in CS, and this just didn’t change over the last 20 years. That’s the whole point of Knuth’s (M)MIX actually.
Just like learning to use alphabet won’t be enough to write every prose you will ever need, but alphabets don’t change every six months.
It wasn’t when I stated, but in the 20 years since then, things have just evolved so much. Nobody really does OOP around here anymore. Parts of it, sure, but for the most parts functions live on their own and “classes” are now mainly used as state stores for variables, and that’s in languages that don’t have a real alternative to “state store” because people vastly prefer types that can’t have functions to protect themselves from bad habits. But fresh from CS hires come out expecting to build abstract classes that are inherited, and then they meet the culture shock, and sometimes some of them don’t even know you can write functions without putting them inside an object. They come out with the expectation that “agile good, waterfall bad” but modern project and contract management has long since realised that “pure agile” just doesn’t work unless you’re in a specified team in a massive tech company. Because in smaller companies nobody is going to sign a contract that’s based on agile promises, and anyone who uses Scrum by the Book has basically gone bankrupt because they got outcompeted by more adaptable ways of working. It’s not that modern things aren’t inspired by what came before, and there is even a lot of research and good books available on things like team-topologies and how to work as fast delivery teams, but it’s just not what’s being taught in traditional CS around here.
https://www.thefeedbackloop.xyz/stroustrups-rule-and-layerin...
One of my favorite insights about syntax design appeared in a retrospective on C++ by Bjarne Stroustrup:
For new features, people insist on LOUD explicit syntax.
For established features, people want terse notation.
I call this Stroustrup's Rule. Part of what I love about his observation is that it acknowledges that design takes place over time, and that the audience it addresses evolves. Software is for people and people grow.
Side note: Dave Herman is Rust's most unrecognized contributor: https://brson.github.io/2021/05/02/rusts-most-unrecognized-c...
I often notice that the actual Rust authors and designers respect and are influenced by C++ very much. (Niko M is another C++ fan)
It's only the randoms online that like to start C++ vs. Rust arguments.
Another thing to note is that the Mozilla Rust is MUCH closer to C++ than Graydon's Rust was. Graydon's Rust was not at all about zero-cost abstraction, the shared motto of C++ and Rust.
I have been unable to determine the provenance of this quote. Source and image: https://ifunny.co/picture/history-the-official-mascot-for-c-...
https://en.uncyclopedia.co/wiki/C%2B%2B
Whole article is worth reading, lots of good laughs in there!
(in response to my complaint to him that hastables again hadn't been included in the most recent standard at the time.)
The really important stuff was printing with "<<", or whatever ...
He is right on this one. Pretty much in every discussion about Programming Languages people write how good Rust is and complain about how bad C++ is but the reality is, C++ it's one of the most used languages in the world.
This quote could be a very harsh reply to Rust vs C++.
I used to love Lisp and Racket. But after writing some real programs with other people I realized the idea that every codebase has its own DSL and languages is actually stupid, doesn't scale and hard to maintain. Came to hate Haskell for the very same reason. Every Haskell programmer think he's more clever than others so he decides on 30/40 language extensions and you have something that simply isn't Haskell.
People should not program programming languages. There's use cases for this style of programming, but they aren't how general-purpose programming should look like.
Code bases can use DSLs. DSLs should used judiciously. For example, if you need an LALR parser, you'd probably wouldn't code it all by hand, and you'd probably use a DSL.
Just like we use libraries judiciously in many languages. (Well, we should, but casually pulling in a hundred libraries is more a Python/JS/Rust convention, than a Lisp family one.)
> Came to hate Haskell for the very same reason. Every Haskell programmer think he's more clever than others so he decides on 30/40 language extensions and you have something that simply isn't Haskell.
Is this a problem when Haskell is used professionally by software engineering teams? Or are you speaking of code by academics/students, who don't have a lot of experience on professional software engineering teams? Or by hobbyists, who are (rightly) indulging, and writing code however they want (more power to them), not writing how they have to at their day job?
To me this is one of the most stupid things he's ever uttered on one hand and the most useful one on the other. Cause it can be used to remind people that there's always trade-offs, which is a good thing if a discussion gets a bit too heated and "I am right!" "No, I am right!", but it can also be used, and most often is, as a very shallow and arrogant dismissal - funny enough, especially by C++ zealots, IME - of someone trying to fix some things. As if trying to do things better is somehow an affront to their greatness.
Deleted Comment
I wish people would stop spamming that quote on discussions here on this site as shallow dismissal everytime someone posts their critique.
Move over to implementing someone else's hard requirements where you have to make that happen, with time pressure - you find yourself going against the grain of the language by necessity and start describing the difficulties, sometimes colorfully.
People waxing lyrical about (this year haskell, rust for example) and who don't have a list of complaints are in the first category.
Deleted Comment
> An organization that treats its programmers as morons will soon have programmers that are willing and able to act like morons only.
In a broader sense what it implies is that companies should not make a Programmer's job onerous (in any dimension) to the point that the "joy" is gone from the doing of the activity itself. Thus Reports/Meetings/Processes/Testing/etc. should all be modulated/balanced based on needs/project and not because it is the latest fad. Managers should really really heed this.
> Far too often, 'software engineering' is neither engineering nor about software
This is a follow-on from the above.
> Any problem in computer science can be solved with another layer of indirection.
I have also heard this attributed to Andy Koenig.
> My ideal of program design is to represent the concepts of the application domain directly in code. That way, if you understand the application domain, you understand the code and vice versa.
This is how i learnt the techniques of designing in C++ from the early days i.e. from "Barton and Nackman's Scientific and Engineering C++" and "James Coplien's Multi-paradigm design for C++". This is fundamental to problem-solving itself and hence in any job it is of utmost importance to understand the domain i.e. Why and What is being done rather than the How.
> Legacy code' often differs from its suggested alternative by actually working and scaling.
Very very true. This is why i dismiss people who come in and start saying "everything must be rewritten" without spending time studying and learning about the existing system.