Alternative hypothesis: (brace yourselves) people don't care enough. Any vendor will prioritize requirements, if performance is not in there, that CPU and memory is going to be used if in any way it helps the developers. Conversely, by looking at a system you can infer its requirements.
For commercial airplanes it may be safety first, ticket price second (passenger capacity, fuel efficiency) and speed third. For most software, functionality, signing up for a subscription, platform availability etc are usually prioritized higher than response times and keyboard shortcuts.
Game devs worry a lot about latency and frame rates and professional software care a lot about keyboard shortcuts. This proves (anecdotally) that performance isn't unacheivable at all, but rather deprioritized. Nobody wants slow apps but it's just that developer velocity, metrics, ads etc etc are higher priorities, and that comes with a cpu and memory cost that the vendor doesn't care about.
I'm a game developer and game performance is better than ever. 144 Hz monitor? 4k? We got you covered. Even ray tracing and VR is on the way.
Most games render frames of a 3D world in less than 17ms, but most websites take 3-7 seconds to load because of all the ads and bloat, and things shift around on you for another 20 seconds after that, so when you go to tap a link you accidentally tap an ad that finally loaded under your finger. If you optimize those websites, they run super fast though, but it's quite a pain to do with the dependency bloat in modern tech stacks...
(note: games lag as well when you drag in a million dependencies you don't need)
The thing is, most sites and web apps try to solve a user problem, and if you are the only company in town that solves that problem, then performance barely matters - what matters is solving the problem. The users will put up with some pain because the problem is even more painful.
With games, it's all about the experience of interacting with the software - so performance is (hopefully, depending on your team and budget) amazing.
That, and... performance tuning is hard work, and I think most people don't know much about it. It's a fractal rabbit hole. Cache misses, garbage collection, streaming, object pooling, dealing with stale pointers, etc. Even I have a ton to learn, and no matter how much I learn, I probably still will have a lot more to learn. It's easier for many teams to hand wave it I guess as long as they aren't losing too many customers because of it.
Older readers will remember when you'd buy a computer magazine printed on paper and at least 80% of it was ads. Which often didn't change from month to month.
Massive slabs of wood pulp had to be printed and shipped to all to the stores that stocked them, at huge cost, just so you could manhandle one of them home.
And the content was mostly physical bloat.
That's the modern web. Except that you don't just get the ads, you get some of the machinery that serves them - at you, specifically, based on your browsing profile, which is used to decide which ads you see.
I always refuse all cookies. When I forget to do that for some reason it's obvious just how much slower the experience gets.
> Most games render frames of a 3D world in less than 17ms, but most websites take 3-7 seconds to load
Most (many?) games take minutes to load. Perf is nice once they're loaded (assuming you've got sufficient hardware), but loading is a drag. Modern load times are worse than all the fiddling it took to get NES games to start.
I wish more game developers optimized for space. I stopped playing video games almost altogether because I'm not going to download 70 GB every time I want to play something. The size has gotten absurd.
Most 'performance tuning' out there is eliminating quadratic functions and putting stuff into simple dictionary caches. This should be easily teachable to any competent developer.
The problem isn't 'hard work', it's nobody cares about this.
True, although even without ads nor a bloated stack, you start with network latency, code running in a VM that has to use hardware with terrible embedded gpu, accessibility requirements, seo, and so the obligation to use the dom api, the latter not being accessible throught a compiled language.
Also, the website teams is usually smaller, and have a 10th of the time to ship.
But it doesn't excuse everything, for sure. Most websites should be fast given how little they do.
Figma is an excellent example of performance being a top priority to set themselves apart, since they weren't the only player in town at the time.
We had Adobe Illustrator, XD was still in beta phases, and Sketch for macOS was showing signs of bloat and major performance hangups in any considerably realistic document. Affinity Designer was also coming into the scene with stellar performance, but had a high learning curve and wasn't well suited for interactive prototyping.
Figma swooped in and solved 3 problems:
1. Availability in the browser on any platform, where native apps can't reach (and thus 10x easier sharing of documents).
2. Incredible performance on first release through their WebGL renderer, rivaling all of the native apps listed above.
3. Stayed flexible, and yet, barebones enough to get at least 80% of a designer's needs covered initially.
Performance (and stability) were primarily what won me over to it, and I'd argue probably the same for many who switched over.
The 3d world has long been a curious space to me. Even in late 90s, and early GPUs, you knew how responsive things were compared to 2d applications. It was hard to reconcile in my head.
> and things shift around on you for another 20 seconds after that, so when you go to tap a link you accidentally tap an ad that finally loaded under your finger.
Why don’t browsers mask out events during render of page regions or note their time and bounding boxes so click events reach the correct element?
Steps to reproduce this annoyance:
- visit Twitter mobile site: mobile.twitter.com in Safari on a slow iPhone (eg iPhone 6 stuck on iOS 12.X).
- scroll and view tweets for a while soaking up memory.
- visit a tweet
- press tweet share icon
- popup menu takes forever to render fully to include the Cancel button.
- clicking Bookmark tweet menu item will trigger item after that, Copy Link to Tweet, when it finally loads and moves that menu item under your finger.
most websites take 3-7 seconds to load because of all the ads and bloat, and things shift around on you for another 20 seconds after that, so when you go to tap a link you accidentally tap an ad that finally loaded under your finger
The problem with remarks like this is that if this was really your experience of using the web you'd have installed an ad blocker years ago, and you wouldn't make that argument now. Consequently either it's easily dismiasable as hyperbole or you're a masochist who enjoys terrible websites.
There are bad websites. It isn't "most websites" though. Just like some games drop frames horribly, but not "most games".
Websites can def be fast, but to your point it’s a priority problem.
This comparison I made at https://legiblenews.com/speed that got upvoted here a while back shows just that—most news websites care about ads first before their users speed experience. Only Legible News, USA Today, and Financial Times have a reasonable speed score, which is kind of depressing.
> I'm a game developer and game performance is better than ever.
I'm a game developer (and player) and I strongly disagree. What game can even do P95 144FPS on affordable hardware? On lowest graphics, with populated lobbies (there also can't be a that ONE map where this all goes to shit). And are we talking about AAA games or not? Because non-AAA means a minecraft knockoff cannot meet 60FPS.
> I'm a game developer and game performance is better than ever.
It could be, if users would have more control over the graphics settings of the games that they want to play and they were allowed to scale back further to accommodate older hardware.
For example, the Unity engine by default has a setting to allow downscaling the texture resolution that the game will use 2X, 4X and 8X, yet many games out there actually disable that. Same for some options menus not having framerate limits, dynamic render resolutions (though most engines support that functionality in one way or the other), particle density/soft particle options, options to disable SSAO/HBAO or other post processing like that, as well as enabling/disabling tessellation.
The end result is that many games that could run passably on integrated graphics or hardware from a few generations ago (e.g. GTX 650 Ti) instead struggle greatly, because the people behind the game either didn't care or didn't want to allow it to ever look "bad" in their pursuit of mostly consistent graphical quality (and thus how the game will look in the videos/screenshots out there).
The only real exception to this are e-sports titles, something like CS:GO is optimized really well for performing across a variety of hardware, while also giving the user the controls over how the game will look (and run). Games like DOOM are also a good example, but they're generally the exception, because most don't care about such technical excellence (though it's useful when you try porting the game to something like Nintendo Switch).
Most other games don't give you that ability, just because they try to always do more stuff, which isn't that different from Wirth's law (software gets slower as hardware gets faster). Of course, this is also prevalent in indie titles, many of which don't even have proper LOD setups, because engines like Unity don't automatically generate LOD models and something like Godot 3 didn't even have any sort of LOD functionality out of the box.
Engines like Unreal might make this better with Nanite, except that most people will use it for shoving more details into the games (bloating install sizes a bit), instead of as a really good LOD solution. That said, Godot 4 is also headed in the right direction and even for Godot 3 there are plugins (even though it's just like Unity, where you still need to make the models yourself), for which I actually ported the LOD plugin from GDScript to C#: https://blog.kronis.dev/articles/porting-the-godot-lod-plugi...
> Most games render frames of a 3D world in less than 17ms, but most websites take 3-7 seconds to load because of all the ads and bloat, and things shift around on you for another 20 seconds after that
I know what you want to say (and I agree), but... website rendering depends on the network mostly, 16ms latency is already the top 0.5% of fiber users and you have to add that on top of every new connection... GPU rendering happens on a bus that is thousands of times faster, and it needs to cover a minuscule distance.
You can't really compare the two.
> For most software, functionality, signing up for a subscription, platform availability etc are usually prioritized higher than response times and keyboard shortcuts.
This is why I love working in fintech. The engineering is paramount. Customers will not accept slow or buggy software.
I get to solve hard problems, and really build systems from the ground up. My managers understand that it is better to push back a deadline than to ship something that isn't up to standard.
Yes, but all in service of what amounts to automated stealing (cough) supplying liquidity.
Better than adtech, anyway. Or nukes. Lots of things, really. (I would have said weapons, last year.)
The only really defensible tech activity these days is things to help get off carbon-emitting processes. Making factories to make electrolysers. Making wind turbines better. Adapting airliners to carry liquid hydrogen in underwing nacelles. Making robots to put up solar fences on farms and pastures. Banking energy for nighttime without lithium. Making ammonia on tropical solar farms for export to high latitudes.
It's even money whether we can get it done before civilization collapses. I guess we will need plenty of liquidity...
I don't know what you mean by "fintech" but from my experience, bank and other finance apps are usually not that great, neither are the websites. So maybe some parts of fintech are nice and clean, but the part that the end user faces, not so much.
The way that game developers get their performance is more or less orthogonal to the way many other applications are expected to function.
It is impressive that they can draw so much stuff so fast, but there are actually very few objects on the screen that the user can directly interact with.
A specific example: in a DAW, you might have tens or even hundreds of thousands of MIDI notes on the screen. These look like (typically) little rectangles which are precisely the short of thing that games can draw at unbelievable speed. But in a DAW (and most design / creation applications), every single one of them is potentially the site of user interaction with some backend model object.
All those complex surfaces you see in contemporary games? Very nice. But the user cannot point at an arbitrary part of a rock wall and say "move this over a bit and make it bigger".
Consequently, the entire way that you design and implement the GUI is different, and the lessons learned in one domain do not map very easily to the other.
A user of Blender can absolutely point at an arbitrary part of a rock wall and say "move this over and make it bigger". Blender sacrifices rendering quality to make sure that interaction is reliably responsive. Cube/Sauerbraten forgoes some of the rendering optimizations provided by some other 3-D game engines to make sure you can always edit any part of the environment at any time, but it was already delivering interactive frame rates 20 years ago. And of course Minetest has very little trouble with arbitrary sets of nodes appearing and disappearing from one frame to the next, but Minetest isn't that great at performance, and its expressiveness is a bit limited compared to Cube and Blender, so maybe it's a less compelling example.
As long as it doesn't cause a glitch in playback, it's acceptable for your DAW to delay 10 milliseconds to figure out which note you clicked on. That's about 100 million instructions on one core of the obsolete laptop I'm typing this on. As you obviously know, that's plenty of time to literally iterate over your hundreds of thousands of little rectangles one by one, in a single thread, testing each one to see if it includes the click position.
But (again, as you obviously know) you don't have to do that; for example, you can divide the screen into 32×32 tiles, maybe 8192 of them, and store an array of click targets for each tile, maybe up to 2048 of them, but on average maybe 64 of them, sorted by z-index. If a click target overlaps more than one tile, you just store it in more than one tile. When you have a click, you bit-shift the mouse coordinates and combine them to index the tile array, then iterate over the click targets in the array until you find a hit. This is thousands of times faster than the stupid approach and we haven't even gotten to quadtrees.
A different stupid approach is to assign each clickable object a unique z-coordinate and just index into the z-buffer to instantly find out what the person clicked. This requires at least a 24-bit-deep z-buffer if you have potentially hundreds of thousands of MIDI notes. But that's fine these days, and it's been fine for 25 years if you were rendering the display in software.
With a modern data driven or ECS architecture, you actually can do that. There are demos out there with hundreds of interactive things on screen at one time. It's kind of amazing.
It's not like games don't have complex UIs either, where each button or field has a lot of logic to them. Many games have very simple UIs, but others get more complex than many complex web apps. Some are even multiplayer, and the server code... it's nuts how much work goes into this. The number of updates you send each second to keep players in a game in sync compared to what is needed to keep a chat app in sync is really impressive to me.
From a technical point of view, games are really cool!
>All those complex surfaces you see in contemporary games? Very nice. But the user cannot point at an arbitrary part of a rock wall and say "move this over a bit and make it bigger".
The Red Faction series, building games like Minecraft and 7 Days to Die, and games like Factorio are some pretty obvious examples where you're completely and utterly wrong, so I'm not really sure why I should trust anything you said in the rest of your comment.
You might be underestimating modern games a bit I feel, but as well game engines usually have editors just as complex as a DAW. Yet both games, game engine editors, DAWs and all the examples you mentioned all run vastly more performantly than many modern simple pieces of software. Which tells us that performance is possible regardless of domain if we build properly.
There are lots of games with destructible environments and thousands or tens of thousands of interactive objects moving around, the static world you're describing hasn't been the Only Way for quite a while. Ignoring the obvious cases like Minecraft, Red Faction and Fortnite, even games where it isn't relevant to the gameplay still implement it sometimes - for example, The Division (2016) had fully destructible walls in many locations and if you fired enough rounds into a concrete wall you could punch a big hole in it that could be seen through and fired through to attack enemies or heal allies. This sort of thing doesn't have to come at the expense of visuals either, modern rendering tech along the lines of Unreal's Lumen & Nanite can adapt to changes in the environment and handle massive dynamic crowds of people or groups of cars.
You can quite literally point at an arbitrary part of a rock wall and move it over a bit then make it bigger depending on the engine you're using. It's why Unreal Engine has got a foothold in TV production.
What? What are you talking about? Games handle thousands of colliders and raycasts just fine. I thought you would mention that games get to use dedicated GPU hardware and apps might not be hardware accelerated but colliders? Very odd take.
Games have come a long way since Space Invaders and Pac Man.
They routinely do have hundreds, or thousands, of interactive things. Especially things like RTS games. But also, even if you look at turn-based strategy games, which have a much more application-like interface. Every hexagon, terrain feature, and unit is interactive, along with a full application-like menu, statusbar, and UI system.
> Alternative hypothesis: (brace yourselves) people don't care enough. Any vendor will prioritize requirements, if performance is not in there, that CPU and memory is going to be used if in any way it helps the developers. Conversely, by looking at a system you can infer its requirements.
I think most of all, it isn't sufficiently visible. Most development is done on high powered hardware that makes slow code very difficult to distinguish from fast code, even though you can often get 10x performance improvements without sacrificing readability or development effort.
Individually it's just a millisecond wasted here and there, but all these small inefficiencies add up across the execution path.
Here's a fun benchmark to illustrate how incomplete beliefs like "compilers are smart enough to magically make this not matter" can be:
It's a 50x difference between the common idiomatic approach and the somewhat optimized greybeard solution, with a wide spectrum of both readability and performance in-between.
If you put zero thought toward this, your modern code will make your modern computer run as though it was a computer from the late '90s.
The sad thing is that a computer from the late '90s running software from the late '90s is extremely fast and snappy basically all the time. Old computers are just so much more responsive.
Honestly, what I want is for more developers to test and optimize their software for low-power machines. Say, a cheap netbook for example (like a Chromebook). I've heard that if you do that, you will be faster than basically every other piece of software on the system. And that speed will persist (multiply, even) on any more-powerful computer.
I've heard of one person who does that for their Quake-engine game/implementation (don't remember which). They get thousands of frames per second on a modern machine. I am guilty of not doing that myself, though. Might pick up a cheap netbook from eBay for around $30.
Frankly I'm shocked that deduplicateTree is two orders of magnitude slower than deduplicateGreybeard, despite having asymptotically better-or-equal performance (TreeSet runtime is O(output log output), whereas sorting an array is O(input log input)). I'd say that perhaps boxing the integers is slowing the algorithm down, but deduplicateGreybeardCollections has only a <3x performance hit rather than >100x. Is the problem due to allocating large amounts of internal memory and branching heavily (binary trees as opposed to B-trees), or less predictable/vectorizable code execution than array traversal, or virtual comparison dispatch slower than `items.sort(Comparator.naturalOrder())`, or some TreeSet-specific slowdown?
> Most development is done on high powered hardware that makes slow code very difficult to distinguish from fast code, even though you can often get 10x performance improvements without sacrificing readability or development effort.
true, but compilers are also doing a lot more, and one big driver of high-performance machines (at least for native dev) is compilation taking waaaay too much time...
I looked at this source when you shared it and had planned to show it to a few engineers on my team. Now, sadly, the file is gone. Any chance you can put it back up?
> This proves (anecdotally) that performance isn't unacheivable at all, but rather deprioritized. Nobody wants slow apps but it's just that developer velocity, metrics, ads etc etc are higher priorities, and that comes with a cpu and memory cost that the vendor doesn't care about.
I saw a project that had a very clear N+1 problem in its database queries and yet nobody seemed to care, because they liked doing nested service calls, instead of writing more complicated SQL queries for fetching the data (or even using views, to abstract the complexity away from the app) and the performance was "good enough".
Then end result was that an application page that should have taken less than 1 second to load now took around 7-8 because of hundreds if not thousands of DB calls to populate a few tables. Because it was a mostly internal application, that was deemed good enough. Only when those times hit around 20-30 seconds, I was called in to help.
At that point rewriting dozens of related service calls was no longer viable (in the time frame that a fix was expected in), so I could "fix" the problem with in memory caching because about 70% of the DB calls requested the very same data, just in different nested loop iterations. Of course, this fix was subjectively bad, given the cache invalidation problems that it brought (or at least would bring in the future if the data would ever change during cache lifetime).
What's even more "fun" was the fact that the DB wasn't usable through a remote connection (say, using a VPN during COVID) because of all the calls - imagine waiting for a 1000 DB calls to complete sequentially, with the full network round trip between those. And of course, launching a local database wasn't in the plans, so me undertaking that initiative also needed days of work (versus something more convenient like MySQL/MariaDB/PostgreSQL being used, which would have made it a job for a few hours, no more; as opposed to the "enterprise" database).
In my eyes, it's all about caring about software development: either you do, or you don't. Sometimes you're paid to care, other times everyone assumes that things are "good enough" without paying attention to performance, testing, readability, documentation, discoverability etc. I'm pretty sure that as long as you're allowed to ship mediocre software, exactly that will be done.
> In my eyes, it's all about caring about software development: either you do, or you don't.
Yes, that's the difference. I'm lucky enough to work mostly with colleagues that share a certain level of craftsmanship about the software we ship. We all care about the quality of the product, although we may not always agree on what an ideal solution should look like.
I think people care but there are no alternatives. My 3 month old Windows 11 machine with a Ryzen 5700 and an RTX 3080 regularly starts chugging on stupid things like moving a window, chrome will randomly just peg a cpu with some sort of "report" process, and visual studio locks up when the laptop awakes from sleeping. Id drop it in a second but for what? Mac sucks for game development, has the same issues with too many weird background processes, and linux distros never seem to work on laptops well, even ones designed for it (system 76 for instance). The aren't real alternatives.
(I know that this is not the ideal solution, but) I had sucesss by installing a trimmed-down 'gamer' version of Windows 11 on a similarly high-specced PC, and it's still blazingly fast nearly a year later.
(If you go this route, you've got to be careful what's been trimmeed, mind, if you want to use it for general purposes - mine doesn't have WSL and seems to have damaged printing support, which is obviously suboptimal depending on your needs.)
One of the reasons users don't care is that they have no expectations of quality. It's painful to watch non-tech people use software written by incompetent developers, like Teams. You'd literally see someone type, then Teams UI would hand. They would pause typing and patiently stare, hands on keyboard. UI would unfreeze 3 seconds later, they'd keep typing. To me this is insane, but people are used to it / have been trained to expect it and consider it normal, apparently. Just like (a somewhat more reasonable) expectation that things need to be restarted in an escalating sequence if something is not working.
If I could go back in time and make one change to the history of computing, I'd add code to every consumer-facing GUI OS that would kill-dash-nine any application blocking UI thread for more than say 200ms. And somehow make it illegal to override for any consumer software, or if I were God make it so any developer trying to override it to let their software be slow gets a brain aneurysm rendering them (even more) incapable of writing software and has to find work digging ditches.
Bad alternative hypothesis. The implemented approach with recursive search is more complicated than just enumerating the files in the Sound recordings directory. A singular focus on requirements would have resulted in the easy solution that's also more efficient.
This is sheer stupidity: a more complicated, less efficient solution to a simple problem.
They presumably wanted to recursively enumerate all the files under the Sound recordings directory?
(In any case, for the application programmer the difference between recursive enumeration of files vs only what directly in a specific directly is likely only one flag or so.)
This is why I think it's so important that tool makers take the responsibility of performance seriously.
My old company used a proprietary VPN software (Appgate SPD) that used an electron front end - why? Well the front end could be reused on Windows, Linux and MacOS. Fair enough but it was very bloated and lacked features. Compound that to half a dozen electron apps and suddenly my laptop battery dies in half the time.
We can't guarantee or mandate that every project will be written in Rust, multiple times for every platform using their native APIs - so it's on the tool makers to bridge that gap.
Electron shows that we need a robust, efficient, cross-platform, cross-language GUI API.
Would be nice to be able to use Go, Rust, Python, JavaScript or whatever and interact with an GUI interpreter that translates a familiar API to native widgets.
GTK can run on Windows, Linux and Mac. It's open source (LGPL), fast enough and the ecosystem is fine. QT is another one.
The truth is, Electron is easy to use, quick to hire and cheap to build on, that makes it the best choice for many companies, even when the user might (highly-likely) hate it.
I agree. I'm claiming that most customers and/or decision makers do not. We should recognize that and focus on a solution instead of blaming engineers and local technical decision making. A first step is to avoid slow products, or influence those that buy products for us.
I guess, but there does seem to be a lot of things companies don't actually care about - performance, security, documentation.. there seems to be a real list of things that regularly show up with people complaining about it and the answer is nobody cares about that stuff.
That said I did work at a major media company and the performance sucked (in the help section) and I said I want to improve it and the PM said "nobody cares" but I think she meant the business didn't care. So there is one anecdote supporting your claim!
I think you are very right about it, but for certain things speed needs to be a requirement. I have tried both Evernote and notion plus onenote.
They are all much more powerful than Apple notes but they are so slow by the time they are ready for my input I have forgotten what I was trying to note down.
We do live in this paradigm of everything being in the cloud and clients are stupid and constantly need to sync, often times in a "stop the world" fashion. Last time I used Twitter web I saw three spinners in the same screen. So that's a big factor as well. It's incredibly poor engineering culture imo, but again those are there because that's what they wanted to build.
I'd say it's also instructive to consider different aspects of performance. For example, many games start up unnecessarily slowly because nobody bothered to optimise it. Users will load the game, then spend a fair while playing it, so the slow startup time doesn't hugely impact their experience. Hence, improving frame rate is a much better investment
True. I don't think users care either. Typically I get lambasted when I bring up performance issues on vendor or user group forums.
Visual Studio and Adobe Creative Suite are examples of the two most egregious offenders. They're both awfully slow and Creative Suite is extremely buggy as well. Visual Studio is nearly 30 years old, and Photoshop is even older. That's a lot of features to have to carry forward over the years.
It would be neat if during setup, they tried to figure out what your goals were as a user and turned stuff off you didn't need. My pet peeve about Visual Studio is whenever I upgrade to a new version, my default setting overrides are not carried over.
Isn't this thread full of evidence that people do care? Most of us, talking as end-users of software, seem to care. It's annoying when notepad.exe is slow.
Most people don't know what notepad.exe means. We're an extremely narrow set of people.
Also, it's about caring enough to actually stop using it. Have you ever changed software to a competitor with better performance? I don't know if I have..
I guess it’s more that what is developed is lead by product people and those don’t know how to plan for performance issues and instead inadvertently push for more features in a shorter time.
As well: Companies pay more for dev time than they ever have in the past. There are tonnes of stories of companies buying devs the best hardware so that they can program more effectively, and of course devs with the best hardware don’t even see a lot of performance issues.
Then on top of that, companies are making choices like “we could optimize X and Y, but that would take too much dev time. Let’s just (compromise somewhere/bump up the requirements)”
This kind of idiocracy reminds me of the GTA json thing.. It's kind of hard to attribute these things to ignorance, but there's been a notable shift in perception among developers.. It used to always be about "tricking" the result out of the computer by applying deep understanding of both system and problem to find the fewest, most performant steps required to yield the result.
30 years ago, nobody in their right mind would have thought "du'h, I'll just recursively scan whatever directory THIS environment variable points at and then apply whatever this library does to each file in it to get some list and then extract the files I need from that list"..
Even IF you had to do such a terrible and expensive operation, you'd at least have the good sense to save your result so that you can do it more efficiently the next time..
> 30 years ago, nobody in their right mind would have thought "du'h, I'll just recursively scan whatever directory THIS environment variable points at and then apply whatever this library does to each file in it to get some list and then extract the files I need from that list"..
I think plenty of people would have made the error of filtering after applying an expensive operation than before. This seems like a classic rookie mistake, not an indication of some paradigm shift in thinking. What was impressive was that no one at Rockstar thought of improving the loading performance for so long that a use decided to reverse-engineer it.
I think some of these issues should be looked through a lens of incentives though as well. At my previous job, management didn't care much about performance improvements or bug fixes, they care more about delivering projects or big features. Fixing stuff like this wouldn't help you get promoted.
Alternative opinion here, 30 years ago games/software were much simpler. I assume we did not have conglomerates like Rockstar and EA who pushed their employees beyond their limits. Game industry is notoriously known for exploiting programmers.
Imagine being a SWE working 12+ hours a day 6 days a week getting no rest. I might have accidentally coded something like this so that I could go home a couple of hours earlier.
Certainly, I see I'm coming across as bashing the devs, that's not my intention.. It is my personal experience that cost of implementation is almost always the main restraint, it sounds rational when we put it this way.. Don't spend more on implementation than is needed.. Defining what's needed is more difficult.
> Game industry is notoriously known for exploiting programmers.
I'd be wary with these kinds of words.
The thing is that people seem really, really eager to work in the video games industry. (Just like people are really eager to be artists or musicians or pet vets or teachers etc.)
Approximately everyone who programs for the game industry could also land a cushy job writing boring CRUD software.
I do admit that it's tempting to show compassion for people who put up with crappy working conditions and low pay in order to work in their beloved field.
(Especially if that field sounds socially desirable like teaching.)
But unless you severely restrict entry to the beloved industry, conditions and pay will always be lousy compared to what similar skills could get you in an unglamorous field. That's just supply and demand.
> Imagine being a SWE working 12+ hours a day 6 days a week getting no rest. I might have accidentally coded something like this so that I could go home a couple of hours earlier.
In the case of the game industry, more likely so that you could work on the next ticket during the eternal death march.
Yeah, I don't buy it. I've been programming for more than 20 years, and I'm pretty sure people were making the same complaints 20 years ago that this article and you are making now.
For some software, performance is important. For others, it isn't. Some developers are good, at least at making performant software, some are worse. That's always been true and will probably always be true.
Still, software is slowing down. Anecdotal example: my phone. It used to run fine at the beginning then now a few years later everything is slow: Google maps, YouTube, Signal, the pin-code screen… I barely have installed any apps, the only thing that changed is the number of messages I have and the updates.
Perhaps we could justify it with new features at least? Perhaps my phone is slower now because the things in there are so much more useful now? Well, no. I got maybe a couple cosmetic changes, but functionality is exactly the same. My phone got slower for no benefit.
It's not even a good vs bad dev issue most of the time, I can imagine lots of devs at Rockstar who hate working there and are under strict deadlines from managers so they just make it work while they leetcode and apply to faang for better salaries where they'll "finally" optimize things, etc.
Funny enough, if you read the guidelines for Metro apps (or Modern apps, or UWP apps, or whatever the hell they are called now), Microsoft specifically says that you should offload all expensive operations to an async thread and keep the UI thread free so the interface can be quick and responsive. When I was making my first apps in university, they would routinely reject them from their app store saying the UI is not responsive enough. You'd think they actually care about efficiency and responsiveness.
But then they hardcode a whole Documents directory scan right in the UI thread in their own damn voice recorder app. Unbelievable.
> It used to always be about "tricking" the result out of the computer by applying deep understanding of both system and problem to find the fewest, most performant steps required to yield the result.
We didn't used to have product teams as large as engineering teams. You can still engineer software the way you want, you're just beholden to some product persons idea about deadlines now. If you can turn your little unit of work around quickly no one will bat an eye (of course, we all know that's not realistic most of the time). If you dare trying to take longer than their perception of how long it ought to take then you better be prepared to discuss the "value add" of your performance minded solution as nauseum.
I don't think either is true. I think it's a case of "development by committee" - many different people work on a codebase, but nobody is critical enough of the whole product.
I can imagine the audio recorder is a low priority minor side project that gets handed from one guy (or team) to the next every couple of years, with the emphasis on "don't spend too much time on it".
The other issue is that performance is taken for granted; if something is slow, it's not immediately obvious.
Having a list of all audio recordings always shown on the right pane is kinda an optimization (VoiceRecorder.exe). The usefulness is limited, because VoiceRecorder can't edit them, only play back and set markings.
yea, we went from being burned by trying to optimize before we knew the problem, to being burned by not optimizing because we knew the problem but didn't optimize.
30 years ago, much (UNIX) system software was built out of calls to system(), ie it would have shell commands do most of the steps. Partly this is because it is hard to do everything in C and partly it is just laziness. This is partly the reason for terrible security (the IFS environment variable could be changed to change the meaning of the system() calls and trivially make a suid program do bad things).
I think you’re giving programmers of the past too much credit.
I think you're not giving them enough. They managed to have a glaring security hole to be exploited, but no one bothered to. It took the bunch of degenerates we have around today with no respect for the sanctity of someone else's machine for this type of practice to even become a problem.
You'll be amazed how many seemingly intractable technical issues aren't due to having been solved at higher layers of abstraction. Nowadays with things like clear cut ownership of post first sale ownership breaking down, we're actually getting bitten by having to try to technically solve problems, previously made tractable by the social consensus between professionals.
As a consequence, here we are with a now crappy, pathological Notepad.
I disagree, software has always sucked. We can remember lots of the good software, and back how things used to be but the reality is this is nostalgia speaking.
A perfect example is windows 95 - wiggling the mouse causes a huge speedup in long running applications [0].
> used to always be about "tricking" the result out of the computer by applying deep understanding of both system and problem to find the fewest, most performant steps required to yield the result.
That "tricking" came with lots of issues. Massive security holes, significant app stability issues, and hugely increased developer time to implement things. I can get a hardware accelerated Sha1 hash of a buffer in one line of c++ today, and I know it will be correct.
People who do that work still exist. Computers are fast enough that it's worth not spending hours optimising for every allocation when you're displaying half a dozen strings on screen, but people like Casey Muratori, Bruce Dawson, still exist and still write code.
too many developers means they switched from trying to emulate the elites to trying to follow the trend of the day, because otherwise you get ostracized by the crowd
I personally blame the “premature optimization is root of all evil” quote which most of the time is taken out of context but formed a lasting impact in many tech leaders’ minds
I was writing an NMEA parser today. (NMEA 0183 is a common serial GPS data protocol) My serial port reads returned an IOvec-like type which exposed limited operations. I spent probably an hour making helper functions and doing various verification and parsing operations on the IOvec type because it saved having to copy the data into a contiguous buffer. Midway through I stopped and realized that I would only need to copy when the ~50B message straddled the end of the ~64kB buffer. It takes a few nanoseconds to copy 50B. I spent an hour on an optimization that would never do anything but complicate the code. If I had just written the simple and obvious code, and spent that hour profiling the application to optimize what would actually make a difference, I'd be much better off.
That is what the quote is talking about. I don't think you can blame the quote if people are just ignoring the first word, and I'm not convinced that's actually a common thing.
You shouldn't even need to profile to realise that if you're using a serial port for I/O, unless your hardware is something embedded and in the low-MHz, range, you'll spend most of your time waiting for I/O unless you massively overcomplicate your solution.
Virtually every single performance thread on stack exchange has that one guy that says regurgitates something about premature optimizations and maintainability eg
Q
"I'm working on high throughput data science thing, is branchless math better than conditional instructions here, because this thing needs to take less than a lifetime to finish"
A
"You should focus on code readability, it's way more important ..yadeyadaya ... i work on python and websites"
I blame slipping standards in CS programs and the entire concept of "information science"
A good mentor told me once not to let 'perfect' be the enemy of 'good'. At the time I thought it was clever but it didn't really hit home until year 10 or 15. I did a lot of things that made no difference to the user, but to me was finesse.
Tbh, most performance questions I see on SE are bad too. The necessary question here is: what's the profile of the whole thing - does this line of math even matter? The questions that are already researched to the level where the original question makes sense are extremely rare - and I do enjoy those.
Also the "developer's time is expensive" belief pushes developers to reuse layers and layers of packages and libraries whose behaviors are not well aligned with the goal of application, but hey, it's cheaper to just slap something on the existing packages and ship it. The "expensive parts" are pushed to the end users.
Agreed. I really dislike this excuse for sloppy slow code.
Yes, developer time is expensive, need to keep ROI in mind.
But if the expensive developer spents a week optimizing the function to make it 1 second faster, is that worth it? Too many people will respond "developer's time is expensive" without thinking beyond that. But of course, it depends.
If that function is being invoked millions of times per day by hundreds of thousands of people, it very quickly becomes totally worth it.
Adding on to that, “QA time is expensive” too. I’ve learned from experience that trying to tweak well established repos/libraries rarely gets management’s approval because the extra time it would take to QA and regression test is considered pure loss. Better to leave it be and force the new code to work with the old stuff. That way it is at least billable.
> developers to reuse layers and layers of packages and libraries whose behaviors are not well aligned with the goal of application, but hey, it's cheaper to just slap something on the existing packages
and in many cases those layers and layers actually slow development long-term because of that mis-alignment and needing to untangle/manage the mess of dependencies over time...
I don’t know, should I really reimplement that graph lib in a buggier, slower way, over choosing a battle-tested one that may also contain some additional features I don’t need?
According to Brook’s famous paper, code reuse is the only way to significantly improve programmer productivity. No new language will suddenly make you 10x faster, and even that would be meaningless against all the quality code bases out there.
Yup. There surely is such thing as over-optimized, early-optimized, over-clever, over-terse, over-engineered, over-abstracted, etc. But in terms of actual impact in the software ecosystem, the simple problem of not-good-at-programming still blows them all out of the water, combined.
It's important to be able to recognize which of the two echelons of problem-space one's codebase inhabits.
Well, modern programmers more often think "optimization is the root of all evil". I read an article that literally recommends not to optimize until the end.
Yep, I think baseline optimize should be just build into the core of framework or language. Specified optimization could be applied later but not these. Basic caches that don't affect performance or memory should just be enabled by default instead of some "let's just don't optimize it at all until it lags like hell" shit like react.(Spoiler: nobody actually optimize it in the end) I hate websites written by react so much because of this.
Another one that's been abused and thoroughly misunderstood is "correlation does not mean causation".
You could come up with data showing an R^2 of like 0.99 and there's always a smartass that immediately parrots "but remember that correlation does not mean causation". Ugh.
Why blame a quote, an inanimate thing, and not the foolish programmer that scans the entire `Documents` directory just to ignore most of it and only keep the results of the `Sound Recordings` subdirectory? (Among other apparent issues with this program.)
I think that quote can often be the source of misguided beliefs like "it's OK to scan the whole Documents directory".
'Cuz why waste time doing something harder? Premature optimization and all that.
Honestly, in my experience, basically the opposite of the quote is true. If you want your software to be fast, you need to be thinking about speed in the design from the very beginning. Very hard, perhaps impossible, to make something not designed for speed from the beginning be as fast as something that was.
Of course there's a difference between having a holistic design that you know can go fast and immediately wasting time eeking out x% of speed in some random sub-fn that will only run y% of the time, when x and y are small.
But really performance does need to be built into the design.
(As an example, look at the heroic steps that are being taken to try to make the Rust compiler faster. It's slowly getting better, but it will never be fast. On the other hand, I would predict that if you started making a Rust-like-language from day 0 with a fast compiler as a goal, you could get something much faster.)
Counterintuitively, software performance as a whole is not a function of hardware, it's a function of human tolerance. Apps are as slow as we humans are willing to tolerate.
As hardware gets better, we're finding ways to use it less efficiently.
So if you want apps to become faster, you shouldn't be building better hardware. You should be changing human tolerance thresholds.
Which is almost impossible to pull off. Apple did this when they released the iPhone. Remember mobile apps before iOS? Mobile browsers before Safari?
I like your take that human tolerance is the limiting factor.
I'd agree that is a major constraint. The rest of it is just economics.
When I started programming, people I learned from would optimise code in assembler, use clever algorithms and write highly unreadable code that was impossible to debug. One bit of code I remember interleaved various operations so that it's access would line up with disk reads. I rewrote it to just hold the data on memory. It wasn't much faster than his disk based code in practice.
But the only reason to go to that kind of effort was the severe constraints hardware placed on software. When those are largely removed, the imperative is to write code fast that other people can still understand.
I think part of it is the terrible software people are forced to use at work, which is bought by people who only look at feature lists and price. This stuff eats away at their expectations until they are OK with bloated stuff like discord for private use too.
I have almost 15 year old CPU with 2 GB RAM and Intel Core 2 Duo processor. It has Windows 7 installed. I recently opened it again and was amazed to see that it actually works and it is also very snappy by current standards.
My Core 2 Duo Merom laptop was sadly unbearably sluggish on Windows 7 with a mechanical drive even with indexing and background services disabled, even at browsing files and such (and merely laggy with a SSHD mechanical drive with flash cache), though I hear Penryn CPUs are faster (and perhaps desktops are faster than laptops). The computer performs much better on a non-JS-based Linux DE like MATE (as opposed to GNOME Shell and plasmashell), though taskbar window previews are black (probably it doesn't work on an anemic GMA 965). I swear Xfce on a 2-core Ivy Bridge laptop with iGPU and SATA SSD, feels just as fast as KDE on a 6-core Zen 3 desktop with non-Nvidia GPU, at file management, Konsole, Telegram/nheko chatting, and PDF reading, until you open a browser or Electron app or run a modern optimizing compiler.
Oddly, with Windows 7 on the Core 2 Duo, the taskbar would randomly stop updating (I think it stopped happening after moving the same hard drive into a newer laptop?), and I got no end of driver issues: the touchpad would stop responding on the lock screen or logged in after sleep-wake, and audio would randomly stop playing until I restarted audiodg.exe. As far as I can remember, none of these issues happen on my Ivy Bridge laptop, where I'm clinging for life onto Windows 7 (the last version with a soul) for as long as I can keep apps running... though I'm getting rather tired of Tailscale creating a new network adapter and asking for home/work/public on every reboot.
I had the same experience with a 900 Mhz Pentium III machine, and Windows XP SP3. It was lightning, blazing fast. It geniunely felt like the reactions for my mouse clicks were faster than I could finish the clicking motion.
I had a experiences both with a very very old win95 box and an old linux 2.4 kali usb key.
The linux one was strange because my arch/systemd/ssd/i3 setup is lean, you get parallel boot, no heavy DE no bloat.. but everything felt lighter and faster on kali and I had a physical reaction saying "I don't miss anything from my current laptop, that old thing was closer to my needs".
Maybe a part of our brain that doesn't care about typography or visual effects much and prefer good old crude solid lag free tools.
I recently had the pleasure of installing OS9 on a classic 2001 iMac – the "lampshade" one, though I don't think that nickname does it justice!
I was, and am, blown away by how responsive it is. Plus, the UI sound effects add to the experience in a great way. You can HEAR when you've done something "in the computer." Just a bunch of clicks and boops. It's fantastic. Makes you think about what we've just... gotten used to.
Speaking of Windows: Microsoft PMs stupidity can never be matched by declining SW development efforts, even with better HW.
Here you cannot really blame the poor developer, when the PM demands to scan all Documents, not just some relevant sound recordings. It's so obviously insane, that only PM's get away with this. Of course the dev matched our PM here, with his insane directory iters, but alone the idea must be appreciated first.
PMs don't get to order devs around at Microsoft (or any sane organization). Devs can tell their PM counterparts to stuff it if they try to demand stuff. If you've worked at Microsoft, or met someone who has, you'd know this.
That stack trace says it all --- the bloat known as COM rears its ugly head again. I also suspect there's a quadratic or even worse algorithm in there, hidden away amongst all that useless abstraction. The fundamental way of scanning for files with the Win32 API is an iterator (FindFirstFile/FindNextFile). It looks like they put some sort of array/indexing abstraction on top of that ("GetRowsAt", "FetchResultAt") which probably gets the N'th entry by stepping the iterator N times, and from there it's not hard to imagine someone doing
for(int i = 0; i < obj->GetNumItems(); i++)
doSomething(obj->GetItemAt(i));
I think the issue is actually the sandboxing and other stuff. WinRT StorageProvider API is known to be extremely slow and NTFS / Windows IO subsystem is itself already quite slow compared to UNIX. The issue IIRC is that StorageProvider is designed for sandboxing and the way they implemented that involves doing RPCs to other processes. So there's probably some heavy context switching involved, but it's architectural, and so the voice recorder was never fixed.
"The challenge here is simply that listing the StorageItems in a StorageFolder is incredibly slow and resource intensive compared to using the standard .Net/Win32 API to list the file paths. A quick test showed that in .Net it takes about 0.003ms per file, whereas with UWP StorageItems it takes about 2ms per file, which is around 700 times slower. Also, looking at the Runtime Broker, UWP uses about 70kB memory per file, which is a huge cost when loading 10,000+ files, whereas .Net uses around 0.1kB per file (that’s a very rough estimate)."
Somewhere else, someone proposes a theory that due to API mismatches/design issues in Win32 the RuntimeBroker is trying to use a design that isn't a great fit in order to try and provide certain security guarantees, and this requires pre-fetching a lot of data up front in case the app requests it. But NTFS is slow, so, all this additional traffic makes opening files via the Storage API really really slow.
The problem here isn't really "modern software", it's that Microsoft wrote a lot of new APIs over time with the intention of replacing Win32 (UWP/WinRT) but they aren't dogfooding them all that effectively, and they're using C++ for everything, so there are problems that don't get fixed for years.
This was exactly the problem with CORBA in the 90s. It made remote calls as easy to use as local calls. Once they looked the same people would make RPCs in loops like that.
Imagine the cost of a non obvious RPC call in a nested for loop? Just not funny.
Odds are someone who writes an app is doing it in a sterile environment, where there are only going to be a handful of test files in any folders.
If they do try it on their own system, they'll likely discount any performance hits as due to their own hording of files, and not consider the user likely to have just as many, if not more files. Thus, its not likely the programmer will notice it.
It's only if some exogenous event causes a programmer to consider run-time performance that it will be measured, and then optimized.
Another reason could also be the fact that devs usually only ever run development builds of the software.
So you end up dismissing slowness due to this ("I am sure this is faster when compiled in Release!").
I've made that mistake before until it was so slow that I decided to compile a Release build which was just as slow and found out that a regression was introduced.
Automatic performance monitoring for 'tasks'/services/computations is relatively straightforward but not quite as easy for UI interactions so these often get ignored.
> there are only going to be a handful of test files
Bingo. This used to be a well-known benchmarking cheat, because performance on a full (or even once-full) filesystem can be very different than on a freshly made one. Almost as common as short-stroking, if you remember that. Anybody who wanted realistic results would run tests on a pre-aged filesystem. Cache warming is another example of a similar effect, and there are probably others across every area of computing. It's always really tempting to run tests in a maximally clean environment to get maximally consistent results, but that just doesn't reflect real-world usage.
The actual capabilities of a modern AMD or Intel x86 chip are staggering compared to the zombie-like procession of application experiences they are forced to support every day.
In the 90s, we had shooters running at playable frame rates using software rasterization on a single thread. Now from this vantage point, think about the "what if you had a 1000x faster CPU" thought experiment that was posted to HN recently. Except, make it more like 100,000x faster...
If we had taken SIMD a bit further on the CPU and kept our extremely scrappy software engineering practices, I think it's possible the GPU might not have ever really emerged as a major consumer product. Recent game engine tech, such as Nanite's software rasterizer, is starting to shift ideologies back the other way now too.
Parent poster seems to be from Intel. They argued very similar about specialized CPUs back then. "You only need one chip"
Then came 3dfx, which was a blast.
You can even call the Amiga Blitter as one of the first GPUs, at least a specialized graphics chip. Same goes for coprocessors, like a math unit in a 486DX for example.
Nanite is a software rasterizer running as a compute shader on the GPU though, it still requires the power of a GPU but takes advantage of the increased flexibility of modern GPUs.
For commercial airplanes it may be safety first, ticket price second (passenger capacity, fuel efficiency) and speed third. For most software, functionality, signing up for a subscription, platform availability etc are usually prioritized higher than response times and keyboard shortcuts.
Game devs worry a lot about latency and frame rates and professional software care a lot about keyboard shortcuts. This proves (anecdotally) that performance isn't unacheivable at all, but rather deprioritized. Nobody wants slow apps but it's just that developer velocity, metrics, ads etc etc are higher priorities, and that comes with a cpu and memory cost that the vendor doesn't care about.
Most games render frames of a 3D world in less than 17ms, but most websites take 3-7 seconds to load because of all the ads and bloat, and things shift around on you for another 20 seconds after that, so when you go to tap a link you accidentally tap an ad that finally loaded under your finger. If you optimize those websites, they run super fast though, but it's quite a pain to do with the dependency bloat in modern tech stacks...
(note: games lag as well when you drag in a million dependencies you don't need)
The thing is, most sites and web apps try to solve a user problem, and if you are the only company in town that solves that problem, then performance barely matters - what matters is solving the problem. The users will put up with some pain because the problem is even more painful.
With games, it's all about the experience of interacting with the software - so performance is (hopefully, depending on your team and budget) amazing.
That, and... performance tuning is hard work, and I think most people don't know much about it. It's a fractal rabbit hole. Cache misses, garbage collection, streaming, object pooling, dealing with stale pointers, etc. Even I have a ton to learn, and no matter how much I learn, I probably still will have a lot more to learn. It's easier for many teams to hand wave it I guess as long as they aren't losing too many customers because of it.
Massive slabs of wood pulp had to be printed and shipped to all to the stores that stocked them, at huge cost, just so you could manhandle one of them home.
And the content was mostly physical bloat.
That's the modern web. Except that you don't just get the ads, you get some of the machinery that serves them - at you, specifically, based on your browsing profile, which is used to decide which ads you see.
I always refuse all cookies. When I forget to do that for some reason it's obvious just how much slower the experience gets.
Most (many?) games take minutes to load. Perf is nice once they're loaded (assuming you've got sufficient hardware), but loading is a drag. Modern load times are worse than all the fiddling it took to get NES games to start.
Most 'performance tuning' out there is eliminating quadratic functions and putting stuff into simple dictionary caches. This should be easily teachable to any competent developer.
The problem isn't 'hard work', it's nobody cares about this.
Also, the website teams is usually smaller, and have a 10th of the time to ship.
But it doesn't excuse everything, for sure. Most websites should be fast given how little they do.
We had Adobe Illustrator, XD was still in beta phases, and Sketch for macOS was showing signs of bloat and major performance hangups in any considerably realistic document. Affinity Designer was also coming into the scene with stellar performance, but had a high learning curve and wasn't well suited for interactive prototyping.
Figma swooped in and solved 3 problems:
1. Availability in the browser on any platform, where native apps can't reach (and thus 10x easier sharing of documents). 2. Incredible performance on first release through their WebGL renderer, rivaling all of the native apps listed above. 3. Stayed flexible, and yet, barebones enough to get at least 80% of a designer's needs covered initially.
Performance (and stability) were primarily what won me over to it, and I'd argue probably the same for many who switched over.
Why don’t browsers mask out events during render of page regions or note their time and bounding boxes so click events reach the correct element?
Steps to reproduce this annoyance:
- visit Twitter mobile site: mobile.twitter.com in Safari on a slow iPhone (eg iPhone 6 stuck on iOS 12.X). - scroll and view tweets for a while soaking up memory. - visit a tweet - press tweet share icon - popup menu takes forever to render fully to include the Cancel button. - clicking Bookmark tweet menu item will trigger item after that, Copy Link to Tweet, when it finally loads and moves that menu item under your finger.
Well now, "most games" will take longer than 3-7 seconds to load. That's fair - they're doing a lot more than websites!
Most websites also render frames in less than 17ms.
The problem with remarks like this is that if this was really your experience of using the web you'd have installed an ad blocker years ago, and you wouldn't make that argument now. Consequently either it's easily dismiasable as hyperbole or you're a masochist who enjoys terrible websites.
There are bad websites. It isn't "most websites" though. Just like some games drop frames horribly, but not "most games".
This comparison I made at https://legiblenews.com/speed that got upvoted here a while back shows just that—most news websites care about ads first before their users speed experience. Only Legible News, USA Today, and Financial Times have a reasonable speed score, which is kind of depressing.
I'm a game developer (and player) and I strongly disagree. What game can even do P95 144FPS on affordable hardware? On lowest graphics, with populated lobbies (there also can't be a that ONE map where this all goes to shit). And are we talking about AAA games or not? Because non-AAA means a minecraft knockoff cannot meet 60FPS.
It could be, if users would have more control over the graphics settings of the games that they want to play and they were allowed to scale back further to accommodate older hardware.
For example, the Unity engine by default has a setting to allow downscaling the texture resolution that the game will use 2X, 4X and 8X, yet many games out there actually disable that. Same for some options menus not having framerate limits, dynamic render resolutions (though most engines support that functionality in one way or the other), particle density/soft particle options, options to disable SSAO/HBAO or other post processing like that, as well as enabling/disabling tessellation.
The end result is that many games that could run passably on integrated graphics or hardware from a few generations ago (e.g. GTX 650 Ti) instead struggle greatly, because the people behind the game either didn't care or didn't want to allow it to ever look "bad" in their pursuit of mostly consistent graphical quality (and thus how the game will look in the videos/screenshots out there).
The only real exception to this are e-sports titles, something like CS:GO is optimized really well for performing across a variety of hardware, while also giving the user the controls over how the game will look (and run). Games like DOOM are also a good example, but they're generally the exception, because most don't care about such technical excellence (though it's useful when you try porting the game to something like Nintendo Switch).
Most other games don't give you that ability, just because they try to always do more stuff, which isn't that different from Wirth's law (software gets slower as hardware gets faster). Of course, this is also prevalent in indie titles, many of which don't even have proper LOD setups, because engines like Unity don't automatically generate LOD models and something like Godot 3 didn't even have any sort of LOD functionality out of the box.
Engines like Unreal might make this better with Nanite, except that most people will use it for shoving more details into the games (bloating install sizes a bit), instead of as a really good LOD solution. That said, Godot 4 is also headed in the right direction and even for Godot 3 there are plugins (even though it's just like Unity, where you still need to make the models yourself), for which I actually ported the LOD plugin from GDScript to C#: https://blog.kronis.dev/articles/porting-the-godot-lod-plugi...
I know what you want to say (and I agree), but... website rendering depends on the network mostly, 16ms latency is already the top 0.5% of fiber users and you have to add that on top of every new connection... GPU rendering happens on a bus that is thousands of times faster, and it needs to cover a minuscule distance. You can't really compare the two.
No they don't. Your compiler will eliminate dead code.
This is why I love working in fintech. The engineering is paramount. Customers will not accept slow or buggy software.
I get to solve hard problems, and really build systems from the ground up. My managers understand that it is better to push back a deadline than to ship something that isn't up to standard.
Better than adtech, anyway. Or nukes. Lots of things, really. (I would have said weapons, last year.)
The only really defensible tech activity these days is things to help get off carbon-emitting processes. Making factories to make electrolysers. Making wind turbines better. Adapting airliners to carry liquid hydrogen in underwing nacelles. Making robots to put up solar fences on farms and pastures. Banking energy for nighttime without lithium. Making ammonia on tropical solar farms for export to high latitudes.
It's even money whether we can get it done before civilization collapses. I guess we will need plenty of liquidity...
You must be lucky, not everywhere in the sector that’d be remotely close to true unfortunately…
The prototypical nerd would happily work on building a nuke capable of destroying a continent if that entailed letting him work on “hard problems”.
Deleted Comment
It is impressive that they can draw so much stuff so fast, but there are actually very few objects on the screen that the user can directly interact with.
A specific example: in a DAW, you might have tens or even hundreds of thousands of MIDI notes on the screen. These look like (typically) little rectangles which are precisely the short of thing that games can draw at unbelievable speed. But in a DAW (and most design / creation applications), every single one of them is potentially the site of user interaction with some backend model object.
All those complex surfaces you see in contemporary games? Very nice. But the user cannot point at an arbitrary part of a rock wall and say "move this over a bit and make it bigger".
Consequently, the entire way that you design and implement the GUI is different, and the lessons learned in one domain do not map very easily to the other.
As long as it doesn't cause a glitch in playback, it's acceptable for your DAW to delay 10 milliseconds to figure out which note you clicked on. That's about 100 million instructions on one core of the obsolete laptop I'm typing this on. As you obviously know, that's plenty of time to literally iterate over your hundreds of thousands of little rectangles one by one, in a single thread, testing each one to see if it includes the click position.
But (again, as you obviously know) you don't have to do that; for example, you can divide the screen into 32×32 tiles, maybe 8192 of them, and store an array of click targets for each tile, maybe up to 2048 of them, but on average maybe 64 of them, sorted by z-index. If a click target overlaps more than one tile, you just store it in more than one tile. When you have a click, you bit-shift the mouse coordinates and combine them to index the tile array, then iterate over the click targets in the array until you find a hit. This is thousands of times faster than the stupid approach and we haven't even gotten to quadtrees.
A different stupid approach is to assign each clickable object a unique z-coordinate and just index into the z-buffer to instantly find out what the person clicked. This requires at least a 24-bit-deep z-buffer if you have potentially hundreds of thousands of MIDI notes. But that's fine these days, and it's been fine for 25 years if you were rendering the display in software.
It's not like games don't have complex UIs either, where each button or field has a lot of logic to them. Many games have very simple UIs, but others get more complex than many complex web apps. Some are even multiplayer, and the server code... it's nuts how much work goes into this. The number of updates you send each second to keep players in a game in sync compared to what is needed to keep a chat app in sync is really impressive to me.
From a technical point of view, games are really cool!
The Red Faction series, building games like Minecraft and 7 Days to Die, and games like Factorio are some pretty obvious examples where you're completely and utterly wrong, so I'm not really sure why I should trust anything you said in the rest of your comment.
You can quite literally point at an arbitrary part of a rock wall and move it over a bit then make it bigger depending on the engine you're using. It's why Unreal Engine has got a foothold in TV production.
They routinely do have hundreds, or thousands, of interactive things. Especially things like RTS games. But also, even if you look at turn-based strategy games, which have a much more application-like interface. Every hexagon, terrain feature, and unit is interactive, along with a full application-like menu, statusbar, and UI system.
In some games, you can.
I think most of all, it isn't sufficiently visible. Most development is done on high powered hardware that makes slow code very difficult to distinguish from fast code, even though you can often get 10x performance improvements without sacrificing readability or development effort.
Individually it's just a millisecond wasted here and there, but all these small inefficiencies add up across the execution path.
Here's a fun benchmark to illustrate how incomplete beliefs like "compilers are smart enough to magically make this not matter" can be:
https://memex.marginalia.nu//junk/DedupTest.gmi
It's a 50x difference between the common idiomatic approach and the somewhat optimized greybeard solution, with a wide spectrum of both readability and performance in-between.
If you put zero thought toward this, your modern code will make your modern computer run as though it was a computer from the late '90s.
Honestly, what I want is for more developers to test and optimize their software for low-power machines. Say, a cheap netbook for example (like a Chromebook). I've heard that if you do that, you will be faster than basically every other piece of software on the system. And that speed will persist (multiply, even) on any more-powerful computer.
I've heard of one person who does that for their Quake-engine game/implementation (don't remember which). They get thousands of frames per second on a modern machine. I am guilty of not doing that myself, though. Might pick up a cheap netbook from eBay for around $30.
I saw a project that had a very clear N+1 problem in its database queries and yet nobody seemed to care, because they liked doing nested service calls, instead of writing more complicated SQL queries for fetching the data (or even using views, to abstract the complexity away from the app) and the performance was "good enough".
Then end result was that an application page that should have taken less than 1 second to load now took around 7-8 because of hundreds if not thousands of DB calls to populate a few tables. Because it was a mostly internal application, that was deemed good enough. Only when those times hit around 20-30 seconds, I was called in to help.
At that point rewriting dozens of related service calls was no longer viable (in the time frame that a fix was expected in), so I could "fix" the problem with in memory caching because about 70% of the DB calls requested the very same data, just in different nested loop iterations. Of course, this fix was subjectively bad, given the cache invalidation problems that it brought (or at least would bring in the future if the data would ever change during cache lifetime).
What's even more "fun" was the fact that the DB wasn't usable through a remote connection (say, using a VPN during COVID) because of all the calls - imagine waiting for a 1000 DB calls to complete sequentially, with the full network round trip between those. And of course, launching a local database wasn't in the plans, so me undertaking that initiative also needed days of work (versus something more convenient like MySQL/MariaDB/PostgreSQL being used, which would have made it a job for a few hours, no more; as opposed to the "enterprise" database).
In my eyes, it's all about caring about software development: either you do, or you don't. Sometimes you're paid to care, other times everyone assumes that things are "good enough" without paying attention to performance, testing, readability, documentation, discoverability etc. I'm pretty sure that as long as you're allowed to ship mediocre software, exactly that will be done.
Yes, that's the difference. I'm lucky enough to work mostly with colleagues that share a certain level of craftsmanship about the software we ship. We all care about the quality of the product, although we may not always agree on what an ideal solution should look like.
(If you go this route, you've got to be careful what's been trimmeed, mind, if you want to use it for general purposes - mine doesn't have WSL and seems to have damaged printing support, which is obviously suboptimal depending on your needs.)
If I could go back in time and make one change to the history of computing, I'd add code to every consumer-facing GUI OS that would kill-dash-nine any application blocking UI thread for more than say 200ms. And somehow make it illegal to override for any consumer software, or if I were God make it so any developer trying to override it to let their software be slow gets a brain aneurysm rendering them (even more) incapable of writing software and has to find work digging ditches.
This is sheer stupidity: a more complicated, less efficient solution to a simple problem.
They presumably wanted to recursively enumerate all the files under the Sound recordings directory?
(In any case, for the application programmer the difference between recursive enumeration of files vs only what directly in a specific directly is likely only one flag or so.)
My old company used a proprietary VPN software (Appgate SPD) that used an electron front end - why? Well the front end could be reused on Windows, Linux and MacOS. Fair enough but it was very bloated and lacked features. Compound that to half a dozen electron apps and suddenly my laptop battery dies in half the time.
We can't guarantee or mandate that every project will be written in Rust, multiple times for every platform using their native APIs - so it's on the tool makers to bridge that gap.
Electron shows that we need a robust, efficient, cross-platform, cross-language GUI API.
Would be nice to be able to use Go, Rust, Python, JavaScript or whatever and interact with an GUI interpreter that translates a familiar API to native widgets.
The truth is, Electron is easy to use, quick to hire and cheap to build on, that makes it the best choice for many companies, even when the user might (highly-likely) hate it.
Telling users what they ought to want is not a very good idea.
That is to say -- not cheap.
That said I did work at a major media company and the performance sucked (in the help section) and I said I want to improve it and the PM said "nobody cares" but I think she meant the business didn't care. So there is one anecdote supporting your claim!
They are all much more powerful than Apple notes but they are so slow by the time they are ready for my input I have forgotten what I was trying to note down.
Deleted Comment
Visual Studio and Adobe Creative Suite are examples of the two most egregious offenders. They're both awfully slow and Creative Suite is extremely buggy as well. Visual Studio is nearly 30 years old, and Photoshop is even older. That's a lot of features to have to carry forward over the years.
It would be neat if during setup, they tried to figure out what your goals were as a user and turned stuff off you didn't need. My pet peeve about Visual Studio is whenever I upgrade to a new version, my default setting overrides are not carried over.
Also, it's about caring enough to actually stop using it. Have you ever changed software to a competitor with better performance? I don't know if I have..
Then on top of that, companies are making choices like “we could optimize X and Y, but that would take too much dev time. Let’s just (compromise somewhere/bump up the requirements)”
We’re stuck and getting stucker.
30 years ago, nobody in their right mind would have thought "du'h, I'll just recursively scan whatever directory THIS environment variable points at and then apply whatever this library does to each file in it to get some list and then extract the files I need from that list"..
Even IF you had to do such a terrible and expensive operation, you'd at least have the good sense to save your result so that you can do it more efficiently the next time..
I think plenty of people would have made the error of filtering after applying an expensive operation than before. This seems like a classic rookie mistake, not an indication of some paradigm shift in thinking. What was impressive was that no one at Rockstar thought of improving the loading performance for so long that a use decided to reverse-engineer it.
Imagine being a SWE working 12+ hours a day 6 days a week getting no rest. I might have accidentally coded something like this so that I could go home a couple of hours earlier.
Funny coincidence, yesterday I posted a link which supports exactly your statement (even the 30 years): https://news.ycombinator.com/item?id=33018669
I'd be wary with these kinds of words.
The thing is that people seem really, really eager to work in the video games industry. (Just like people are really eager to be artists or musicians or pet vets or teachers etc.)
Approximately everyone who programs for the game industry could also land a cushy job writing boring CRUD software.
I do admit that it's tempting to show compassion for people who put up with crappy working conditions and low pay in order to work in their beloved field.
(Especially if that field sounds socially desirable like teaching.)
But unless you severely restrict entry to the beloved industry, conditions and pay will always be lousy compared to what similar skills could get you in an unglamorous field. That's just supply and demand.
> Imagine being a SWE working 12+ hours a day 6 days a week getting no rest. I might have accidentally coded something like this so that I could go home a couple of hours earlier.
In the case of the game industry, more likely so that you could work on the next ticket during the eternal death march.
For some software, performance is important. For others, it isn't. Some developers are good, at least at making performant software, some are worse. That's always been true and will probably always be true.
Perhaps we could justify it with new features at least? Perhaps my phone is slower now because the things in there are so much more useful now? Well, no. I got maybe a couple cosmetic changes, but functionality is exactly the same. My phone got slower for no benefit.
We had lots of 'free' improvements in throughput. But latency still requires careful attention.
But then they hardcode a whole Documents directory scan right in the UI thread in their own damn voice recorder app. Unbelievable.
“Looks fast to me! Ship it.”
“Looks fast to me too! Approve it!”
We didn't used to have product teams as large as engineering teams. You can still engineer software the way you want, you're just beholden to some product persons idea about deadlines now. If you can turn your little unit of work around quickly no one will bat an eye (of course, we all know that's not realistic most of the time). If you dare trying to take longer than their perception of how long it ought to take then you better be prepared to discuss the "value add" of your performance minded solution as nauseum.
I can imagine the audio recorder is a low priority minor side project that gets handed from one guy (or team) to the next every couple of years, with the emphasis on "don't spend too much time on it".
The other issue is that performance is taken for granted; if something is slow, it's not immediately obvious.
I think you’re giving programmers of the past too much credit.
You'll be amazed how many seemingly intractable technical issues aren't due to having been solved at higher layers of abstraction. Nowadays with things like clear cut ownership of post first sale ownership breaking down, we're actually getting bitten by having to try to technically solve problems, previously made tractable by the social consensus between professionals.
As a consequence, here we are with a now crappy, pathological Notepad.
A perfect example is windows 95 - wiggling the mouse causes a huge speedup in long running applications [0].
> used to always be about "tricking" the result out of the computer by applying deep understanding of both system and problem to find the fewest, most performant steps required to yield the result.
That "tricking" came with lots of issues. Massive security holes, significant app stability issues, and hugely increased developer time to implement things. I can get a hardware accelerated Sha1 hash of a buffer in one line of c++ today, and I know it will be correct.
People who do that work still exist. Computers are fast enough that it's worth not spending hours optimising for every allocation when you're displaying half a dozen strings on screen, but people like Casey Muratori, Bruce Dawson, still exist and still write code.
[0] https://retrocomputing.stackexchange.com/questions/11533/why...
And this attitude is what gives us such gems as the one mentioned in the OP article.
That is what the quote is talking about. I don't think you can blame the quote if people are just ignoring the first word, and I'm not convinced that's actually a common thing.
You shouldn't even need to profile to realise that if you're using a serial port for I/O, unless your hardware is something embedded and in the low-MHz, range, you'll spend most of your time waiting for I/O unless you massively overcomplicate your solution.
In my experience it is mostly used as an excuse not to think about performance at all. "The quote says we can worry about it later!"
That's why the quote needs to die.
That's what Mike Acton says before he shakes your hand in the morning.
Q "I'm working on high throughput data science thing, is branchless math better than conditional instructions here, because this thing needs to take less than a lifetime to finish"
A "You should focus on code readability, it's way more important ..yadeyadaya ... i work on python and websites"
I blame slipping standards in CS programs and the entire concept of "information science"
Agreed. I really dislike this excuse for sloppy slow code.
Yes, developer time is expensive, need to keep ROI in mind.
But if the expensive developer spents a week optimizing the function to make it 1 second faster, is that worth it? Too many people will respond "developer's time is expensive" without thinking beyond that. But of course, it depends.
If that function is being invoked millions of times per day by hundreds of thousands of people, it very quickly becomes totally worth it.
According to Brook’s famous paper, code reuse is the only way to significantly improve programmer productivity. No new language will suddenly make you 10x faster, and even that would be meaningless against all the quality code bases out there.
It's important to be able to recognize which of the two echelons of problem-space one's codebase inhabits.
What's ironic is premature optimization is not just code, using K8s or expensive cloud solutions when a bare metal dedi will suffice. etc.
Instead it's just parroted by people who want to churn out MVP quality code in prod.
Defend Your Hot Path!
No one will say thank you for fixing something, but everybody will blame you if you fail trying.
Premature “premature optimization is the root of all evil” is the root of all evil.
You could come up with data showing an R^2 of like 0.99 and there's always a smartass that immediately parrots "but remember that correlation does not mean causation". Ugh.
'Cuz why waste time doing something harder? Premature optimization and all that.
Honestly, in my experience, basically the opposite of the quote is true. If you want your software to be fast, you need to be thinking about speed in the design from the very beginning. Very hard, perhaps impossible, to make something not designed for speed from the beginning be as fast as something that was.
Of course there's a difference between having a holistic design that you know can go fast and immediately wasting time eeking out x% of speed in some random sub-fn that will only run y% of the time, when x and y are small.
But really performance does need to be built into the design.
(As an example, look at the heroic steps that are being taken to try to make the Rust compiler faster. It's slowly getting better, but it will never be fast. On the other hand, I would predict that if you started making a Rust-like-language from day 0 with a fast compiler as a goal, you could get something much faster.)
As hardware gets better, we're finding ways to use it less efficiently. So if you want apps to become faster, you shouldn't be building better hardware. You should be changing human tolerance thresholds.
Which is almost impossible to pull off. Apple did this when they released the iPhone. Remember mobile apps before iOS? Mobile browsers before Safari?
When I started programming, people I learned from would optimise code in assembler, use clever algorithms and write highly unreadable code that was impossible to debug. One bit of code I remember interleaved various operations so that it's access would line up with disk reads. I rewrote it to just hold the data on memory. It wasn't much faster than his disk based code in practice.
But the only reason to go to that kind of effort was the severe constraints hardware placed on software. When those are largely removed, the imperative is to write code fast that other people can still understand.
In around 1990, I gave my dad my old 286 machine, with Win 3.1. He never upgraded it.
About 2000, when I was accustomed to much newer software & hardware, I used his old machine. I was amazed how much faster everything was.
Oddly, with Windows 7 on the Core 2 Duo, the taskbar would randomly stop updating (I think it stopped happening after moving the same hard drive into a newer laptop?), and I got no end of driver issues: the touchpad would stop responding on the lock screen or logged in after sleep-wake, and audio would randomly stop playing until I restarted audiodg.exe. As far as I can remember, none of these issues happen on my Ivy Bridge laptop, where I'm clinging for life onto Windows 7 (the last version with a soul) for as long as I can keep apps running... though I'm getting rather tired of Tailscale creating a new network adapter and asking for home/work/public on every reboot.
I run Win95 and Win2000Pro in a virtal machine with old software and it runs faster than the most recent versions.
I put Ubuntu on a Lenovo PC and it runs faster than Windows 10 did.
There's still a couple quirks, but Windows 10 had its fair number of quirks too...
The linux one was strange because my arch/systemd/ssd/i3 setup is lean, you get parallel boot, no heavy DE no bloat.. but everything felt lighter and faster on kali and I had a physical reaction saying "I don't miss anything from my current laptop, that old thing was closer to my needs".
Maybe a part of our brain that doesn't care about typography or visual effects much and prefer good old crude solid lag free tools.
I was, and am, blown away by how responsive it is. Plus, the UI sound effects add to the experience in a great way. You can HEAR when you've done something "in the computer." Just a bunch of clicks and boops. It's fantastic. Makes you think about what we've just... gotten used to.
Here you cannot really blame the poor developer, when the PM demands to scan all Documents, not just some relevant sound recordings. It's so obviously insane, that only PM's get away with this. Of course the dev matched our PM here, with his insane directory iters, but alone the idea must be appreciated first.
https://github.com/microsoft/WindowsAppSDK/issues/8
"The challenge here is simply that listing the StorageItems in a StorageFolder is incredibly slow and resource intensive compared to using the standard .Net/Win32 API to list the file paths. A quick test showed that in .Net it takes about 0.003ms per file, whereas with UWP StorageItems it takes about 2ms per file, which is around 700 times slower. Also, looking at the Runtime Broker, UWP uses about 70kB memory per file, which is a huge cost when loading 10,000+ files, whereas .Net uses around 0.1kB per file (that’s a very rough estimate)."
Somewhere else, someone proposes a theory that due to API mismatches/design issues in Win32 the RuntimeBroker is trying to use a design that isn't a great fit in order to try and provide certain security guarantees, and this requires pre-fetching a lot of data up front in case the app requests it. But NTFS is slow, so, all this additional traffic makes opening files via the Storage API really really slow.
The problem here isn't really "modern software", it's that Microsoft wrote a lot of new APIs over time with the intention of replacing Win32 (UWP/WinRT) but they aren't dogfooding them all that effectively, and they're using C++ for everything, so there are problems that don't get fixed for years.
Imagine the cost of a non obvious RPC call in a nested for loop? Just not funny.
If they do try it on their own system, they'll likely discount any performance hits as due to their own hording of files, and not consider the user likely to have just as many, if not more files. Thus, its not likely the programmer will notice it.
It's only if some exogenous event causes a programmer to consider run-time performance that it will be measured, and then optimized.
So you end up dismissing slowness due to this ("I am sure this is faster when compiled in Release!").
I've made that mistake before until it was so slow that I decided to compile a Release build which was just as slow and found out that a regression was introduced.
Automatic performance monitoring for 'tasks'/services/computations is relatively straightforward but not quite as easy for UI interactions so these often get ignored.
Bingo. This used to be a well-known benchmarking cheat, because performance on a full (or even once-full) filesystem can be very different than on a freshly made one. Almost as common as short-stroking, if you remember that. Anybody who wanted realistic results would run tests on a pre-aged filesystem. Cache warming is another example of a similar effect, and there are probably others across every area of computing. It's always really tempting to run tests in a maximally clean environment to get maximally consistent results, but that just doesn't reflect real-world usage.
In the 90s, we had shooters running at playable frame rates using software rasterization on a single thread. Now from this vantage point, think about the "what if you had a 1000x faster CPU" thought experiment that was posted to HN recently. Except, make it more like 100,000x faster...
If we had taken SIMD a bit further on the CPU and kept our extremely scrappy software engineering practices, I think it's possible the GPU might not have ever really emerged as a major consumer product. Recent game engine tech, such as Nanite's software rasterizer, is starting to shift ideologies back the other way now too.
I’m really skeptical about this take, the gains from the GPU architecture are too large to pass up on
Then came 3dfx, which was a blast.
You can even call the Amiga Blitter as one of the first GPUs, at least a specialized graphics chip. Same goes for coprocessors, like a math unit in a 486DX for example.