Ask HN: How were video games from the 90s so efficient?

I built games in the 90s. Graphics was obviously the hardest part.

We thought about things in terms of how many instructions per pixel per frame we could afford to spend. Before the 90s it was hard to even update all pixels on a 320x200x8bit (i.e. mode 13h) display at 30 fps. So you had to do stuff like only redraw the part of the screen that moved. The led to games like donkey kong where there was a static world and only a few elements updated.

In the 90s we got to the point where you had a pentium processor at 66 Mhz (woo!) At that point your 66Mhz / 320 (height) / 200 (width) / 30 (fps) gave you 34 clocks per pixel. 34 clocks was way more than needed for 2D bitblt (e.g. memcpy'ing each line of a sprite) so we could beyond 2D mario-like games to 3D ones.

With 34 clocks, you could write a texture mapper (in assembly) that was around 10-15 clocks per pixel (if memory serves) and have a few cycles left over for everything else. You also had to keep overdraw low (meaning, each part of the screen was only drawn once or maybe two times). With those techniques, you could make a game where the graphics were 3D and redrawn from scratch every frame.

The other big challenge was that floating point was slow back then (and certain processors did or didn't have floating-point coprocessors, etc.) so we used a lot of fixed point math and approximations. The hard part was dividing, which is required for perspective calculations in a 3D game, but was super slow and not amenable to fixed-point techniques. A single divide per pixel would blow your entire clock budget! "Perspective correct" texture mappers were not common in the 90s, and games like Descent that relied on them used lots of approximations to make it fast enough.

cmroanirgo · 4 years ago

Agree with everything you said. We used x87 though and paid extreme attention to fpu stalls to ensure nothing wasted clocks.

As developers, we were also forced to give the graphics guys a really hard time: "no that texture is too big! 128x128" & "you need to do it again with less polygons". We used various level of detail in textures and models to minimise calcs and rendering issues. Eg. A tank with only 12 vertices when it would only be a pixel or three on screen. I think it only used 2x2 texels as part of a 32x32 texture (or thereabouts)...

This was around mid 90's.

Dave_Rosenthal · 4 years ago

Ha. Yeah, there was not a lot of memory. The toolchain we built at the the time automatically packed all individual game texture maps into a single 256x256 texture (IIRC). If the artists made too many textures, everything started to look bad because everything got downsampled too much.

Any yeah, the design of the game content was absolutely affected by things like polygon count concerns: "Say, wouldn't it be cool to have a ship that shaped like a torus with a bunch of fins sticking out? Actually, on second thought... How about one that looks like a big spike? :)"

Agentlien · 4 years ago

Honestly, though the numbers are bigger and floating point arithmetic is fast on a modern GPU this still sounds a lot like how we work nowadays.

I recently spent two years on the performance team for a large indie title and a huge portion of it was asking artists to simplify meshes, improve LODs, increase the amount of culling we could do on the CPU, etc.

My own work was mainly optimising the rendering itself.

pkphilip · 4 years ago

The 199os were a time when we tried all sorts of tricks to get the last bit of performance out of the hardware. One book which stands out is the one Michael Abrash wrote:

Michael Abrash's "Graphics Programming Black Book" https://github.com/jagregory/abrash-black-book

https://www.drdobbs.com/parallel/graphics-programming-black-...

As CPU power, the number of cores, RAM sizes, HDD sizes, graphics card capabilities have increased, the developers are no longer as careful to squeeze out the performance.

patentatt · 4 years ago

I may be naïve/out of the loop here, but it’s fun to imagine what would be possible with the same gusto and creativity applied to today’s hardware. I imagine that a significant amount of modern hardware, even in games, is eaten up by several layers of abstraction that make it easier for developers to crank out games faster. What would a 90’s developer do with a 16-core CPU and an RTX3080?

smcameron · 4 years ago

I remember reading the comp.graphics.algorithms news group back when Descent was just out. People were going a little bit crazy trying to figure out how the hell that thing worked. I found this page that talks about some of the things done to do texture mapping: https://www.lysator.liu.se/%7Ezap/speedy.html

lostgame · 4 years ago

Descent blew my mind, as well. IIRC it predated Quake and was the first ’true’ 3DoF FPS?

The source code has since been released on GitHub, if you’re ever interested in seeing it!

DeathArrow · 4 years ago

I dabbled with graphics using mode 13h and later with VGA. It was orders of magnitude simpler than using Vulkan or DX12.

CPUs were simpler DOS and Windows 95 were very simple compared to Windows 10.

That means that writing optimized C or even assembler routines was pretty easy.

If we go 10 years back in time, programming Z80 or MOS Technology 6510 or Motorola 68k was even simpler.

cryo · 4 years ago

Yes the instruction sets were simpler, but the developers at the time had invented a lot of clever solutions to solve hard problems.

I think the most innovative timespan was between 1950–1997'ish, and hope we get back to get the most out of hardware again as common sense.

cma · 4 years ago

Some interesting related stuff in this talk:

HandmadeCon 2016 - History of Software Texture Mapping in Games

https://www.youtube.com/watch?v=xn76r0JxqNM

I think they say at one point it went from 14 to 8 instructions, and then the Duke Nukem guy (Ken Silverman) got it down to around 4.

Quake would do something where it only issued a divide every 8 pixels or something, and then only interpolate when inbetween and the out of order execution on pentium pro (I think?) would let it all work out.

joakleaf · 4 years ago

For perspective correct texture mapping quake did the distance divide every 8 pixels on the FPU, and the affine texture mapping on the integer side of the CPU in between (you can actually see a little bit of “bending”, if you stand right next to a wall surface in low res like 320x200).

Since the FPU could work in parallel with the integer instructions on the pentium, this was almost as fast as just doing affine texture mapping.

This worked even on the basic Pentium.

It was likely also the reason Quake was a unplayable on the 486 and Cyrix 586.

kqr · 4 years ago

Great discussion! Early on, they mention what I think is the book Applied Concepts in Microcomputer Graphics. It sounds like it would be right up my alley, but I can only find it with very expensive shipping.

Does anyone know of a book like it? I'm very interested in getting started with software rendering from the ground up, mainly to scratch an intellectual itch of mine. I learn better from well-curated books than online material in general.

freewizard · 4 years ago

This calls back a lot of good memories. In additional to the cool tricks you mentioned, I recall the limited 256 or 16 color also created some innovative ways to use color palettes dynamically. The limited built in PC speaker also pushed the boundary for sound fx and music implementations.

andrewjf · 4 years ago

Out of curiosity, How did you know how many clock cycles your rendering code took?

SavantIdiot · 4 years ago

You look at the assembly code, grab an Intel Programmer Reference manual (they were about 1000 pages), look up each instruction opcode and that would tell you the clock cycles. For memory operations it is much more difficult due to caching. However, for many hot regions of code, the data is already in the L1s so manual counting is sufficient. (At the time there was a book called The Black Art of ... Assembly? I can't recall, and I should be flogged for forgetting it... but it was directed at game programmers and covered all sorts of assembly tricks for Intel CPUs.)

Also, a little later in the 90's: VTune. When VTune dropped it was a game changer. Intel started adding performance counters to the CPUs that could be queried in real-time so you could literally see what code was missing branches or stalling, etc. Source: I worked with Blizzard (pre-WoW!) developing VTune, feeding back requirements for new performance counter requests from them and developers.

Dave_Rosenthal · 4 years ago

You could just measure the isolated inner loop with accurate timers and figure out down to the clock how many cycles it was taking.

You also basically knew how many cycles each instruction took (add, multiply, bit shift, memory reference, etc.) so you just added up the clock counts. (Though things got a bit harder to predict starting with Pentium as it had separate pipelines called U and V that could sometimes overlap instructions.)

rasz · 4 years ago

https://uops.info

laumars · 4 years ago

Good write up. I’d also add to that, that modern techniques are written to use the modern hardware.

It’s easy to forget that each upgrade to graphics is an exponential jump. Going from 8 colours to 256 colours on screen. Jumps in resolution. Jumps in sound, number of sprits the hardware can track. Etc.

When we look at graphics now and the tangible visible difference between 8k, 4K and HD and Moore’s Law no longer in effect it is easy to forget just how significant the jumps in tech was in the 80s and 90s if you hadn’t lived through it and/or developed code during it.

tinus_hn · 4 years ago

A game like Donkey Kong uses a static tile mapped background and some small dynamic objects that move around, and the hardware combines them.

These machines don’t even have enough memory to store one frame buffer, you can’t program like a modern game where everything is customizable and you can just do whatever you want as long as it’s fast enough.

In a game like Donkey Kong you do what the hardware allows you to do (and of course the hardware is designed to allow you to do what you want to do).

These tricks are things that you could still pull off today. The difference is, outside of competitions or the demoscene, nobody _needs_ to pull them off - it greatly decreases your time-to-market to do things in a straightforward way, rather than a novel way. That wasn't true 20+ years ago - the straightforward way often brought you up against hardware limitations.

bailey1541 · 4 years ago

Having been a game programmer back in the day (C64, Amiga, Atari Jaguar, N64, and beyond to newer machines) I fully agree with your point.

My earliest days had the most fantastic tricks. So called game engines were useless. Now it’s the complete opposite.

Dracophoenix · 4 years ago

What did you program on the N64? And which among those platforms was was your favorite?

kqr · 4 years ago

> it greatly decreases your time-to-market to do things in a straightforward way

I mean, it's hard to disagree when you phrase it this way, but... really? In the old days (mid-'90s) studios like id released many games per year, some of which with completely new technology.

Modern studios and indie developers (!) who "do things in a straightforward way" can be happy to release even one game per year, and that's with a lot of reuse. Forget novel technology once a year!

So maybe these tricks don't really increase time to market that much, compared to other variables that are also in play?

Pamar · 4 years ago

I have zero information about (due also to little interest for) computer games, so this is just a wild speculation: maybe the visuals in terms of "levels/textures/objects/mesh/characters/voice/audio" dominate the planning, now?

KronisLV · 4 years ago

> ... it greatly decreases your time-to-market to do things in a straightforward way ...

I'd actually like to link the YouTube channel of a person who is writing their own game engine and game at the same time: https://www.youtube.com/c/RandallThomas/videos

You can see how using Godot, Unity or Unreal (or most other engines) would have been much faster in regards to time to market.

Similar differences show up when you try to build the same project, once while using an engine and another without it, the performance can be much better if you write your own optimized code (supposing that you can do so in the first place), however the development still takes much longer, for example: https://www.youtube.com/watch?v=tInaI3pU19Y

Now, whether that matters to you or not is a different matter entirely: some care about learning a lot more about the lower level stuff, others just want to develop games and care more about the art/story/etc., while others care about selling them ASAP.

smallstepforman · 4 years ago

I’ve created a 4k UHD video editor for Haiku OS (https://github.com/smallstepforman/Medo), it’s a C++17 native app, with over 30 OpenGL GLSL effect plugins and addons, multi threaded Actor model, over 10 user languages, and the entire package file fits on a 1.44Mb floppy disk with space to spare. If I was really concerned about space, I could probably replace all .png resources with WebP and save another 200kb.

How is it so small? No external dependancies (uses stock Haiku packages), uses the standard C++ system API, and written by a developer that learned their trade on restrained systems from the 80’s. Look at the old Amiga stuff from that era.

squarefoot · 4 years ago

Nice, this may be worth a Show HN. I'm not a Haiku user although I try it from time to time should it become interesting for my uses. Also I agree on the value of being exposed to the old way of doing things. I started coding on the Amiga, and the way its OS worked (no memory management) forced me to grow sane habits when for example dealing with memory allocation: if I didn't free a buffer before my program exited, that buffer would remain allocated until the next reboot. I once had to debug a small assembly program of mine that lost a longword (4 bytes) at every run; turned out I missed it when doing pointer calculations with registers, and the journey to monitor the program activity, find the problem and correcting the error has been of tremendous help years later.

MrDOS · 4 years ago

> Nice, this may be worth a Show HN.

smallstepforman has submitted it before: https://news.ycombinator.com/item?id=25513557. Unfortunately, it didn't garner much attention.

kragen · 4 years ago

This looks amazing. Thank you for sharing.

egypturnash · 4 years ago

Lower resolutions on smaller monitors. 256 colors at a time, no truecolor yet.

Lower-res samples, if any.

Lower framerate expectations - try playing Descent 2 on an emulated computer with similar specs to the lowest specs suggested on the box. Even one in the middle of the spec range probably didn't get a constant 60fps.

More hand-tuned assembly (RCT was famously 99% assembly, according to Wikipedia; this was starting to be unusual but people who'd been in the industry a while probably did at last one game in 100% assembly, and would have been pretty comfortable with hand-optimizing stuff as needed).

Simpler worlds, with simpler AI. Victory conditions designed to be reached before the number of entities in the game overwhelmed the CPU completely.

Simpler models. Most 3d games you play now probably have more polygons in your character's weapon than Descent would have in the entire level and all the active entities; they certainly have more polys in the overall character.

I mean, really, three million tiles? That's insane by that day's standards, that's a bit more than 1700x1700 tiles, a quick search tells me the maximum map size in Roller Coaster Tycoon 3 was 256x256, and it's a fairly safe assumption that RCT1 was, at best, the same size, if not probably smaller. I can't find anything similar for Sim City 2000 but I would be surprised if it was much bigger.

fxtentacle · 4 years ago

Quite simply: No frameworks.

Some games nowadays are built using Electron, which means they include a full web browser which will then run the game logic in JavaScript. That alone can cause +1000% CPU usage.

Unity (e.g. RimWorld) wastes quite a lot of CPU cycles on things that you'll probably never use or need, but since it's a one-size-fits-all solution, they need to include everything.

For Unreal Engine, advanced studios will actually configure compile-time flags to remove features that they don't need, and it's C++ and in general well designed, so that one can become quite efficient if you use it correctly.

And then there's script marketplaces. They will save you a lot of time getting your game ready for release quickly, but they are usually coded by motivated amateurs and super inefficient. But if CPUs are generally fast enough and the budgets for game developers are generally low, many people will trade lower release performance for a faster time to market.

=> Modern games are slow because it's cheaper and more convenient that way.

But there still are tech demos where people push efficiency to the limit. pouet.net comes to mind. And of course the UE5 demo which runs on an AMD GPU and a 8-core AMD CPU:

https://www.youtube.com/watch?v=iIDzZJpDlpA

That demo is eerily beautiful. Honestly the first time I’ve seen photorealistic detail in a game engine.

don-code · 4 years ago

thewileyone · 4 years ago

The simple answer is you have to work within the constraints you're confined in. I used to work for an ecommerce company, a very early Amazon competitor, and because the Internet was so slow in the early years, we had a rule that our homepage had to be less than 100k, including image icons. Every 1k squeezed was a success and celebrated. Even today Amazon's homepage is less than 1MB, go ahead and check.

Now with CSS frameworks, JS frameworks, ad frameworks and cross-linking, pages take forever to load with even less actual content.

adventured · 4 years ago

> Even today Amazon's homepage is less than 1MB, go ahead and check.

Pingdom reports 4.8mb (3.3mb of images), 661ms to load, and 298 requests.

GT Metrix reports 2.65mb (2mb of images), and 307 requests.

An incognito window with Firefox on my system says ~3mb and ~265 requests. Just the top six or seven media assets combined that loaded initially weigh in at about 1mb.

Certainly not the worst page ever, granted.

bellyfullofbac · 4 years ago

Nitpick: MB = Megabyte. Mb = Megabit. mb = millibit?

crate_barre · 4 years ago

Hit the nail on the head. Constraints fuel creativity and innovation. If I give you everything, then what do you have left to do?

It’s like the General that wins every battle because he has the most elite soldiers. Could the General win with a rag tag shoddy group of soldiers?

cerved · 4 years ago

Like the Köln concert!

https://en.m.wikipedia.org/wiki/The_K%C3%B6ln_Concert

majani · 4 years ago

Yup. I am currently developing a game for African players and I have to build my own engine and do old school tricks to get the game under 1MB on a website. CSS sprites and all

pvillano · 4 years ago

I was just thinking about some console game that launched with a 60GB day one patch, and how disappointing it must have been for players without good internet to put in the disc and not be able to play or missing half the content.

roenxi · 4 years ago

The major difference is probably the kind of game being produced. There is a reason DF starts out running smoothly and eventually hits CPU death. I'm not certain what the reason is these days, but it used to be something like pathfinding combined with polynomial explosions in how a dwarf chooses their next job.

Old-school games would have been forced to redesign the game so that that wasn't a problem that gets faced. For example, none of the games you list try to simulate a deep 3d space with complex pathfinding. Warcraft II I know quite well and the maps are small.

One of the reasons systems used to be highly efficient was evolutionary - it was impossible to make resource-heavy games, so people kept iterating until they found good concepts that didn't require polynomial scaling algorithms with moderate-large N inputs.

georgewsinger · 4 years ago

I recommend watching this: https://youtu.be/izxXGuVL21o

Naughty Dog Co-founder Andy Gavin discusses various hacks that were used on the Playstation to get Crash Bandicoot to run smoothly. The fuller version is also worth watching.

shapefrog · 4 years ago

The art of working with constraints does not seem to be lost; just look at late cycle console games.

A PS4 was/is effectively unchanged hardware between Nov 2013 and today, yet the late lifecycle games look great. Upon release of the hardware devs had plenty of performance to play with, then 5 years in they have honed their craft and are able to use all the tricks at their disposal to squeeze as much graphical and performance life out of a limited resource budget.

omegaham · 4 years ago

For those who like text, his series of blog entries is also extremely good. https://all-things-andy-gavin.com/video-games/making-crash/

Waterluvian · 4 years ago

This one is my favourite. I would call this kind of programming both art and beauty.

sushsjsuauahab · 4 years ago

If I remember correctly the two main hacks were making draw lists such that only certain polygon faces were rendered based on Crash's xyz position (the theory being that it isnt possible for other faces to be seen from that location), and also that he removed functions/files from Sony's standard C libs on the PS1 SDK?