Fun fact: Bob Colwell (chief architect of the Pentium Pro through Pentium 4) recently revealed that the Pentium 4 had its own 64-bit extension to x86 that would have beaten AMD64 to market by several years, but management forced him to disable it because they were worried that it would cannibalize IA64 sales.
> Intel’s Pentium 4 had our own internal version of x86–64. But you could not use it: we were forced to “fuse it off”, meaning that even though the functionality was in there, it could not be exercised by a user. This was a marketing decision by Intel — they believed, probably rightly, that bringing out a new 64-bit feature in the x86 would be perceived as betting against their own native-64-bit Itanium, and might well severely damage Itanium’s chances. I was told, not once, but twice, that if I “didn’t stop yammering about the need to go 64-bits in x86 I’d be fired on the spot” and was directly ordered to take out that 64-bit stuff.
That's no guarantee it would succeed though - AMD64 also cleaned up a number of warts on the x86 architecture, like more registers.
While I suspect the Intel equivalent would do similar things, simply from being a big enough break it's an obvious thing to do, there's no guarantee it wouldn't be worse than AMD64. But I guess it could also be "better" from a retrospective perspective.
And also remember at the time the Pentium 4 was very much struggling to get the advertised performance. One could argue that one of the major reasons that the AMD64 ISA took off is that the devices that first supported it were (generally) superior even in 32-bit mode.
EDIT: And I'm surprised it got as far as silicon. AMD64 was "announced" and the spec released before the pentium 4 was even released, over 3 years before the first AMD implementations could be purchased. I guess Intel thought they didn't "need" to be public about it? And the AMD64 extensions cost a rather non-trivial amount of silicon and engineering effort to implement - did the plan for Itanium change late enough in the P4 design that it couldn't be removed? Or perhaps this all implies it was a much less far-reaching (And so less costly) design?
As someone who followed IA64/Itanium pretty closely, it's still not clear to me the degree to which Intel (or at least groups within Intel) thought IA64 was a genuinely better approach and the degree to which Intel (or at least groups within Intel) simply wanted to get out from existing cross-licensing deals with AMD and others. There were certainly also existing constraints imposed by partnerships, notably with Microsoft.
> That's no guarantee it would succeed though - AMD64 also cleaned up a number of warts on the x86 architecture, like more registers.
As someone who works with AMD64 assembly very often - they didn't really clean it up all that much. Instruction encoding is still horrible, you still have a bunch of useless instructions even in 64-bit mode which waste valuable encoding space, you still have a bunch of instructions which hardcode registers for no good reason (e.g. the shift instructions have a hardcoded rcx). The list goes on. They pretty much did almost the minimal amount of work to make it 64-bit, but didn't actually go very far when it comes to making it a clean 64-bit ISA.
I'd love to see what Intel came up, but I'd be surprised if they did a worse job.
Pentium 4 was widely speculated of being able to run 64bit at the time of AMD64 delivering, but at half the speed.
Essentially, while decoding a 64bit variant of x86 ISA might have been fused off, there was a very visible part that was common anyway, and that was available ALUs on NetBurst platform - which IIRC were 2x 32bit ALUs for integer ops. So you either issue micro-op to both to "chain" them together, or run every 64bit calculation in multiple steps.
I don't think it's just mis-reading. It's also internal politics. How many at Nokia knew that the Maemo/MeeGo series was the future, rather than Symbian? I think quite a few. But Symbian execs fought to make sure Maemo didn't get a mobile radio. In most places, internal feuds and little kingdoms prevail over optimal decisions for the entire organization. I imagine lots of people at Intel were deeply invested in IA-64. Same thing repeats mostly everywhere. For example, from what I've heard from insiders, ChromeOS vs Android battles at Google were epic.
When I ran the Python Meetup here in Phoenix -- an engineer for Intel's compilers group would show up all the time. I remember he would constantly be frustrated that Intel management would purposely down-play and cripple advances of the Atom processor line because they thought it would be "too good" and cannibalize their desktop lines. This was over 15 years ago -- I was hearing this in real-time. He flat out said that Intel considered the mobile market a joke.
They don't misread the market so much as intentionally do that due to INTC being a market driven org. They want to suck up all the profits in each generation for each SKU. They stopped being an engineering org in the 80s. I hope they crash and burn.
"Recently revealed" is more like a confirmation of what I had read many years before; and furthermore, that Intel's 64-bit x86 would've been more backwards-compatible and better-fitting than AMD64, which looks extremely inelegant in contrast, with several stupid missteps like https://www.pagetable.com/?p=1216 (the comment near the bottom is very interesting.)
If you look at the 286's 16-bit protected mode and then the 386's 32-bit extensions, they fit neatly into the "gaps" in the former; there are some similar gaps in the latter, which look like they had a future extension in mind. Perhaps that consideration was already there in the 80s when the 386 was being designed, but as usual, management got in the way.
> Fun fact: Bob Colwell (chief architect of the Pentium Pro through Pentium 4) recently revealed that the Pentium 4 had its own 64-bit extension to x86 that would have beaten AMD64 to market by several years, but management forced him to disable it because they were worried that it would cannibalize IA64 sales.
File this one under "we made the right decision based on everything we knew at the time." It's really sad because the absolute right choice would have been to extend x86 and let it duke it out with Itanium. Intel would win either way and the competition would have been even more on the back heel. So easy to see that decades later...
Yup. I went to the Microprocessor Forum where they introduced 'Sledgehammer' (the AMD 64 architecture) and came back to NetApp where I was working and started working out how we'd build our next Filer using it. (that was a journey given the AMD distrust inside of NetApp!). I had a pretty frank discussion with the Intel SVP of product who was pretty bought into the Intel "high end is IA, Mid/PC is IA32, embedded is the 8051 stuff". They were having a hard time with getting Itainum wins.
This seems like an object lesson in making sure that the right hand does not know what the left is doing. Yes, if you have two departments working on two mutually exclusive architectures, one of them will necessarily fail. In exchange, however, you can guarantee that it will be the worse one. This is undervalued as a principle since the wasted labor is more easily measured, and therefore decision making is biased towards it.
I agree with you, but perhaps this is very hard (impossible?) to pull off. Invariably, politics will result in various outcomes being favored in management and the moment that groups realize the game is rigged, the whole fair market devolves into the usual political in-fighting.
The story I heard (which I can't corroborate) was that it was Microsoft that nixed Intel's alternative 64-bit x86 ISA, instead telling it to implement AMD's version instead.
Microsoft did port some versions of Windows to Itanium, so they did not reject it at first.
With poor market demand and AMD's success with amd64, Microsoft did not support itanium in vista and later desktop versions which signaled the end of Intel's Itanium.
Yeah, I remember hearing that at the time too. When MS chose to support AMD64, they made it clear it was the only 64bit x86 ISA they were going to support, even though it was an open secret Intel was sitting on one but not wanting to announce it.
I wanted to mention that the Pentium 4 (Prescott) that was marketed as the Centrino in laptops had 64bit capabilities, but it was described as 32bit extended mode. I remember buying a laptop in 2005(?) which I first ran with XP 32bit, and then downloading the wrong Ubuntu 64bit Dapper Drake image, and the 64bit kernel was running...and being super confused about it.
Also, for a long while, Intel rebranded the Pentium 4 as Intel Atom, which then usually got an iGPU on top with being a bit higher in clock rates. No idea if this is still the case (post Haswell changes) but I was astonished to buy a CPU 10 years later to have the same kind of oldskool cores in it, just with some modifications, and actually with worse L3 cache than the Centrino variants.
core2duo and core2quad were peak coreboot hacking for me, because at the time the intel ucode blob was still fairly simple and didn't contain all the quirks and errata fixes that more modern cpu generations have.
In 2005 you could already buy Intel processors with AMD64. It just wasn't called AMD64 or Intel64; it was called EM64T. During that era running 64-bit Windows was rare but running 64-bit Linux was pretty commonplace, at least amongst my circle of friends. Some Linux distributions even had an installer that told the user they were about to install 32-bit Linux on a computer capable of running 64-bit Linux (perhaps YaST?).
Pentium 4 was never marketed as Centrino - that came in with the Pentium M, which was very definitely not 64-bit capable (and didn't even officially have PAE support to begin with). Atom was its own microarchitecture aimed at low power use cases, which Pentium 4 was definitely not.
Centrino was Intel's brand for their wireless networking and laptops that had their wireless chipsets, the CPUs of which were all P6-derived (Pentium M, Core Duo).
Possibly you meant Celeron?
Also the Pentium 4 uarch (Netburst) is nothing like any of the Atoms (big for the time out-of-order core vs. a small in-order core).
Speaking of marketing, that era of Intel was very weird for consumers. In the 1990s, they had iconic ads and words like Pentium or MMX became powerful branding for Intel. In the 2000s I think it got very confused. Centrino? Ultrabook? Atom? Then for some time there was Core. But it became hard to know what to care about and what was bizarre corporate speak. That was a failure of marketing. But maybe it was also an indication of a cultural problem at Intel.
Very early intel "EM64T" chips (aka amd64 compatible) had too short virtual address size of 36bit instead of 40, which is why Windows 64bit didn't run on them, but some linux versions did.
I remember at the time thinking it was really silly for Intel to release a 64-bit processor that broke compatibility, and was very glad AMD kept it. Years later I learned about kernel writing, and I now get why Intel tried to break with the old - the compatibility hacks piled up on x86 are truly awful. But ultimately, customers don't care about that, they just want their stuff to run.
It didn't help that Itanium was late, slow, and Intel/HP marketing used Itanium to kill off the various RISC CPUs, each of which had very loyal fans. This pissed off a lot of techies at the time.
I was a HUGE DEC Alpha fanboy at the time (even helped port FreeBSD to DEC Alpha), so I hated Itanium with a passion. I'm sure people like me who were 64-bit MIPS and PA-RISC fanboys and fangrirls also existed, and also lobbied against adoption of itanic where they could.
I remember when amd64 appeared, and it just made so much sense.
This, if intel's compilers and architecture had been stellar and provided a x5 or x10 improvement it would have caught on. However no one in IT was fool enough to switch architectures over a 30-50% performance improvement that require switching hardware, compilers, and software and try to sell it to their bosses.
Itanic wasn't exactly HP-PA v.3, but it was a kissing cousin. Most of the HP shops I worked with believed the rhetoric it was going to be a straightforward if not completely painless upgrade from the PA-8x00 gear they were currently using.
Not so much.
The MIPS 10k line on the other hand...sigh...what might have been.
I remember when amd64 appeared, and it just made so much sense.
Intel might have been successful with the transition if they didn't decide to go with such radically different and real-world untested architecture for Itanium.
It is worth noting that at the turn of the century x86 wasn't yet so utterly dominant yet. Alphas, PowerPC, MIPS, SPARC and whatnot were still very much a thing. So that is part why running x86 software was not as high priority, and maybe even compatibility with PA-RISC would have been a higher priority.
The writing was on the wall once Linux was a thing. I did alot of solution design in that period. The only times there were good business cases in my world for not-x86 were scenarios where DBAs and some vertical software required Sun, and occasionally AIX or HPUX for license optimization or some weird mainframe finance scheme.
The cost structure was just bonkers. I replaced a big file server environment that was like $2M of Sun gear with like $600k of HP Proliant.
It wasn't just incompatibility, it was some of the design decisions that made it very hard to make performant code that runs well on Itanium.
Intel made a bet on parallel processing and compilers figuring out how to organize instructions instead of doing this in silicone. It proved to be very hard to do, so the supposedly next gen processors turned out to be more expensive and slower than the last gen or new AMD ones.
Yeah the biggest idea was essentially to do the scheduling of instructions upfront in the compiler instead of dynamically at runtime. By doing this, you can save a ton of die area for control and put it into functional units doing math etc.
The problem as far as I can tell as a layman is that the compiler simply doesn't have enough information to do this job at compile time. The timing of the CPU is not deterministic in the real world because caches can miss unpredictably, even depending on what other processes are running at the same time on the computer. Branches also can be different depending on the data being processed. Branch predictors and prefetchers can optimize this at runtime using the actual statistics of what's happening in that particular execution of the program. Better compilers can do profile directed optimization, but it's still going to be optimized for the particular situation the CPU was in during the profile run(s).
If you think of a program like an interpreter running a tight loop in an interpreted program, a good branch predictor and prefetcher are probably going to be able to predict fairly well, but a statically scheduled CPU is in trouble because at the compile time of the interpreter, the compiler has no idea what program the interpreter is going to be running.
- Intel quietly introduced their implementation of amd64 under the name "EM64T". It was only later that they used the name "Intel64".
- Early Itanium processors included hardware features, microcode and software that implemented an IA‑32 Execution Layer (dynamic binary translation plus microcode assists) to run 32‑bit x86 code; while the EL often ran faster than direct software emulation, it typically lagged native x86 performance and could be worse than highly‑optimised emulators for some workloads or early processor steppings.
It never mentioned the release of the x86_64 'emulator' by AMD to prepare and test your 64bit development. Or even the Opteron. Feels like it is more story how the author perceived it than an actual timeline
Edit: Looked it up, it is called AMD SimNow! Originally released in 2000. I clearly remember www.x86-64.org existed for this
I was one of those weird users who used the 64-bit version of Windows XP, with what I'm pretty sure was an Athlon 64 X2, both the first 64-bit chip and first dual-core one that I had.
Unfortunately, NT for Alpha only ran in a 32-bit address space.
"The 64-bit versions of Windows NT were originally intended to run on Itanium and DEC Alpha; the latter was used internally at Microsoft during early development of 64-bit Windows. This continued for some time after Microsoft publicly announced that it was cancelling plans to ship 64-bit Windows for Alpha. Because of this, Alpha versions of Windows NT are 32-bit only."
Alpha support was removed in one of the later NT5 betas right? Makes sense that it would've been late 90s then, before it was renamed Windows 2000 for release.
Me too! It was funny how little love it got given how well it worked.
The only issues I came across were artificial blocks. Some programs would check the OS version and give an error just because. Even the MSN Messenger (also by Microsoft) refused to install by default; I had to patch the msi somehow to install it anyway. And then it ran without issues, once installed.
Athlon was my second computer(cpu) after i486. I think the core was K7 architecture and it had 700MHz clock, iirc. I rememebr Athlon/AMD being much cheaper than Intel and it was very exotic even thinking about it as Intel was EVERYWHERE(it was THE computer - "intel inside") and getting AMD was quite literally a question whether I'll even be able to install Windows and run normal programs(we really didn't know back then). I think I had another AMD after that in desktop(1.4GHz, dual core....iirc), then Intel in a laptop and now AMD again in a laptop. Will probably stick with AMD for the future as well.
I remember the days of cpu clock speed being displayed on the outside of the computer case using an led display. There was also this turbo button but I'm not sure whether that really did anything.
Generally, Turbo toggled some form of slow mode. Back in the XT era, enough software relied on the original 4.77MHz CPU clock of the PC and XT that faster Turbo XT clones running at 8MHz would have a switch to slow things back down. It persisted for a while into the early ‘90s as a way to deal with software that expects a slower CPU, although later implementations may not slow things down all the way to an XT’s speed.
Nitpick: The author states that removal of 16-bit in Windows 64 was a design decision and not a technical one. That’s not quite true.
When AMD64 is in one of the 64-bit modes, long mode (true 64-bit) or compatibility mode (64-bit with 32-bit compatibility), you can not execute 16-bit code. There are tricks to make it happen, but they all require switching the CPU mode, which is insecure and can cause problems in complex execution environments (such as an OS).
If Microsoft (or Linux, Apple, etc) wanted to support 16-bit code in their 64-bit OSes, they would have had to create an emulator+VM (such as OTVDM/WineVDM) or make costly hacks to the OS.
I've written code to call 16-bit code from 64-bit code that works on Linux (because that's the only OS where I know the syscall to modify the LDT).
It's actually no harder to call 16-bit code from 64-bit code than it is to call 32-bit code from 64-bit code... you just need to do a far return (the reverse direction is harder because of stack alignment issues). The main difference between 32-bit and 16-bit is that OS's support 32-bit code by having a GDT entry for 32-bit code, whereas you have to go and support an LDT to do 16-bit code, and from what I can tell, Windows decided to drop support for LDTs with the move to 64-bit.
The other difficulty (if I've got my details correct) is that returning from an interrupt into 16-bit code is extremely difficult to do correctly and atomically, in a way that isn't a problem for 32-bit or 64-bit code.
Executing 16-bit code in Compatibility Mode (not Long Mode) is possible, that's not the problem. The problem is lack of V86 allowing legacy code to run. So Real Mode code is out wholesale (a sizable chunk of legacy software) and segmented memory is out in Protected Mode (nearly the totality of remaining 16-bit code).
So yes, you can write/run 16-bit code in 64-bit Compatibility Mode. You can't execute existing 16-bit software in 64-bit Compatibility Mode. The former is a neat trick, the latter is what people actually expect "16-bit compatibility" to mean.
It's not so much running 16 bit code, but running something that wants to run on bare metal, i.e. DOS programs that access hardware directly. Maintaining the DOS virtualization box well into the 21st century probably wasn't worth it.
> The 64-bit builds of Windows weren’t available immediately.
There was a year or so between the release of AMD-64 and the first shipping Microsoft OS that supported it.[1] It was rumored that Intel didn't want Microsoft to support AMD-64 until Intel had compatible hardware. Anyone know?
Meanwhile, Linux for AMD-64 was shipping, which meant Linux was getting more market share in data centers.[1]
Microsoft has just such an emulator. Via Windows source code leaks the NTVDM (Virtual DOS Machine) from 32-bit Windows versions has been built for 64-bit Windows targets[0].
I don't understand why Microsoft chose to kill it. That's not in their character re: backwards compatibility.
NTVDM requires Virtual 8086 mode in the processor. This doesn't exist in the 64-bit modes, requiring a software emulator. That is why OTVDM/WineVDM exist.
You can see all of this explained in the README for the very project you linked:
```
How does it work?
=================
I never thought that it would be possible at all, as NTVDM on Win32 uses V86
mode of the CPU for fast code execution which isn't available in x64 long
mode.
However I stumbled upon the leaked Windows NT 4 sourcecode and the guys from
OpenNT not only released the source but also patched it and included all
required build tools so that it can be compiled without installing anything
but their installation package.
The code was a pure goldmine and I was curious how the NTVDM works.
It seems that Microsoft bought the SoftPC solution from Insignia, a company
that specialised in DOS-Emulators for UNIX-Systems. I found out that it also
existed on MIPS, PPC and ALPHA Builds of Windows NT 4 which obviously don't
have a V86 mode available like Intel x86 has. It turned out that Insignia
shipped SoftPC with a complete emulated C-CPU which also got used by Microsoft
for MIPS, PPC and ALPHA-Builds.
```
As to why they didn't continue with that solution, because they didn't want to rely on SoftPC anymore or take on development themselves for a minuscule portion of users who would probably just use 32-bit Windows anyways.
> I don't understand why Microsoft chose to kill it.
My personal suspicion: it's about handles.
Several kinds of objects in the Windows API are identified by global handles (for instance, HWND for a window), and on 16-bit Windows, these handles are limited to 16 bits (though I vaguely recall reading somewhere that they're actually limited to 15 bits). Not having the possibility of a 16-bit Windows process would allow them to increase the global limit on the number of handles (keeping in mind that controls like buttons are actually nested windows, so it's not just one window handle for each top-level window).
It's probably important to note that the AMD64 platform isn't what got Intel in it's current situation. After adopting AMD64 Intel once again dominated AMD and the Bulldozer/Piledrive/Excavator series of AMD processors where not doing well in the competition with Intel.
With Zen AMD once again turned the tables on Intel, but not enough to break Intel. Intels downfall seems entirely self-inflicted and is due to a series of bad business decisions and sub-par product releases.
Yeah. The article tells a good story and I agree with it. I even bought an Athlon 64 CPU back in ~2004.
What I want to add to the story is that when Intel Core 2 came out (and it was an x86-64 chip), it absolutely crushed AMD's Athlon 64 processors. It won so hard that, more or less, the lowest spec Core 2 CPU was faster than the highest spec Athlon 64 CPU. (To confirm this, you can look up benchmark articles around the year 2006, such as those from Tom's Hardware Guide.) Needless to say, my next computer in 2008 was a Core 2 Quad, and it was indeed much faster than my Athlon 64.
The Core 2 and all its sequels were how Intel dominated over AMD for about a decade until AMD Zen came along.
> Intel’s Pentium 4 had our own internal version of x86–64. But you could not use it: we were forced to “fuse it off”, meaning that even though the functionality was in there, it could not be exercised by a user. This was a marketing decision by Intel — they believed, probably rightly, that bringing out a new 64-bit feature in the x86 would be perceived as betting against their own native-64-bit Itanium, and might well severely damage Itanium’s chances. I was told, not once, but twice, that if I “didn’t stop yammering about the need to go 64-bits in x86 I’d be fired on the spot” and was directly ordered to take out that 64-bit stuff.
https://www.quora.com/How-was-AMD-able-to-beat-Intel-in-deli...
While I suspect the Intel equivalent would do similar things, simply from being a big enough break it's an obvious thing to do, there's no guarantee it wouldn't be worse than AMD64. But I guess it could also be "better" from a retrospective perspective.
And also remember at the time the Pentium 4 was very much struggling to get the advertised performance. One could argue that one of the major reasons that the AMD64 ISA took off is that the devices that first supported it were (generally) superior even in 32-bit mode.
EDIT: And I'm surprised it got as far as silicon. AMD64 was "announced" and the spec released before the pentium 4 was even released, over 3 years before the first AMD implementations could be purchased. I guess Intel thought they didn't "need" to be public about it? And the AMD64 extensions cost a rather non-trivial amount of silicon and engineering effort to implement - did the plan for Itanium change late enough in the P4 design that it couldn't be removed? Or perhaps this all implies it was a much less far-reaching (And so less costly) design?
As someone who works with AMD64 assembly very often - they didn't really clean it up all that much. Instruction encoding is still horrible, you still have a bunch of useless instructions even in 64-bit mode which waste valuable encoding space, you still have a bunch of instructions which hardcode registers for no good reason (e.g. the shift instructions have a hardcoded rcx). The list goes on. They pretty much did almost the minimal amount of work to make it 64-bit, but didn't actually go very far when it comes to making it a clean 64-bit ISA.
I'd love to see what Intel came up, but I'd be surprised if they did a worse job.
Essentially, while decoding a 64bit variant of x86 ISA might have been fused off, there was a very visible part that was common anyway, and that was available ALUs on NetBurst platform - which IIRC were 2x 32bit ALUs for integer ops. So you either issue micro-op to both to "chain" them together, or run every 64bit calculation in multiple steps.
They also were affordable dual cores, it wasn't the norm at all at the time.
I understand that r8-r15 require a REX prefix, which is hostile to code density.
I've never done it with -O2. Maybe that would surprise me.
Intel has a strong history of completely mis-reading the market.
Quote: Business success contains the seeds of its own destruction. Success breeds complacency. Complacency breeds failure. Only the paranoid survive.
- Andy Grove, former CEO of Intel
From wikipedia: https://en.wikipedia.org/wiki/Andrew_Grove#Only_the_Paranoid...
Takeaway: Be paranoid about MBAs running your business.
If you look at the 286's 16-bit protected mode and then the 386's 32-bit extensions, they fit neatly into the "gaps" in the former; there are some similar gaps in the latter, which look like they had a future extension in mind. Perhaps that consideration was already there in the 80s when the 386 was being designed, but as usual, management got in the way.
Segmentation very useful for virtualization? I don't follow that claim.
The concern is that it won't cannibalize sales, it would cannibalize IA64 manager's job and status. "You ship the org chart"
File this one under "we made the right decision based on everything we knew at the time." It's really sad because the absolute right choice would have been to extend x86 and let it duke it out with Itanium. Intel would win either way and the competition would have been even more on the back heel. So easy to see that decades later...
With poor market demand and AMD's success with amd64, Microsoft did not support itanium in vista and later desktop versions which signaled the end of Intel's Itanium.
Damn!
Also, for a long while, Intel rebranded the Pentium 4 as Intel Atom, which then usually got an iGPU on top with being a bit higher in clock rates. No idea if this is still the case (post Haswell changes) but I was astonished to buy a CPU 10 years later to have the same kind of oldskool cores in it, just with some modifications, and actually with worse L3 cache than the Centrino variants.
core2duo and core2quad were peak coreboot hacking for me, because at the time the intel ucode blob was still fairly simple and didn't contain all the quirks and errata fixes that more modern cpu generations have.
Possibly you meant Celeron?
Also the Pentium 4 uarch (Netburst) is nothing like any of the Atoms (big for the time out-of-order core vs. a small in-order core).
Rest is well explained by sibling posts :)
[1] https://en.wikipedia.org/wiki/Physical_Address_Extension
I was a HUGE DEC Alpha fanboy at the time (even helped port FreeBSD to DEC Alpha), so I hated Itanium with a passion. I'm sure people like me who were 64-bit MIPS and PA-RISC fanboys and fangrirls also existed, and also lobbied against adoption of itanic where they could.
I remember when amd64 appeared, and it just made so much sense.
Itanic wasn't exactly HP-PA v.3, but it was a kissing cousin. Most of the HP shops I worked with believed the rhetoric it was going to be a straightforward if not completely painless upgrade from the PA-8x00 gear they were currently using.
Not so much.
The MIPS 10k line on the other hand...sigh...what might have been.
I remember when amd64 appeared, and it just made so much sense.
And you were right.
The cost structure was just bonkers. I replaced a big file server environment that was like $2M of Sun gear with like $600k of HP Proliant.
Intel made a bet on parallel processing and compilers figuring out how to organize instructions instead of doing this in silicone. It proved to be very hard to do, so the supposedly next gen processors turned out to be more expensive and slower than the last gen or new AMD ones.
The problem as far as I can tell as a layman is that the compiler simply doesn't have enough information to do this job at compile time. The timing of the CPU is not deterministic in the real world because caches can miss unpredictably, even depending on what other processes are running at the same time on the computer. Branches also can be different depending on the data being processed. Branch predictors and prefetchers can optimize this at runtime using the actual statistics of what's happening in that particular execution of the program. Better compilers can do profile directed optimization, but it's still going to be optimized for the particular situation the CPU was in during the profile run(s).
If you think of a program like an interpreter running a tight loop in an interpreted program, a good branch predictor and prefetcher are probably going to be able to predict fairly well, but a statically scheduled CPU is in trouble because at the compile time of the interpreter, the compiler has no idea what program the interpreter is going to be running.
- Intel quietly introduced their implementation of amd64 under the name "EM64T". It was only later that they used the name "Intel64".
- Early Itanium processors included hardware features, microcode and software that implemented an IA‑32 Execution Layer (dynamic binary translation plus microcode assists) to run 32‑bit x86 code; while the EL often ran faster than direct software emulation, it typically lagged native x86 performance and could be worse than highly‑optimised emulators for some workloads or early processor steppings.
Edit: Looked it up, it is called AMD SimNow! Originally released in 2000. I clearly remember www.x86-64.org existed for this
7 and 2008R2 were pretty good too. All downhill from there..
"The 64-bit versions of Windows NT were originally intended to run on Itanium and DEC Alpha; the latter was used internally at Microsoft during early development of 64-bit Windows. This continued for some time after Microsoft publicly announced that it was cancelling plans to ship 64-bit Windows for Alpha. Because of this, Alpha versions of Windows NT are 32-bit only."
https://en.wikipedia.org/wiki/Windows_NT#64-bit_platforms
The only issues I came across were artificial blocks. Some programs would check the OS version and give an error just because. Even the MSN Messenger (also by Microsoft) refused to install by default; I had to patch the msi somehow to install it anyway. And then it ran without issues, once installed.
When AMD64 is in one of the 64-bit modes, long mode (true 64-bit) or compatibility mode (64-bit with 32-bit compatibility), you can not execute 16-bit code. There are tricks to make it happen, but they all require switching the CPU mode, which is insecure and can cause problems in complex execution environments (such as an OS).
If Microsoft (or Linux, Apple, etc) wanted to support 16-bit code in their 64-bit OSes, they would have had to create an emulator+VM (such as OTVDM/WineVDM) or make costly hacks to the OS.
It's actually no harder to call 16-bit code from 64-bit code than it is to call 32-bit code from 64-bit code... you just need to do a far return (the reverse direction is harder because of stack alignment issues). The main difference between 32-bit and 16-bit is that OS's support 32-bit code by having a GDT entry for 32-bit code, whereas you have to go and support an LDT to do 16-bit code, and from what I can tell, Windows decided to drop support for LDTs with the move to 64-bit.
The other difficulty (if I've got my details correct) is that returning from an interrupt into 16-bit code is extremely difficult to do correctly and atomically, in a way that isn't a problem for 32-bit or 64-bit code.
So yes, you can write/run 16-bit code in 64-bit Compatibility Mode. You can't execute existing 16-bit software in 64-bit Compatibility Mode. The former is a neat trick, the latter is what people actually expect "16-bit compatibility" to mean.
> The 64-bit builds of Windows weren’t available immediately.
There was a year or so between the release of AMD-64 and the first shipping Microsoft OS that supported it.[1] It was rumored that Intel didn't want Microsoft to support AMD-64 until Intel had compatible hardware. Anyone know? Meanwhile, Linux for AMD-64 was shipping, which meant Linux was getting more market share in data centers.[1]
I don't understand why Microsoft chose to kill it. That's not in their character re: backwards compatibility.
[0] https://github.com/leecher1337/ntvdmx64
Edit: Some nice discussion about the NTVDMx64 when it was released: https://www.vogons.org/viewtopic.php?t=48443
You can see all of this explained in the README for the very project you linked:
```
How does it work?
=================
I never thought that it would be possible at all, as NTVDM on Win32 uses V86 mode of the CPU for fast code execution which isn't available in x64 long mode. However I stumbled upon the leaked Windows NT 4 sourcecode and the guys from OpenNT not only released the source but also patched it and included all required build tools so that it can be compiled without installing anything but their installation package. The code was a pure goldmine and I was curious how the NTVDM works.
It seems that Microsoft bought the SoftPC solution from Insignia, a company that specialised in DOS-Emulators for UNIX-Systems. I found out that it also existed on MIPS, PPC and ALPHA Builds of Windows NT 4 which obviously don't have a V86 mode available like Intel x86 has. It turned out that Insignia shipped SoftPC with a complete emulated C-CPU which also got used by Microsoft for MIPS, PPC and ALPHA-Builds.
```
As to why they didn't continue with that solution, because they didn't want to rely on SoftPC anymore or take on development themselves for a minuscule portion of users who would probably just use 32-bit Windows anyways.
My personal suspicion: it's about handles.
Several kinds of objects in the Windows API are identified by global handles (for instance, HWND for a window), and on 16-bit Windows, these handles are limited to 16 bits (though I vaguely recall reading somewhere that they're actually limited to 15 bits). Not having the possibility of a 16-bit Windows process would allow them to increase the global limit on the number of handles (keeping in mind that controls like buttons are actually nested windows, so it's not just one window handle for each top-level window).
With Zen AMD once again turned the tables on Intel, but not enough to break Intel. Intels downfall seems entirely self-inflicted and is due to a series of bad business decisions and sub-par product releases.
What I want to add to the story is that when Intel Core 2 came out (and it was an x86-64 chip), it absolutely crushed AMD's Athlon 64 processors. It won so hard that, more or less, the lowest spec Core 2 CPU was faster than the highest spec Athlon 64 CPU. (To confirm this, you can look up benchmark articles around the year 2006, such as those from Tom's Hardware Guide.) Needless to say, my next computer in 2008 was a Core 2 Quad, and it was indeed much faster than my Athlon 64.
The Core 2 and all its sequels were how Intel dominated over AMD for about a decade until AMD Zen came along.