Base 3 Computing Beats Binary

> Why didn’t ternary computing catch on? The primary reason was convention. Even though Soviet scientists were building ternary devices, the rest of the world focused on developing hardware and software based on switching circuits — the foundation of binary computing. Binary was easier to implement.

That's not the way I've heard the story. Ternary isn't just on/off, voltage yes/no like binary - you need to know the charge of the voltage: is it positive or negative? Essentially then your circuits are -/0/+ instead of (0/+) like it is for binary. Such ternary circuits resisted miniaturization. At a certain point the - and + circuits cross and arc and create a fire. The binary circuits kept getting miniaturized.

The story I've heard that goes along with this is that's how the US ultimately won the space race: the US bet on binary computing and the Soviets bet on ternary computing. The Soviets lost.

hoosieree · 2 years ago

I always thought the information density advantage of ternary disappeared once you put it in the real world:

- harder to discriminate values as you pack more in the same dynamic range

- more susceptible to noise

- requires lower clock rates for (wired) transmission

- requires more complex protocols for (wireless) transmission

- representing 0 or 1 as switched states uses "no" power, but getting a switch to be "halfway on" uses a lot of power

pezezin · 2 years ago

But most transmission systems use multi-level signals. Gigabit Ethernet uses PAM-5, the latest versions of PCIe use PAM-4, and USB 4 uses PAM-3. Not to mention the crazy stuff like QAM-256 and higher.

ComplexSystems · 2 years ago

This doesn't make sense to me. You don't have to use negative voltages to encode ternary. You can just use three different positive voltages, if you like. 0V = 0, +3V = 1, +6V = 2.

jerf · 2 years ago

The main problem is that if you are minimizing the voltage to the minimum that can be safely distinguished for binary, you must, by necessity, be introducing twice that voltage to introduce another level. You can't just cut your already-minimized voltage in half to introduce another level; you already minimized the voltage for binary.

50 years ago this may not have been such a problem, but now that we care a lot, lot more about the power consumption of our computing, and a lot of that power consumption is based on voltage (and IIRC often super-linearly), so a tech that requires us to introduce additional voltage levels pervasively to our chips is basically disqualified from the very start. You're not sticking one of these in your server farm, phone, or laptop anytime soon.

jsheard · 2 years ago

Indeed that's how nearly all NAND flash works nowadays, early SLC media was binary with each cell set to a low or high voltage, but as density increased they started using more voltages inbetween to encode multiple bits per cell. The current densest NAND uses 16 different positive voltage states to encode 4 bits per cell.

AnotherGoodName · 2 years ago

Voltages are always measure relative to each other. In ops example -3V to +3V has 6V difference just as 0V to 6V does and the arcing is the same.

Op didn’t specify any particular voltage but you should get the example. You need more voltage between the highest and lower states to differentiate the signals compared to binary. It can work well but only in circuits where there’s already very low leakage (flash mentioned as another reply is a great example).

IWeldMelons · 2 years ago

Yes, but then you have to use a lot more complex electronics and production tolerances, as now you'd need to either distribute voltage reference for intermediate level all over the board, which essentially makes it exactly same system as with negative voltage, but with the third wire becoming ground; the same concept but worse implementation, or make circuits able able to discriminate between two different levels, this will be both difficult in terms of implementing the circuit, and will also lead to enormous energy waste, as part of your transistors will have to be half open (jinda similar similar to ECL logic, but worse).

kuroguro · 2 years ago

That's my understanding as well. Voltages are relative, you are free to choose a "ground" and work with negatives or not if you want.

RegnisGnaw · 2 years ago

Except in the analog world its not so clear, you can't just say +3V=1. What if its 3.7V? or 4.5V? Early tools weren't that accurate either so you needed more range to deal with it.

da_chicken · 2 years ago

> At a certain point the - and + circuits cross and arc and create a fire.

That's not unique to ternary circuits. That's just how voltage differential of any kind works.

The trick is figuring out how many states you can reliably support below the maximum voltage differential the material supports. As we reach the physical limits of miniaturization, "two states" is almost certainly not going to remain the optimal choice.

ykonstant · 2 years ago

I am extremely doubtful about your last claim; is there work being done in that direction that you can point to? Don't get me wrong, it would really be exciting if we could get actual efficiencies by increasing the number of states, but all the experts I have talked to so far are very pessimistic about the possibility. The problems introduced by ternary circuits seem to offset any claimed efficiency.

keybored · 2 years ago

I think this sentence from the Space Race Wikipedia is funny:

> The conclusion of Apollo 11 is regarded by many Americans as ending the Space Race with an American victory.

HPsquared · 2 years ago

"Winning the space race" is a rather woolly concept and depends on your definitions. Although NASA did land astronauts on the moon, the Soviets had firsts in most of the main areas relevant today (first satellite, first astronaut, first space station, first landing on another planet, etc etc.).

https://en.m.wikipedia.org/wiki/Timeline_of_the_Space_Race

akira2501 · 2 years ago

> At a certain point the - and + circuits cross and arc and create a fire.

To do binary logic we do CMOS. The reason CMOS gets hot is because the complementary transistors don't switch at the same time. So, at a certain point, the Vss and Vdd circuits connect and create a massive current drain.

> The binary circuits kept getting miniaturized.

Sure.. but they're not getting much faster.

davrosthedalek · 2 years ago

There are three loss mechanisms for CMOS. a) Leakage b) Crossing current c) Ohmic losses because of currents required to charge/discharge capacitances (of the gates etc..)

Pretty sure c) dominates for high frequency/low power applications like CPUs, as it's quadratic.

IWeldMelons · 2 years ago

I think it is yet another "bears walking on redsqare" level of claim (I mean about ternary systems). There was only one minor ternary computer produced by USSR ("Setun"); it has never been a big thing.

jasomill · 2 years ago

SETUN itself was an electronically binary machine that used bit pairs to encode ternary digits[1].

In support of your point, of the Soviet computers surveyed in the cited article, six were pure binary, two used binary-coded decimal numerics, and only SETUN was ternary[2].

[1] [Willis p. 149]https://dl.acm.org/doi/pdf/10.1145/367149.1047530#page=19

[2] [Willis p. 144]https://dl.acm.org/doi/pdf/10.1145/367149.1047530#page=14

[Willis] Willis H. Ware. 1960. Soviet Computer Technology–1959. Commun. ACM 3, 3 (March 1960), 131–166. DOI:https://doi.org/10.1145/367149.1047530

meindnoch · 2 years ago

Complete fabrication, with blatant disregard for physics and electronics.

Many modern CPUs use different voltage levels for certain components, and everything works fine.

phkahler · 2 years ago

>> Many modern CPUs use different voltage levels for certain components, and everything works fine.

But none of them use more than 2 states. If you've got a circuit at 0.9V or one at 2.5V they both have a single threshold (determined by device physics) that determines the binary 1 or 0 state and voltages tend toward 0 or that upper supply voltage. There is no analog or level-based behavior. A transistor is either on or off - anything in the middle has resistance and leads to extra power dissipation.

AlotOfReading · 2 years ago

Not agreeing with the parent post, but the different domains in modern electronics only work because they're (nominally) isolated except for level crossing circuits.

Dead Comment

> A binary logic system can only answer “yes” or “no.”

Maybe I'm missing something, but this sounds like a silly argument for ternary. A ternary system seems like it would be decidedly harder to build a computer on top of. Control flow, bit masking, and a mountain of other useful things are all predicated on boolean logic. At best it would be a waste of an extra bit (or trit), and would also introduce ambiguity and complexity at the lowest levels of the machine, where simplicity is paramount.

But again, maybe I'm missing something. I'd be super interested to read about those soviet-era ternary systems the author mentioned.

creer · 2 years ago

I don't see anything fundamentally obvious about this (chip design and arch background). If you look at chip photographs, you see massive amounts of space dedicated to wiring compared to the "logic cells" area, if you include the routing in between the logic cell rows - if you want to look at it this way for compute vs interconnect. Nicely regular, full custom datapaths exist, but so do piles of standard cells. And massive amount of space dedicated to storage-like functions (registers, cache, prediction, renaming, whatever.) If you could 1) have logic cells that "do more" and are larger, 2) less wiring because denser usage of the same lines, 3) denser "memory" areas - well that would be a LOT! So, not saying it's an obvious win. It's not. But it's worth considering now and then. At this level the speed of conversion between binary and multi-level becomes critical also - but it's not so slow that it obviously can't fit the need.

creer · 2 years ago

Speaking of compute density, do people do multi-bit standard cells these days? In their standard cell libraries?

One thing we were trying way back then was standard cells that integrated one flip flop or latch and some logic function into one cell. To trade a slightly larger cell (and many more different cells) for less wiring in the routing channel.

lo0dot0 · 2 years ago

I'm not saying it is competitive or practical but optical multi-level storage exists.

https://www.nature.com/articles/nphoton.2015.182

andriamanitra · 2 years ago

> Control flow, bit masking, and a mountain of other useful things are all predicated on boolean logic. At best it would be a waste of an extra bit, and would also introduce ambiguity and complexity at the lowest levels of the machine, where simplicity is paramount.

There is an even bigger mountain of useful things predicated on ternary logic waiting to be discovered. "Tritmasks" would be able to do so much more than bitmasks we are used to as there would be one more state to assign a meaning to. I'm not sure if the implementation complexity is something we can ever overcome, but if we did I'm sure there would eventually be a Hacker's Delight type of book filled with useful algorithms that take advantage of ternary logic.

davrosthedalek · 2 years ago

Yes, I am sure, that's why ternary logic is such a widely studied math field compared to boolean logic /s. No really, can you give an example where ternary logic is actually considerably more useful than the log(3)/log(2) factor of information density?

bee_rider · 2 years ago

Boolean logic is somewhat unintuitive already, I mean we have whole college courses about it.

> At best it would be a waste of an extra bit (or trit), and would also introduce ambiguity and complexity at the lowest levels of the machine, where simplicity is paramount.

This seems backwards to me. It isn’t a “waste” of a bit, because it doesn’t use bits, it is the addition of a third state. It isn’t ambiguous, it is just a new convention. If you look at it through the lens of binary computing it seems more confusing than if you start from scratch, I think.

It might be more complex, hardware-wise though.

mywittyname · 2 years ago

> I mean we have whole college courses about it.

Doesn't this have more to do with the fact that it's not part of the standard math curriculum taught at the high school level? I'm no math wiz and discrete math was basically a free A when I took it in college. The most difficult part for me was memorizing the Latin (modus ponens, modus tollens - both of which I still had to lookup because I forgot them beyond mp, mt).

Being a college course doesn't imply that it's hard, just that it's requisite knowledge that a student is not expected to have upon entering university.

titchard · 2 years ago

its been a while since I read some of this book and enjoying it, and I remember it refering to 3bit computing in the soviet era but it might be right up your street https://mitpress.mit.edu/9780262534666/how-not-to-network-a-...

__tmk__ · 2 years ago

I would think there wouldn't be much of a difference because the smallest unit you can really work with on modern computers is the byte. And whether you use 8 bits to encode a byte (with 256 possible values) or 5 trits (with 243 possible values), shouldn't really matter?

AnotherGoodName · 2 years ago

3 fewer lanes for the same computation. FWIW 8bits is the addressable unit. Computers work with 64bits today, they actually mask off computation to work with 8bits. A ternary computer equivalent would have 31trits (the difference is exponential - many more bits only adds a few trits). That means 31 conductors for the signal and 31 adders in the alu rather than 64. The whole cpu could be smaller with everything packed closer together enabling lower power and faster clock rates in general. Of course ternary computers have more states and the voltage differences between highest and lowest has to be higher to allow differentiation and then this causes more leakage which is terrible. But the actual bits vs trits itself really does matter.

shakow · 2 years ago

> because the smallest unit you can really work with on modern computers is the byte

Absolutely not, look e.g. at all the SIMD programming where bit manipulation is paramount.

mrob · 2 years ago

The concept of radix economy assumes that hardware complexity for each logic element is proportional to the number of logic levels. In practice this isn't true, and base 2 is best.

>Ternary circuits: why R=3 is not the Optimal Radix for Computation

https://arxiv.org/abs/1908.06841

Previous HN discussion:

https://news.ycombinator.com/item?id=38979356

crdrost · 2 years ago

I like this article but it does kind of seem like it gets to a point of “well we know how to do binary stuff in hardware real well, we don't know how to do ternary stuff that well and doing it with binary components doesn't work great.”

Also ternary gets a bit weird in some other ways. The practical ternary systems that the Soviets invented used balanced ternary, digits {–, o, +} so that 25 for example is +o–+,

   25 = 27 + 0*9 – 3 + 1.

If you think about what is most complicated about addition for humans, it is that you have these carries that combine adjacent numbers: and in the binary system you can prove that you relax to a 50/50 state, the carry bit is 50% likely to be set, and this relaxation happens in average by the 3rd bit or so, I think? Whereas ternary full adders only have the carry trit set ¼ of the time (so ⅛ +, ⅛ –) and it takes a few more trits for it to get there. (One of those nice scattered uses for Markov chains in the back of my head, the relaxation goes as the inverse of the second eigenvalue because the first eigenvalue is 1 and it creates the steady state. I got my first summer research job by knowing that factoid!) So you start to wonder if there's something like speculative execution possible then—half-adder-+ is definitely too simple for this but full adder + chains all these bits together and for larger numbers maybe it's not!

Similarly I think that binary proliferated in part because the multiplication story for binary is so simple, it's just a few bitshifts away. But for balanced ternary it's just inversions and tritshifts too, so it has always felt like maybe it has some real “teeth” there.

dgoldstein0 · 2 years ago

In terms of implementing adders, the standard solution for binary logic adders is carry lookahead adders. Perhaps an equivalent could be built in ternary logic?

https://en.m.wikipedia.org/wiki/Carry-lookahead_adder

addaon · 2 years ago

But it is true for one-hot encodings, yes? I may be showing my age here, but when I last took a computer architecture course domino logic (and self-clocking domino logic, even) was seen as something between the cutting edge and an obvious future for high speed data paths. No idea of this is still true, but it seems that something like domino logic would extend cleanly (and with cost linear in states) to ternary.

Deleted Comment

"In addition to its numerical efficiency, base 3 offers computational advantages. It suggests a way to reduce the number of queries needed to answer questions with more than two possible answers. A binary logic system can only answer “yes” or “no.”"

Yes... but nobody uses binary systems. We use 64 bit systems (and a number of other systems, but all larger than one bit), which has abundantly more than enough space to represent "greater than/less than/equal".

The main software issue with ternary computing is that with this, the entire advantage goes up in smoke. It is quite hard to articulate an actual advantage a multi-bit system would see since we do not use 1-bit or 1-trit systems in real life. (If you've got some particular small hardware thingy that could use it, by all means go for it; it's no problem to use it in just one little place and have a conventional binary interface.)

Taylodl's hardware issue with ternary circuits sounds like a reasonable one as well. If you've already minimized the voltage difference for your binary circuits to as small as it can reasonably be, the addition of ternary hardware necessarily entails doubling it the maximum voltage in the system.

Is this Quanta Magazine's "Brash And Stupid Claims About Computing Week" or something? https://news.ycombinator.com/item?id=41155021 The last paragraph is outright crockery, that "trit-based security system" is well known to be from a crackpot who appears to simply not comprehend that our binary systems do not in fact explode the moment they have to represent a "3", despite repeated attempts for people to explain it to the guy.

Programmers and hardware designers are not in fact just total idiots stuck in the mud about digital (vs analog) and binary (vs ternary) representations. They are the way they are for good and solid engineering reasons that aren't going anywhere, and there is basically no space here for someone to displace these things. It isn't just path dependence, these really are the best choices based on current systems.

varjag · 2 years ago

Ternary is one of several crackpotry schools that were established in USSR. You'd have them write books on the subjects, rant in tech magazines… there even was an SF short story about evil alien robots defeated by ternary.

Another big thing was TRIZ: a theory that you can codify invention by making a rulebook and arranging the rules in different permutations. There were plenty of smaller things too, especially in academia. It would typically start with one researcher sticking to some bizarre idea, then growing his own gang of grad students and adjuncts who all feed on that. Except the theory would be borderline batshit and all the publications are in-group, referring each other, and naturally a semester or two worth of this sectarian stuff is getting dropped into the undergrad curriculum.

marcosdumay · 2 years ago

During most of the time USSR existed, computer electronics were away from the optimum enough that ternary logic was competitive with binary.

It was just at the late 80s that this changed.

fooker · 2 years ago

> a theory that you can codify invention by making a rulebook and arranging the rules in different permutations.

You can. It's just slow that's all.

Superoptimizers 'invent' new compiler optimizations by exactly this technique.

homarp · 2 years ago

TRIZ previously on HN https://news.ycombinator.com/item?id=18045322 (and a few others)

and https://en.m.wikipedia.org/wiki/Systematic_inventive_thinkin...

then https://arxiv.org/abs/2403.13002 (AutoTRIZ: Artificial Ideation with TRIZ and Large Language Models)

shrubble · 2 years ago

TRIZ is not bizarre or strange. It is a series of concepts and ideas which are meant to help you get unstuck when working through a difficult engineering problem.

mordae · 2 years ago

Sounds like a good description of the current state of affairs in the academia, though.

vintermann · 2 years ago

Fringe theories in mathematics sometimes work out. Neural nets is arguably one of them: For the longest time, neural nets were simply worse than SVMs on most metrics you could think of.

NotAnOtter · 2 years ago

This line really struck me and it's a failure in technical writing. This is an article published on "quantamagazine", about niche computing techniques. You have a technical audience... you shouldn't still be explaining what binary is at the halfway point in your article.

Based

I loved the penultimate paragraph; gave me a hearty laugh and a fun rabbit hole to waste time on :)

taylodl · 2 years ago

The metric they use for "efficiency" seems rather arbitrary and looks like a theoretical mathematicians toy idea. Unfortunately real computers have to be constructed from real physical devices, which have their own measures of efficiency. These don't match.

"It's more efficient because um... the written expression is shorter"

Wait until the hear about base Googol

danielvaughn · 2 years ago

Rygian · 2 years ago

> Surprisingly, if you allow a base to be any real number, and not just an integer, then the most efficient computational base is the irrational number e.

Now I'm left with an even more interesting question. Why e? The wikipedia page has some further discussion, hinting that the relative efficiency of different bases is a function of the ratio of their natural logarithms.

jetrink · 2 years ago

The "area" that you want to minimize is the number of digits, d, times the base, b.

    A = d * b

d is roughly equal to the log of the number represented, N, base b.

    d ~= ln(N)/ln(b)

Substituting,

    A ~= b * ln(N) / ln(b)

Take the derivative of the area with respect to b and find where the derivative is zero to find the minimum. Using the quotient rule,

    dA/db = ln(N) * (ln(b)*1 - b/b) / ln(b)^2

    0 = ln(N) * (ln(b) - 1) / ln(b)^2

    0 = ln(b) - 1

    ln(b) = 1

    b = e

I hope I got that right. Doing math on the internet is always dangerous.

klyrs · 2 years ago

> Doing math on the internet is always dangerous.

Only if you see correctness as a thing to be avoided. In my experience, being Wrong On The Internet is the fastest way to get something proofread.

l- · 2 years ago

Although the cost function has a multiplication of a base times the floor of the log of the value with respect to that base plus one, area is a misleading analogy to describe the applicability as any geometric dimensional value has to taken with respect to a basis. For a visual, (directional) linear scaling is more in line so to say.

vitus · 2 years ago

A related point is comparing x^y vs y^x, for 1 < x < y.

It can be easily shown that the "radix economy" described in the article is identical to this formulation by simply taking the logarithm of both expressions (base 10 as described in the article, but it doesn't really matter, as it's just a scaling factor; this doesn't change the inequality since log x is monotonically increasing for x > 0): y log x vs x log y. Or, if you want to rearrange the terms slightly to group the variables, y / log y vs x / log x. (This doesn't change the direction of the inequality, as when restricted to x > 1, log x is always positive.) If you minimize x / log x for x > 1, then you find that this minimum value (i.e. best value per digit) is achieved at x=e.

(Choosing the base = e for calculation purposes: take a derivative and set to zero -- you get (ln x - 1) / (ln x)^2 = 0 => ln x - 1 = 0 => ln x = 1 => x = e.)

For some intuition:

For small x and y, you have that x^y > y^x (consider, for instance, x=1.1 and y=2 -- 1.1^2 = 1.21, vs 2^1.1 is about 2.14). But when x and y get large enough, you find the exact opposite (3^4 = 81 is larger than 4^3 = 64).

You might notice that this gets really close for x=2 and y=3 -- 2^3 = 8, which is just barely smaller than 3^2 = 9. And you get equality in some weird cases (x=2, y=4 -- 2^4 = 4^2 = 16 is the only one that looks nice; if you consider 3, its pairing is roughly 2.47805).

It turns out that what really matters is proximity to e (in a weird sense that's related to the Lambert W function). You can try comparing e^x to x^e, or if you want, just graph e^x - x^e and observe that's greater than 0 for x != e.

https://www.wolframalpha.com/input?i=min+e%5Ex-x%5Ee

paulsmith · 2 years ago

> To see why, consider an important metric that tallies up how much room a system will need to store data. You start with the base of the number system, which is called the radix, and multiply it by the number of digits needed to represent some large number in that radix. For example, the number 100,000 in base 10 requires six digits. Its “radix economy” is therefore 10 × 6 = 60. In base 2, the same number requires 17 digits, so its radix economy is 2 × 17 = 34. And in base 3, it requires 11 digits, so its radix economy is 3 × 11 = 33. For large numbers, base 3 has a lower radix economy than any other integer base.

I thought that was interesting so I made (well, Claude 3.5 Sonnet made) a little visualization, plotting the radix efficiency of different bases against a range of numbers:

https://paulsmith.github.io/radix-efficiency/radix_effciency...

Manabu-eo · 2 years ago

Base 4 is surprisingly competitive, but of course never better than base 2. Base 5 is the highest base that could stand at the pareto frontier, but just once and then never more.

Iam feeling extremely uncomfortable seeing people in this thread being absolutely unfamiliar wrt basic electronics and basic CS fundamentals.

Ternary system has very limited Energy efficiency benefit compared to binary - roughly 1.5 more efficient and a lot more difficult to trasmit over differential lines. Today the latter is a big concern.

BuyMyBitcoins · 2 years ago

I would like to become more familiar with such things, but my CS education was lacking in this regard. It was almost entirely geared towards programming, and none of these things come up in my career.

I suspect this is widespread.