Leslie Lamport revolutionized computer science with math [video]

One of the most interesting results I found in Leslie Lamport's papers is Buridan's Principle:

A discrete decision based upon an input having a continuous range of values cannot be made within a bounded length of time.

Quoting from https://lamport.azurewebsites.net/pubs/buridan.pdf:

"The significance of Buridan’s Principle lies in its warning that decisions may, in rare circumstances, take much longer than expected. Before the problem was recognized by computer designers, some computer systems probably failed regularly (perhaps once or twice a week) because arbiters took longer than expected to reach a decision. Real accidents may occur because people cannot decide in time which of two alternative actions to take, even though either would prevent the accident. Although Buridan’s Principle implies that the possibility of such an accident cannot be eliminated, awareness of the problem could lead to methods for reducing its probability."

In the accompanying notes at https://lamport.azurewebsites.net/pubs/pubs.html, Lamport states:

The four reviews ranged from "This well-written paper is of major philosophical importance" to "This may be an elaborate joke." One of the other reviews was more mildly positive, and the fourth said simply "My feeling is that it is rather superficial." The paper was rejected.

Animats · 3 years ago

This is a very real phenomenon. But Lamport didn't discover it. It was known a decade earlier. The classic paper is Kinniment and Wood, 1976.[1] The earliest patent in this area is from 1972.[2] It's quite difficult to do this right. Arbiters today recognize "A was first", "B was first", and "ambiguous, must retry". The system has to be prepared for rare retry delays. In theory there can be an infinite number of retries, but the odds of needing more than one retry are very low.

It comes up all the time in multiport memories, where several CPUs are accessing the same memory, and there has to be an arbiter to decide who gets access access for near-simultaneous requests. Requests can be very close to simultaneous. If two CPUs with 3 GHz clocks are contending for memory access, and the clocks are not synchronized but close, about once per second they will be within 10^-19 seconds of whatever difference is most troublesome.

This was a serious problem with some early multiprocessor mainframes. Early in my career we saw this happening with multiprocessor UNIVAC 1108 machines. They had an unsound arbiter, and, once in a while, every few hours, they'd botch a memory access. Early on, the software stability was so bad the OS crashed more often than the hardware did, but as the OS got better, it became clear there was a race condition at the hardware level.

[1] http://async.org.uk/David.Kinniment/Research/papers/IEE1976....

[2] https://patents.google.com/patent/US3824409A/en

gradschool · 3 years ago

Kinniment expands on the idea at length in [1]. Chaney is another good source on the history of arbiters [2]. Specialists in asynchronous circuit design think about arbiters a lot. Alex Yakovlev and Alain Martin have published some papers about them.

[1] http://www.async.org.uk/David.Kinniment/DJKinniment-He-Who-H...

[2] https://arl.wustl.edu/~jst/cse/260/glitchChaney.pdf

edit: typo

lisper · 3 years ago

Why is an analog-to-digital converters not a counterexample? For that matter, why is a comparator (a one-bit ADC) not a counterexample? Or, for that matter, a wire tied to ground? Or a 3x5 card with the word "YES" printed on it?

kazinator · 3 years ago

I think it's because these devices do not always make the correct decision between two values.

Lamport's paper covers this in the interrupt example: a computer must decide whether an interrupt line is signaled or not in a finite amount of time (the next instruction). The interrupt is continuous: it passes through some halfway voltage level like 2.5 between 0 and 5, and so it can be called the wrong way.

Basically the whole result is a restatement of real numbers not being computable.

The discrete sample value you get from an ADC is not based on knowing the underlying real number absolutely, and then choosing the nearest value. It's a time limited process of discovering the value partially. Since the value is not partially known, it is not known whether the reported sample value is the closest one.

EGreg · 3 years ago

I emailed back and forth with Leslie a lot about this, when I first read his paper. There is an earlier analysis of the same phenomenon by different people (the arbiter problem) which I discovered later.

At first, I had trouble to believe it, and tried to find counterexamples, until I realized that the composition of continuous functions is continuous. I still wonder sometimes whether an infinite composition of continuous functions has to be continuous (limit of a sequence) if it represents some real world evolving process. Can it produce a discontinuity in the limit?

Anyway, Buridan’s principle informs our consensus process in Intercoin Protocol[1] [2] — it is why the consensus groups are always of a bounded finite size, so we don’t approach the errors in Buridan’s priciple (I think this is also called unstable problems).

1. Beyond blockchains https://community.intercoin.org/t/intercoin-technology-conse...

2. Experimental improvements https://community.intercoin.org/t/experimental-improvements-...

Anyone who reads the above … I would be very interested in your feedback on our forum.

raegis · 3 years ago

> I still wonder sometimes whether an infinite composition of continuous functions has to be continuous (limit of a sequence) ...

No, as you expected. Take f(x) = x^2 on [0,1]. Then f(f(f(...f(x))) converges to g(x) = { 0, if 0 <= x < 1; 1, if x=1 }

Deleted Comment

im3w1l · 3 years ago

I think Buridan's ass is like an inverted pendulum. There is no particular reason why it should fall either way but the slightest gust of wind or vibration will trigger a self-reinforcing process that breaks the stalemate.

Now, an inverted pendulum can be stabilized with a control system, and this might be possible to relate with adversarial inputs of some system.

pontusrehula · 3 years ago

> Real accidents may occur because people cannot decide in time which of two alternative actions to take, even though either would prevent the accident.

Reminds me of one of the Boeing 737 crashes where pilots were reading the manual but had not enough time to reach the relevant pages.

grayclhn · 3 years ago

From skimming the paper, "continuity" seems to be a deceptively strong assumption here, and the claim "this is a physical system so it's continuous" is doing a lot of heavy lifting. Unless I'm misunderstanding the example in the first paragraph, "move towards point 1 at a constant pace" can't be represented as a continuous A_t(x) for t above some finite threshold.

thfuran · 3 years ago

Why not? How are you getting discontinuities out of constant velocity motion?

zozbot234 · 3 years ago

> A discrete decision based upon an input having a continuous range of values cannot be made within a bounded length of time.

I understand that this is true for deterministic decisions. But a limited amount of randomness/noise is generally an acceptable alternative to a requirement for unbounded precision.

Deleted Comment

l33t2328 · 3 years ago

What about a decision like “can this wave be faithfully represented by this set of points?”

Wouldn’t the Sampling Theorem give an answer in some bounded time? Is the idea that the time required to crank the algorithm can grow without bound?

Kranar · 3 years ago

I feel like Buridan's Principle is either poorly worded, or most likely I just completely fail to understand it. The principle says:

"A discrete decision based upon an input having a continuous range of values cannot be made within a bounded length of time."

But certainly I can make a discrete decision in a constant amount of time, just choose to always go left. I must therefore be missing something here.

The next possibility is that a discrete decision procedure that guarantees success is not possible in an unbounded amount of time. This is more sensible and I think the idea is that there will be some kind of infinite regress. For example a donkey will starve to death if it doesn't eat in exactly 100 seconds. The donkey can choose to walk left X meters, or walk right Y meters to some food.

There are certain values of X and Y where the donkey dies no matter what, if X and Y are sufficiently far away that the donkey can never get to them in 100 seconds then the donkey dies.

There are also certain values of X and Y where the donkey can pick either of them and survive since the donkey can get to X or Y in well less than 100 seconds, so the donkey lives.

The above two scenarios are boundary conditions of sorts, where it's fairly trivial to come to a decision, the principle is about what happens when there is a value of X where it takes almost exactly 100 seconds to get to X, (and assume way more than 100 seconds to get to Y), but in order for the donkey to come to that realization the donkey needs to spend some time thinking about it and by the time the donkey has made a decision the donkey won't have enough time to carry it out. So the donkey has to take into account not only the amount of time to get to X, but also the amount of time it takes to come to a decision to get to X, but that too takes some finite amount of time... so the donkey has to take time to come to a decision about how long it takes to come to a decision to get to X, so on so forth... and you end up with an unbounded amount of time.

I could be way off here but I feel there is some subtlety in the way this principle is described that I'm missing.

Deleted Comment

gryn · 3 years ago

the signal needs to have a maximal frequency for samling theorem to be aplicable. not all functions on a continious domain have that, in fact i think its the oposite most don't unless proved otherwise.

emmelaich · 3 years ago

Discussed at https://news.ycombinator.com/item?id=22884831

bsedlm · 3 years ago

to be fair, the paper has a funny tone, like an elaborate joke would have... I feel like he may deliver a punch-line at any point before I finish reading the paper and then I'd have to choose whether to burst out laughing or not

froh · 3 years ago

uh --- which implications does this have for driving AI? Possibly none, if the acceptance criterion for AI is: "it kills fewer people than humans would. possibly others, but fewer overall."

Afton · 3 years ago

None, unless you think that people are somehow able to side-step this problem in a way that a computer can't mimic.

charcircuit · 3 years ago

Driving AI doesn't require continuous input variables. Approximations are good enough in the real world.

Great video. I met Leslie once, sat on the bus beside him on the way to a conference around 8 years ago. He wasn't the chattiest, but you bring up his work, he likes to talk. I think he was just over 70 years old, but still incredibly sharp. At the time Microsoft Research were shutting down their valley office, but they would still let him come in there - last one to put the lights out (metaphorically for computer science research at the big IT companies). Nowadays, he couldn't do the research work he did there and at other places at any big IT company - it's R&D, with the emphasis on "D".

peppertree · 3 years ago

I believe VMWare "adopted" Microsoft's research team, but that's the last I heard of the team. These days the most interesting corporate research happens at Google, Nvidia, OpenAI. I guess the forefront of research has moved onto ML and many old school researchers got left behind.

rhplus · 3 years ago

TLA+ and other formal language research is pursued by the RISE group in Microsoft Research:

https://www.microsoft.com/en-us/research/group/research-soft...

metadat · 3 years ago

There's lots of other research happening all over, but gets little attention probably due to non-existent or otherwise poor marketing beyond publishing papers.

InefficientRed · 3 years ago

There is tons of formal methods research happening in industry. Way more than in the days of Microsoft research silicon valley

Deleted Comment

IMTDb · 3 years ago

> Nowadays, he couldn't do the research work he did there and at other places at any big IT company - it's R&D, with the emphasis on "D".

I, on the other hand, am amazed by the progress being done today, and the research paper released on an almost weekly basis by companies like Google, Microsoft and co. Just look at the language and image models that are released, the progress on computer vision etc.

I played with the playground of OpenAI GPT-3 and I am blown away at the result. It's not perfect, but it's orders of magnitude better than what we had easily access to just few month ago. I have started using it in my day-to-day life on some specific tasks, and I can't wait to see the next versions. I can't even start to fathom the amazing products people are going to build around it in the next decade.

It's absolutely true that these models/research serve the purpose of the company building them... but even Leslie says in the video that he did his entire career in the industry because that's where he saw many interesting challenges, and that his whole research was a means to an end.

R&D is still alive and kicking, and is going faster than ever. You might just be looking at the past with rose tainted glasses, and needlessly undermining the present.

vfclists · 3 years ago

AI and ML are not true computer science because no one knows how these models work, not even the developers

avgcorrection · 3 years ago

> I think he was just over 70 years old, but still incredibly sharp.

Nice...

I don’t think someone at his level becomes dull at such a young senior age.

penguin_booze · 3 years ago

A friend used to joke that "we're doing R&D, too: run and debug".

triska · 3 years ago

uvdn7 · 3 years ago

There is no distributed system without Lamport. His Time, Clocks and Ordering paper to distributed system is like Ted Codd’s A Relational Model paper to database (which is also math!). His elegant analogy between the special relativity, space-time diagram, reasoning distributed systems as state machines, and TLA+; each one is breathtaking.

Paxos is so beautiful. It’s a piece of art. It’s way more elegant and beautiful than Raft (as it solves the smallest consensus problem instead of a log). I will die on this hill.

adolph · 3 years ago

The TLA+ Lectures are awesome:

https://www.youtube.com/watch?v=p54W-XOIEF8

[suddenly in a clown costume] "What kind of clown am I, claiming I can make you think better? Why should you pay attention to me?"

[in a suit] "This is not the time to be modest. I have done seminal research in the theory of concurrent and distributed systems for which I won the Turing award. You can stop the video now and look me up on the web." [long pause]

gralx · 3 years ago

Easter egg: at one point you can hear him typing on an IBM buckling spring keyboard. Something for the keyboard enthusiasts :)

butterisgood · 3 years ago

Viewstamped Replication (the other VR), is also very lovely. https://pmg.csail.mit.edu/papers/vr-revisited.pdf

_benedict · 3 years ago

I think you maybe overstate it, but I agree Paxos is nicer than Raft. The papers in which we outlines it are awful though.

Raft is more understandable, because they wrote a paper that was easy to understand. You are much better off reading Heidi Howard to understand Paxos, than Lamport.

Lamport's early Paxos papers are interesting for a lot of reasons (I particularly like his simple textbook definition of equivalence classes in "Generalized Consensus and Paxos" (2005)), but I agree that the Paxos algorithm gets lost in all the fluff.

In 2016 some researchers at Stony Brook published the full (Multi-) Paxos algorithm using TLA+, including a couple of nice extras:

https://arxiv.org/abs/1606.01387

The spec is succinct enough to commit to memory and way easier to comprehend than Lamport's prose descriptions (Lamport has never specified the Paxos algorithm in TLA+, although you can find his TLA+ spec for Paxos consensus). The paper also includes a mechanically checked proof and an interesting overview of other related work.

LAC-Tech · 3 years ago

I have that paper printed out on my desk :) Won't pretend I understand it 100% but it taught me a lot about distributed systems.

fizwhiz · 3 years ago

TIL Edgar Codd went by Ted.

frogulis · 3 years ago

I don't think I'd even heard his name without the "F." initial in the middle either.

_448 · 3 years ago

At the end of explaining the "bakery algorithm", he says "I am proud that I stumbled on it" He doesn't say "I invented it", "I came with it", "I wrote it", etc, etc.

In my career I have seen that people who are true geniuses are also very humble!

lifefeed · 3 years ago

There's a list of his papers with little notes by him on every one at https://lamport.azurewebsites.net/pubs/pubs.html . His casual notes are themselves an absolute education.

My favorite is on "Time, Clocks and the Ordering of Events in a Distributed System", where he applies the lessons of special relativity to understand computers, and he says:

> Jim Gray once told me that he had heard two different opinions of this paper: that it's trivial and that it's brilliant. I can't argue with the former, and I am disinclined to argue with the latter.

lanstin · 3 years ago

I think if one thoroughly understands this paper then all the rest of distributed systems theory makes sense.

talaketu · 3 years ago

Indeed - though I think this is a mathematical philosophy about theories. "Genius" is a nice word choice to emphasize your point - it refers to some kind of distinct spirit that happens to inspire the person, not the person themself. (A little like writing vs typing.)

andrepd · 3 years ago

Absolutely. Conversely, I find that the people who are exceedingly eager on self-aggrandisement are usually mediocre.

clpm4j · 3 years ago

100% agreed. Arrogance is a sure sign of brittleness.

nanna · 3 years ago

Leslie Lamport built Knuth's TeX into the user-friendly version to which he appended the first two letters of his surname: LaTeX. He wrote an excellent, highly accessible guide, LaTeX: A Document Preparation System, which I wish I read before setting off on a thousand random web pages.

contingencies · 3 years ago

This paper is a case study of program evolution. The author kept track of all changes made to TEX during a period of ten years, including the changes made when the original program was first debugged in 1978. The log book of these errors, numbering more than 850 items, appears as an appendix to this paper. The errors have been classified into fifteen categories for purposes of analysis.

I came to the conclusion that the designer of a new system must not only be the implementor and the first large-scale user; the designer should also write the first user manual. - Donald Knuth, 'The Errors of TeX' (1989) https://yurichev.com/mirrors/knuth1989.pdf (4.8MB)

... via https://github.com/globalcitizen/taoup

globular-toast · 3 years ago

Why did I never notice the "La" came from his name? Most people pronounce it like the rubber you get from a tree but my PhD supervisor pronounced it more like "Lah-Tech". I guess that is the correct pronunciation!

dboreham · 3 years ago

His later work prompted me to learn Order Theory, which has turned out to be useful for all sorts of things. Also quite closely related to Category Theory which I wouldn't have had much chance of grokking without first understanding Order Theory, I suspect.

I also used LaTeX heavily in the 80s so was surprised to see him pop up as a genius of distributed systems later (although that work was published much earlier it didn't get much exposure until the 90s). Like "oh that guy must be _really_ smart to excel in two quite different fields".

SkyMarshal · 3 years ago

Are there any good resources you recommend for learning Order Theory?

That's a good question. I don't think I found any useful books, although surely there are some out there. I believe I mostly read Wikipedia articles and posts on math stackexchange

e.g.

https://en.wikipedia.org/wiki/Join_and_meet

https://math.stackexchange.com/questions/1775926/when-is-a-j...

manu3000 · 3 years ago

an introductory blog: https://matt.might.net/articles/partial-orders/

mhh__ · 3 years ago

He wrote some interesting stuff on mathematics and physics in the 60s too but it's all lost to time apparently.

jamesblonde · 3 years ago

RcouF1uZ4gsC · 3 years ago

I think one of the things that helped was his ability to come up with very catchy explanations and names. “Paxos” and “Byzantine Generals” have great memetic power verses some boring technical name.

User23 · 3 years ago

In my opinion Lamport is the greatest computing scientist since Dijkstra. His work on the win and sin predicate transformers is grossly neglected and under appreciated. Dijkstra was the first to formalize interrupts and nondeterminism, but Lamport took it to the next level.

rramadass · 3 years ago

Right, this is also my viewpoint! We need Scientists like them to beat the importance of Mathematical Thinking into our lazy brains. No wishy-washy compromises but just full focus on usage of Correct Logic.

Now the next step is for somebody who understands their writings to teach us unwashed masses :-(

carapace · 3 years ago

https://lamport.azurewebsites.net/pubs/lamport-win.pdf