Some thoughts on journals, refereeing, and the P vs NP problem

I’m not up-to-date on how advanced proof assistants have become, but are we not nearing the point where serious attempts at proving P vs NP can be automatically validated (or not) by a widely vetted collection of Lean libraries?

The P vs NP problem is expressible as the question of whether a specific Π_2 sentence is true or not (more specifically, whether a particular base theory proves the sentence or its negation). Unlike problems involving higher order set theory or analytical mathematics, I would think any claimed proof of a relatively short arithmetical sentence shouldn’t be too difficult to write up formally for Lean (although I suppose Fermat’s Last Theorem isn’t quite there either, but my understanding is we’re getting closer to having a formal version of it).

The impact of this would be that journals could publicly post which specific sentences they consider to represent various famous open problems, and a prerequisite to review is that all purported proofs require a formal version that automatically Lean-validates. This would nix the whole crank issue.

Then again, I may be way off base with how close we are to achieving something like I’ve described. My impression is that this goal seems about 5 years away for “arithmetical” mathematics, but someone better in the know feel free to correct me.

Tainnor · a month ago

The effort from going to a paper proof, even one that everyone agrees is correct, to a formally verified one is still gigantic. Requiring all submissions in maths papers to come with a verified Lean proof would be a complete non-starter - most mathematicians don't even use Lean.

In many cases, the problem isn't that the statement of the theorem can't be properly expressed in Lean. It's very easy to write down the statement of FLT in Lean, for example. But for theorems whose proofs have been elusive for decades or centuries, the proofs very likely include objects in rather advanced mathematics that are not at all easy to encode in Lean. This is one of the current struggles of the FLT formalisation process, for example.

In any case, the problem in this case has nothing to do with formalisation, it's simply a case of a bad submission that for unclear reasons didn't get properly peer reviewed.

Xcelerate · a month ago

Thanks for sharing! I didn’t think about all of the intermediate steps that might be difficult to encode, but that makes sense.

andrewla · 25 days ago

I think it's reasonably clear at this point that an elementary proof that P != NP is unlikely to exist; in all likelihood such a proof would have to traverse many other fields of mathematics to get back to the main result, and would almost certainly hinge on areas of mathematics that have proven difficult to formalize so far.

pron · a month ago

The problem is this: a proof of P ≠ NP is likely to appeal to some theorem in some area of mathematics. Whether or not that theorem has already been formalised is actually unimportant, because it's already accepted as convincing. But the core of the proof would be some sort of a reduction from P ≠ NP to that theorem, and that reduction is likely to be novel and very arduous to formalise.

Tangential, but for some reason P vs. NP attracts an ungodly amount of cranks, probably because as far as open problems of that importance go it's easy to understand the question.

CJefferson · a month ago

I agree, but think it's worse.

It's easy to get a superficial understanding of the problem, and then very easy to subtly mis-understand it.

I've reviewed published papers by respectable people where they've made one of several easy mistakes:

* Be careful about how you encode your problem, it's easy to make it "too large", at which point your problem can be solved in P in the input size. For example, don't represent a sudoku as triples "x-pos,y-pos,value", where those 3 numbers are encoded in binary, because now if I give you a problem with only 2 filled in values, you can't take the solution as your 'certificate', as your input is about size 6 * log(n), but the solution will be n * n * log(n).

* It's also easy if you write your problem as a little language to accidentally make it impossible to check in P time.

* When proving a reduction (say turning SAT into a Sudoku, to prove Sudoku is NP-complete), it's (usually) easy to show how solutions map to solutions (so you show how the SAT instance's solution turns into a particular solution to the Sudoku). It's usually much harder, and easy to make mistakes, showing the Sudoku can't possible have any other solutions.

eru · a month ago

I've also seen people make the wrong direction of reduction.

Basically, they show that you can use SAT to solve Sudoku, and then claim that this makes Sudoku NP-complete. (All it shows is that Sudoku is in NP.)

People make similar mistakes often when they want to show that a certain problem isn't solvable in linear time, and they try to show that sorting can solve your problem. But it's the wrong direction.

hejsansvejsan · a month ago

There's nothing subtle about the mistake in the paper at hand. The reason everybody expects proving P != NP to be difficult is that it's very hard to say anything at all about arbitrary programs. The authors just assume without justification that any program that solves SAT must operate in a certain recursive way -- obvious rubbish. It's hard to overstate how embarrassing this is for the Springer journal where this nonsense is published.

andrewla · 25 days ago

I think both easy to understand, and also it seems very obvious that non-deterministic Turing machines are more powerful than deterministic ones. It feels almost like a non-deterministic Turing machine is more powerful almost in the sense that a halting oracle is more powerful.

The insane thing is that non-deterministic Turing machines are computable at all! It really feels like they belong to a different class of "magical" computers, or an axiom-of-choice / Banach-Tarski naval-gazing infeasible trick. You mean, you just "guess" the answers and your program will get the right answer?

But they are computable; the Church-Turing model of computation is fantastically powerful. Now the problem is just "feasibility" and "complexity". It seems to the initiate that there must be an answer hiding just around the next corner because it's SO OBVIOUS but in the end if you give someone n^100 time they can solve any problem that you care to pose but that still counts as P, so you're not going to stumble upon some grand insight.

zero_k · a month ago

I work on a (once top-of-the-line) SAT solver [1] and a (currently top-of-the-line) model counter [2]. Actually, I am very interested in the part of the rebuttal of "when each constraint has at most two variables, then the constraint satisfaction problem (and even the more difficult problem of counting the number of solutions) can be solved in time less than the lower bound that is claimed" -- in the model counting competition [3] there are actually problems that are binary-clause only, and I have to admit I am having trouble counting them any smarter than I already do normal (i.e. >=3 length clause) problems. Is there some very fun algorithm I'm missing that I could use for only-binary clause solution counting? I have thought about it, but I just... can't come up with anything smarter than compiling it into a d-DNNF form, which most SOTA model counters (and so I as well) do.

[1] https://github.com/msoos/cryptominisat/ [2] https://github.com/meelgroup/ganak/ [3] https://mccompetition.org/

simiones · a month ago

Isn't this the claim for which they link to a paper [0] directly, which describes an algorithm by Ryan Williams for achieving this bound?

[0] https://www.sciencedirect.com/science/article/pii/S030439750...

Ah, interesting. Thanks, I'll look into it!

pron · 25 days ago

The paper's authors' reply to the comment is really bad. E.g., it says this:

> With respect to the assumption, we have clearly elaborated on why a mathematical assumption is essential for proving computational hardness results in Appendix A of our paper as follows: In fact, the Turing machine itself is an assumption about machines (i.e., mechanized finite formal systems) that can be realized in the physical universe. In essence, the Turing machine represents a fundamental physical assumption, and Turing’s findings on uncomputability signify the physical limits of mechanically solving all instances of a problem.

But Turing's assumption was necessary precisely because it was not mathematical. He had to cross the gap between physics (where computation takes place) and mathematics (which he wanted to use to discuss computation) with an assumption about physics that allowed him to construct a mathematical definition of computation. The authors of the paper, on the other hand, make an assumption within mathematics that could simply be mathematically wrong. It's like "proving" Golbach's conjecture by assuming that if an even number were ever not the sum of two primes, then that number must also be divisible by 17 and proceeding from there. Their "proof" is essentially: we assume that if a tractable algorithm (a mathematical construct) for SAT existed, it would be done in a certain way [1], but that way isn't possible ergo a tractable algorithm for SAT cannot exist.

Physics is based on (mathematically) unprovable assumptions. In mathematics, though, "assumptions" are conditionals. The paper should have said, we make a mathematical conjecture that, if true, would lead to our conclusion. You can't assume a mathematical conjecture and then call the entailment (from that conjecture) of X a proof of X. It's calling a proof of `A ⇒ B` a proof of `B`. I agree that a paper that aims to present a mathematical proof that is based on a redefinition of what a mathematical proof means should have led to the paper's outright rejection.

[1]: The paper says: "We only need to assume that this task is finished by dividing the original problem into subproblems" (... of the same kind)

P.S.

What's funnier is that their assumption requires an even bigger leap of faith than their conclusion, i.e. more people are inclined to believe that P ≠ NP than that if someday somebody did find a tractable algorithm for SAT, that algorithm would surely work by division into smaller subproblems.

brohee · 25 days ago

Wait, the obviously wrong paper was written by GPT-4? https://arxiv.org/pdf/2309.05689 (from the comments on the original blog)

shusaku · a month ago

“Frontiers” journals are crap, so it’s no surprise. But I don’t understand why editors let this happen. It’s similar to academic fraud: if you lie about results nobody cares about, you’ll never get caught; if you lie about big things, suddenly the hammer comes down. And for what benefit? It’s not like they walked away with a payday from splitting the open access fee. There’s nothing wrong with desk rejecting an author by saying “if your result is true, publish it somewhere better”

flobosg · a month ago

Despite the similarity in naming, Frontiers of Computer Science (Front. Comput. Sci.) is not published by Frontiers Media SA, but Springer and Higher Education Press. Note, however, that Frontiers Media does publish Frontiers *in* Computer Science (Front. Comput. Sci. (Lausanne)).

bwfan123 · 25 days ago

The issue here is that the deputy editor-in-chief of the journal is also an author of the paper. As such, the conflict of interest should have raised the bar to the acceptance of the paper.

Here is a business idea. Start a "Frontier Journal" published electronically as a pdf. Contact professors, grad students, etc for paper submissions. Publish their papers for a fee, say $100. The paper will buff their resumes and pad their egos. Send out proceedings as pdf to participants. Everybody profits including the environment. I have seen this scam play out.

jojomodding · 25 days ago

Having such a paper mill paper on your CV will do the opposite of "boasting" your CV when people from your field look at it. Usually this happens at your PhD defense at the latest.

qsort · a month ago