Love this article and the push to build awareness of what modern SAT solvers can do.
It's worth mentioning that there are higher level abstractions that are far more accessible than SAT. If I were teaching a course on this, I would start with either Answer Set Programming (ASP) or Satisfiability Modulo Theories (SMT). The most widely used solvers for those are clingo [0] and Z3 [1]:
With ASP, you write in a much clearer Prolog-like syntax that does not require nearly as much encoding effort as your typical SAT problem. Z3 is similar -- you can code up problems in a simple Python API, or write them in the smtlib language.
Both of these make it easy to add various types of optimization, constraints, etc. to your problem, and they're much better as modeling languages than straight SAT. Underneath, they have solvers that leverage all the modern CDCL tricks.
We wrote up a paper [2] on how to formulate a modern dependency solver in ASP; it's helped tremendously for adding new types of features like options, variants, and complex compiler/arch dependencies to Spack [3]. You could not get good solutions to some of these problems without a capable and expressive solver.
There are also Constraint Programming solvers (some SAT based, some not) and (Mixed) Integer Programming solvers (not SAT based).
Each "school" excels at different types of problems. ASP for modelling a knowledge-base and running queries on it, CP for discrete optimization problems or for all-solution search, SMT for formal verification and proofs, MIP for optimization of (mostly) continuous variables.
Modern solvers in these "schools" can do things traditionally meant for other "schools". Z3 can do optimization, clingo can include CP-style constraints with clingcon, some MIP solvers can find all solutions, etc.
All in all, this type of "classical" AI is super interesting and I hope the hype on LLMs doesn't suck up all the funding that would go to research in this area.
The work on [2] is fascinating to me, both because of the problem domain and as a case study on the effective application of ASP. I will be reading this paper carefully to pore over the details.
I read Potassco's Answer Set Solving in Practice book [0] but it's pretty dense. I suspect it would be easier to digest if you read it while also following their course materials, which are all online [1].
These days I recommend people start with the Lifschitz book [2] and read through the Potassco book [0]. Lifschitz's book is a much gentler introduction to ASP and logic programming in general and its examples are in ASP code (not math). It's also more geared towards the programming side than the solving side, which is probably better for most people until they really want to understand what clingo/gringo/clasp are doing and what their limitations are.
There are other more applied courses, like Adam Smith's Applied ASP course at UCSC [3]. The problems in that course look like a lot of fun.
> ... modern SAT solvers are fast, neat and criminally underused by the industry.
Modern SAT solvers are fast, neat and criminally undertaught by the universities. Seriously, why isn't this taught at the undergraduate level in CS70 [1] discrete math type courses or CS170 [2] analysis of algorithms type courses?
This is… not true. CS170 specifically teaches about reducing NP problems to SAT (you can find this in the Algorithms textbook linked in the class syllabus). I recall solving one of the projects by using MiniSat after converting a problem to 3-SAT. FWIW, the textbook is excellent and the course was very useful.
I definitely recall doing reductions from SAT in Algorithms courses. I think that is a common part of most curricula.
I don't recall being taught any practical uses of SAT. It was introduced only in the context of Cook's theorem, as the problem you needed to reduce to other problems in order to show NP-completeness.
I think most people now learn SAT in that theoretical context, not as a tool to solve problems.
To clarify, you're specifically talking about reductions to SAT, not from SAT, right?
Note the former is used as a solution technique for feeding into SAT solvers, where the latter's goal is basically the exact opposite (to show NP-hardness and hence algorithmic intractability). Formal methods courses do the former, but algorithms courses usually use SAT for the latter.
I used to teach formal methods at university, including a course with a lot of SAT examples. We tried to make it as practical as possible, with many examples and exercises in Java (where you just generate your formulas and call a solve method). Thing is, most students seemed to hate it passionately, just like with reductions to SAT in complexity theory.
I wrote a bespoke backtracking solver for a specific class of problems. Would love to use Z3 or something, but frankly, I wouldn't know how to systematically translate problem instances to solver constraints. It's essentially a kind of very complex job-shop scheduling problem with lots of dynamic inter-dependencies. Many of the problems are hard to even solve without dead-locking, while we also naturally strive to minimize overall processing time. Where would I find ressources to help me get a grip on my specific problem [1]? Could I reasonably hope that Z3 or another general solver might be faster than a moderately well designed bespoke solver in a compiled language? (My solver combines a bunch of greedy heuristics with a backtracker in case it runs into a dead-lock, which is often.)
[1] Rough problem outline: Input goods must be transformed through a complex series of assembly/machining/processing lines to output goods; each line transforms one or two inputs into an (intermediary or end-) product; an assembly line produces it's product in a fixed number of time units and cannot be stopped or delayed; finished intermediary products arrive at the end of an assembly line, which must be cleared by then; there are a limited number of temporary storage spaces where intermediary products can be moved to/from in a fixed number of time units; some assembly lines must wait for two intermediary products to be completed to start a job combining them into another intermediary or end product; end products must then be moved to their destinations.
Both my Alma Maters had courses that used these extensively, and also planners like (Pl|Tr|L)ingeling. We also covered reducability and SAT in multiple courses in both.
These should also be taught in an advanced PL course, e.g liquid, dependent etc types.
I seem to recall that a poorly scaling sat solver in conda-forge broke so badly in 2020 that it shifted the tectonic plates underneath the entire academic python ecosystem away from conda and towards pip.
That was always a bit of a red herring, from my understanding. Yes, if you poorly model something into an ad hoc SAT solver, expect slowness.
Which is a bit of the general idea of these being underused. If you can get your problem into a SAT form or three, than feed it to a state of the art solver, it can work amazingly well. But, you will be spending a lot of time moving to and from the SAT formulation.
From this and Dart, I think one of the lessons here is that SAT solvers are the wrong technique for solving dependencies. SAT solvers find “better” solutions by some metric, but humans using package managers want something which is both simpler and faster.
Also there had been a growing trend for most popular packages to offer precompiled wheels on PyPI instead of just sdist releases.
This meant that people who had moved to Conda because they couldn't get Pip to install important packages into their environment took another look and found that actually they could get things installed using Pip now.
At the same time Pip also got a resolver allowing you to have install time confidence you're not installing conflicting package, and recently (Pip 23.1+) the resolver's backtracking got pretty good.
That said Conda mostly solved this (and once mamba is the default resolver engine things will be really fast), Pip is not ever going to be a package manager, and Poetry still isn't an environment manager, and most other Python package/installer alternatives to Conda won't do things like install your Jupyterlab's nodejs dependency.
After many years I now almost exclusively use Pip to install into an environment, but still nothing beats Conda for bootstraping the non-Python-package requirement's (such as Python itself) nor for getting things working when you are in a weird environment that you can't install OS dev libraries into.
Is Conda actually moving towards making mamba the default? Last I heard, they were distinctly uninterested in that, since mamba is implemented in C++, and they would rather rely on their own slow Python code, which they can more easily modify.
It's worth mentioning that there are higher level abstractions that are far more accessible than SAT. If I were teaching a course on this, I would start with either Answer Set Programming (ASP) or Satisfiability Modulo Theories (SMT). The most widely used solvers for those are clingo [0] and Z3 [1]:
With ASP, you write in a much clearer Prolog-like syntax that does not require nearly as much encoding effort as your typical SAT problem. Z3 is similar -- you can code up problems in a simple Python API, or write them in the smtlib language.
Both of these make it easy to add various types of optimization, constraints, etc. to your problem, and they're much better as modeling languages than straight SAT. Underneath, they have solvers that leverage all the modern CDCL tricks.
We wrote up a paper [2] on how to formulate a modern dependency solver in ASP; it's helped tremendously for adding new types of features like options, variants, and complex compiler/arch dependencies to Spack [3]. You could not get good solutions to some of these problems without a capable and expressive solver.
[0] https://github.com/potassco/clingo
[1] https://github.com/Z3Prover/z3
[2] https://arxiv.org/abs/2210.08404, https://dl.acm.org/doi/abs/10.5555/3571885.3571931
[3] https://github.com/spack/spack
Each "school" excels at different types of problems. ASP for modelling a knowledge-base and running queries on it, CP for discrete optimization problems or for all-solution search, SMT for formal verification and proofs, MIP for optimization of (mostly) continuous variables.
Modern solvers in these "schools" can do things traditionally meant for other "schools". Z3 can do optimization, clingo can include CP-style constraints with clingcon, some MIP solvers can find all solutions, etc.
All in all, this type of "classical" AI is super interesting and I hope the hype on LLMs doesn't suck up all the funding that would go to research in this area.
These days I recommend people start with the Lifschitz book [2] and read through the Potassco book [0]. Lifschitz's book is a much gentler introduction to ASP and logic programming in general and its examples are in ASP code (not math). It's also more geared towards the programming side than the solving side, which is probably better for most people until they really want to understand what clingo/gringo/clasp are doing and what their limitations are.
There are other more applied courses, like Adam Smith's Applied ASP course at UCSC [3]. The problems in that course look like a lot of fun.
[0] https://potassco.org/book/
[1] https://teaching.potassco.org
[2] https://www.cs.utexas.edu/users/vl/teaching/378/ASP.pdf, https://www.amazon.com/Answer-Set-Programming-Vladimir-Lifsc...
[3] https://canvas.ucsc.edu/courses/1338
Modern SAT solvers are fast, neat and criminally undertaught by the universities. Seriously, why isn't this taught at the undergraduate level in CS70 [1] discrete math type courses or CS170 [2] analysis of algorithms type courses?
[1] https://www2.eecs.berkeley.edu/Courses/CS70/
[2] https://www2.eecs.berkeley.edu/Courses/CS170/
I don't recall being taught any practical uses of SAT. It was introduced only in the context of Cook's theorem, as the problem you needed to reduce to other problems in order to show NP-completeness.
I think most people now learn SAT in that theoretical context, not as a tool to solve problems.
Note the former is used as a solution technique for feeding into SAT solvers, where the latter's goal is basically the exact opposite (to show NP-hardness and hence algorithmic intractability). Formal methods courses do the former, but algorithms courses usually use SAT for the latter.
Deleted Comment
Deleted Comment
[1] Rough problem outline: Input goods must be transformed through a complex series of assembly/machining/processing lines to output goods; each line transforms one or two inputs into an (intermediary or end-) product; an assembly line produces it's product in a fixed number of time units and cannot be stopped or delayed; finished intermediary products arrive at the end of an assembly line, which must be cleared by then; there are a limited number of temporary storage spaces where intermediary products can be moved to/from in a fixed number of time units; some assembly lines must wait for two intermediary products to be completed to start a job combining them into another intermediary or end product; end products must then be moved to their destinations.
These should also be taught in an advanced PL course, e.g liquid, dependent etc types.
Deleted Comment
Which is a bit of the general idea of these being underused. If you can get your problem into a SAT form or three, than feed it to a state of the art solver, it can work amazingly well. But, you will be spending a lot of time moving to and from the SAT formulation.
Schaefer's dichotomy theorem shows that, when possible, just make sure to use Horn clauses when possible.
Takes a bit of thinking but is superior to huristics or SAT solvers if you can refactor to allow for it.
many package dependency managers use sat solvers since suse spearheaded this in 2007/2008 with libsolv, which is blazing fast for large repositories.
https://research.swtch.com/version-sat
This meant that people who had moved to Conda because they couldn't get Pip to install important packages into their environment took another look and found that actually they could get things installed using Pip now.
At the same time Pip also got a resolver allowing you to have install time confidence you're not installing conflicting package, and recently (Pip 23.1+) the resolver's backtracking got pretty good.
That said Conda mostly solved this (and once mamba is the default resolver engine things will be really fast), Pip is not ever going to be a package manager, and Poetry still isn't an environment manager, and most other Python package/installer alternatives to Conda won't do things like install your Jupyterlab's nodejs dependency.
After many years I now almost exclusively use Pip to install into an environment, but still nothing beats Conda for bootstraping the non-Python-package requirement's (such as Python itself) nor for getting things working when you are in a weird environment that you can't install OS dev libraries into.
The CNF Converter is a gem.
https://github.com/scala/scala/blob/v2.13.5/src/compiler/sca...
I don't know anything about the Scala compiler. A few years ago I needed a CNF Converter and I ripped their Logic Module.
(performant CNF Converter are harder to find than SAT Solver)
He has a few open source SAT solvers and tooling that provide good and proven examples on modern SAT solver techniques.
https://www.anaconda.com/blog/understanding-and-improving-co...