We’re trying to bring new ideas to the optimization space. We’ve used a bunch of technologies over time (CP, MIP, LP, heuristics, etc.). We’ve built our own solver (for https://www.nextmv.io) and learned a lot along the way.
Solvers are amazing. MIP solvers, for example, are some of the fastest and most mature software packages that exist. They’re also wickedly challenging to use effectively.
Do you use optimization solvers? Which ones? Why? Do you like them? Why or why not?
Initially, CPLEX and Xpress were founded in the eighties. In the nineties, CPLEX was acquired by ILOG (a French CP company), which in turn was purchased by IBM around 2009. Around the same time, the original technical co-founder of CPLEX, along with the two latest head developers, left CPLEX to found Gurobi. Since then, there has been a slow trickle of developers leaving CPLEX for Gurobi... until 2020, when CPLEX suddenly lost its 7 remaining devs over 6 months (because of catastrophic mismanagement at IBM, from what I heard). Unsurprisingly, those devs ended up mostly at Gurobi, resulting in the CPLEX team from 20 years ago being essentially Gurobi now. Other CPLEX devs also ended up at XPRESS, which had been purchased around 2008 by FICO (the credit rating company).
Meanwhile, there is also a smaller Danish company, Mosek, that does its own thing (they have a MIP solver, but their focus seems to be on their amazing conic optimization code). And SAS (the analytics giant) has a small MIP team too.
Then over the last 2 years, three new solvers appeared out of China: COPT (by Cardinal Operations, a startup by Stanford graduates), MindOpt (Alibaba Research) and OptVerse (Huawei). They mostly have LP solvers for now, but for newcomers, the performance is extremely impressive. This is only partially out of nowhere, though: COPT in particular has hired several devs from the incumbents.
On the academic side, ZIB (a PhD-granting research institute in Berlin) maintains a source-available family of solvers, and has been a steady provider of talent for commercial solvers. The dev behind SoPlex (LP solver) went to CPLEX after his PhD and now Gurobi. The main dev behind SCIP did the same, and is now VP of R&D at Gurobi. Many more XPRESS and Gurobi people did their PhDs at ZIB.
The Coin-OR open-source codes for LP (clp) and MIP (cbc) were written decades ago by a founding father of computational optimization, John Forrest, now retired from IBM research. Their source code is difficult to read, and Coin-OR aims to eventually replace them with a new code, HiGHS. The dev who wrote the simplex code of HiGHS as his PhD thesis went on to XPRESS and now COPT. The dev who writes the MIP code of HiGHS comes from ZIB.
As you can see, everyone is very inter-connected. Hence the throwaway :-).
1. At least they exist! They give a good rough idea of the relative performance of the various solvers. Also, I think it provides a sense of competition that is great for innovation. Some of the newcomers push press releases about having "won a prize" when they leapfrog the competition in one of the (sometimes weekly) benchmarks.
2. For the MIP benchmarks [2], a lot of thought has gone into doing them properly [3].
Don't be scared by the "End of a benchmarking era" banner. In 2018, Gurobi cherry-picked numbers that made them look good in some promotional material, while suggesting endorsement by Hans Mittelmann. There was a whole fuss about it and they had to publicly apologize [4]. This was a very dumb move since they were the fastest solver anyways, just not by as much as they claimed. CPLEX and Xpress took this as an excuse to take their toys and go home (they have since refused to participate in benchmarking). Gurobi was banned from the benchmarks for a while, but now they're back.
[1] http://plato.asu.edu/bench.html
[2] http://plato.asu.edu/ftp/milp.html
[3] https://link.springer.com/article/10.1007/s12532-020-00194-3
[4] https://web.archive.org/web/20181224083542/http://www.gurobi...
* Minion for IP for scheduling + edge crossing minimization.
* cvxpy (wrapping ECOS, OSQP, and SCS by default) for convex optimization for making nice geometry.
* Z3 and STP for SAT/SMT for program analysis.
All are FLOSS, which is my main criterion in many situations.
Beyond that, I like minion for its focus on only providing efficiently implementable primitives, cvxpy for the awesome lectures and documentation the folks behind it have produced, and z3 + stp for their use in tools I like, such as klee.
For example, I would perhaps not use the same solver for a best-effort problem where I can program strong heuristics, vs a problem that must be optimized and no natural heuristics exist. Sometimes problems have a natural type of formulation, and it is best to use a solver for that type of problem (linear programming, mixed integer programming, SAT, pseudo boolean, constraint programming, ...).
Choosing a solver also very much depends on the deployment situation; how should a model be used and what operational concerns are there.
For example, I first found minion when I was learning about scheduling problems, initially to help a friend who was a Chief Resident on a pediatrics ward and later, to automate the administrative hassle of scheduling incident responders. (I wrote about that a little bit many years ago here if you’re curious: https://mstone.info/posts/scheduling/)
Concurrently with that, while I was working with and supervising security researchers, I spent quite a bit of time implementing various program analyses to answer questions about what certain programs of interest might do, and to find inputs that would make them do interesting things. For this purpose, though, minion is much less immediately relevant than Z3 since the interfaces and research interests of the relevant communities are so different.
Finally, this year, I am focusing on problems that have a more conventionally geometric flavor as opposed to the more discrete search spaces mentioned above. Here, convex optimization is a starting point that cvxpy makes incredibly accessible, especially in combination with Stephen Boyd’s Stanford SEE EE364A lectures (or similar): https://see.stanford.edu/Course/EE364A
It was much easier to build a score-calculator for min/max based on user-tweaked parameters, then, jiggle the data, re-calculate score, and keep doing it until there was significant improvement (again, standard SA). I would absolutely love to convert the entire production plan logic with material availability, lead-times, customer demand, quality control windows etc. to something like nextmv.io but looking at your docs, I have no idea where to begin.
Cost is a big factor too because 3 years ago I bought 4 old 24-core Xeons off eBay and they've been chugging non-stop simulating billion+ flops per hour with electricity being the only cost. I don't mind paying $50-100/day for cloud if the results are great and code is easy to manage. We have the same chicken-egg problem everyone in supply chain currently has - we don't have enough materials to make everything, don't know when we'll get everything, and so don't know the best order to buy/make everything in. I would love to write a solver for this using our dataset but I kind of don't want to re-invent the wheel.
As it stands, every solver I find is one layer of abstraction away from what I want to code in. I can explain the problem in length if you want but it's honestly nothing unique - just the standard MRP/ERP planning with ton of BOM items, PO delays, labor/machine capacity constraints etc.
Your existing tutorials explain how I can use your API/SDK to perform OR operations. That's great and a necessity. However, it's not sufficient for me because my questions are: How do I represent my production calendar in the JSON blob for your algo? How do I put a constraint of 100hrs/week/machine but also 168hrs/week/room full of specific machines. In other words, while each machine can run 100hrs/week, if there are 4 of them in the same room, only one can run at a time, and so combined machines in a given room cannot be over 168hrs/week. Maybe a tutorial or a higher-level library to help people like me convert business rules into JSON format for your APIs. Because even if I might be capable of using your API as-is, I unfortunately don't have the time to implement these things myself. Hope this makes sense and gives you some insight into at least one of your target use-cases.
We haven't put a lot more examples into our SDK docs lately since we've been working more on our routing engine and cloud offering. Now we're getting back to more foundational questions like "what should our solver be?" and "how should model formulation work?"
Hop started off as a decision diagram solver, but even internally we strap a lot of different techniques onto it. My hope is to support those patterns better, which is really why I posed this question.
I'd be interested to know: what made the process of converting business logic into solver-speak painful?
The fact that every solver doc I found shows me two near identical variables and 4 identical constraints. I can solve that using Excel Add-In. My problem is mind-numbingly complex, like everyone else doing supply-chain optimization. So I need examples for each type of constraint I have but instead, the tools expect me to figure out everything based on very generic examples. Dates are a big deal in what I do. I have no idea how to convert "each day the customer order is delayed is bad, but I also can't make things 3 months before they want it" into solver-speak. Cleaning a machine takes time, so it is often better to make similar things in a row (campaign-mode). No idea how to convert that to solver-speak.
The good thing is that once I understand how to convert my data into solver-speak, I can keep it updated on the fly since business logic doesn't change daily, just the data-set.
Feel free to contact me directly and/or we can Zoom etc. if you wish to discuss further. Wish you the best of luck either way!
https://github.com/jodavaho/highfleet-ship-opt
It really feels like a tools or language problem. Heck, we used to have to manually work out derivatives for continuous optimization problems, but nowadays programming languages with performant built-in autodiff often make this trivial. Removing the manual derivation hassle let loose a flood of cool ideas and applications, even though there was no technical hurdle preventing them in the first place.
Alternate problem specifications is a well-explored area (what is Prolog if not a way of describing problems for a constraint satisfier?), but I wonder how many other neat things are dammed up behind usability problems.
I think there are much better algorithms in metaheuristic search than just simulated annealing.
I'm doing almost exactly this right now on a client project (I consult in supply chain optimization)
I have a niggling feeling somewhere that JSON is the wrong language for this. Something like cue lang may be the right thing. (https://www.sobyte.net/post/2022-04/cue/)
The nice thing about scoring with SA is that I don't need to think like an operations research student. I just think in code with if/then/else and the number of dimensions don't matter. The code is working fine for now but now the problem is a business requirement to make it MUCH more complex and factor in a whole new set of constraints. SA will straight up not work with it because the scoring itself can take tens of seconds now. So I need to come up with a new 'proper' way, hence my renewed interest in solvers.
Most open-source solvers don't handle parallelization well, and they lack the latest research on techniques like branch-cutting and heuristics that can speed things up significantly.
In my experience, Gurobi is still leader for linear and MiP solving. But, it's really expensive and the licensing terms seem anachronistic.
SimpleRose.com was a startup working on a new solver, too - I'm curious if anybody has tried it yet?
To OP's question, besides JuMP, the other use case I've had for optimization is for optimizing the drawing order of pen plots, for which I used or-tools. A write-up on it is here: https://nb.paulbutler.org/optimizing-plots-with-tsp-solver/
> link to twitter
visible_confusion.png.jpg.exe
I remember liking JuMP, but Julia itself didn't feel ready yet. Some of the packages had weird behaviors. For example, Gadfly took several minutes to render some of my charts. IIRC when I looked at the source, it was solving a MIP to compute the best bounding box for display purposes.
I should probably check it out again.
Also: agreed regarding Gurobi licensing.
If I'm solving an optimization problem for personal curiosity, a blog post, etc., JuMP is the tool I reach for. :)
* CLP/CBC: open source makes deployment and devops easy, which is great. Linear models are nice in that you "know what you're getting." Performance is at times a pain point.
* Gurobi: super fast, but the licensing was just impossible. Partly that was due to high cost, but ultimately we could have borne the cost; the inability to do something like have autoscaling containers using Gurobi was ultimately the dealbreaker for us. As Zoba grew, we had to turn to alternatives.
* NLOpt: absolutely a blessing, but the variety of algorithms and tuning parameters is really opaque, even for a team with some OR experience/background.
* OR-tools: powerful but the documentation is remarkably terrible, and the vehicle routing problem solver doesn't natively support all the constraints we'd like, so we have to do some hacks.
Overall my feeling for all these tools is roughly gratitude: solvers are complex, and rolling our own would be absolutely impractical at our size. But also there's some pain in between "I've formulated my problem as a nice mathematical model" and "I can reliably get optimal results on time."
NextMV was practically the opposite (at the time, I'm sure it has improved now, especially since they used to be far more insistent on decision diagrams): rather bad/terrible performance, but excellent in terms of licensing and deploying the code, and they had great support too. The modern deployment made sense given they were a new/modern company. One silver lining to the terrible performance, and why I used to stick up for them at the time was that you could get somewhat acceptable results fast: if you stopped after one second, CBC might be absolutely nowhere, but NextMV's solver would at least give you something. This meant you could do things that made use of extremely fast results, like trying a configuration and checking the (approximate) solution, then trying a bunch more, all very quickly.
In the end I mostly settled on OR-Tools.
---
I'm sorry to hear that you find our licensing frustrating. It is true that in the past we focused a lot on local deployments holding on to the notion of a physical machine. However, we have added a few licensing options that you might find interesting, specifically the Web License Service (https://www.gurobi.com/web-license-service/), where you can get a short-lived JSON Web Token inside of a Docker container which renews automatically. You can find our Docker Hub images here (https://hub.docker.com/orgs/gurobi/repositories).
There are also other ways like the Instant Cloud (https://www.gurobi.com/products/gurobi-instant-cloud/) which give you a very scalable SaaS approach.
Feel free to reach out to me (<lastname>@gurobi.com) to discuss more!
Gurobi, AFAIK, is considered the leading edge of the field, and other tools, especially open-source ones, do not perform as well.
I cannot speak specifically for your firm and use cases, but in my experience, if the use cases are structured and specific enough, rolling out a in-house optimizer is not such a bad idea.
---
Sorry to hear you found our licensing problematic! You might find interesting that we now have developed a Web License Service (https://www.gurobi.com/web-license-service/), where you can retrieve a short-lived JSON Web Token inside of your container to run Gurobi; our Docker Hub images are here: https://hub.docker.com/orgs/gurobi/repositories.
From what you are saying ("autoscaling containers"), this may be a good fit for you. What do you think?
I needed shadow prices defined using the master problem dual solution. In my problem instances I would very often run into scenarios where the primal (and hence also dual) master LP problem had a unique objective value but the dual solutions at which that maxima was attained were non-unique. I learned that it only makes sense to talk about shadow prices in some allowable range for each dual decision variable, but LP solvers generally don't give you an API to extract much information about this from the current internal state of the simplex algorithm [0]. I read a bunch of LP solver documentation and the best one I found discussing this kind of stuff was the section in MOSEK's manual about sensitivity analysis[1]. This was for a hobby project, not business, so I didn't try out MOSEK, even though it is looks very modestly priced compared to other commercial packages.
What I did find, however, was that some time during the last few years, scipy.optimize grew an LP solver interface, and even better, Matt Haberland contributed a python implementation of an interior-point LP solver straight into scipy [2][3]. I found that Haberland & co's open source interior point LP solver produced dual solutions that tended to be more stable and more useful for shadow prices in my problems than a bunch of the other open source LP solvers I tried (including the various other LP backends exposed by scipy.optimize.linprog).
[0] a paper I found very helpful in understand what was going on was Jansen, de Jong, Roos, Terlaky (1997) "Sensitivity analysis in linear programming: just be careful!". In 1997 they got 5 commercial LP solvers to perform a sensitivity analysis on an illustrative toy problem, and although the solvers all agreed on the optimal objective value, none of them agreed with each other on the sensitivity analysis.
[1] https://docs.mosek.com/9.2/pythonapi/sensitivity-shared.html
[2] https://github.com/scipy/scipy/pull/7123
[3] https://docs.scipy.org/doc/scipy/reference/generated/scipy.o...
Lot's of money to be made on it too if you guys are interested, but it's super niche and hard to get into. There is a huge resistence from train controllers and other workers. I actually understand it because of the job loss involved but it was super cool being in a NASA like control center sorrounded by panels and monitors and seeing the trains moving based on code I and other wrote!
It was a just in time local optimization with lots of heuristics and business rules embedded into it. Basically impossible to reuse between companies or even railroads sometimes, the controller would then solve all the more complex crossing that involved either some lose-lose choice or a pre-defined business decision.
The train controllers are amazing at their jobs, it's super stressful and a single mistake can kill people or make the whole thing stop for weeks, with the software running it made it a lot less risky, one dude could control an area that needed 7 or more people without it, with minor interventions.
A comment on a recent thread about a train IT failure went on a bit of an interesting tangent about ahead-of-time network scheduling in (IIUC) the Netherlands' TURNI system - https://news.ycombinator.com/item?id=30902585
(The whole thread was a bit of a wide-spectrum ramble, as one might expect for a downtime event.)
So you're saying you were actually doing JIT routing as opposed to AOT? The linked system apparently precomputed the trip<->driver/conductor schedules overnight. I wonder if they're still using that approach today. It does feel like a JIT approach is much more amenable to handling the unpredictable non-spherical real world (eg electricity issues, track breakage, crashes at crossings, train malfunctions that block tracks (right on junctions >:D), etc).
This sorta thing is definitely beyond my own mental level :) but for reference, how would someone interested get their foot in the door in this area?
All the rest was on us, failures, breakages, crashes (We never had one!!!), blocks were super interesting btw!.
So I got on it by chance, I just did interviews as a sport when I was in my early 30's so at some point I passed on this position, pay was awesome, lot's of travel involved, etc... So I joined and learned it all there, it was a small mom and genious software house that specialized on all types of JIT planning for railroads.
JIT was actually "THE" solution to it, because the controllers could just try it during the day, so let's say they new 18:00 was the worst and there was a huge trains comming in at that time with priority, he would just throw shit at the system to see how to same the most time sometimes, of course he knew how to do it by heart, but was it the "Best way" we could help him answering it super fast.
P.S: I am not ducth but I lived there for a while, I remember thinking about how much of a bad day it would have been if it was me, I knew exactly what the NS people were felling :D