thxg (u/thxg) - Readit News

thxg commented on Scaling up linear programming with PDLP research.google/blog/scal... · Posted by u/bookofjoe

whatever1 · a year ago

FYI the table does not include the commercial top dogs (ibm cplex, gurobi, Fico Xpress), since due to politics they all withdrew

thxg · a year ago

Indeed those are the "big four" solver businesses in the West, and also probably the most reliably-good solvers. But by the time Gurobi withdrew from the benchmarks (a few weeks ago), COpt was handily beating them in the LP benchmarks, and closing down on them in MIP benchmarks. Solver devs like to accuse each other of gaming benchmarks, but I'm not convinced anyone is outright cheating right now. Plus, all solver companies have poached quite a bit from each other since cplex lost all its devs, which probably equalizes the playing field. So overall, I think Mittelmann benchmarks still provide a good rough estimate of where the SOTA is.

thxg commented on Scaling up linear programming with PDLP research.google/blog/scal... · Posted by u/bookofjoe

dsfcxv · a year ago

hmm... I am reading [1] right now. When looking at their Table 7 and Table 11 in [1], they report comparison results with Gurobi presolve enabled and 1e-8 error. Do I miss anything?

Their performance isn't quite as good as Gurobi's barrier method, but it's still within a reasonable factor, which is impressive.

thxg · a year ago

Regarding presolve: When they test their solver "with presolve", they use Gurobi's presolve as a preprocessing step, then run their solver on the output. To be clear, this is perfectly fair, but from the perspective of "can I switch over from the solver I'm currently using", this is a big caveat.

They indeed report being 5x slower than Gurobi at 1e-8 precision on Mittelmann instances, which is great. Then again, Mittelmann himself reports them as 15x off COpt, even when allowed to do 1e-4. This is perfectly explainable (COpt is great at benchmarks; there is the presolve issue above; the Mittelmann instance set is a moving target), but I would regard the latter number as more useful from a practitioner's perspective.

This is not to diminish PDLP's usefulness. If you have a huge instance, it may be your only option!

thxg commented on Scaling up linear programming with PDLP research.google/blog/scal... · Posted by u/bookofjoe

dsfcxv · a year ago

Their numerical results with GPUs, compared to Gurobi, are quite impressive [1]. In my opinion (unless I'm missing something), the key benefits of their algorithms lie in the ability to leverage GPUs and the fact that there’s no need to store factorization in memory. However, if the goal is to solve a small problem on a CPU that fits comfortably in memory, there may be no need to use this approach.

[1] https://arxiv.org/pdf/2311.12180

thxg · a year ago

I agree that their results are impressive. Just to be clear, however:

1. They compare their solver with a 1e-4 error tolerance to Gurobi with 1e-6. This may seem like a detail, but in the context of how typical LPs are formulated, this is a big difference. They have to do things this way because their solver simply isn't able to reach better accuracy (meanwhile, you can ask Gurobi for 1e-9, and it will happily comply in most cases).

2. They disable presolve, which is 100% reasonable in a scientific paper (makes things more reproducible, gives a better idea of what the solver actually does). If you look at their results to evaluate which solver you should use, though, the results will be misleading, because presolve is a huge part of what makes SOTA solvers fast.

thxg commented on Scaling up linear programming with PDLP research.google/blog/scal... · Posted by u/bookofjoe

DominikPeters · a year ago

In this post, I don’t see any comparisons of their solver to other LP solvers on benchmarks, so it’s difficult to know how useful this is.

thxg · a year ago

I think it's partially excusable. Most LP solvers target large-scale instances, but instances that still fit in RAM. Think single-digit millions of variables and constraints, maybe a billion nonzeros at most. PDLP is not designed for this type of instances and gets trounced by the best solvers at this game [1]: more than 15x slower (shifted geometric mean) while being 100x less accurate (1e-4 tolerances when other solvers work with 1e-6).

PDLP is targeted at instances for which factorizations won't fit in memory. I think their idea for now is to give acceptable solutions for gigantic instances when other solvers crash.

[1] https://plato.asu.edu/ftp/lpfeas.html

thxg commented on Truthiness in C dxuuu.xyz/casts.html... · Posted by u/todsacerdoti

thxg · 2 years ago

The article is very recent (September 2023 [1]), mentions that "rax is used to return integer values" in the SysV ABI (hence implicitly the 64-bit SysV ABI). Also confirmed in this excerpt:

> Note that the top 56 bits in rax are not zeroed – they contain junk. This is fine b/c the compiler will only make callers check the lowest bit of a register for boolean operations. This is why changing the compiler’s “understanding” (ie the cast) is necessary.

... and yet, the function ABI is clearly i386 (fetching arguments from stack) and indeed everything is compiled with -m32 [2] (i.e. 32-bit SysV ABI). This is a strange contradiction in 2023. On the Intel/AMD side, x86_64 has been prevalent (and the default) for... more than 15 years?

It does not invalidate the article's point. But it is slightly confusing....

[1] https://dxuuu.xyz/

[2] https://godbolt.org/z/ff8r44nKn

thxg commented on Efficient Jagged Arrays zeux.io/2023/06/30/effici... · Posted by u/ingve

thxg · 2 years ago

This representation of jagged array is literally the "Compressed Sparse Row" (CSR) format for sparse matrices [1]:

- a single array of floats with all the nonzero matrix entries, ordered row-by-row:

  double values[nnz];

- a single array of ints with the corresponding column index:

  int columns[nnz];

- a single array of ints, of length (m+1), with the start of each row in the two other arrays (the (m+1)th element stores nnz, which will be useful later):

  int start[m + 1];

So to print a whole matrix, you do this:

  for (int i = 0; i < m; i++) {
    index = start[i]; // where
    count = start[i + 1] - start[i]; // number of nonzeros in row i
    
    for (int t = 0; t < count; t++) {
       j = columns[index + t];
       v = values[index + t];
       printf("element (%d, %d) has value %g\n", i, j, v);
    }

To a modern programmer, this may look very old-fashioned (Fortran-ish), but I have tried many alternatives (like grouping columns and values together in a struct, or using pointers), and in my numerical codes, nothing gets even close in terms of performance. It must be said that CSR and CSC (its transpose) are very compact in memory, which is often a bottleneck. Also, since so many sparse numerical codes in benchmarks have used this representation for decades, it could be that CPU designers make sure that it works well.

[1] https://en.wikipedia.org/wiki/Sparse_matrix#Compressed_spars...

thxg commented on Ligatures in programming fonts: hell no (2019) practicaltypography.com/l... · Posted by u/johnchristopher

thxg · 3 years ago

Ligatures are an interesting contrast to Julia programmers' habit of using actual (non-Latin) Unicode characters in their code. It seemed to me that the use of Greek letters and symbols is actively encouraged in the Julia documentation. I think they advise using editors that allow inputting special characters through their LaTeX equivalent (but source files are then saved to disk with the Unicode characters).

I am not very familiar with the latest trends, could anyone else chime in here? Do Julia coders use fonts with ligatures? That could indeed be one case where confusion is possible. Or maybe not?

thxg commented on MemComputing vs. Quantum Computing memcpu.com/blog/memcomput... · Posted by u/velmu

thxg · 3 years ago

MemComputing mainly present themselves as optimizers: Their machine can solve real optimization problems, in practice, today. This is very good because, through that lens, we can skip the BS and go straight to the facts. The evaluation criteria for any optimization approach/algorithm are:

- the quality of the solutions found (in the best case: optimal solutions)

- the presence or absence of optimality guarantees (for example, some algorithms provide optimal solutions very often, but cannot guarantee it 100%)

- the time (or computational cost) needed to find such solutions

Furthermore, the state-of-the-art (SOTA) is well known for most types of optimization problems. In this article, they present a list of real and practical applications, so let us have a look at them one by one:

1.1 Traffic flow optimization

Simple flow problems can be solved in polynomial time (and quickly in practice), so there is no need for anything fancy. Once you introduce additional constraints or discrete variable, the SOTA is mixed-integer programming for offline problems. For online problems it's more complicated. In both cases, I am not aware of any application in which MemComputing can reach SOTA.

1.2 Vehicle routing & scheduling

Here, depending on your computing time constraints, needs for solution quality and/or optimality guarantees, the SOTA can be constraint programming (gecode, OptaPlanner), local search heuristics (LocalSolver), or mixed-integer programming (CPLEX/XPress/Gurobi) with column generation. MemComputing is nowhere to be seen again.

1.3 Supply chain optimization

This is a very broad field, but in general mixed-integer programming is king here.

2.1 Protein folding

Here there is quite an objective measure: the biennal CASP competition. This is where AlphaFold made a splash in 2022. MemComputing has never participated.

2.2 Genome sequencing

I am not knowledgeable enough to comment here.

2.3 Radiotherapy treatment

I am not very knowledgeable here either, but last I looked mixed-integer programming approaches were favored.

3.1 Portfolio risk optimization

Various types of branch-and-bound solvers. Mixed-integer linear/quadratic/convex programming. No MemComputing.

3.2 Detecting market instabilities

No idea.

3.3 Optimizing trading trajectories

No idea.

4.1 Training neural networks

Many people here know how this is done. Stochastic gradient descent on GPUs or TPUs. No MemComputing involved. How can they even claim to be active in this field?

4.2 Detecting statistical anomalies

Vague.

4.3 Classifying unstructured datasets

No idea.

The problem is that if you invent a new optimization algorithm, it is very easy to find one instance of one problem for which your algorithm works well. They did literally that in a paper [1]: They took a library of mixed-integer programming problem instances containing 270 benchmark problems, and published a whitepaper showing that they beat a SOTA solver on one of them. A single instance out of 270!

The really hard part is the opposite: given a class of problems, find an algorithm that beats the SOTA. MemComputing has never done that. Combined with their propensity for grand claims backed by misleading evidence, MemComputing have accumulated a lot of badwill from the academic community over the years. My suspicion is that, while on the surface this post seems to put their approach in contrast to quantum computing, what they really try to do here is ride on the quantum computing hype wave.

[1] https://arxiv.org/abs/2003.10644

thxg commented on AI tool designs mRNA vaccines that are more potent and stable nature.com/articles/d4158... · Posted by u/voisin

thxg · 3 years ago

The scientific article seems to be open access [1].

Before people draw links to recent large language model breakthroughs: Although they do use techniques from computational linguistics, there are no neural networks involved. This is more like old-school AI.

They have essentially a giant optimization problem, and they (approximately) model it as a lattice parsing problem, with a stochastic context-free grammar. They can solve that to optimality in O(n^3), which is too slow for some applications. So they propose a O(n) heuristic (hence no optimality guarantees, but the model was approximate to begin with anyways, and the heuristic is a lot faster), which is the reason for the name of their code: "LinearDesign".

[1] https://www.nature.com/articles/s41586-023-06127-z