would a x1000 compute cluster provide a meaningful performance boost ( of the generated binaries)?
One thing is that you can think of static analysis as building facts about the program. You can for example start by assuming nothing and then adding facts about the program. And you need to iteratively propagate these facts from one line of code to the next. But you can also start by assuming the universe and removing facts from this universe.
Some classes of program analysis are safe to stop early. For example, if I have a static analysis that tries to find the target of virtual calls (also known as a devirtualization), you can stop early after a time out. Not finding the target just implies a missed optimization.
There are some other classes of program analysis that are not safe until the algorithm finishes. For example, if you have to prove that two variables do not alias each other, you cannot stop until you have all possible points-to sets and verify that for each of those two variables, their points-to sets do not overlap.
So, given the above restriction, the first class (early termination) is perhaps more desirable and throwing more compute time would yield a better approximation. For the second one, it wouldn't.
Another thing to keep in mind is that most of these data flow frameworks are not easily parallelized. The only paper I've read (but I haven't kept up with these avenue of research) that implemented a control flow analysis in the GPU is the following:
* Prabhu, Tarun, et al. "EigenCFA: Accelerating flow analysis with GPUs." ACM SIGPLAN Notices 46.1 (2011): 511-522.
I'm sure people are working on it. (I should mention that there are some program analyses written in Datalog and Datalog can be parallelized, but I think this is a processor based parallelization and not a GPU one).
The third thing is that when you say whether we are limited by algorithms or compute, I think it is important to note that it is impossible to find all possible facts *precisely* about a program without running it. There is some relation between static program analysis and the halting problem. We want to be able to guarantee termination of our static program analysis and some facts are just unobtainable without running. However, there is not just static program analyses, but also dynamic program analyses which can analyze a program as it is running. An example of a dynamic program analysis can be value profiling. Imagine that you have a conditional and for 99% of the time, the conditional is false. With a virtual machine, you can add some instrumentation to find out the probability distribution of this conditional and then generate code without this condition, optimize the code, and only if the condition is false then run a less optimized version of the code with an additional penalty. Some virtual machines already do this for types and values. Type profiling and value profiling.
One last thing, when you say a meaningful performance boost, it depends on your code. If your code can be folded away completely at compile time, then yes, we could just generate the solution at compile time and that's it. But if it doesn't or parts of it cannot be folded away / the facts cannot be used to optimize the code, then no matter how much you search, you cannot optimize it statically.
Compilers are awesome :)
As an addendum, it might be desirable in the future to have a repository of analyzed code. Compilers right now are re-analyzing code on every single compile and not sharing their results across the web. It is a fantasy of mine to have a repository that maps some code with equivalent representations and every time one does a local compile it explores a new area and adds it to the repository. Essentially, each time you compile the code, it explores new potential optimizations and all of them get stored online.
For example, clang compilation times have slowed down something like 5-6x faster than the code they generate.
As a developer I can come up with better structure in my code much faster if I can try more things out in a day.
Now we want to add LLMs of all things in there? I'm not looking forward to code taking one or two orders of magnitude longer to compile
* Nandi, Chandrakana, et al. "Rewrite rule inference using equality saturation." Proceedings of the ACM on Programming Languages 5.OOPSLA (2021): 1-28.
* Pal, Anjali, et al. "Equality Saturation Theory Exploration à la Carte." Proceedings of the ACM on Programming Languages 7.OOPSLA2 (2023): 1034-1062.
I will need to read more about both of these techniques along with Synthesizing Abstract Transformers.Thanks for sharing! Really exciting stuff!
What papers would you recommend for learning implementing dataflow analysis? For example, foundational or tutorial papers.
* Kam, John B., and Jeffrey D. Ullman. "Monotone data flow analysis frameworks." Acta informatica 7.3 (1977): 305-317.
* Reps, Thomas, Susan Horwitz, and Mooly Sagiv. "Precise interprocedural dataflow analysis via graph reachability." Proceedings of the 22nd ACM SIGPLAN-SIGACT symposium on Principles of programming languages. 1995.
* Sagiv, Mooly, Thomas Reps, and Susan Horwitz. "Precise interprocedural dataflow analysis with applications to constant propagation." TAPSOFT'95: Theory and Practice of Software Development: 6th International Joint Conference CAAP/FASE Aarhus, Denmark, May 22–26, 1995 Proceedings 20. Springer Berlin Heidelberg, 1995.
* Reps, Thomas, et al. "Weighted pushdown systems and their application to interprocedural dataflow analysis." Science of Computer Programming 58.1-2 (2005): 206-263.
* Späth, Johannes, Karim Ali, and Eric Bodden. "Context-, flow-, and field-sensitive data-flow analysis using synchronized pushdown systems." Proceedings of the ACM on Programming Languages 3.POPL (2019): 1-29.
Other areas that may be interesting to look at: * Points-to Analysis
* Abstract Interpretation
* On demand dataflow analyses
Check his other research. Some of it is highly accessible via youtube videos. I recommend watching / reading:
* Stabilizer
* Mesh
* Scalene
Native integration with Datalog.
Many times, I find myself working on a program and I realize that what I need is a database. But having a database, even sqlite3 or Berkely DB, would be an overkill. If I could just express my data and the relationships between them, then I would be able to query what I need in an efficient way.
I perhaps have not had a long professional life working with compilers (5+ years), but to me the definition of "compiles to binary" is too restrictive. The main things I care for in my work are:
1. To be able to perform some sort of static analysis on the program 2. To be able to transform the program representation
To other commenters: in Python, we have two program representations. The human readable string representation and the bytecode representation. The syntactical errors are a kind of static analysis. To me, the maps between the Python string representation and the bytecode representation and the classes of errors we can catch without running the program is far more interesting than pigeon-holing Python in the "compiled" or "interpreted" hole.
More philosophically, the motto is "write programs as morphisms directly". Rather than writing a term in some type theory which you then (maybe) give a categorical semantics, why not just work directly in a category?
Long term, the goal is to have a compiler which is a stack of categories with functors as compiler passes. The idea being that in contrast to typical compilers where you are "stuck" at a given abstraction level, this would allow you to view your code at various levels of abstractions. So for example, you could write a program, then write an x86-specific optimization for one function which you can then prove correct with respect to the more abstract program specification.
Hello, as a compiler engineer I am interested in this area. Can you expand a little bit more? How would I be able to plug in my own language for example?
> So for example, you could write a program, then write an x86-specific optimization for one function which you can then prove correct with respect to the more abstract program specification.
So, what you are saying is that catgrad alllows me to write a program and then also plug in a compiler pass? I.e., the application author can also be the compiler developer?