ben_pfaff (u/ben_pfaff)

ben_pfaff commented on Rethinking Syntax: Binding by Adjacency github.com/manifold-syste... · Posted by u/owlstuffing

ben_pfaff · 7 days ago

Bjarne Stroustrup wrote a paper about adding overloading for the "whitespace operator" in C++, but in his case it was a joke: https://stroustrup.com/whitespace98.pdf

ben_pfaff commented on Datalog in Rust github.com/frankmcsherry/... · Posted by u/brson

jitl · 9 months ago

I struggle to understand the Clojure/Datomic dialect, but I agree generally. I recommend Percival for playing around with Datalog in a friendly notebook environment online: https://percival.ink/

Although there’s no “ANSI SQL” equivalent standard across Datalog implementations, once you get a hang of the core idea it’s not too hard to understand another Datalog.

I started a Percival fork that compiles the Datalog to SQLite, if you want to check out how the two can express the same thing: https://percival.jake.tl/ (unfinished when it comes to aggregates and more advanced joins but the basic forms work okay). Logica is a much more serious / complete Datalog->SQL compiler written by a Google researcher that compiles to BigTable, DuckDB, and a few other SQL dialects (https://logica.dev/).

One area Datalog is an order of magnitude easier is when working with recursive queries / rules; this is possible in SQL but feels a bit like drinking playdough through a straw. Frank’s Materialize.com has a “WITH MUTUALLY RECURSIVE” SQL form (https://materialize.com/blog/recursion-in-materialize/) that’s much nicer than the ancient ANSI SQL recursive approach, we’re evaluating it for page load queries & data sync at Notion.

Feldera has a similar form for recursive views as well (https://www.feldera.com/blog/recursive-sql-queries-in-felder...). I like that Feldera lets you make each “rule” or subview its own statement rather than needing to pack everything into a single huge statement. Main downside I found when testing Feldera is that their SQL dialect has a bunch of limitations inherited from Apache Calcite, the Materialize SQL dialect tries very hard to be PostgresSQL compatible.

ben_pfaff · 9 months ago

> Main downside I found when testing Feldera is that their SQL dialect has a bunch of limitations inherited from Apache Calcite

At Feldera, we're adding features to our SQL over time, by contributing them upstream to Calcite, making it better for everyone. Mihai Budiu, who is the author of the Feldera SQL compiler, is a Calcite committer.

ben_pfaff commented on The Pain That Is GitHub Actions feldera.com/blog/the-pain... · Posted by u/qianli_cs

solatic · a year ago

> Trivial mistakes (formatting, unused deps, lint issues) should be fixed automatically, not cause failures.

Do people really consider this best practice? I disagree. I absolutely don't want CI touching my code. I don't want to have to remember to rebase on top of whatever CI may or may not have done to my code. Not all linters are auto-fixable so anyway some of the time I would need to fix it from my laptop. If it's a trivial check it should run as a pre-commit hook anyway. What's next, CI should run an LLM to auto-fix failing test cases?

Do people actually prefer CI auto-fixing anything?

ben_pfaff · a year ago

I'm new to CI auto-fixes. My early experience with it is mixed. I find it annoying that it touches my code at all, but it does sometimes allow a PR to get further through the CI system to produce more useful feedback later on. And then a lot of the time I end up force-pushing a branch that is revised in other ways, in which case I fold in whatever the CI auto-fix did, either by squashing it in or by applying it in some other way.

(Most of the time, the auto-fix is just running "cargo fmt".)

ben_pfaff commented on Feldera Incremental Compute Engine github.com/feldera/felder... · Posted by u/gzel

bbminner · a year ago

I wonder what guarantees can be made wrt resource consumption. I suppose that'd reasonable to assume that in most (all?) cases an update is cheaper then recompute in terms of cpu cycles, but what about ram? Intuitively it seems like there must be cases that would force you to store unbounded amount of data indefinitely in ram.

ben_pfaff · a year ago

(Feldera co-founder here.) There are some cases where Feldera needs to index data indefinitely, yes. For those cases, Feldera can put those indexes on storage rather than keeping them entirely in RAM.

In a lot of cases where one might initially think that data needs to stay around indefinitely, people actually want the results from the last hour or day or month, etc. For those cases, Feldera supports a concept called "lateness" that allows it to drop older data: https://docs.feldera.com/sql/streaming/#lateness-expressions.

ben_pfaff commented on The sad story of Heisenberg's doctoral oral exam (1998) aps.org/publications/apsn... · Posted by u/occamschainsaw

ben_pfaff · 3 years ago

These days, it is absolutely unheard of for someone to get a doctorate at age 21. I don't know whether it was unusual then.

ben_pfaff commented on North Paw sensebridge.net/projects/... · Posted by u/noja

ben_pfaff · 3 years ago

I bet you could build something like this as a pair of earrings.

ben_pfaff commented on Beej’s Guide to C Programming [pdf] beej.us/guide/bgc/pdf/bgc... · Posted by u/tumblewit

billfruit · 5 years ago

There was a "C Unleashed" book, a massive tome of 1000+ pages written by many famous programmers, many of them who where quite active in comp.lang.c, like Richard Heathfield and CB Falconer, had quite insightful material in it.

Any one remember the heyday of comp.lang.c? I wonder what goes on in there now.

ben_pfaff · 5 years ago

I wasn't sure anyone but the authors remembered C Unleashed! I wrote the chapter on binary search trees and balanced trees.

Comp.lang.c was important to me for many years. I've met 5 or so of the regulars at least once. The most famous comp.lang.c regular is probably Tim Hockin of the Kubernetes project.

ben_pfaff commented on Differential Datalog github.com/vmware/differe... · Posted by u/maximilianroos

0x70dd · 5 years ago

Rego, the language used in Open Policy Agent, is based on and extends Datalog. It's gaining a lot of traction in the past couple of years for evaluating authorization policies.

ben_pfaff · 5 years ago

There's an indirect relationship between Rego and DDlog, at least people-wise. OPA comes from Styra, which was founded by Tim Hinrichs and Teemu Koponen. Teemu designed nlog, which is also a Datalog variant, for use in NVP, which was the network virtualization product at Nicira and was later renamed NSX after VMware acquired Nicira. Tim also worked with Teemu (and me) on NSX at VMware. And Teemu was one of the architects of OVN (the biggest open source application for DDlog), with Tim also having some influence. And Teemu also knows Leonid and Mihai (the principal authors of DDlog).

Some of the episodes of my OVS Orbit podcast may be relevant:

* Episode 5: nlog, with Teemu Koponen from Styra and Yusheng Wang from VMware (https://ovsorbit.org/#e5)

* Episode 44: Cocoon-2, with Leonid Ryzhyk from VMware Research (https://ovsorbit.org/#e44)

* Episode 58: Toward Leaner, Faster ovn-northd, with Leonid Ryzhyk from VMware Research Group (https://ovsorbit.org/#e58)

ben_pfaff commented on Differential Datalog github.com/vmware/differe... · Posted by u/maximilianroos

lykahb · 5 years ago

I wonder how the DDLog compares to Souffle. I think that the biggest difference is that Souffle is batch-oriented. It is less clear how different are the expressiveness, type systems or the recommended patterns. The syntax of DDLog looks seems to be closer to Souffle than to Datomic with EDN.

ben_pfaff · 5 years ago

I don't know anything about Souffle so I can't directly answer the question. But there is some related material.

The DDlog repo includes a Souffle-to-DDlog converter: https://github.com/vmware/differential-datalog/blob/v0.38.0/...

The Souffle converter is somewhat incomplete: https://github.com/vmware/differential-datalog/issues/174

ben_pfaff commented on Differential Datalog github.com/vmware/differe... · Posted by u/maximilianroos

ampdepolymerase · 5 years ago

Building it from source looks tricky but the concepts are quite interesting.

ben_pfaff · 5 years ago

If you're talking about building the DDlog-to-Rust compiler, then it's not that hard. Really it's just a matter of installing Haskell then typing "stack build". But each release also comes with binaries for GNU/Linux, OS X, and Windows, e.g. for 0.38.0: https://github.com/vmware/differential-datalog/releases/tag/...