Logica, a novel open-source logic programming language

Nice to see Datalog being validated by a big name, though I don't see what's modern about Logica in particular, or why one should use it over plain Datalog (as syntactical Prolog subset) when the available backends are restricted to SQL rewriters or locked-in to BigQuery. Will have to look into the model for aggregate queries which I guess is a selling point for Logica (as is modularization/composition with optimization), and a weak point since typically neither in Datalog (the decidable logic fragment) nor portable.

Edit: also I find the title a bit grandiose since this isn't about Logic Programming in general, but only database querying

cpdean · 5 years ago

> ...or why one should use it over plain Datalog...

I have been looking for examples of how to use a practical implementation of Datalog for years and the closest I've come to is actually miniKanren instead. Could you point me to codebases that productively use Datalog internally?

hcarvalhoalves · 5 years ago

Datomic: https://docs.datomic.com/cloud/query/query-data-reference.ht...

Datascript: https://github.com/tonsky/datascript

Crux: https://opencrux.com/main/index.html

kovvy · 5 years ago

Souffle: https://github.com/souffle-lang/souffle

Used by projects such as Doop: https://bitbucket.org/yanniss/doop and Ddisasm: https://github.com/grammatech/ddisasm

RBerenguel · 5 years ago

What example with miniKanren have you found? This is an area I have a passing interest but never find the time to delve deep enough to find anything shiny enough

devit · 5 years ago

It seems that it's open source (Apache 2.0) and can generate SQL for PostgreSQL and SQLite in addition to BigQuery.

tannhaeuser · 5 years ago

Yeah but that reduces Logica outside Google to SQL rewriting. When the weak point of SQL isn't so much the syntax but the scalability and expressiveness limitations for big data and document data (fixation on strong ACID/CAP guarantees, schemas). SQL syntax has strong points, too; one being that it's not just a query but also update/batch language with ACID semantics; another being that it's standardized with a range of mature options available.

Consider also the practical side: using Datalog as merely "prettier SQL" still doesn't allow you to dynamically define data properties or go schema-less as in RDF or other logic/deductive graph databases. Whenever you want a new column, you must execute DDLs (ALTER TABLE ADD COLUMN) also leading to forced commits, overly broad permissions, chaotic backup procedures and/or code artefacts containing the dreaded SELECT * syntax. Also, parsing Datalog queries, reformulating into SQL, then re-parsing SQL in the DB engine isn't the most efficient thing.

Basically, the workflows and use cases for SQL RDBMSs and Datalog/graph databases are not the same, and if you're using one on top of the other, you're getting the intersection of possibilities but the union of problems, as is well known from O/R mappers );

anentropic · 5 years ago

Oh that's good, there was no mention of anything other than BigQuery in the front-matter but looks like it's in there:

https://github.com/EvgSkv/logica/blob/main/compiler/dialects...

samuell · 5 years ago

My hope would be mainly that this can get datalog into mainstream use, and soon get more (and more mature) libraries created by the community. That is in itself very exciting to me though.

Would be pretty awesome if we could have logica (or something similar) for dataframes (including pandas), and so could build pipelines of transformations-via-queries on those.

(If there is anything like this already implemented, I'm all ears!).

I have a hard time understanding what problem this solves versus SQL?

This really should have demonstration of some of the actual use cases where this shines vs SQL.

Also: SQL-92 has values clause which makes the example provided a little bit silly, you could just use

    values (2),(3),(5)

Another example they gave is a 5-line (excluding imported code) mocking code with a comment "compare that to what you would have to do to achieve the same using bare SQL". Okay..

    select * from (values (1, 'hello'), (2, 'logic'), (3, 'programming')) as mocktable(user_id, comment);

hibbelig · 5 years ago

I guess the appeal of Datalog is the recursion that can be expressed in rules:

    has_descendant(?ancestor, ?descendant) :-
        has_child(?ancestor, ?child),
        has_descendant(?child, ?descendant).

Here, the assumption is that you have explicit `has_child` facts (expressing vertices in a graph, essentially), and the above rule gives you paths of arbitrary length.

In SQL, given a table has_child(parent, child), it is not clear to me how you can get all descendants of a given person, or all ancestors.

Other people talk about recursive extensions to SQL, maybe that provides a way.

quelltext · 5 years ago

Depends on the specific SQL flavor, e.g. https://www.postgresql.org/docs/9.1/queries-with.html

Logica instead seems to create tables and drop tables?

https://github.com/EvgSkv/logica/blob/main/examples/Logica_e...

It looks like Logica code isn't actually translated to a (large) SQL query, but Logica code is dynamically interpreted by some interpreter that calls into a SQL database as the virtual machine.

Not sure what Logica really provides here. Datalog usually comes with elaborate techniques to make sure only what's really needed is calculated instead of just generating all values every fact generation "iteration".

HelloNurse · 5 years ago

The normal SQL way is with a recursive CTE and help from the query optimizer. Differs more in feeling (constructing relations with joins vs. defining predicates with logic) than in substance from the Datalog way.

Sqlite official examples: https://sqlite.org/lang_with.html

MySql official examples: https://docs.oracle.com/cd/E17952_01/mysql-8.0-en/with.html

cm277 · 5 years ago

The problem Datalog tries to solve is complexity: SQL "pulls" data (what's a query after all) to a calling application. Datalog builds up data relationships through declarations. That means that: a) that entities can be inferred from these relationships as opposed to large complex queries, b) that some of these relationships can be built up by code/robots as opposed to humans declaring them.

The end result is (you hope) a very complex database where the smaller blocks/relationships can be audited and verified quickly, and where parallelization more or less comes for free.

The reality is that Datalog systems end up being massive hairballs of declarations that are hard to unravel for mere humans (well, regular developers) and that query-based solutions are 10x faster to develop for 80% of the application use cases.

The closest parallel is functional-vs-procedural programming (don't flame me); it's a niche solution for niche problems.

Source: former Datalog developer for ERP systems.

ainar-g · 5 years ago

I actually mostly agree with you, except for the fact that in reality SQL is not a language, but a family of languages, some of which don't support the syntax[1], including Google's own BigQuery. Whether or not this is a reason to create a completely new unrelated language is still up for a debate.

Tangentially related, but does anybody know of a program or a library that takes standard SQL queries as input and outputs one or multiple equivalent queries using the SQL dialects of a set of DBMSs? That is, compiles a standard SQL query into a PostgreSQL one, an SQLite one, etc.

[1]: https://modern-sql.com/feature/values#compatibility

layer8 · 5 years ago

> does anybody know of a program or a library that takes standard SQL queries as input and outputs one or multiple equivalent queries using the SQL dialects of a set of DBMSs?

There are a number of tools that translate queries between SQL dialects. Google for “sql dialect translator”.

da39a3ee · 5 years ago

That's explained carefully in the first 5 paragraphs, especially paras 1, 3, 4, and 5.

mikkom · 5 years ago

Yes I read it. Lots of abstract talk about modularity and how SQL is so bad because people tend to use all caps without demonstrating how and what this whole new programming language can do better.

Note that I'm not claiming it can't do but I would be interested for the authors to point out what the actual benefits are.

All I see from their examples that clauses in Logica are much longer than SQL counterparts and that they are importing modules which (I assume) re-define already defined schemas which brings all kinds of different dependency problems that I'm not going to go in here..

Dead Comment

Scryer Prolog aims to become to ISO Prolog what GHC is to Haskell: an open source industrial strength production environment that is also a testbed for bleeding edge research in logic and constraint programming, which is itself written in a high-level language.

js8 · 5 years ago

I have yet to check it out, but it's very cool when people try to rethink SQL, and also revive Datalog. I (morally) support this effort.

Myself, I would like to see data query/manipulation language as a total functional language, possibly based on the idea of categorical data transformations: https://www.categoricaldata.net/

Also - bit of a rant - if you're creating a new programming language, consider making syntax and semantics separate in the specification. Lots of people get hung up on arguing about language syntax but it's really semantics differences that are important for compatibility. Lot of new languages comes up only to fix syntactic problems with existing languages but create small semantic differences in the process, making automated translation from and to existing languages difficult. I wish we could move to a world where syntax and semantics in programming languages are discussed separately from each other.

chriswarbo · 5 years ago

> Also - bit of a rant - if you're creating a new programming language, consider making syntax and semantics separate in the specification.

I agree. Implementations should accept a stable, machine-friendly format (doesn't matter which; JSON, s-expressions, or even XML would do). If they also accept a human-friendly format, there should be a standard/built-in translation from human format -> machine format (optionally the other way too).

This way, we can always convert random real-world code (scraped from GitHub, or whatever) into a language-agnostic format (yes Python has an `ast` module; that doesn't help a Python linter written in something else, like Go); tools can manipulate this format without having to care about the surface syntax (e.g. linting/doc-gen/static-analysis/versioning/diffing/refactoring/macros/etc.); the output of such tools can always be fed back into the main implementation to compile/run/type-check/syntax-check/etc.

thechao · 5 years ago

I agree with the GP. It frustrates me when the surface syntax and the semantics are not split apart. The constraints of old compiler technology (pre-3rd-millennium) continue to dictate the overall architecture of compilers. LLVM is a tiny baby step in the right direction.

Note, I'm not advocating for a single solution, here. I'm advocating that language authors should always think in terms of a front, middle, and back-end: front is surface syntax (their preferred one?); middle is the semantics, with a prescribed API; and, the backend is the implementation side — nicely abstracted by a 2nd API.

That way it gives those of us stuck in not-your-language a fighting chance to integrate Your Cool Thing™.

chillpenguin · 5 years ago

Good point about separating syntax and semantics. Maybe new languages should use something like lisp syntax to nail down the semantics, and then people could create their own syntaxes from there.

IIRC, Ohm (successor to OMeta) separates syntax from semantics!

jimmyed · 5 years ago

> Myself, I would like to see data query/manipulation language as a total functional language

Erm, like PromQL?

harperlee · 5 years ago

Not a great introduction to the language, IMHO. There is not a clear use of logic to automatically reason about anything, just query composition. It seems the language is much more powerful than what this introduction makes it to be!

> English words (...) often capitalized to keep the old-fashioned COBOL spirit of the 70s alive!

I like logic programming a lot but a convention not technologically enforced is a poor reason to argue for a language change. When arguing about SQL limited abstraction capabilities that space would have been better spent talking about CTE limitations, for example.

Also:

> To make things worse, SQL code is rarely tested, because “testing SQL queries” sounds rather esoteric to most engineers, at best

So nonexisting best practices require a language change, apparently. It was also not showcased how this can be done in Logica, beyond the table mocking that could be done with a "with xxx as (values a, b, c) select (query to be tested)" approach in sql.

> So nonexisting best practices require a language change, apparently. It was also not showcased how this can be done in Logica.

Look for the section containing the text "As a final example, let us mock the comments table, in a unittest of a query." That demonstrates mocking and is explicitly pointing towards testing. The article is only a high-level intro document.

Sorry, I was editing my post in parallel. I see what you are pointing to. I'm sure the language has been thought out for much more time than my reading of the post - I'm just complaining about this "presentation" post. It's not clear from the example what the language provides over just redefining the table with mock values for testing. I´m sure Logica has more than what's stated in here, it's just not a good example (IMHO).

Deleted Comment

JulianMorrison · 5 years ago

SQL queries aren't just esoteric, they have highly opaque performance implications. Two ways of doing an SQL query that might look mathematically equivalent to a human could result in orders of magnitude speed difference due to one of them using the proper index and the other having to do a sequential scan, or various other performance issues like creating temporary data.

So this is going to run into the same issues as any SQL code generator (compare Hibernate for example): you need to know what query it will output. And you need DBA skills to know what that query means in terms of performance. Neither of those steps can be skipped.

Nor is unit testing necessarily helpful when using small n. Issues of poor scaling don't show in tests unless the data is large.

Compared to Prolog, which is sensitive to declaration/search order and can assert enough new facts to not terminate, the risk of inefficient but always correct queries is a marked improvement.

theon144 · 5 years ago

This is just pure speculation on my part, but:

What about optimizations? It seems like it should be possible to construct a SQL query that doesn't hit the pain points (e.g. avoids queries that do not use indexes). Although from my experiences with other ORM frameworks, that probably isn't an easy problem.

Even then though, since it looks like it somewhat aims to replace SQL even in the database-construction step, that might help in this regard, by constructing a more optimal representation of the data (which doesn't seem to be tabular)?

Unit testing I am similarly skeptical about though. The article does mention it being "rather esoteric [sounding] at best", I would actually agree with that expression, haha. I don't think I've ever written or even seen, in my 8 years as a developer, a 100-line SQL query that was not at least partly generated (and hence required testing as a unit, and not just the code around it). I suppose Google operates at a different scale, but still.

xvilka · 5 years ago

There's Datomic[1] though it's proprietary. Regarding the Prolog, I hope they will take a look at newly emerging "GHC of Prolog" in Rust - Scryer[2].

[1] https://www.datomic.com/

[2] https://github.com/mthom/scryer-prolog

What do you mean with "GHC of Prolog"? I don´t know a lot about the Haskell ecosystem so I dont know what that implies.

Edit: Never mind, the Scryer Prolog github page states it as follows:

billfruit · 5 years ago

Isn't SWI the GHC of Prolog already?

tofflos · 5 years ago

I recently gave Rego a shot but had a difficult time grokking it. It's also inspired from Datalog. How would you say Logica compares to Rego?

> It supports modules and imports, it can be used from an interactive Python notebook and it even makes testing your queries natural and easy.

I don't see any examples of how to do tests in the announcement. Consider adding some.

BenoitP · 5 years ago

Nice to see some new initiatives in this domain (or old ideas resurface), but there is a long way towards mainstream adoption IMHO:

* My business clients and I speak SQL together. I don't see them learning a new language. I don't have the authority nor any will to force them to.

* I can spin up a container for testing business rules logic (and often share the results back to the client: here is what the impact of updating rule A is, rows of type W will be affected in this way).

Even though SQL has ceremony/verbosity, I'd rather see the standard be evolved. My clients and I could pick it up more easily.

----

That's great for BigQuery though. You can't spin up a BigQuery docker container anyway, and testing with another schema/project is risky while you have interns around.