kaby76 (u/kaby76) - Readit News

kaby76 commented on Antlr-Ng Parser Generator antlr-ng.org/... · Posted by u/djoldman

kaby76 · 6 months ago

Development on Antlr4 has terminated. The "official ANTLR" successor, called Antlr5, was intended to enable ANTLR to run in a browser, replacing over a half-dozen runtime targets with a unified runtime target, and to add LSP services. But development on Antlr5 stopped after a few months, a year and a half ago, and I don't see when it'll be restarted, if ever.

Antlr-ng is Mike Lischke's port of Antlr4, which he likely undertook because ANTLR is used at Oracle for one MySQL product. It's not "official ANTLR," but Terence Parr granted him the use of the "ANTLR" name and allowed a fork to port the existing Antlr4 code to TypeScript.

Mike's Antlr-ng port of the Antlr4 code began with a Java-to-TypeScript translator he wrote. Along the way, he made some improvements to the TypeScript target.

But, Antlr-ng uses ALL(star). Therefore, it shares the same performance issues as Antlr4. I'm not sure where Mike wants to take Antlr-ng to address that issue.

ANTLR is presented as a generator for small, fast parsers. ALL(star) probably can't do that. Many grammars people write are pathological for ANTLR. People hand-write parsers, reverse-engineer the EBNF from the implementation as an afterthought, drop the critical semantic predicates from the EBNF, and then refactor it into something else—example: the Java Language Spec.

kaby76 commented on Parser generators vs. handwritten parsers: surveying major languages in 2021 notes.eatonphil.com/parse... · Posted by u/eatonphil

kaby76 · 5 years ago

I have a fundamental question: What happened to the idea of specifying the syntax of a language using a "formal" grammar, however terribly flawed and imperfect that has been--for decades--and still is? It seems that if you need a grammar, e.g., to write a programming language tool, it's considered "old school". If it's published, it's of secondary consideration because it's usually out-of-sync with the implementation (e.g., C#, soon to be version 10, the latest "spec"--draft at that--version 6). Some don't even bother publishing a grammar, e.g., Julia. Why would the authors of Julia not publish a "formal" grammar of any kind, instead just offer a link to the 2500+ lines of Scheme-like code? Are we to now write tools to machine-scrape the grammar from the hand-written code or derive the grammar from a ML model of the compiler?

kaby76 commented on Why aren't there more programming-language startups? akitasoftware.com/blog-po... · Posted by u/mpweiher

kaby76 · 5 years ago

There are many reasons why there are not many PL startups. From my perspective--as a PL tool developer for 40 years--is that we still do not have reliable formal grammars that describe the syntax and static semantics for a given programming language. That is a crux because it is difficult to write a tool for a language when we cannot even define what that language is. When you ask where to find a grammar for the language, the default answer is to just use an "official" implementation. That often requires a large amount of time to find and understand an API--and then locks you into that implementation. Some of the popular languages publish a grammar derived by a human reading the implementation after the fact, prone to errors. For C#, the standards committee is still trying to describe version 6 of the language (https://github.com/dotnet/csharplang/tree/main/spec), and we are now on version 10. An Antlr4 grammar for C# is mechanically reverse engineered from the IR, but it does not work out of the box--it contains several mutual left recursions and does not correctly describe the operator precedence and associativity. Julia does not even have a formal grammar published for the language. You cannot easily refactor a formal grammar from something that does not even exist, and have to write it from scratch. Often, what is published is out of sync from the implementation, not versioned, tied to a particular parser generator and target language with semantic predicates. The quality is suspect because you are unsure whether it even follows a spec. No one bothers to enumerate the refactorings involved in modifying the grammar to fit a parser generator, or for optimization, e.g., speed or memory. Yes, I can agree: “[t]here's definitely a need for SOMETHING, as developers have so much pain.”