Dead Comment
The static typing of LR is nice, but trading off user experience for developer experience seems like a bad deal.
IMO there's a kind of funny progression in which parsing approach turns out to be the most appropriate depending on the scope of your project that circles back on itself:
- For pretty simple languages a hand-written recursive descent is obviously easiest
- Once your grammar is complicated enough that you start hitting precedence and ambiguity issues, or get sick of rewriting a big chunk of your parser as the grammar changes, you look into generating your parser from a BNF-like specification and end up with some variant of LL or LR
- At some point your language's grammar has mostly stabilized and you're struggling with providing good error messages or parsing performance or you've had to add one too many hacks to get around limitations of your parser generator and recursive descent starts looking good again
For my money, I tend to think that Pratt parsing/precedence climbing can extend recursive descent in a way that makes a lot of the common complaints about the complexity of dealing with operator precedence and associativity seem overstated. The trick is just that as you're building an AST, some symbols will cause you to reparent nodes that you thought you'd already placed, according to various rules. See: https://www.oilshell.org/blog/2017/03/31.html
I wrote a compiler for a vaguely C-like language by hand in javascript a while back that's intended to show how simple a hand-written parser (and code generator) can end up: https://github.com/j-s-n/WebBS
It's not that hard to statically track type information along the way - the above example requires a second pass at the AST to nail things into place and make sure all our operators are operating on the right type of thing, but a lot of that information is captured during the first parser pass or even during lexing.
I've posted a version of this comment before, but my main problem with MVC isn't that it's a bad architecture, just that in practice there's a huge amount of disagreement about what MVC (or MV* more generally) actually _is_, which can lead to confusion or weird/bad hybrid implementations.
I tend to use the "model" and "view" concepts a lot when discussing architecture, but in my experience it's almost always a mistake to try and reference any specific MV* pattern for explanatory purposes - it does not have the effect of making the discussion clearer.
The issue is that there isn't actually a consensus about what constitutes the definitional features of these patterns, especially when it comes to how the concepts involved actually translate into code. For any sufficiently notable discussion of an MV* pattern, you're going to find an argument in the comments about whether the author actually understands the pattern or is talking about something else, and typically the commenters will be talking past one another.
Note that I'm NOT claiming that there's anything wrong with MV* as an architecture, or your favorite explanation of MV* - it may be perfectly well defined and concrete and useful once you understand it. The issue is a feature of the community: lots of other people have a different (and possibly worse) understanding of what MV* means, so when you start talking about it and your understandings don't align, confusion arises. Getting those understandings back in alignment is more trouble than the acronyms are worth.
I've seen enough conversations about concrete development issues suddenly turn into disagreements about the meaning of words to realize that nothing useful is conveyed by mentioning MV* and assuming anyone knows what you're talking about - it's better to spell out exactly what you mean in reference to the code you're actually talking about, even if you have to use more words.
I like this MOVE scheme because it seems to me to divide up the conceptual space at the joints in a way that's relatively hard to misunderstand, and it seems a little easier to see how to directly relate those division back to code.
Part of the motivation was that while I've written many simple parsers, I mostly used parser generators with BNF grammars and I didn't feel like I had a good sense of how the code I ended up generating actually worked for something with complex syntax and semantics. I felt like I was writing specs for a parser, rather than writing a parser. And I didn't have a huge amount of experience with code generation.
My toy language has vaguely C-like syntax with block scope and infix operators with various precedences, so it was a bit more complicated than JSON, but I ended up using something like Pratt parsing/Precedence Climbing (see https://www.oilshell.org/blog/2017/03/31.html) and wrote the whole thing in a way that's - hopefully - pretty easy to read for folks interested in wrapping their head around parsing complex syntax (e.g. with scope and name resolution). The lexer, parser and language definition ended up being about 1000 lines of JS (counting some pretty extensive comments).
Code generation is pretty straightforward once you have an AST.
Any JS programmers that are interested in really getting into the nitty-gritty of writing your own parser/compiler should check it out. The source is here: https://github.com/j-s-n/WebBS.
If you want to play around with the language and inspect the generated ASTs and WASM bytecode, there's an interactive IDE with example code here: https://j-s-n.github.io/WebBS/index.html#splash
The internal factors are less about intentionally hiding things and more about not committing any resources to being open. A lot of folks within Cycorp would like for the project to be more open, but it wasn't prioritized within the company when I was there. The impression that I got was that veterans there sort of feel like the broader AI community turned their back on symbolic reasoning in the 80s (fair) and they're generally not very impressed by the current trends within the AI community, particularly w.r.t. advances in ML (perhaps unfairly so), so they're going to just keep doing their thing until they can't be ignored anymore. "Their thing" is basically paying the bills in the short term while slowly building up the knowledge base with as many people as they can effectively manage and building platforms to make knowledge entry and ontological engineering smoother in the future. Doug Lenat is weirdly unimpressed by open-source models, and doesn't really see the point of committing resources to getting anyone involved who isn't a potential investor. They periodically do some publicity (there was a big piece in Wired some time ago) but people trying to investigate further don't get very far, and efforts within the company to open things up or revive OpenCyc tend to fall by the wayside when there's project work to do.
2. I don't know that much about this subject, but it's a point of common discussion within the company. Historically, a lot of the semantic web stuff grew out of efforts made by either former employees of Cycorp or people within a pretty close-knit intellectual community with common interests. OWL/RDF is definitely too simple to practically encode the kind of higher order logic that Cyc makes use of. IIRC the inference lead Keith Goolsbey was working on a kind of minimal extension to OWL/RDF that would make it suitable for more powerful knowledge representation, but I don't know if that ever got published.
Mathematics (in the mainstream sense) is the study of space and quantity.