Readit News logoReadit News
phoe-krk commented on Generative AI's failure to induce robust models of the world   garymarcus.substack.com/p... · Posted by u/pmcjones
sgt101 · 2 months ago
Thank goodness we have version control systems then.
phoe-krk · 2 months ago
"Version control systems", in case of AI, mean that their knowledge will stay frozen in time, and so their usefulness will diminish. You need fresh data to train AI systems on, and since contemporary data is contaminated with generative AI, it will inevitably lead to inbreeding and eventual model collapse.
phoe-krk commented on Harper – an open-source alternative to Grammarly   writewithharper.com... · Posted by u/ReadCarlBarks
efitz · 2 months ago
I’m glad we have people at HN who could have eliminated decades of effort by tens of thousands of people, had they only been consulted first on the problem.
phoe-krk · 2 months ago
Which effort? Learning a language is something that can't be eliminated. Everyone needs to do it on their own. Writing grammar checking software, though, can be done few times and then copied.
phoe-krk commented on Harper – an open-source alternative to Grammarly   writewithharper.com... · Posted by u/ReadCarlBarks
fakedang · 2 months ago
Not exactly. It takes time for those words to become mainstream for a generation. While you'd have to manually add those words in dictionaries, LLMs can learn these words on the fly, based on frequency of usage.
phoe-krk · 2 months ago
At this point we're already using different definitions of grammar and vocabulary - are they discrete (as in a rule system, vide Harper) or continuous (as in a probability, vide LLMs). LLMs, like humans, can learn them on the fly, and, like humans, they'll have problems and disagreements judging whether something should be highlighted as an error or not.

Or, in other words: if you "just" want a utility that can learn speech on the fly, you don't need a rigid grammar checker, just a good enough approximator. If you want to check if a document contains errors, you need to define what an error is, and then if you want to define it in a strict manner, at that point you need a rule engine of some sort instead of something probabilistic.

phoe-krk commented on Harper – an open-source alternative to Grammarly   writewithharper.com... · Posted by u/ReadCarlBarks
qwery · 2 months ago
Fair enough, thanks for replying. I don't see the task of specifying a grammar as straightforward as you do, perhaps. I guess I just didn't understand the chain of comments.

I find that clear-cut, rigid rules tend to be the least helpful ones in writing. Obviously this class of rule is also easy/easier to represent in software, so it also tends to be the source of false positives and frustration that lead me to disable such features altogether.

phoe-krk · 2 months ago
When you do writing as a form of art, rules are meant to be bent or broken; it's useful to have the ability to explicitly write new ones and make new forms of the language legal, rather than wrestle with hallucinating LLMs.

When writing for utility and communication, though, English grammar is simple and standard enough. Browsing Harper sources, https://github.com/Automattic/harper/blob/0c04291bfec25d0e93... seems to have a lot of the basics already nailed down. Natural language grammar can often be represented as "what is allowed to, should, or should not, appear where, when, and in which context" - IIUC, Harper seems to tackle the problem the same way.

phoe-krk commented on Harper – an open-source alternative to Grammarly   writewithharper.com... · Posted by u/ReadCarlBarks
fakedang · 2 months ago
Aight you win fam, I was trippin fr. You're absolutely bussin, no cap. Harvard should be taking notes.

(^^ alien language that was developed in less than a decade)

phoe-krk · 2 months ago
Yes, precisely. This "less than a decade" is magnitudes above the hours or days that it would take to manually add those words and idioms to proper dictionaries and/or write new grammar rules to accomodate aspects like skipping "g" in continuous verbs to get "bussin" or "bussin'" instead of "bussing". Thank you for illustrating my point.

Also, it takes at most few developers to write those rules into a grammar checking system, compared to millions and more that need to learn a given piece of "evolved" language as it becomes impossible to avoid learning it. It's not only fast enough to do this manually, it also takes much less work-intensive and more scalable.

phoe-krk commented on Harper – an open-source alternative to Grammarly   writewithharper.com... · Posted by u/ReadCarlBarks
qwery · 2 months ago
Please share your reasoning that led you to this conclusion -- that natural language "evolves slowly". You also seem to be making an assumption that natural languages (English, I'm assuming) can be well defined by a simple set of rigid patterns/rules?
phoe-krk · 2 months ago
> Please share your reasoning that led you to this conclusion -- that natural language "evolves slowly".

Languages are used to successfully communicate. To achieve this, all parties involved in the communication must know the language well enough to send and receive messages. This obviously includes messages that transmit changes in the language, for instance, if you tried to explain to your parents the meaning of the current short-lived meme and fad nouns/adjectives like "skibidi ohio gyatt rizz".

It takes time for a language feature to become widespread and de-facto standardized among a population. This is because people need to asynchronously learn it, start using it themselves, and gain critical mass so that even people who do not like using that feature need to start respecting its presence. This inertia is the main source of slowness that I mention, and also and a requirement for any kind of grammar-checking software. From the point of such software, a language feature that (almost) nobody understands is not a language feature, but an error.

> You also seem to be making an assumption that natural languages (English, I'm assuming) can be well defined by a simple set of rigid patterns/rules?

Yes, that set of patterns is called a language grammar. Even dialects and slangs have grammars of their own, even if they're different, less popular, have less formal materials describing them, and/or aren't taught in schools.

phoe-krk commented on Harper – an open-source alternative to Grammarly   writewithharper.com... · Posted by u/ReadCarlBarks
triknomeister · 2 months ago
You would lose out on evolution of language.
phoe-krk · 2 months ago
Natural languages evolve so slowly that writing and editing rules for them is easily achievable even this way. Think years versus minutes.
phoe-krk commented on A different take on S-expressions   gist.github.com/tearflake... · Posted by u/tearflake
tearflake · 2 months ago
Actually it's quite simple. We parse from left to right. When we hit EOL, we return to the beginning of line and increase Y by one.

Blocks are parsed in the following way: when we get the beginning count of block opening characters, we move Y by one, loop right while whitespace, until we encounter ending count of block characters.

In transposed block, we just switch X and Y, it is easily done with pointers, and use the same code.

phoe-krk · 2 months ago

    (fst-atom """   trd-atom frt-atom
      """     00001
      asdf    00002 """    fth-atom)
      qwer    00003 hahaha
      zxcv      """ hehehe
      """           hohoho
                    """
I'm not sure I'd like the above to be parseable.

phoe-krk commented on A different take on S-expressions   gist.github.com/tearflake... · Posted by u/tearflake
velcrovan · 2 months ago
It gets worse/better. Since Racket allows you to hook your own reader in front of (or in place of) the default reader, you can have things like 2D syntax:

    #lang 2d racket
    (require 2d/match)
     
    (define (subtype? a b)
      #2dmatch
      ╔══════════╦══════════╦═══════╦══════════╗
      ║   a  b   ║ 'Integer ║ 'Real ║ 'Complex ║
      ╠══════════╬══════════╩═══════╩══════════╣
      ║ 'Integer ║             #t              ║
      ╠══════════╬══════════╗                  ║
      ║ 'Real    ║          ║                  ║
      ╠══════════╣          ╚═══════╗          ║
      ║ 'Complex ║        #f        ║          ║
      ╚══════════╩══════════════════╩══════════╝)
https://docs.racket-lang.org/2d/index.html

phoe-krk · 2 months ago
Truth be told, you can intercept the reader in Common Lisp, too, and here it actually makes some sense since the 2D value is immediately visually grokkable as an ASCII-art table. The proposed 2D sexpr notation does not have this.

u/phoe-krk

KarmaCake day8741February 10, 2017
About
Lisp hacker. https://phoe.github.io

Author of "The Common Lisp Condition System", published by Apress. https://www.apress.com/us/book/9781484261330

meet.hn/city/50.0619474,19.9368564/Krakow Socials: - github.com/phoe - functional.cafe/@phoe ---

View Original