Readit News logoReadit News
eru commented on Building AI products in the probabilistic era   giansegato.com/essays/pro... · Posted by u/sdan
thrown-0825 · 2 days ago
this entire endeavor is a fools errand, and any who has used coding agents for anything more complex than a web tut knows it.

it doesn't matter how much jargon and mathematical notation you layer on top of your black box next token generator, it will still be unreliable and inconsistent because fundamentally the output is an approximation of an answer and has no basis in reality

This is not a limitation you can build around, its a basic limitation of the underlying models.

Bonus points if you are relying on an LLM for orchestration or agentic state, its not going to work, just move on to a problem you can actually solve.

eru · 2 days ago
You could more or less use the same reasoning to argue for why humans can't write software.

And you'd be half-right: humans are extremely unreliable, and it takes a lot of safeguards and automated testing and PR reviews etc to get reliable software out of humans.

(Just to be clear, I agree that current models aren't exactly reliable. But I'm fairly sure with enough resources thrown at the problem, we could get reasonably reliable systems out of them.)

eru commented on Everything is correlated (2014–23)   gwern.net/everything... · Posted by u/gmays
petters · 2 days ago
If two things e.g. both change over time, they will be correlated. I think it can be good to keep this article in mind
eru · 2 days ago
> If two things e.g. both change over time, they will be correlated.

No?

You can have two independent random walks. Eg flip a coin, gain a dollar or lose a dollar. Do that to times in parallel. Your two account balances will change over time, but they won't be correlated.

eru commented on Everything is correlated (2014–23)   gwern.net/everything... · Posted by u/gmays
cluckindan · 2 days ago
I wonder if this tendency to correlate truly holds for everything? Intuitively it more or less demonstrates that nature tends to favor zero-sum games. Maybe analyzing correlations within the domain of theoretical physics would highlight true non-correlations in some particular approaches? (pun only slightly intended)
eru · 2 days ago
> Intuitively it more or less demonstrates that nature tends to favor zero-sum games.

Please explain.

eru commented on Everything is correlated (2014–23)   gwern.net/everything... · Posted by u/gmays
sayamqazi · 2 days ago
Wouldnt you need the T_zero configuration of the universe for this to work?

Given different T_zero configs of matter and energies T_current would be different. and there are many pathways that could lead to same physical configuration (position + energies etc) with different (Universe minus cake) configurations.

Also we are assuming there is no non-deterministic processed happening at all.

eru · 2 days ago
> Wouldnt you need the T_zero configuration of the universe for this to work?

Why? We learn about the past by looking at the present all the time. We also learn about the future by looking at the present.

> Also we are assuming there is no non-deterministic processed happening at all.

Depends on the kind of non-determinism. If there's randomness, you 'just' deal with probability distributions instead. Since you have measurement error anyway, you need to do that anyway.

There are other forms of non-determinism, of course.

eru commented on AI tooling must be disclosed for contributions   github.com/ghostty-org/gh... · Posted by u/freetonik
raggi · 2 days ago
In the US you can not generate copyrightable IP without substantial human contribution to the process.

https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...

eru · 2 days ago
Eh, they figured out how to copyright photographs, where the human only provides a few bits (setting up the scene, deciding when to pull the trigger etc); so stretching a few bits of human input to cover the whole output of an AI should also be doable with sufficiently well paid lawyers.
eru commented on It’s not wrong that "\u{1F926}\u{1F3FC}\u200D\u2642\uFE0F".length == 7 (2019)   hsivonen.fi/string-length... · Posted by u/program
spyrja · 2 days ago
True. But then again, backward-compatibility isn't really such a hard to do with ASCII because the MSB is always zero. The problem I think is that the original motivation which ultimately lead to the complications we now see with UTF-8 was based on a desire to save a few bits here and there rather than create a straight-forward standard that was easy to parse. I am actually staring at 60+ lines of fairly pristine code I wrote a few years back that ostensibly passed all tests, only to find out that in fact it does not cover all corner cases. (Could have sworn I read the spec correctly, but apparently not!)
eru commented on It’s not wrong that "\u{1F926}\u{1F3FC}\u200D\u2642\uFE0F".length == 7 (2019)   hsivonen.fi/string-length... · Posted by u/program
spyrja · 2 days ago
I really hate to rant on about this. But the gymnastics required to parse UTF-8 correctly are truly insane. Besides that we now see issues such as invisible glyph injection attacks etc cropping up all over the place due to this crappy so-called "standard". Maybe we should just to go back to the simplicity of ASCII until we can come up with with something better?
eru · 2 days ago
You could use a standard that always uses eg 4 bytes per character, that is much easier to parse than UTF-8.

UTF-8 is so complicated, because it wants to be backwards compatible with ASCII.

eru commented on It’s not wrong that "\u{1F926}\u{1F3FC}\u200D\u2642\uFE0F".length == 7 (2019)   hsivonen.fi/string-length... · Posted by u/program
baq · 2 days ago
ASCII is very convenient when it fits in the solution space (it’d better be, it was designed for a reason), but in the global international connected computing world it doesn’t fit at all. The problem is all the tutorials, especially low level ones, assume ASCII so 1) you can print something to the console and 2) to avoid mentioning that strings are hard so folks don’t get discouraged.

Notably Rust did the correct thing by defining multiple slightly incompatible string types for different purposes in the standard library and regularly gets flak for it.

eru · 2 days ago
Python 3 deals with this reasonable sensibly, too, I think. They use UTF-8 by default, but allow you to specify other encodings.
eru commented on AI tooling must be disclosed for contributions   github.com/ghostty-org/gh... · Posted by u/freetonik
ants_everywhere · 2 days ago
Junior developers are entering a workforce where they will never not be using AI
eru · 2 days ago
I don't think using AI at all is forbidden, he just doesn't want AI to do the whole PR?
eru commented on AI tooling must be disclosed for contributions   github.com/ghostty-org/gh... · Posted by u/freetonik
ryukoposting · 2 days ago
Publishing pirated copies of books on libgen isn't the same as downloading pirated copies of books from libgen. Neither is legal.
eru · 2 days ago
In many places that's a fairly recent development: publishing pirated IP used to be much more of a legal problem than consuming it.

Also publishing pirated IP without any monetary gain to yourself also used to be treated more leniently.

Of course, all the rules were changed (both in law and in interpretation in practice) as file sharing became a huge deal about two decades ago.

Details depend on jurisdiction.

u/eru

KarmaCake day32822August 16, 2007
About
You can reach me via email (generalbaguette@gmail.com).

Living in Singapore at the moment.

View Original