Mathnerd314 (u/Mathnerd314)

Mathnerd314 commented on Nearly 1 in 3 Starlink satellites detected within the SKA-Low frequency band astrobites.org/2025/08/12... · Posted by u/aragilar

barbazoo · 13 days ago

> Luckily we now are able to launch stuff into orbit a lot cheaper.

“We” as in the select few countries that have the launch capability and the space tech.

Again a public good is being commoditized and being sold to the highest bidder.

Mathnerd314 · 13 days ago

I thought that was the whole idea of spectrum auctions.

Mathnerd314 commented on Texas politicians warn Smithsonian it must not lobby to retain its space shuttle arstechnica.com/space/202... · Posted by u/LorenDB

Mathnerd314 · 17 days ago

All I can find on the Smithsonian is that they did press interviews, where various staff expressed opposition, and that they also sent some report to Congress. The press interviews are, quite naturally, public statements, and it could be argued they're unrelated to lobbying. As for the report, that's part of their normal duties - it would be a real catch-22 if such a report were considered lobbying. This feels like bluster from the politicians; they write dumb letters all the time for PR purposes.

The space shuttle situation, though, is a disaster.

Mathnerd314 commented on ACM Transitions to Full Open Access acm.org/publications/open... · Posted by u/pcvarmint

throwaway81523 · a month ago

> Institutions subscribing to ACM Open receive full access to the Premium version of the ACM Digital Library, providing their users with unrestricted access to over 800,000 ACM published research articles, the ACM Guide to Computing Literature (which indexes more than 6,500 3rd party publishers with direct links to the content), advanced tools, and exclusive features.

What does this mean? The 800,000 previously published articles will stay paywalled and only the new stuff will be open? Or will stuff be open to individuals while institutions have to keep paying? Or what?

Mathnerd314 · a month ago

So all articles will be open and free to read. The ACM Open subscription mainly includes publishing at a lower overall cost than the per-article rates, but also includes "AI-assisted search, bulk downloads, and citation management" and "article usage metrics, citation trends, and Altmetric tracking".

Mathnerd314 commented on The Big Oops: Anatomy of a Thirty-Five-Year Mistake [video] youtube.com/watch?v=wo84L... · Posted by u/doruk101

romaniv · a month ago

From the video: "It's like, yeah, he said that in 2003, right? He said that after a very long time. So why did he say it? It's because 10 years earlier, he was already saying he kind of soured on it."

https://youtu.be/wo84LFzx5nI?t=823

He mentions Alan Kay about dozen times and uses quotes and dates to create a specific narrative about Smalltalk. That narrative is demonstrably false.

Mathnerd314 · a month ago

Casey says he “didn’t really cover Alan Kay” https://youtu.be/wo84LFzx5nI?t=8651 To me that says that Kay wasn’t a major focus of his research. That seems to be reflected in the talk itself: I counted 6 Bjorne sources, 4 Alan Kay sources, 2 more related to Smalltalk, and about 10 focused on Sketchpad, Douglas Ross, and others. By source count, the talk is roughly 18% about Alan Kay and 27% about Smalltalk overall - not a huge part.

As far as the narrative, probably the clearest expression of Casey's thesis is at https://youtu.be/wo84LFzx5nI?t=6187 "Alan Kay had a degree in molecular biology. ... [he was] thinking of little tiny cells that communicate back and forth but which do not reach across into each other's domain to do different things. And so [he was certain that] that was the future of how we will engineer things. They're going to be like microorganisms where they're little things that we instance, and they'll just talk to each other. So everything will be built that way from the ground up." AFAICT the gist of this is true, Kay was indeed inspired by biological cells and that is why he emphasized message-passing so heavily. His undergraduate degree was in math + bio, not just bio, but close enough.

As far as specific discussion, Casey says, regarding a quote on inheritance: https://youtu.be/wo84LFzx5nI?t=843 "that's a little bit weird. I don't know. Maybe Alan Kay... will come to tell us what he actually was trying to say there exactly." So yeah, Casey has already admitted he has no understanding of Alan Kay's writings. I don't know what else you want.

Mathnerd314 commented on The Big Oops: Anatomy of a Thirty-Five-Year Mistake [video] youtube.com/watch?v=wo84L... · Posted by u/doruk101

abetusk · a month ago

I found this talk to be great. It goes through the history of OOP and how some of the ideas for the more modern ECS were embedded in the culture at the formation of OOP in the 1960s to 1980s but somehow weren't adopted.

It was pretty clear, even 20 years ago, that OOP had major problems in terms of what Casey Muratori now calls "hierarchical encapsulation" of problems.

One thing that really jumped out at me was his quote [0]:

> I think when you're designing new things, you should focus on the hardest stuff. ... we can always then take that and scale it down ... but it's almost impossible to take something that solves simple problems and scale it up into something that solves hard [problems]

I understand the context but this, in general, is abysmally bad advice. I'm not sure about language design or system architecture but this is almost universally not true for any mathematical or algorithmic pursuit.

[0] https://www.youtube.com/watch?v=wo84LFzx5nI&t=8284s

Mathnerd314 · a month ago

So, this is pretty difficult to test in a real-world environment, but I did a little LLM experiment. Two prompts, (A) "Implement a consensus algorithm for 3 nodes with 1 failure allowed." vs. (B) "Write a provably optimal distributed algorithm for Byzantine agreement in asynchronous networks with at least 1/3 malicious nodes". Prompt A generates a simple majority-vote approach and says "This code does not handle 'Byzantine' failures where nodes can act maliciously or send contradictory information." Prompt B generates "This is the simplified core consensus logic of the Practical Byzantine Fault Tolerance (PBFT) algorithm".

I would say, if you have to design a good consensus algorithm, PBFT is a much better starting point, and can indeed be scaled down. If you have to run something tomorrow, the majority-vote code probably runs as-is, but doesn't help you with the literature at all. It's essentially the iron triangle - good vs. cheap. In the talk the speaker was clearly aiming for quality above all else.

Mathnerd314 commented on The Big Oops: Anatomy of a Thirty-Five-Year Mistake [video] youtube.com/watch?v=wo84L... · Posted by u/doruk101

romaniv · a month ago

This video contains many serious misrepresentations. For example, it makes a claim that Alan Kay only started talking about message-passing only in 2003 and that it was a kind of backpedaling due the failures of the inheritance-based OOP model. That is a laughable claim. Kay had given detailed talks discussing issues of OOP, dynamic composition and message-passing in mid-80s. Some of those talks are on YouTube:

https://www.youtube.com/watch?v=QjJaFG63Hlo

Also, earlier versions of Smalltalk did not have inheritance. Kay talks about this is his 1993 article on the history of the language:

https://worrydream.com/EarlyHistoryOfSmalltalk/

Dismissing all of this as insignificant quips is ludicrous.

Mathnerd314 · a month ago

The dates are the dates of the sources, he says in the talk he wasn't going to try to infer the dates these ideas were invented. Also he barely talked about Alan Kay.

Mathnerd314 commented on Jank is C++ jank-lang.org/blog/2025-0... · Posted by u/Jeaye

Mathnerd314 · a month ago

Ok so jank is Clojure but with C++/LLVM runtime rather than JVM. So already all of its types are C++ types, that presumably makes things a lot easier. Basically it just uses libclang / CppInterOp to get the corresponding LLVM types and then emits a function call. https://github.com/jank-lang/jank/blob/interop/compiler%2Bru...

Mathnerd314 commented on Zig Community Mirrors ziglang.org/download/comm... · Posted by u/todsacerdoti

Mathnerd314 · 2 months ago

3 mirrors? Arch has 827.

Mathnerd314 commented on LLMs pose an interesting problem for DSL designers kirancodes.me/posts/log-l... · Posted by u/gopiandcode

Mathnerd314 · 2 months ago

Python is just a beautiful, well-designed language - in an era where LLM's generate code, it is kind of reassuring that they mostly generate beautiful code and Python has risen to the top. If you look at the graph, Julia and Lua also do incredibly well, despite being a minuscule fraction of the training data.

But Python/Julia/Lua are by no means the most natural languages - what is natural is what people write before the LLM, the stuff that the LLM translates into Python. And it is hard to get a good look at these "raw prompts" as the LLM companies are keeping these datasets closely guarded, but from HumanEval and MBPP+ and YouTube videos of people vibe coding and such, it is clear that it is mostly English prose, with occasional formulas and code snippets thrown in, and also it is not "ugly" text but generally pre-processed through an LLM. So from my perspective the next step is to switch from Python as the source language to prompts as the source language - integrating LLM's into the compilation pipeline is a logical step. But, currently, they are too expensive to use consistently, so this is blocked by hardware development economics.

Mathnerd314 commented on Fine-tuning LLMs is a waste of time codinginterviewsmadesimpl... · Posted by u/j-wang

sota_pop · 3 months ago

Not sure what you mean by “not trained to saturation”. Also I agree with the article, in the literature, the phenomenon to which the article refers is known as “catastrophic forgetting”. Because no one has specific knowledge about which weights contribute to model performance, by updating the weights via fine-tuning, you are modifying the model such that future performance will change in ways that are not understood. Also I may be showing my age a bit here, but I always thought “fine-tuning” was performing additional training on the output network (traditionally a fully-connected net), but leaving the initial portion (the “encoder”) weights unchanged - allowing the model to capture features the way it always has, but updating the way it generates outputs based on the discovered features.

Mathnerd314 · 3 months ago

OK, so this intuition is actually a bit hard to unpack, I got it from bits and pieces. So this is this post https://www.fast.ai/posts/2023-09-04-learning-jumps/. Essentially, a single pass over the training data is enough for the LLM to significantly "learn" the material. In fact if you read the LLM training papers, for the large-large models, they generally explicitly say that they only did 1 pass over the training corpus, and sometimes not even the full corpus, only like 80% of it or whatever. The other relevant information is the loss curves - models like Llama 3 are not trained until the loss on the training data is minimized, like typical ML models. Rather they use these approximate estimates of FLOPS / tokens vs. performance on benchmarks. But it is pretty much guaranteed that if you continued to train on the training data it would continue to improve its fit - 1 pass over the training data is by no means enough to adequately learn all of the patterns. So from a compression standpoint, the paper I linked previously says that an LLM is a great compressor - but it's not even fully tuned, hence "not trained to saturation".

Now as far as how fine-tuning affects model performance, it is pretty simple: improves fit on the fine-tuning data, decreases fit on original training corpus. Beyond that, yeah, it is hard to say if fine-tuning will help you solve your problem. My experience has been that it always hurts generalization, so if you aren't getting reasonable results with a base or chat-tuned model, then fine-tuning further will not help, but if you are getting results then fine-tuning will make it more consistent.