Readit News logoReadit News
olq_plo commented on What Is the Fourier Transform?   quantamagazine.org/what-i... · Posted by u/rbanffy
lutusp · 7 days ago
Such a shame. In an otherwise well-written article, the author mentions Cooley and Tukey's discovery of the FFT, but without mentioning that Gauss discovered it first, among others, each of whom approached the same idea from different directions.

The Wikipedia FFT article (https://en.wikipedia.org/wiki/Fast_Fourier_transform) credits Gauss with originating the FFT idea later expanded on by others, and correctly describes Cooley and Tukey's work as a "rediscovery."

olq_plo · 7 days ago
Yes, bad article to omit that. It is such a cool fun fact. Gauss was unreal.
olq_plo commented on The Bitter Lesson Is Misunderstood   obviouslywrong.substack.c... · Posted by u/JnBrymn
kushalc · 8 days ago
Hey folks, OOP/original author and 20-year HN lurker here — a friend just told me about this and thought I'd chime in.

Reading through the comments, I think there's one key point that might be getting lost: this isn't really about whether scaling is "dead" (it's not), but rather how we continue to scale for language models at the current LM frontier — 4-8h METR tasks.

Someone commented below about verifiable rewards and IMO that's exactly it: if you can find a way to produce verifiable rewards about a target world, you can essentially produce unlimited amounts of data and (likely) scale past the current bottleneck. Then the question becomes, working backwards from the set of interesting 4-8h METR tasks, what worlds can we make verifiable rewards for and how do we scalably make them? [1]

Which is to say, it's not about more data in general, it's about the specific kind of data (or architecture) we need to break a specific bottleneck. For instance, real-world data is indeed verifiable and will be amazing for robotics, etc. but that frontier is further behind: there are some cool labs building foundational robotics models, but they're maybe ~5 years behind LMs today.

[1] There's another path with better design, e.g. CLIP that improves both architecture and data, but let's leave that aside for now.

olq_plo · 8 days ago
Since you seem to know your stuff, why do LLMs need so much data anyway? Humans don't. Why can't we make models aware of their own uncertainty, e.g. feeding the variance of the next token distribution back into the model, as a foundation to guide their own learning. Maybe with that kind of signal, LLMs could develop 'curiosity' and 'rigorousness' and seek out the data that best refines them themselves. Let the AI make and test its own hypotheses, using formal mathematical systems, during training.
olq_plo commented on Tracking Copilot vs. Codex vs. Cursor vs. Devin PR Performance   aavetis.github.io/ai-pr-w... · Posted by u/HiPHInch
rustc · 3 months ago
> “It concludes that the outputs of generative AI can be protected by copyright only where a human author has determined sufficient expressive elements”

How would that work if it's a patch to a project with a copyleft license like GPL which requires all derivate work to be licensed the same?

olq_plo · 3 months ago
IANAL, but it means the commit itself is public domain. When integrated into a code base with a more restrictive license, you can still use that isolated snippet in whatever way you want.

More interesting question is whether one could remove the GPL restrictions on public code by telling AI to rewrite the code from scratch, providing only the behavior of the code.

This could be accomplished by making AI generate a comprehensive test suite first, and then let it write the code of the app seeing only the test suite.

olq_plo commented on TransMLA: Multi-head latent attention is all you need   arxiv.org/abs/2502.07864... · Posted by u/ocean_moist
olq_plo · 4 months ago
Very cool idea. Can't wait for converted models on HF.
olq_plo commented on Autobib v0.4.0 released: auto-download missing entries to your BibTeX file   github.com/hdembinski/aut... · Posted by u/olq_plo
olq_plo · 4 years ago
This is not everyone, but if you've been writing scientific papers with LaTeX, you may have come across this issue.

You go to an online database (Inspire or ADS) to fetch some references for your paper. Then you have to copy/paste the entry twice, the key to your LaTeX document and the BibTeX entry to your .bib file. Doing redundant things is annoying, right? autobib removes the need to do the latter. You still have to look up the key online and cite it in your LaTeX document, but autobib downloads the entry automatically to your .bib file.

olq_plo commented on Show HN: Iminuit v2.0 Released   iminuit.readthedocs.io... · Posted by u/olq_plo
olq_plo · 5 years ago
Apart from the visible changes to the user interface, I completely swapped out the foundation. iminuit consists - at its core - of Python bindings to the Minuit2 C++ library.

We used to generate those bindings with Cython, but Cython is very bad at generating bindings for C++. It does not support all modern features and imposes restrictions on what you can wrap. It is also an external code generator that you have to install.

Cython was a real problem, so we switched to the excellent pybind11 library. It is C++ header-only library. Generating Python bindings with that is a breeze and it supports all possible C++ constructs. We lost at lot of weight and awkward complexity by switching out the foundation.

olq_plo commented on Show HN: Iminuit v2.0 Released   iminuit.readthedocs.io... · Posted by u/olq_plo
olq_plo · 5 years ago
Hi, I present the v2.0 overhaul of iminuit, the Jupyter-friendly Python interface for CERN's Minuit2 C++ library.

iminuit is a minimizer to find a minimum of a mathematical (Python) function. It is used by people who fit complicated statistical models to data. There are many other minimizers out there (like those in scipy), but iminuit can also compute uncertainty estimates for the fitted values, which almost no other package can do. If that is not of interest to you, you can stop reading here :).

iminuit has been around for a long time and is popular in the astroparticle physics community, but its roots are in the high energy physics community, where almost every publication uses Minuit in some way.

Most of CERN's software is bundled in ROOT, a large framework that can do all kinds of things. Some people, however, prefer small packages that do specific things, and for those iminuit was written.

You can easily pip install iminuit and try it on one of the tutorials, everything is heavily documented. Starting with iminuit now is the best time, because the interface has been really cleaned up in v2.0.

u/olq_plo

KarmaCake day9April 14, 2019View Original