tsurba (u/tsurba) - Readit News

tsurba commented on François Chollet: The Arc Prize and How We Get to AGI [video] youtube.com/watch?v=5QcCe... · Posted by u/sandslash

NetRunnerSu · 2 months ago

Minimize prediction errors.

tsurba · 2 months ago

But are we close to doing that in real-time on any reasonably large model? I don’t think so.

tsurba commented on The Rise of Whatever eev.ee/blog/2025/07/03/th... · Posted by u/cratermoon

tsurba · 2 months ago

I agree with everything up until the AI part, and for that part too, the general idea is good and worth worrying about. I’m scared af about what happens to kids who do all their homework with LLMs. Thankfully at least we still have free and open models, and are not just centralizing everything.

But chatgpt does help me work through some really difficult mathematical equations in newest research papers by adding intermediate steps. I can easily confirm when it gets them right and when not, as I do have some idea. It’s super useful.

If you are not able to make LLMs work for you at all, and complain about them on the internet, you are an old man yelling at clouds. The blog post devolves from an insightful viewpoint into a long sad ramble.

It’s 100% fine if you don’t want to use them yourself, but complaining to others gets tired quick.

tsurba commented on Meta announces Oakley smart glasses theverge.com/news/690133/... · Posted by u/jmsflknr

jekwoooooe · 2 months ago

Yes! Your 10$ and 600$ are made in the same place by the same company with largely the same materials.

I discovered goodr recently and they are great. 25$ high quality sunglasses that I can actually trust have real UV ratings. Seeing people wear ray bans or oakleys is really funny

tsurba · 2 months ago

Thankfully in the EU you are not even allowed to sell sunglasses without proper UV protection, and can just pick up sunglasses from any market and trust they are fine, if a little flimsy.

EDIT: ok apparently anywhere else than the poorest of countries, too, really.

tsurba commented on Generative AI coding tools and agents do not work for me blog.miguelgrinberg.com/p... · Posted by u/nomdep

didibus · 2 months ago

If I follow what you are saying, employers won't see any benefits, but employees, while they will take the same time and create the same output in the same amount of time, will be able to do so at a reduced mental strain?

Personally, I don't know if this is always a win, mostly because I enjoy the creative and problem solving aspect of coding, and reducing that to something that is more about prompting, correcting, and mentoring an AI agent doesn't bring me the same satisfaction and joy.

tsurba · 2 months ago

And how long have you been doing this? Because that sounds naive.

After doing programming for a decade or two, the actual act of programming is not enough to be ”creative problem solving”, it’s the domain and set of problems you get to apply it to that need to be interesting.

>90% of programming tasks at a company are usually reimplementing things and algorithms that have been done a thousand times before by others, and you’ve done something similar a dozen times. Nothing interesting there. That is exactly what should and can now be automated (to some extent).

In fact solving problems creatively to keep yourself interested, when the problem itself is boring is how you get code that sucks to maintain for the next guy. You should usually be doing the most clear and boring implementation possible. Which is not what ”I love coding” -people usually do (I’m definitely guilty).

To be honest this is why I went back to get a PhD, ”just coding” stuff got boring after a few years of doing it for a living. Now it feels like I’m just doing hobby projects again, because I work exactly on what I think could be interesting for others.

tsurba commented on Generative AI coding tools and agents do not work for me blog.miguelgrinberg.com/p... · Posted by u/nomdep

suddenlybananas · 2 months ago

>persistence and a willingness to start from scratch again and again.

i.e. continually gambling and praying the model spits something out that works instead of thinking.

tsurba · 2 months ago

Gambling is where I end up if I’m tired and try to get an LLM to build my hobby project for me from scratch in one go, not really bothering to read the code properly. It’s stupid and a waste of time. Sometimes it’s easier to get started this way though.

But more seriously, in the ideal case refining a prompt based on a misunderstanding of an LLM due to ambiguity in your task description is actually doing the meaningful part of the work in software development. It is exactly about defining the edge cases, and converting into language what is it that you need for a task. Iterating on that is not gambling.

But of course if you are not doing that, but just trying to get a ”smarter” LLM with (hopefully deprecated study of) ”prompt engineering” tricks, then that is about building yourself a skill that can become useless tomorrow.

tsurba commented on "The Illusion of Thinking" – Thoughts on This Important Paper hardcoresoftware.learning... · Posted by u/rbanffy

Animats · 2 months ago

"We found that LRMs have limitations in exact computation: they fail to use explicit algorithms and reason inconsistently across puzzles."

That's the real conclusion of the Apple paper. It's correct. LLMs are terrible at arithmetic, or even counting. We knew that. So, now what?

It would be interesting to ask the "AI system" to write a program to solve such puzzle problems. Most of the puzzles given have an algorithmic solution.

This may be a strategy problem. LLMs may need to internalize Polya's How to Solve It?[2] Read the linked Wikipedia article. Most of those are steps an LLM can do, but a strategy controller is needed to apply them in a useful order and back off when stuck.

The "Illusion of thinking" article is far less useful than the the Apple paper.

(Did anybody proofread the Apple paper? [1] There's a misplaced partial sentence in the middle of page 2. Or a botched TeX macro.)

[1] https://ml-site.cdn-apple.com/papers/the-illusion-of-thinkin...

[2] https://en.wikipedia.org/wiki/How_to_Solve_It

tsurba · 2 months ago

Is it a puzzle if there is no algorithm?

But testing via coding algos to known puzzles is problematic as the code may be in the training set. Hence you need new puzzles, which is kinda what ARC was meant to do, right? Too bad OpenAI lost credibility for that set by having access to it, but ”verbally promising” (lol) not to train on it, etc.

tsurba commented on Getting Past Procrastination spectrum.ieee.org/getting... · Posted by u/WaitWaitWha

cardanome · 3 months ago

It is normal to struggle with procrastination from time to time but if is a regular occurrence you need to check the actual causes.

You might have ADHD.

And is is very important to know whether you have it or not because all that advice for neurotypical people will not work for you then. In fact it will harm you. It will make you feel as a failure.

You need to figure out how your brain works and only then you will finally manage to make lasting changes.

tsurba · 3 months ago

I would argue the other way around. I have ADHD, but the thing that really helped me with work procrastination, which I think would help even without ADHD, was to find a job that is actually interesting.

In approx 7 years I went through working at all the top software companies in my country, but what really fixed my problems was moving on to being a researcher at the university. I’m now paid less than half from before, but it’s still enough, and I couldn’t be happier.

Getting to work on what I think is actually important and interesting every day is what helped. I also seem happier than the younger researchers who didn’t work at companies first, who don’t know how good they have it.

tsurba commented on Running GPT-2 in WebGL: Rediscovering the Lost Art of GPU Shader Programming nathan.rs/posts/gpu-shade... · Posted by u/nathan-barry

bigyabai · 3 months ago

Nvidia dominates training moreso than inference, and mostly because of their hardware efficiency and not entirely because of CUDA. To dominate Nvidia, you have to beat their TSMC investments, beat their design chops and match their software support. The shortlist consisting of companies that can do that reads "Apple" and nobody else, which is exactly the sort of captive market Nvidia wants to conquer.

OpenCL has been pretty handy for inference on older cards, but I'd argue it's relevance is waning. llama.cpp has Vulkan compute now which requires a smaller featureset for hardware to support. Many consumer devices skip OpenCL/CUDA altogether and delegate inference to an NPU library.

tsurba · 3 months ago

1.5-2 years ago I did some training for a ML paper on 4 AMD MI250x (each is essentially 2 gpus so 8 in total really, each with 64GB VRAM) on LUMI.

My Jax models and the baseline PyTorch models were quite easy to set up there, and there was not a noticeable perf difference to 8x A100s (which I used for prototyping on our university cluster) in practice.

Of course it’s just a random anecdote, but I don’t think nvidia is actually that much ahead.

tsurba commented on Claude 4 anthropic.com/news/claude... · Posted by u/meetpateltech

vharish · 3 months ago

Totally agreed on this. The context size is what made me switch to Gemini. Compared to Gemini, Claude's context window length is a joke.

Particularly for indie projects, you can essentially dump the entire code into it and with pro reasoning model, it's all handled pretty well.

tsurba · 3 months ago

Yet somehow chatting with Gemini in the web interface, it forgets everything after 3 messages, while GPT (almost) always feels natural in long back-and-forths. It’s been like this for at least a year.

tsurba commented on ChatGPT Is a Gimmick hedgehogreview.com/web-fe... · Posted by u/blueridge

danlitt · 3 months ago

It is refreshing to see I am not the only person who cannot get LLMs to say anything valuable. I have tried several times, but the cycle "You're right to question this. I actually didn't do anything you asked for. Here is some more garbage!" gets really old really fast.

It makes me wonder whether everyone else is kidding themselves, or if I'm just holding it wrong.

tsurba · 3 months ago

I do machine learning research and it is very useful for working out equations and checking for ”does this concept already have an established name” etc.

It is also excellent for writing one-off code experiments and plots, saving some time from having to write them from scratch.

I’m sorry but you are just using it wrong.