But chatgpt does help me work through some really difficult mathematical equations in newest research papers by adding intermediate steps. I can easily confirm when it gets them right and when not, as I do have some idea. It’s super useful.
If you are not able to make LLMs work for you at all, and complain about them on the internet, you are an old man yelling at clouds. The blog post devolves from an insightful viewpoint into a long sad ramble.
It’s 100% fine if you don’t want to use them yourself, but complaining to others gets tired quick.
I discovered goodr recently and they are great. 25$ high quality sunglasses that I can actually trust have real UV ratings. Seeing people wear ray bans or oakleys is really funny
EDIT: ok apparently anywhere else than the poorest of countries, too, really.
Personally, I don't know if this is always a win, mostly because I enjoy the creative and problem solving aspect of coding, and reducing that to something that is more about prompting, correcting, and mentoring an AI agent doesn't bring me the same satisfaction and joy.
After doing programming for a decade or two, the actual act of programming is not enough to be ”creative problem solving”, it’s the domain and set of problems you get to apply it to that need to be interesting.
>90% of programming tasks at a company are usually reimplementing things and algorithms that have been done a thousand times before by others, and you’ve done something similar a dozen times. Nothing interesting there. That is exactly what should and can now be automated (to some extent).
In fact solving problems creatively to keep yourself interested, when the problem itself is boring is how you get code that sucks to maintain for the next guy. You should usually be doing the most clear and boring implementation possible. Which is not what ”I love coding” -people usually do (I’m definitely guilty).
To be honest this is why I went back to get a PhD, ”just coding” stuff got boring after a few years of doing it for a living. Now it feels like I’m just doing hobby projects again, because I work exactly on what I think could be interesting for others.
i.e. continually gambling and praying the model spits something out that works instead of thinking.
But more seriously, in the ideal case refining a prompt based on a misunderstanding of an LLM due to ambiguity in your task description is actually doing the meaningful part of the work in software development. It is exactly about defining the edge cases, and converting into language what is it that you need for a task. Iterating on that is not gambling.
But of course if you are not doing that, but just trying to get a ”smarter” LLM with (hopefully deprecated study of) ”prompt engineering” tricks, then that is about building yourself a skill that can become useless tomorrow.
That's the real conclusion of the Apple paper. It's correct. LLMs are terrible at arithmetic, or even counting. We knew that. So, now what?
It would be interesting to ask the "AI system" to write a program to solve such puzzle problems. Most of the puzzles given have an algorithmic solution.
This may be a strategy problem. LLMs may need to internalize Polya's How to Solve It?[2] Read the linked Wikipedia article. Most of those are steps an LLM can do, but a strategy controller is needed to apply them in a useful order and back off when stuck.
The "Illusion of thinking" article is far less useful than the the Apple paper.
(Did anybody proofread the Apple paper? [1] There's a misplaced partial sentence in the middle of page 2. Or a botched TeX macro.)
[1] https://ml-site.cdn-apple.com/papers/the-illusion-of-thinkin...
But testing via coding algos to known puzzles is problematic as the code may be in the training set. Hence you need new puzzles, which is kinda what ARC was meant to do, right? Too bad OpenAI lost credibility for that set by having access to it, but ”verbally promising” (lol) not to train on it, etc.
You might have ADHD.
And is is very important to know whether you have it or not because all that advice for neurotypical people will not work for you then. In fact it will harm you. It will make you feel as a failure.
You need to figure out how your brain works and only then you will finally manage to make lasting changes.
In approx 7 years I went through working at all the top software companies in my country, but what really fixed my problems was moving on to being a researcher at the university. I’m now paid less than half from before, but it’s still enough, and I couldn’t be happier.
Getting to work on what I think is actually important and interesting every day is what helped. I also seem happier than the younger researchers who didn’t work at companies first, who don’t know how good they have it.
OpenCL has been pretty handy for inference on older cards, but I'd argue it's relevance is waning. llama.cpp has Vulkan compute now which requires a smaller featureset for hardware to support. Many consumer devices skip OpenCL/CUDA altogether and delegate inference to an NPU library.
My Jax models and the baseline PyTorch models were quite easy to set up there, and there was not a noticeable perf difference to 8x A100s (which I used for prototyping on our university cluster) in practice.
Of course it’s just a random anecdote, but I don’t think nvidia is actually that much ahead.
Particularly for indie projects, you can essentially dump the entire code into it and with pro reasoning model, it's all handled pretty well.
It makes me wonder whether everyone else is kidding themselves, or if I'm just holding it wrong.
It is also excellent for writing one-off code experiments and plots, saving some time from having to write them from scratch.
I’m sorry but you are just using it wrong.