I asked it to refactor a medium-sized Python project to remove duplicated code by using a dependency injection mechanism. That refactor is not really straightforward as it involves multiple files and it should be possible to use different files with different dependencies.
Anyway, I explain the problem in a few lines and ask for a plan of what to do.
At first I was extremely impressed, it automatically used commands to read the files and gave me a plan of what to do. It seemed it perfectly understood the issue and even proposed some other changes which seemed like a great idea.
So I just asked him to proceed and make the changes and it started to create folders and new files, edit files, and even run some tests.
I was dumbfounded, it seemed incredible. I did not expect it to work with the first try as I had already some experience with AI making mistakes but it seemed like magic.
Then once it was done, the tests (which covered 100% of the code) were not working anymore.
No problem, I isolate a few tests failing and ask Claude Code to fix it and it does.
Now for a few times I found some failing tests and ask him to fix it, slowly trying to fix the mess until there is a test which had a small problem: it succeeded (with pytest) but froze at the end of the test.
I ask again Claude Code to fix it and it tries to add code to solve the issue, but nothing works now. Each time it adds some bullshit code and each time it fails, adding more and more code to try to fix and understand the issue.
Finally after $7,5 spent and 2000+ lines of code changed it's not working, and I don't know why as I did not make the changes.
As you know it's easier to write code than to read code so at end I decided to scrape everything and do all the changes myself little by little, checking that the tests keep succeeding as I go along. I did follow some of the recommended changes it proposed tough.
Next time I'll start with something easier.
Orion is a WebKit web browser from the folks at Kagi that supports both Firefox and Chromium extensions (including on iPhones and iPads) and has zero telemetry, and I have the Firefox version of uBlock Origin installed.
Firefox is not the only option for people that want alternatives to Chrome that support uBlock Origin.
But there are a ton of models I can't run at all locally due to VRAM limitations. I'd take being able to run those models slower. I know there are some ways to get these running on CPU orders of magnitude slower, but ideally there's some sort of middle ground.
This is obviously great advice, but most groups don't organically sprout around interests. Sports, especially, are something that I have a very difficult time imagining enjoying. And with the slow enshittification of meetup, where do you find these groups? Your local library?
Huh? I python you have to choose either threads or asynchronous/await? Why not combine both of them? I am so confused. C# allows for both to be combined quite naturally. And JavaScript as well allows for workers with async/await.
Can someone explain in plain English how RL is even doable here, let alone desirable?
That's typically a setup where RL is desirable (even necessary): we have sparse rewards (only at the end) and give no details to the model on how to reach the solution. It's similar to training models to play chess against a specific opponent.
I would expect 10 random people to do better than a committee of 10 people because 10 people have 10 chances to get it right while a committee only has one. Even if the committee gets 10 guesses (which must be made simultaneously, not iteratively) it might not do better because people might go along with a wrong consensus rather than push for the answer they would have chosen independently.
We tried to do that with GraphQL, http2,... And arguably failed. Until we can properly evolve web standards we won't be able to fix the main issue. Novel frameworks won't do it either