Readit News logoReadit News
tymonPartyLate commented on We tasked Opus 4.6 using agent teams to build a C Compiler   anthropic.com/engineering... · Posted by u/modeless
tymonPartyLate · a month ago
I try to see this like F1 racing. Building a browser or a C compiler with agent swarms is disconnected from the reality of normal software projects. In normal projects the requirements are not full understood upfront and you learn and adapt and change as you make progress. But the innovations from professional racing result in better cars for everyone. We'll probably get better dev tools and better coding agents thanks to those experiments.
tymonPartyLate commented on Apple, What Have You Done?   onlinegoddess.net/2026/01... · Posted by u/todsacerdoti
tymonPartyLate · a month ago
The worst crime for me is liquid glass on the apple watch. All the menus are now lagging on the watch ultra gen2. Where previously it was smooth to interact with, the random lags now make it annoying and inconvenient to interact with. (I need to focus my attention on the Ui state instead of following an automatic procedure from muscle memory)

The battery sometimes randomly drains within less than a day. There are absolutely no benefits of the new visual effects.

The watch was my favorite apple device because it helps me to reduce screen time on the phone. Now it is a source of anger.

tymonPartyLate commented on GLM-4.7: Advancing the Coding Capability   z.ai/blog/glm-4.7... · Posted by u/pretext
w10-1 · 3 months ago
Appears to be cheap and effective, though under suspicion.

But the personal and policy issues are about as daunting as the technology is promising.

Some the terms, possibly similar to many such services:

    - The use of Z.ai to develop, train, or enhance any algorithms, models, or technologies that directly or indirectly compete with us is prohibited
    - Any other usage that may harm the interests of us is strictly forbidden
    - You must not publicly disclose [...] defects through the internet or other channels.
    - [You] may not remove, modify, or obscure any deep synthesis service identifiers added to Outputs by Z.ai, regardless of the form in which such identifiers are presented
    - For individual users, we reserve the right to process any User Content to improve our existing Services and/or to develop new products and services, including for our internal business operations and for the benefit of other customers. 
    - You hereby explicitly authorize and consent to our: [...] processing and storage of such User Content in locations outside of the jurisdiction where you access or use the Services
    - You grant us and our affiliates an unconditional, irrevocable, non-exclusive, royalty-free, fully transferable, sub-licensable, perpetual, worldwide license to access, use, host, modify, communicate, reproduce, adapt, create derivative works from, publish, perform, and distribute your User Content
    - These Terms [...] shall be governed by the laws of Singapore
To state the obvious competition issues: If/since Anthropic, OpenAI, Google, X.AI, et al are spending billions on data centers, research, and services, they'll need to make some revenue. Z.ai could dump services out of a strategic interest in destroying competition. This dumping is good for the consumer short-term, but if it destroys competition, bad in the long term. Still, customers need to compete with each other, and thus would be at a disadvantage if they don't take advantage of the dumping.

Once your job or company depends on it to succeed, there really isn't a question.

tymonPartyLate · 3 months ago
The biggest threats to innovation are the giants with the deepest pockets. Only 5% of chatgpt traffic is paid, 95% is given for free. Gemini cli for developers has a generous free tier. It is easy to get Gemini credits for free for startups. They can afford to dump for a long time until the smaller players starve. How do you compete with that as a small lab? How do you get users when bigger models are free? At least the chinese labs are scrappy and determined. They are the small David IMO.
tymonPartyLate commented on TP-Link conducts Wi-Fi 8 trials, promises better reliability and lower latency   techspot.com/news/109837-... · Posted by u/thunderbong
tymonPartyLate · 5 months ago
If you want reliable Wifi at home, get yourself Ubiquity access points and throw away TP-Link. The issue is not the protocol. After many years of unplugging and plugging back in my TP-link router I know that they are cursed.
tymonPartyLate commented on Devpush – Open-source and self-hostable alternative to Vercel, Render, Netlify   github.com/hunvreus/devpu... · Posted by u/el_hacker
hunvreus · 5 months ago
The main difference is that you're just deploying apps, not containers (although I do plan on adding support for custom Docker images).

Ultimately, I built this because I wanted the Vercel experience for Python apps.

PS: it's built with FastAPI and HTMX (and Basecoat [1] which I created for this project).

[1]: https://basecoatui.com

tymonPartyLate · 5 months ago
Interesting tech choices. I am also always on the hunt for React alternatives. But the lack of type safety and static analysis usually leads to brittle templates. Stuff that could be expressed in verifiable code, is compressed in annotations and some custom markup. You need to manually re-test all templates when you make any change in the models. How do you deal with that?

Btw, very cool project. Deployments for simple projects are a huge time sink.

tymonPartyLate commented on How the AI Bubble Will Pop   derekthompson.org/p/this-... · Posted by u/hdvr
kumarm · 5 months ago
[even in programming it's inconclusive as to how much better/worse it makes programmers.]

Try building something new in claude code (or codex etc) using a programming language you have not used before. Your opinion might change drastically.

Current AI tools may not beat the best programmer, they definitely improves average programmer efficiency.

tymonPartyLate · 5 months ago
I did just that and I ended up horribly regretting it. The project had to be coded in Rust, which I kind of understand but never worked with. Drunk on AI hype, I gave it step by step tasks and watched it produce the code. The first warning sign was that the code never compiled at the first attempt, but I ignored this, being mesmerized by the magic of the experience. Long story short, it gave me quick initial results despite my language handicap. But the project quickly turned into an overly complex, hard to navigate, brittle mess. I ended up reading the Rust in Action book and spending two weeks cleaning and simplifying the code. I had to learn how to configure the entire tool chain, understand various cargo deps and the ecosystem, setup ci/cd from scratch, .... There is no way around that.

It was Claude Code Opus 4.1 instead of Codex but IMO the differences are negligible.

tymonPartyLate commented on AI Mode in Search gets new agentic features and expands globally   blog.google/products/sear... · Posted by u/meetpateltech
tymonPartyLate · 7 months ago
We used to have private bridges and private roads, and that was an expensive travel situation for everyone. Now, internet search is kind of like a bridge that leads clients to businesses and Google is deciding on the tolls. Government-controlled Internet search would definitely be horrible. But I'm thinking if there is a path towards more competitiveness in this landscape, maybe the ISPs could somehow provide free search as part of the Internet service fee? Can we have more specialized, niche search engines? Can governments be asked to break up the Google search monopoly?
tymonPartyLate commented on Gemini-2.5-pro-preview-06-05   deepmind.google/models/ge... · Posted by u/jcuenod
diggan · 9 months ago
> Code that is simple, easy to read, not polluted with comments, no unnecessary crap, just pretty, clean and functional

I get that with most of the better models I've tried, although I'd probably personally favor OpenAI's models overall. I think a good system prompt is probably the best way there, rather than relying in some "innate" "clean code" behavior of specific models. This is a snippet of what I use today for coding guidelines: https://gist.github.com/victorb/1fe62fe7b80a64fc5b446f82d313...

> That being said it occasionally does something absolutely stupid. Like completely dumb

That's a bit tougher, but you have to carefully read through exactly what you said, and try to figure out what might have led it down the wrong path, or what you could have said in the first place for it avoid that. Try to work it into your system prompt, then slowly build up your system prompt so every one-shot gets closer and closer to being perfect on every first try.

tymonPartyLate · 9 months ago
Thanks for sharing, I'll copy your rules :)
tymonPartyLate commented on Gemini-2.5-pro-preview-06-05   deepmind.google/models/ge... · Posted by u/jcuenod
johnfn · 9 months ago
Impressive seeing Google notch up another ~25 ELO on lmarena, on top of the previous #1, which was also Gemini!

That being said, I'm starting to doubt the leaderboards as an accurate representation of model ability. While I do think Gemini is a good model, having used both Gemini and Claude Opus 4 extensively in the last couple of weeks I think Opus is in another league entirely. I've been dealing with a number of gnarly TypeScript issues, and after a bit Gemini would spin in circles or actually (I've never seen this before!) give up and say it can't do it. Opus solved the same problems with no sweat. I know that that's a fairly isolated anecdote and not necessarily fully indicative of overall performance, but my experience with Gemini is that it would really want to kludge on code in order to make things work, where I found Opus would tend to find cleaner approaches to the problem. Additionally, Opus just seemed to have a greater imagination? Or perhaps it has been tailored to work better in agentic scenarios? I saw it do things like dump the DOM and inspect it for issues after a particular interaction by writing a one-off playwright script, which I found particularly remarkable. My experience with Gemini is that it tries to solve bugs by reading the code really really hard, which is naturally more limited.

Again, I think Gemini is a great model, I'm very impressed with what Google has put out, and until 4.0 came out I would have said it was the best.

tymonPartyLate · 9 months ago
I just realized that Opus 4 is the first model that produced "beautiful" code for me. Code that is simple, easy to read, not polluted with comments, no unnecessary crap, just pretty, clean and functional. I had my first "wow" moment with it in a while. That being said it occasionally does something absolutely stupid. Like completely dumb. And when I ask it "why did you do this stupid thing", it replies "oh yeah, you're right, this is super wrong, here is an actual working, smart solution" (proceeds to create brilliant code)

I do not understand how those machines work.

u/tymonPartyLate

KarmaCake day85February 26, 2023
About
CTO gralio.ai https://x.com/tymonPartyLate
View Original