The battery sometimes randomly drains within less than a day. There are absolutely no benefits of the new visual effects.
The watch was my favorite apple device because it helps me to reduce screen time on the phone. Now it is a source of anger.
But the personal and policy issues are about as daunting as the technology is promising.
Some the terms, possibly similar to many such services:
- The use of Z.ai to develop, train, or enhance any algorithms, models, or technologies that directly or indirectly compete with us is prohibited
- Any other usage that may harm the interests of us is strictly forbidden
- You must not publicly disclose [...] defects through the internet or other channels.
- [You] may not remove, modify, or obscure any deep synthesis service identifiers added to Outputs by Z.ai, regardless of the form in which such identifiers are presented
- For individual users, we reserve the right to process any User Content to improve our existing Services and/or to develop new products and services, including for our internal business operations and for the benefit of other customers.
- You hereby explicitly authorize and consent to our: [...] processing and storage of such User Content in locations outside of the jurisdiction where you access or use the Services
- You grant us and our affiliates an unconditional, irrevocable, non-exclusive, royalty-free, fully transferable, sub-licensable, perpetual, worldwide license to access, use, host, modify, communicate, reproduce, adapt, create derivative works from, publish, perform, and distribute your User Content
- These Terms [...] shall be governed by the laws of Singapore
To state the obvious competition issues: If/since Anthropic, OpenAI, Google, X.AI, et al are spending billions on data centers, research, and services, they'll need to make some revenue. Z.ai could dump services out of a strategic interest in destroying competition. This dumping is good for the consumer short-term, but if it destroys competition, bad in the long term. Still, customers need to compete with each other, and thus would be at a disadvantage if they don't take advantage of the dumping.Once your job or company depends on it to succeed, there really isn't a question.
Ultimately, I built this because I wanted the Vercel experience for Python apps.
PS: it's built with FastAPI and HTMX (and Basecoat [1] which I created for this project).
Btw, very cool project. Deployments for simple projects are a huge time sink.
Try building something new in claude code (or codex etc) using a programming language you have not used before. Your opinion might change drastically.
Current AI tools may not beat the best programmer, they definitely improves average programmer efficiency.
It was Claude Code Opus 4.1 instead of Codex but IMO the differences are negligible.
I get that with most of the better models I've tried, although I'd probably personally favor OpenAI's models overall. I think a good system prompt is probably the best way there, rather than relying in some "innate" "clean code" behavior of specific models. This is a snippet of what I use today for coding guidelines: https://gist.github.com/victorb/1fe62fe7b80a64fc5b446f82d313...
> That being said it occasionally does something absolutely stupid. Like completely dumb
That's a bit tougher, but you have to carefully read through exactly what you said, and try to figure out what might have led it down the wrong path, or what you could have said in the first place for it avoid that. Try to work it into your system prompt, then slowly build up your system prompt so every one-shot gets closer and closer to being perfect on every first try.
That being said, I'm starting to doubt the leaderboards as an accurate representation of model ability. While I do think Gemini is a good model, having used both Gemini and Claude Opus 4 extensively in the last couple of weeks I think Opus is in another league entirely. I've been dealing with a number of gnarly TypeScript issues, and after a bit Gemini would spin in circles or actually (I've never seen this before!) give up and say it can't do it. Opus solved the same problems with no sweat. I know that that's a fairly isolated anecdote and not necessarily fully indicative of overall performance, but my experience with Gemini is that it would really want to kludge on code in order to make things work, where I found Opus would tend to find cleaner approaches to the problem. Additionally, Opus just seemed to have a greater imagination? Or perhaps it has been tailored to work better in agentic scenarios? I saw it do things like dump the DOM and inspect it for issues after a particular interaction by writing a one-off playwright script, which I found particularly remarkable. My experience with Gemini is that it tries to solve bugs by reading the code really really hard, which is naturally more limited.
Again, I think Gemini is a great model, I'm very impressed with what Google has put out, and until 4.0 came out I would have said it was the best.
I do not understand how those machines work.