The results seemed impressive until I noticed some of the "Thinking" statements in the UI.
One made it apparent the model / agent / whatever had read the title from the screenshot and was off searching for existing ABC transcripts of the piece Ode to Joy.
So the whole thing was far less impressive after that, it wasn't reading the score anymore, just reading the title and using the internet to answer my query.
It's weird, it's like many agents are now in a phase of constantly getting more information and never just thinking with what they've got.
Our children are effectively enslaved through basic trinkets and manipulation to serve as eye-balls for ad impressions to fuel equity value in silicon valley. It's fucked and it was intentionally designed to be this fucked.
The abstract of what you state makes sense, but the layers of manipulation on top of it are what the problem is.
So they build useful things and then make them pretty bad and less useful. If they were useful your interest or need would complete and you would move on.
Fundamentally I think it is important to say this. Addiction confounds some things in the space of designed systems
Other people apparently don't have this feeling at all. Maybe I shouldn't have been surprised by this, but I've definitely been caught off guard by it.
i.e. imagine a change that is literally a small diff, that is easy to describe as a mere user and not a developer, and that requires quite a lot of deep understanding merely to submit as a PR (build the project! run the tests! write the template for the PR!).
Really a lot of this stuff ends up being a kind of failure mode of various projects that we all fall into at some point where "config" is in the code and what could be a simple change and test required a lot of friction.
Obviously not all submissions are going to be like this but I think I've tried a few little ones like that where I would normally just leave whatever annoyance I have alone but think "hey maybe it's 10 min faff with AI and a PR".
The structure of the project incentives kind of creates this. Increasing cost to contribution is a valid strategy of course, but from a holistic project point of view it is not always a good one especially assuming you are not dealing with adversarial contributors but only slightly incompetent ones.
This fixed subscription plan with some hardly specified quotas looks like they want to extract extra money from these users who pay $200 and don't use that value, at the same time preventing other users from going over $200. Like I understand that it might work at scale, but just feels a bit not fair to everyone?
If you look at tool calls like MCP and what not you can see it gets ridiculous. Even though it's small for example calling pal MCP from the prompt is still burning tokens afaik. This is "nobody's" fault in this case really but you can see how the incentives are and we all need to think how to make this entire space more usable.
Shell scripts can be audited. The average user may not do it due to laziness and/or ignorance, but it is perfectly doable.
On the other hand, how do you make sure your LLM, a non-deterministic black box, will not misinterpret the instructions in some freak accident?
Instead of asking the agent to execute it for you, you ask the agent to write an install.sh based on the install.md?
Then you can both audit whatever you want before running or not.
They could literally find people who are working on he same things and recommend them for networking etc.
But that would take you off platform. Off attention.
Who wants to build this. Others must be thinking the same thing?
Acting like "oh, he was trolling", or "it was just a small amount of hating Black people and women" is exactly how you get Steven Miller in the fucking White House.
We need to make it shameful to be bigoted again, and that means calling out the bigotry even in death.
We are seeing parallel mechanics from the Trump/GOP camp: look at the library purges in conservative states and the push to co-opt moderation on platforms like TikTok. Access to the historical record isn't just a detail; it is the fundamental substrate of free speech.