jonfw (u/jonfw) - Readit News

jonfw commented on Flunking my Anthropic interview again taylor.town/flunking-anth... · Posted by u/surprisetalk

rurp · 4 days ago

Putting so much self worth into a single job application strikes me as unhealthy. Hiring decisions are have absurdly high variance. Everyone I know has been rejected from a job that seemed like a perfect, usually many times over. I'd say that's far more common than actually getting a given job.

jonfw · 4 days ago

particularly at these high prestige companies where open roles are likely to get thousands of applicants

jonfw commented on What makes Claude Code so damn good minusx.ai/blog/decoding-c... · Posted by u/samuelstros

alex1138 · 10 days ago

What do people think of Google's Gemini (Pro?) compared to Claude for code?

I really like a lot of what Google produces, but they can't seem to keep a product that they don't shut down and they can be pretty ham-fisted, both with corporate control (Chrome and corrupt practices) and censorship

jonfw · 10 days ago

Gemini is better at helping to debug difficult problems that require following multiple function calls.

I think Claude is much more predictable and follows instructions better- the todo list it manages seems very helpful in this respect.

jonfw commented on How to stop feeling lost in tech: the wafflehouse method yacinemahdid.com/p/how-to... · Posted by u/research_pie

cortesoft · 13 days ago

I find my life is best when I balance my creation and consumption. Too much of either leaves me not feeling the best.

jonfw · 12 days ago

I find my life is best when I make time to neither create nor consume, but just to do.

Play board games, go on a walk, play on the floor with my kid, play a sport.

"Doing" doesn't have to be creative or productive!

jonfw commented on MCP doesn't need tools, it needs code lucumr.pocoo.org/2025/8/1... · Posted by u/the_mitsuhiko

mgsloan2 · 15 days ago

I agree the current way tools are used seems inefficient. However there are some very good reasons they tend to operate on code instead of syntax trees:

* Way way way more code in the training set.

* Code is almost always a more concise representation.

There has been work in the past training graph neural networks or transformers that get AST edge information. It seems like some sort of breakthrough (and tons of $) would be needed for those approaches to have any chance of surpassing leading LLMs.

Experimentally having agents use ast-grep seems to work pretty well. So, still representing a everything as code, but using a syntax aware search replace tool.

jonfw · 15 days ago

> * Way way way more code in the training set.

Why not convert the training code to AST?

jonfw commented on MCP tools with dependent types vlaaad.github.io/mcp-tool... · Posted by u/vlaaad

jonfw · 15 days ago

I have a blog post here that has an example of dynamically changing the tool list- https://jonwoodlief.com/rest3-mcp.html.

In this situation, I would have a tool called "request ability to edit GLTF". I This would trigger an addition to the tool list specifically for your desired GLTF. The model would send the "tool list changed' notification and now the LLM would have access.

If you want to do it without the tool list changed notification ability, I'd have two tools, get schema for GLTF, and edit GLTF with schema. If you note that the get schema is a dependency for edit, the LLM could probably plumb that together on it's own fairly well

You could probably also support this workflow using sampling.

jonfw commented on LLMs and coding agents are a security nightmare garymarcus.substack.com/p... · Posted by u/flail

padolsey · 16 days ago

Most of these attacks succeed because app developers either don’t trust role boundaries or don’t understand them. They assume the model can’t reliably separate trusted instructions (system/developer rules) from untrusted ones (user or retrieved data), so they flippantly pump arbitrary context into the system or developer role.

But alignment work has steadily improved role adherence; a tonne of RLHF work has gone into making sure roles are respected, like kernel vs. user space.

If role separation were treated seriously -- and seen as a vital and winnable benchmark (thus motivate AI labs to make it even tighter) many prompt injection vectors would collapse...

I don't know why these articles don't communicate this as a kind of central pillar.

Fwiw I wrote a while back about the “ROLP” — Role of Least Privilege — as a way to think about this, but the idea doesn't invigorate the senses I guess. So, even with better role adherence in newer models, entrenched developer patterns keep the door open. If they cared tho, the attack vectors would collapse.

jonfw · 16 days ago

> If role separation were treated seriously -- and seen as a vital and winnable benchmark, many prompt injection vectors would collapse...

I think it will get harder and harder to do prompt injection over time as techniques to seperate user from system input mature and as models are trained on this strategy.

That being said, prompt injection attacks will also mature, and I don't think that the architecture of an LLM will allow us to eliminate the category of attack. All that we can do is mitigate

jonfw commented on Claude says “You're absolutely right!” about everything github.com/anthropics/cla... · Posted by u/pr337h4m

zozbot234 · 21 days ago

> the biggest take away I have is, if you tell it "don't do xyz" it will always have in the back of its mind "do xyz" and any chance it gets it will take to "do xyz"

You're absolutely right! This can actually extend even to things like safety guardrails. If you tell or even train an AI to not be Mecha-Hitler, you're indirectly raising the probability that it might sometimes go Mecha-Hitler. It's one of many reasons why genuine "alignment" is considered a very hard problem.

jonfw · 21 days ago

This reminds me of a phenomena in motorcyling called "target fixation".

If you are looking at something, you are more likely to steer towards it. So it's a bad idea to focus on things you don't want to hit. The best approach is to pick a target line and keep the target line in focus at all times.

I had never realized that AIs tend to have this same problem, but I can see it now that it's been mentioned! I have in the past had to open new context windows to break out of these cycles.

jonfw commented on Hand-picked selection of articles on AI fundamentals/concepts aman.ai/primers/ai/... · Posted by u/vinhnx

nativeit · 23 days ago

https://www.wheresyoured.at/ai-is-a-money-trap/

jonfw · 23 days ago

The whole article seems to hinge on the idea that cursor is unsustainable and a large driver of revenue for AI companies. It seems to think that if cursor dies, so does the revenue.

I don't see that- cursor will die by losing market share, not by the death of the market. Agentic coding as a market will continue to grow and if Claude remains competitive then Anthropic will do just fine.

jonfw commented on An LLM does not need to understand MCP hackteam.io/blog/your-llm... · Posted by u/gethackteam

crowcroft · a month ago

I think we're probably over using MCPs.

If you're a large org with an API that an ecosystem of other partners use then you should host a remote MCP and then people should connect LLMs to it.

The current model of someone bundling tools into an MCP and then you download and run that MCP locally feels a bit like the wrong path. Tool definitions for LLMs are already pretty standardized if things are just running locally why am I not just importing a package of tools, I'm not sure what the MCP server is adding.

jonfw · a month ago

MCP is just packaging. It's the ideal abstraction for building AI applications.

I think it provides the similar benefits of decoupling the front and back end of a standard app.

I can pick my favorite AI "front end"- whether that's in my IDE as a dev, a desktop app as a business user, or on a server if I'm running an agentic workflow.

MCP allows you to package tools, prompts, etc. in a way that works across any of those front ends.

Even if you don't plan on leveraging the MCP across multiple tools in that way- I do think it has some benefits in de-coupling the lifecycle of the tool development from the model/ UI.

jonfw commented on LLM Inflation tratt.net/laurie/blog/202... · Posted by u/ingve

zacksiri · a month ago

The problem described in this post has nothing to do with LLMs. It has everything to do with work culture and bureaucracy. Rules and laws that don't make sense remain because changing it requires time, energy and effort that most people in companies have either tried and failed or don't care enough to make a change.

This is one example of the "horseless carriage" AI solutions. I've begun questioning further that actually we're going into a generation where a lot of the things we are doing now are not even necessary.

I'll give you one more example. The whole "Office" stack of ["Word", "Excel", "Powerpoint"] can also go away. But we still use it because change is hard.

Answer me this question. In the near future if we could have LLMs that can traverse to massive amount of data why do we need to make excel sheets anymore? Will we as a society continue to make excel spreadsheets because we want the insights the sheet provides or do we make excel sheets to make excel sheets.

The current generation of LLM products I find are horseless carriages. Why would you need agents to make spreadsheets when you should just be able to ask the agent to give you answers you are looking for from the spreadsheet.

jonfw · a month ago

> Answer me this question. In the near future if we could have LLMs that can traverse to massive amount of data why do we need to make excel sheets anymore?

A couple of related questions- if airplanes can fly themselves with auto-pilot, why do we need steering yolks? If I have a dishwasher- why do I still keep sponges and dish soap next to my sink?

The technology is nowhere near being reliable enough that we can eschew traditional means of interacting with data. That doesn't prevent the technology from being massively useful.