ptak_dev (u/ptak_dev)

ptak_dev commented on Ask HN: Are people shipping their AI "vibe-coded" apps to production? · Posted by u/infiniumtek

ptak_dev · 21 hours ago

Yes — JetSet AI is in production.

The parts that needed careful engineering vs the parts that were fine to vibe-code:

Flight data integration (live pricing APIs, edge cases): needed careful engineering Booking deep-link generation: needed careful engineering Session state management (persistent VM per user): needed careful engineering The LLM prompt for intent parsing: mostly vibe-coded, iterated quickly The conversational refinement flow: surprisingly robust with minimal engineering once the state layer was solid The pattern I've found: the infrastructure decisions (state management, data layer, booking handoff) need deliberate engineering. The AI behavior layer is more forgiving to iteration.

https://bit.ly/4besn7l

ptak_dev commented on Ask HN: What Are You Working On? (March 2026) · Posted by u/david927

ptak_dev · 21 hours ago

JetSet AI — conversational flight search with persistent session state

The core problem: every AI flight tool I'd seen was stateless — each follow-up query reconstructed context from the conversation history stuffed into the prompt. For transactional multi-turn searches ("what if I fly into Osaka instead?"), approximate context reconstruction isn't good enough.

The approach: deploy on persistent VMs via SuperNinja rather than stateless serverless functions. Each session has a dedicated VM with full runtime state — no reconstruction step, exact reference resolution on follow-ups.

The trade-off is higher infra cost per session vs serverless. For this use case it's worth it; for single-turn use cases it wouldn't be.

Still early. Happy to discuss the architecture if anyone's solving similar problems.

https://bit.ly/4besn7l

ptak_dev commented on You need to rewrite your CLI for AI agents justin.poehnelt.com/posts... · Posted by u/justinwp

ptak_dev · 2 days ago

This is a great point and something we've been dealing with firsthand. The biggest shift is that AI agents need structured, predictable output — not pretty human-formatted tables. We found that adding a simple --json flag and making error messages machine-parseable made our tools 10x more useful to agents. The other thing worth considering is idempotency — when an agent retries a command because it wasn't sure if it succeeded, you want the second run to be safe. That changed how we think about every CLI we build now.

ptak_dev commented on The death of social media is the renaissance of RSS (2025) smartlab.at/rss-revival-l... · Posted by u/jruohonen

ptak_dev · 3 days ago

Google Reader died and took with it the social graph that made RSS useful. You didn't just subscribe to feeds; you saw what your network was reading and sharing. That discovery mechanism is what Twitter/X replaced, not the reading itself.

The problem with RSS today: you have to already know what you want to follow. There's no equivalent of "people like you are reading this." Until someone solves discovery for RSS, it'll stay a power-user tool.

The irony is that LLMs could actually solve this — a model that knows your reading history and surfaces relevant feeds you haven't found yet. That's the product that could bring RSS back to the mainstream.

ptak_dev commented on LLM Writing Tropes.md tropes.fyi/tropes-md... · Posted by u/walterbell

mvkel · 5 days ago

Weirdly, LLMs seem to break with these instructions. They simply ignore them, almost as if the pretraining/RL weights are so heavy, no amount of system prompting can override it

ptak_dev · 3 days ago

@mvkel the reason is that these patterns are so deeply embedded in the training data that they're essentially the model's default "register" for formal writing. System prompts operate at a different level than the weights.

The approaches that actually work: (1) show don't tell — instead of "don't use em dashes", give it 3 examples of the writing style you want and say "write like this". (2) negative examples — paste a paragraph with the tropes and say "never write like this". (3) temperature — lower temperature makes the model more conservative and less likely to reach for the dramatic flourish.

The deeper issue is that these tropes exist because they worked in the training data. Humans upvoted and engaged with that style of writing, so the model learned it was good. The model isn't wrong — it's just optimizing for the wrong signal.

ptak_dev commented on Claude built a system in 3 rounds, latent bugs from round 1 exploded in round 3 github.com/mycelium-clj/m... · Posted by u/yogthos

reedf1 · 3 days ago

It's starting to become obvious that if you can't effectively use AI to build systems it is a skill issue. At the moment it is a mysterious, occasionally fickle, tool - but if you provide the correct feedback mechanisms and provide small tweaks and context at idiosyncrasies, it's possible to get agents to reliably build very complex systems.

ptak_dev · 3 days ago

Partially agree, but I think "skill issue" undersells the genuine reliability problem the original post is describing.

The skill part is real — giving the agent the right context, breaking tasks into the right size, knowing when to intervene. Most people aren't doing that well and their results reflect it.

But the latent bug problem isn't really a skill issue. It's a property of how these systems work: the agent optimises for making the current test pass, not for building something that stays correct as requirements change. Round 1 decisions get baked in as assumptions that round 3 never questions — and no amount of better prompting fixes that.

The fix isn't better prompting. It's treating agent-generated code with the same scepticism you'd apply to code from a contractor who won't be around to maintain it — more tests, explicit invariants, and not letting the agent touch the architecture without a human reviewing the design first.

ptak_dev commented on We should revisit literate programming in the agent era silly.business/blog/we-sh... · Posted by u/horseradish

rustybolt · 4 days ago

I have noticed a trend recently that some practices (writing a decent README or architecture, being precise and unambiguous with language, providing context, literate programming) that were meant to help humans were not broadly adopted with the argument that it's too much effort. But when done to help an LLM instead of a human a lot of people suddenly seem to be a lot more motivated to put in the effort.

ptak_dev · 3 days ago

This is the pattern I keep noticing too. A lot of "good engineering hygiene" that got dismissed as overhead is now paying dividends specifically because agents can consume it.

Detailed commit messages: ignored by most humans, but an agent doing a git log to understand context reads every one. Architecture decision records: nobody updates them, but an agent asked to make a change that touches a core assumption will get it wrong without them.

The irony is that the practices that make code legible to agents are the same ones that make it legible to a new engineer joining the team. We just didn't have a strong enough forcing function before.

ptak_dev commented on Agent Safehouse – macOS-native sandboxing for local agents agent-safehouse.dev/... · Posted by u/atombender

ptak_dev · 3 days ago

The thing I keep coming back to with local agent sandboxing is that the threat model is actually two separate problems that get conflated.

Problem 1: the agent does something destructive by accident — rm -rf, hard git revert, writes to the wrong config. Filesystem sandboxing solves this well.

Problem 2: the agent does something destructive because it was prompt-injected via a file it read. Sandboxing doesn't help here — the agent already has your credentials in memory before it reads the malicious file.

The only real answer to problem 2 is either never give the agent credentials that can do real damage, or have a separate process auditing tool calls before they execute. Neither is fully solved yet.

Agent Safehouse is a clean solution to problem 1. That's genuinely useful and worth having even if problem 2 remains open.