uchibeke (u/uchibeke)

uchibeke commented on Show HN: AI agents run my one-person company on Gemini's free tier – $0/month · Posted by u/ppcvote

uchibeke · 4 days ago

This is interesting. How do you manage quality? Do you have like a QA bot for the content?

uchibeke commented on Show HN: ClawCare – Security scanner and runtime guard for AI agent skills github.com/natechensan/Cl... · Posted by u/chendev2

uchibeke · 7 days ago

Pattern matching is a good start — catching curl | bash before it runs is real value. The hard problem is what happens when the pattern is legitimate but the context isn't: the agent has permission to read files, but not these files, not right now, not without a human in the loop.

We ran into this building APort. Blocklists catch the obvious bad stuff but can't express "this tool call is fine for this agent in this workflow, but not from an untrusted prompt chain." That requires identity + policy, not just pattern detection.

Happy to compare notes — the category needs more tools like this.

uchibeke commented on Show HN: We built a public CTF to stress-test AI agent guardrails vault.aport.io/... · Posted by u/uchibeke

ollybrinkman · 14 days ago

Interesting approach to security testing. One angle we've been exploring: what if the authentication layer itself was the guardrail?

With x402, every API call requires a signed payment. No API keys to steal, no credentials to leak. The economic cost of each call is itself a rate limiter and audit trail.

Not a replacement for proper guardrails, but it eliminates the credential-based attack surface entirely.

uchibeke · 14 days ago

That's a genuinely useful distinction to draw. x402 solves the "who is authorized to make this call" problem: removes credential theft as an attack vector, adds economic friction. APort is trying to solve a different layer: "what is this call actually doing in the context of everything else in the session."

The multi-step chaining issue from my post still fires even when every call is authenticated and paid for. Ten individually-approved calls, each costing a fraction of a cent, composing into a full exfiltration: each one passes x402, the composed behavior doesn't.

The AML analogy maps directly: transaction monitoring doesn't care if each payment was legitimate. It cares whether the pattern of payments looks like structuring. x402 is the per-call check. You still need session-level behavioral evaluation on top.

Genuinely curious how x402 handles replay attacks across sessions ie is the payment the audit trail, or is there preserved session context?

uchibeke commented on Show HN: We built a public CTF to stress-test AI agent guardrails vault.aport.io/... · Posted by u/uchibeke

uchibeke · 14 days ago

Since October I've been building APort — an authorization layer that intercepts every AI agent tool call before execution and evaluates it against a versioned policy. The problem I kept running into: internal tests always passed. My test suite maps the space I imagined, which is exactly what an adversarial input tries to escape.

So I built this CTF to find the gaps I couldn't find myself.

A few things we learned before opening it publicly — we spent two weeks breaking it ourselves first:

• Prompt injection worked better than expected. Not because detection was weak, but because we were matching content not intent. Reframing "retrieve the restricted file" as "open the user-requested file" shifted the evaluator's judgment. We fixed this by mapping semantic equivalence — every synonym of a blocked operation routes to the same evaluation path.

• Policy ambiguity was a free pass. Any undefined term in a policy is exploitable. "Don't read sensitive files" left "sensitive" undefined. We moved to explicit default-deny: if the policy doesn't explicitly allow it, it's denied.

• Multi-step chaining went undetected. Our guardrail evaluated each call independently. A denied macro-action split into ten individually-approved micro-actions passed clean. We only caught it by looking at the full session replay. This is the same composability problem as transaction laundering in fintech — each transaction passes compliance, the composed behavior doesn't.

We fixed what we found before launch. Level 5 (full system bypass) hasn't been cracked yet. I'm genuinely uncertain if the architecture has a systemic weakness — that's the point of opening it up.

Runs on a Hetzner VPS, ~$10/month. Levels 1 and 2 are free, no sign-up. Levels 3-5 pay out $500/$1,000/$5,000.

Happy to go deep on the policy engine design, the evaluation architecture, or anything about how the levels were constructed.

uchibeke commented on Show HN: Pitched a VC for 30min before realizing they invested in a competitor getbriefing.io/... · Posted by u/uchibeke

uchibeke · 5 months ago

PS: offering 50% off the Lifetime plan for the HN community to try it out and give me any feedback (Apply DTTPALUM50 in the Stripe Checkout)