waynenilsen (u/waynenilsen)

waynenilsen commented on LLM Inflation tratt.net/laurie/blog/202... · Posted by u/ingve

roxolotl · 22 days ago

My PM said they’d written a bunch of tickets for a project yesterday morning that we hadn’t fully scoped yet. I was pleasantly surprised because I can’t complain if they are going to get ahead of things and start scaffolding tickets.

Of course when I went to read them they were 100% slop. The funniest requirement were progress bars for actions that don’t have progress. The tickets were, even if you assume the requirements weren’t slop, at least 15 points a piece.

But ok maybe with all of these new tools we can respond by implementing these insane requirements. The real problem is what this article is discussing. Each ticket was also 500-700 words. Requirements that boil down to a single if statement were described in prose. While this is hilarious the problem is it makes them harder to understand.

I tried to explain this and they just said “ok fine rewrite them then”. Which I did in maybe 15min because there wasn’t actually much to write.

At this point I’m at a loss for how to even work with people that are so convinced these things will save time because they look at the volume of the output.

waynenilsen · 22 days ago

The software requirements phase is becoming increasingly critical to the development lifecycle and that trend will continue. I have started writing very short tickets and having claude code inflate them, then I polish those. I often include negative prompts at this point so claude may have included "add a progress bar for xyz" and i simply add "do not" in front of those things that do not make sense. The results have been excellent.

waynenilsen commented on Claude Opus 4.1 anthropic.com/news/claude... · Posted by u/meetpateltech

haaz · 23 days ago

it is barely an improvement according to their own benchmarks. not saying thats a bad thing, but not enough for anybody to notice any difference

waynenilsen · 23 days ago

i think its probably mostly vibes but that still counts, this is not in the charts

> Windsurf reports Opus 4.1 delivers a one standard deviation improvement over Opus 4 on their junior developer benchmark, showing roughly the same performance leap as the jump from Sonnet 3.7 to Sonnet 4.

waynenilsen commented on URL-Driven State in HTMX lorenstew.art/blog/bookma... · Posted by u/lorenstewart

Twey · a month ago

The example URL here, though, is still not (helpfully) bookmarkable because the contents of page 2 will change as new items are added. To get truly bookmarkable list URLs, the best approach I've seen is ‘page starting from item X’, where X is an effectively-unique ID for the item (e.g. a primary key, or a timestamp to avoid exposing IDs).

waynenilsen · a month ago

also significantly better for performance

waynenilsen commented on Study mode openai.com/index/chatgpt-... · Posted by u/meetpateltech

waynenilsen · a month ago

i need tree conversations more now than ever

waynenilsen commented on Ex-Waymo engineers launch Bedrock Robotics to automate construction techcrunch.com/2025/07/16... · Posted by u/boulos

kevinmpeterson · a month ago

Probably - if cost goes down to operate a machine, then the base machines can be smaller. Will eventually be interesting to go for much bigger changes... ditch the cabs, lower profiles, place your cameras in smart spots, move the pivot points so they are best for the job instead of best for a person. Vehicle distribution might change, too - heavy machines are tools and the tool you choose is a combo of best quality, right cost, speed, etc. That landscape will shift.

The variety of machines and their specificity is super fascinating and very specific. Definitely will change.

waynenilsen · a month ago

I am very much looking forward to seeing how things evolve. Do you think the bitter lesson applies here? I am assuming you are gathering reams of data from the controls and any cameras of manually operated machines. I have been thinking about the bitter lesson in many contexts recently and I think for something like this where there really isn't anything proprietary about the method of operation, doing some kind of data consignment agreement or just up front paying for data could be very valuable in and of itself. Even unlabeled data can later be autolabeled.

waynenilsen commented on Ex-Waymo engineers launch Bedrock Robotics to automate construction techcrunch.com/2025/07/16... · Posted by u/boulos

kevinmpeterson · a month ago

I'm the CTO and one of the founders of Bedrock. I was very pleasantly surprised to see the excitement from this crowd! Happy to answer any questions about us (and will look through the comment threads here). We're looking for really really awesome MLEs and software engineers, so if you're interested take a look at our careers page https://bedrockrobotics.com/careers

waynenilsen · a month ago

do you see a change in the machinery landscape due to the removal of humans?

waynenilsen commented on How and where will agents ship software? instantdb.com/essays/agen... · Posted by u/stopachka

jen729w · a month ago

Please don’t use a vibe-coded app for anything important.

I use Claude. I like Claude. But I’ve backed away from having Claude actually write my code other than in the most limited circumstances.

I caught it copying one of my TS Interfaces, for example. And modifying, then using, the copy. So my type-checks pass, yay! But wait what?

It wrote a test for a tricky bit of code. The test wouldn’t pass. So it re-wrote it in a way that couldn’t possibly fail, mocking all elements inside the test itself.

I’m not anti-AI. But I wouldn’t trust anything vibe-coded above the importance of, say, Wordle.

waynenilsen · a month ago

Review and good rules are still critical. Current agent state is still hyperspeed junior engineer. Providing examples helps a lot when scaffolding something similar to something else.

waynenilsen commented on Show HN: TrendFi – I built AI trading signals that self-optimize trend.fi... · Posted by u/wolfman1

malshe · 2 months ago

I checked this fund's performance going to the first link you shared. In the last about 3 years it underperformed S&P 500 by 2.5%

waynenilsen · 2 months ago

I think you're looking at the wrong strategy. The first link is quantbase which has their own strategies which are not good. Quiver creates strategies you can subscribe to which have excellent performance

waynenilsen commented on Show HN: TrendFi – I built AI trading signals that self-optimize trend.fi... · Posted by u/wolfman1

downrightmike · 2 months ago

Serious question: What good are reported trades delayed by months?

waynenilsen · 2 months ago

The S&P 500 historically returns about 10% annually with a Sharpe ratio around 0.5-0.6. This fund's metrics show it outperforming on both absolute and risk-adjusted bases. The 36.35% CAGR is excellent.

Now, imagine if you could trade on the information when they do.

waynenilsen commented on Show HN: TrendFi – I built AI trading signals that self-optimize trend.fi... · Posted by u/wolfman1

waynenilsen · 2 months ago

perhaps team up with https://www.getquantbase.com/ that is how i invest in https://www.quiverquant.com/strategies/s/Congress%20Buys/ its a decent platform designed for rebalancing / taxes etc