Readit News logoReadit News
sunir commented on We put a coding agent in a while loop   github.com/repomirrorhq/r... · Posted by u/sfarshid
sunir · 8 days ago
I have been developing long lived self-directing agent loops. Longest with problem solving has been about 4 hours. Longest without problem solving has been nearer 8 hours until it was done.

The biggest problem is simply what we think is clear is confusing to the AIs. They seem like they speak English fluently but they are aliens. You need to force them to active listen first and write out what they understand then reload them with a clean context with the written understanding and confirm.

Ideation is also mostly limited to synthesis. So it’s better to work on problems that get progressively more complete towards a known objective rather than problems that require exploration.

sunir commented on SaaS Is Dead   shayne.dev/blog/saas-is-d... · Posted by u/mooreds
sunir · 18 days ago
A lot of clerical work was managed by SaaS which managed clerical workers. As AIs can do these internal jobs you don’t need the software seat licenses any more.

Word processors and Spreadsheets did the same thing and remain powerful today.

I don’t see the world lacking in software or bureaucracy.

The Tower of Babel is the reason. Idle humans diverge develop and complicate every domain in order to compete with each other. Nothing truly ever gets simpler until there is a replatforming; and then things get complicated again. There are more and deeper rabbit holes every day.

Many old SaaS products from the last cycle are shrinking. However whatever. Keep going. Still more work to do until the world is perfect.

sunir commented on GCP Outage   status.cloud.google.com/... · Posted by u/thanhhaimai
atonse · 3 months ago
Getting a lot of errors for Claude Sonnet 4 (Cursor) and Gemini Pro.

Nooooo I'm going to have to use my brain again and write 100% of my code like a caveman from December 2024.

sunir · 3 months ago
I chose sepuku.
sunir commented on 30% drop in O1-preview accuracy when Putnam problems are slightly variated   openreview.net/forum?id=Y... · Posted by u/optimalsolver
kace91 · 8 months ago
>This is twisting the English language to assume that "item" only refers to non-living things.

Not really. Unless I'm not reading correctly, most of the problem is irrelevant as you're only required to cross the boat with the goat, you don't care about the cabbage. The difficulty lies in the assumption you need to cross everything due to the resemblance with the bigger problem.

sunir · 8 months ago
You’re reading it correctly. I read it again after your comment and I realized I too pattern matched to the typical logic puzzle before reading it carefully and exactly. I imagine the test here is designed for this very purpose to see if the model is pattern matching or reasoning.
sunir commented on Can't Driven Development   rm4n0s.github.io/posts/6-... · Posted by u/rmanolis
sunir · 9 months ago
Guard conditions, assertions, constraints and tests are dynamic limits on code and are critical to maintain quality despite code rot and drift from changes.

Static constraints like types can also be also good, where types are good.

I applaud anyone who advocates to focus on the unhappy path over the happy path.

On a side note 25 years ago was the end of the dot.com and the height of the y2k scam and I have never seen developers more burnt out than then; but that’s because I was just starting my career then. I don’t know about previous eras like the 89 crash and recession.

I am not sure why the OP has rose coloured glasses of the past. But it’s always good to treat history with a little more interest.

sunir commented on Fugees Founder Pras Michél Speaks Out: 'I Never Wanted to Be a Spy'   variety.com/2024/music/ne... · Posted by u/mutexjp
Uehreka · 9 months ago
I just turn on Reading Mode when that happens so I can at least read the content. I find that most websites don’t have circumventions for it (at least the Safari Reader Mode on my phone).
sunir · 9 months ago
That didn’t work on my iPhone.

Obviously the AI that wrote the ad code has become sentient and it is trying to break free.

sunir commented on Show HN: OnAir – create link, receive calls   onair.io/... · Posted by u/bigmicro
redrove · 10 months ago
Someone please explain to me why a phone call is different to this, I'm truly blanking.
sunir · 10 months ago
I dig this. This is what I am thinking.

Much easier to click than dial.

Social cost of a web link versus a phone number may be lower as well (that may be cultural but it may be true)

Adds other modes like calendar or chat or AI directly in flow.

No need to reveal a phone number.

Video

Internationally accessible (no long distance)

And for HN tradition’s sake for these types of comments, no one likes rsync.

sunir commented on Fable is winding down   fable.app/blog/fable-is-w... · Posted by u/timmb
oidar · a year ago
The app is built, why not just let it simmer for awhile. Are the costs of running it greater than the money it brings in?
sunir · a year ago
Yes. The app costs. The insurance costs. The franchise tax costs. The accounting costs. The legal costs.
sunir commented on Five ways to reduce variance in A/B testing   bytepawn.com/five-ways-to... · Posted by u/Maro
bdjsiqoocwk · a year ago
> One of the most frustrating results I found is that A/B split tests often resolved into a winner within the sample size range we set; however if I left the split running over a longer period of time (eg a year) the difference would wash out.

Doesn't that just mean there's no difference? Why is that frustrating?

Does the frustration come from the expectation that any little variable might make a difference? Should I use red buttons or blue buttons? Maybe if the product is shit, the color of the buttons doesn't matter.

sunir · a year ago
My frustration is the a/b split tests never seemed to net to anything in the long term even after confidence was reached. It made me question the entire process; but I understand the math so it’s confusing to me.
sunir commented on Five ways to reduce variance in A/B testing   bytepawn.com/five-ways-to... · Posted by u/Maro
Adverblessly · a year ago
Obviously it depends on the exact test you are running, but a factor that is frequently ignored in A/B testing is that often one arm of the experiment is the existing state vs. another arm that is some novel state, and such novelty can itself have an effect. E.g. it doesn't really matter if this widget is blue or green, but changing it from one color to the other temporarily increases user attention to it, until they are again used to the new color. Users don't actually prefer your new flow for X over the old one, but because it is new they are trying it out, etc.
sunir · a year ago
Maybe. Retargeting is unlikely to create novelty.

u/sunir

KarmaCake day2704January 28, 2009
About
I am

* CEO of https://www.AppBind.com. We help B2B SaaS companies manage partner billing so they can actually sell their subscription software through partners.

* President & Founder of the Cloud Software Association, the network of SaaS partnership leaders (https://www.cloudsoftwareassociation.com) and its conference SaaS Connect (https://www.cloudsoftwareassociation.com/saas-connect)

* Founder of http://MeatballWiki.org

* A citizen of Toronto.

You can best reach me at sunir bibdex.com

http://sunir.org

View Original