Readit News logoReadit News
yeahyeahok commented on AI agent benchmarks are broken   ddkang.substack.com/p/ai-... · Posted by u/neehao
RansomStark · 2 months ago
I really like the CMU Agents Company approach of simulating a real world environment [0]. Is it perfect, no. Does it show you want to expect in production, not really, but it's much closer than anything else I've seen.

[0] https://the-agent-company.com/

yeahyeahok · 2 months ago
Damn. Super bullish on CMU. Somehow, they seem routinely left out of the top CS schools discussion at least in mainstream discourse: MIT, Stanford, Cal, .... Seen a disproportionate amount of stellar research come from there. Also, interestingly, I have met really incompetent people from all the other top 3 schools but am yet to meet an incompetent CMU SCS alum -- wtf are they feeding them in pitsburgh??

u/yeahyeahok

KarmaCake day1July 11, 2025View Original