Readit News logoReadit News
dbmikus commented on AI models need a virtual machine   blog.sigplan.org/2025/08/... · Posted by u/azhenley
nostrademons · 4 months ago
What you're speaking of is basically the capability security model [1], where you must explicitly pass into your software agent the capabilities that they are allowed to access, and there is physically no mechanism for them to do anything not on that list.

Unfortunately, no mainstream OS actually implements the capability model, despite some prominent research attempts [2], some half-hearted attempts at commercializing the concept that have largely failed in the marketplace [3], and some attempts to bolt capability-based security on top of other OSes that have also largely failed in the marketplace [4]. So the closest thing to capability-based security that is actually widely available in the computing world is a virtual machine, where you place only the tools that provide the specific capabilities you want to offer in the VM. This is quite imperfect - many of these tools are a lot more general than true capabilities should be - but again, modern software is not built on the principle of least privilege because software that is tends to fail in the marketplace.

[1] https://en.wikipedia.org/wiki/Capability-based_security

[2] https://en.wikipedia.org/wiki/EROS_(microkernel)

[3] https://fuchsia.dev/

[4] https://sandstorm.io/

dbmikus · 4 months ago
I'm going to be pedantic and note that iOS and Android both have the capability security model for their apps.

And totally agree that instead of reinventing the wheel here, we should just lift from how operating systems work, for two reasons:

1. there's a bunch of work and proven systems there already

2. it uses tools that exist in training data, instead of net new tools

dbmikus commented on Waymo granted permit to begin testing in New York City   cnbc.com/2025/08/22/waymo... · Posted by u/achristmascarl
nipponese · 4 months ago
Don't you think it would be easier and cheaper to gatekeep than to build up an enforcement and judgement workforce.
dbmikus · 4 months ago
The people that are good but dangerous drivers will drive well and safely during tests, so you won't catch them.
dbmikus commented on Sunny days are warm: why LinkedIn rewards mediocrity   elliotcsmith.com/linkedin... · Posted by u/smitec
Henchman21 · 4 months ago
Marketing is lying. Convincing someone to buy something they don’t actually need? Thats a drain on society. It’s become so pervasive we go to great lengths to justify it. But at its core its fundamentally dishonest.
dbmikus · 4 months ago
You can market products that people need. A big part of this is explaining and educating someone about what your product does, another part is just getting the word out there. Every website homepage is more-or-less a marketing page.

If no one is marketing a product, then nobody knows about it.

dbmikus commented on AI groups spend to replace low-cost 'data labellers' with high-paid experts   ft.com/content/e17647f0-4... · Posted by u/eisa01
the_brin92 · 5 months ago
I've been doing this for one of the major companies in the space for a few years now. It has been interesting to watch how much more complex the projects have gotten over the last few years, and how many issues the models still have. I have a humanities background which has actually served me well here as what constitutes a "better" AI model response is often so subjective.

I can answer any questions people have about the experience (within code of conduct guidelines so I don't get in trouble...)

dbmikus · 5 months ago
What kinds of data are you working on? Coding? Something else?

I've been curious how much these AI models look for more niche coding language expertise, and what other knowledge frontiers they're focusing on (like law, medical, finance, etc.)

dbmikus commented on Org tutorials   orgmode.org/worg/org-tuto... · Posted by u/dargscisyhp
silcoon · 5 months ago
I wish there's something like Obsidian with the same support for org-mode that Emacs has. A few pros:

- Organize notes in org-mode is much quicker - The best support for lists (and I do list most of the times) - Tags and properties - Perfect integration with agenda - Great TODOs support - Code blocks with highlights, execution and results

dbmikus · 5 months ago
Check out Org-Roam: https://www.orgroam.com/

It has some Obsidian-like features inside Org Mode.

So, if you're looking for an easier-to-use UI, it's not it, but if you're looking for Obsidian-like linking and backlinking, it has that.

dbmikus commented on AccountingBench: Evaluating LLMs on real long-horizon business tasks   accounting.penrose.com/... · Posted by u/rickcarlino
dbmikus · 5 months ago
This is cool. A bunch of interesting things here:

  1. Agent can create its own tools and save them to memory
  2. You create a SQL (and web app?) workbench per agent run
  3. Grok fell off a cliff in the last month. Was this consistent over multiple runs?
  4. Agents have a difficult time backtracking. Would unwinding system state and agent context make backtracking better? (Harder to implement this, though)
  5. Since each new month only uses final state from previous month, agent has no way to understand why error occurred in previous month
Cool experiment! Was it difficult building the observable SQL workbench? And how many humans-in-the-loop did you have?

dbmikus commented on Launch HN: K-Scale Labs (YC W24) – Open-Source Humanoid Robots    · Posted by u/codekansas
codekansas · 6 months ago
Thanks!

We're all pretty cross-stack - there are some hardware people and some software people, but the product is quite integrated. Personally, my time has been mostly focused on the RL stack recently, and after there are more robots in the wild, I suspect my time will transition to working on building this data feedback loop.

I try to answer questions pretty actively on our Discord so happy to chat there about whatever you like

dbmikus · 6 months ago
I'll hop in there!
dbmikus commented on Launch HN: K-Scale Labs (YC W24) – Open-Source Humanoid Robots    · Posted by u/codekansas
dbmikus · 6 months ago
This is awesome! How much of your team's time goes into working on the physical hardware, versus RL simulation environments, versus managing all the training data from the real robot and the simulations?

I'm super interested in learning more about the training process of world and robotics model and the data challenges there.

dbmikus commented on Bots are overwhelming websites with their hunger for AI data   theregister.com/2025/06/1... · Posted by u/Bender
rglover · 6 months ago
> Some of the bots identify themselves, but some don't. Either way, the respondents say that robots.txt directives – voluntary behavior guidelines that web publishers post for web crawlers – are not currently effective at controlling bot swarms.

Is anybody tracking the IP ranges of bots or anything similar that's reliable?

It seems like they're taking the "what are you gonna do about it" approach to this.

Edit: Yes [1]

[1] https://github.com/FabrizioCafolla/openai-crawlers-ip-ranges

dbmikus · 6 months ago
Many bots use residential IP proxy networks, so they come from the same IPs that humans use
dbmikus commented on Meta invests $14.3B in Scale AI to kick-start superintelligence lab   nytimes.com/2025/06/12/te... · Posted by u/RyanShook
ml-anon · 6 months ago
Every frontier lab has moved (or is moving) away from scale to other vendors/platforms/bespoke solutions.
dbmikus · 6 months ago
What vendors or platforms are frontier labs moving to?

u/dbmikus

KarmaCake day615July 30, 2015
About
Working on Fixpoint - connecting AI agents to web data (https://fixpoint.co/)

YC S22 alum

[ my public key: https://keybase.io/dbmikus; my proof: https://keybase.io/dbmikus/sigs/hKAT57HrL4WNrtLuYDHaD10eRu99ZJiCylweoAKoGx4 ]

YC Badge: 0x00c0b5e3efeab93cdadd4b5136d53985e6dd303f

View Original