Readit News logoReadit News
sReinwald commented on Nanobot: Ultra-Lightweight Alternative to OpenClaw   github.com/HKUDS/nanobot... · Posted by u/ms7892
rafram · 5 days ago
But this could be done for 1/100 the cost by only delegating the news-filtering part to an LLM API. No reason not to have an LLM write you the code, too! But putting it in front of task scheduling and API fetching — turning those from simple, consistent tasks to expensive, nondeterministic ones — just makes no sense.
sReinwald · 5 days ago
Like I said, the first examples are fairly trivial, and you absolutely don't need an LLM for those. A good agent architecture lets the LLM orchestrate but the actual API calls are deterministic (through tool use / MCPs).

My point was specifically about the news filtering part, which was something I had tried in the past but never managed to solve to my satisfaction.

The agent's job in the end for a morning briefing would be:

  - grab weather, calendar, Todoist data using APIs or MCP  
  - grab news from select sources via RSS or similar, then filter relevant news based on my interests and things it has learned about me  
  - synthesize the information above
The steps that explicitly require an LLM are the last two. The value is in the personalization through memory and my feedback but also the ability for the LLM to synthesize the information - not just regurgitate it. Here's what I mean: I have a task to mow the lawn on my Todoist scheduled for today, but the weather forecast says it's going to be a bit windy and rain all day. At the end of the briefing, the assistant can proactively offer to move the Todoist task to tomorrow when it will be nicer outside because it knows the forecast. Or it might offer to move it to the day after tomorrow, because it also knows I have to attend my nephew's birthday party tomorrow.

sReinwald commented on Nanobot: Ultra-Lightweight Alternative to OpenClaw   github.com/HKUDS/nanobot... · Posted by u/ms7892
loveparade · 5 days ago
But can't you do the same using appropriate MCP servers with any of the LLM providers? Even just a generic browser MCP is probably enough to do most of these things. And ChatGPT has Tasks that are also proactive/scheduled. Not sure if Claude has something similar.

If all you want to do is schedule a task there are much easier solutions, like a few lines of python, instead of installing something so heavy in a vm that comes with a whole bunch of security nightmares?

sReinwald · 5 days ago
> But can't you do the same just using appropriate MCP servers with any of the LLM providers?

Yeah, absolutely. And that was going to be my approach for a personal AI assistant side project. No need to reinvent the wheel writing a Todoist integration when MCPs exist.

The difference is where it runs. ChatGPT Tasks and MCP through the Claude/OpenAI web interfaces run on their infrastructure, which means no access to your local network — your Home Assistant instance, your NAS, your printer. A self-hosted agent on a mac mini or your old laptop can talk to all of that.

But I think the big value-add here might be "disposable automation". You could set up a Home Assistant automation to check the weather and notify you when rain is coming because you're drying clothes on the clothesline outside. That's 5 minutes of config for something you might need once. Telling your AI assistant "hey, I've got laundry on the line. Let me know if rain's coming and remind me to grab the clothes before it gets dark" takes 10 seconds and you never think about it again. The agent has access to weather forecasts, maybe even your smart home weather station in Home Assistant, and it can create a sub-agent, which polls those once every x minutes and pings your phone when it needs to.

sReinwald commented on Nanobot: Ultra-Lightweight Alternative to OpenClaw   github.com/HKUDS/nanobot... · Posted by u/ms7892
gergo_b · 5 days ago
I have no idea. the single thing I can think of is that it can have a memory.. but you can do that with even less code. Just get a VPS. create a folder and run CC in it, tell it to save things into MD files. You can access it via your phone using termux.
sReinwald · 5 days ago
You could, but Claude Code's memory system works well for specialized tasks like coding - not so much for a general-purpose assistant. It stores everything in flat markdown files, which means you're pulling in the full file regardless of relevance. That costs tokens and dilutes the context the model actually needs.

An embedding-based memory system (letta, mem0, or a self-built PostgreSQL + pgvector setup) lets you retrieve selectively and only grab what's relevant to the current query. Much better fit for anything beyond a narrow use case. Your assistant doesn't need to know your location and address when you're asking it to look up whether sharks are indeed older than trees, but it probably should know where you live when you ask it about the weather, or good Thai restaurants near you.

sReinwald commented on Nanobot: Ultra-Lightweight Alternative to OpenClaw   github.com/HKUDS/nanobot... · Posted by u/ms7892
loveparade · 5 days ago
What are people using these things for? The use cases I've seen look a bit contrived and I could ask Claude or ChatGPT to do it directly
sReinwald · 5 days ago
Disclaimer: Haven't used any of these (was going to try OpenClaw but found too many issues). I think the biggest value-add is agency. Chat interfaces like Claude/ChatGPT are reactive, but agents can be proactive. They don't need to wait for you to initiate a conversation.

What I've always wanted: a morning briefing that pulls in my calendar (CalDAV), open Todoist items, weather, and relevant news. The first three are trivial API work. The news part is where it gets interesting and more difficult - RSS feeds and news APIs are firehoses. But an LLM that knows your interests could actually filter effectively. E.g., I want tech news but don't care about Android (iPhone user) or MacOS (Linux user). That kind of nuanced filtering is hard to express as traditional rules but trivial for an LLM.

sReinwald commented on Claude Code: connect to a local model when your quota runs out   boxc.net/blog/2026/claude... · Posted by u/fugu2
the_harpia_io · 5 days ago
Interesting approach for cost management, but one angle nobody seems to be discussing: the security implications.

When you fall back to a local model for coding, you lose whatever safety guardrails the hosted model has. Claude's hosted version has alignment training that catches some dangerous patterns (like generating code that exfiltrates env vars or writes overly permissive IAM policies). A local Llama or Mistral running raw won't have those same checks.

For side projects this probably doesn't matter. But if your Claude Code workflow involves writing auth flows, handling secrets, or touching production infra, the model you fall back to matters a lot. The generated code might be syntactically fine but miss security patterns that the larger model would catch.

Not saying don't do it - just worth being aware that "equivalent code generation" doesn't mean "equivalent security posture."

sReinwald · 5 days ago
Not saying the frontier models aren't smarter than the ones I can run on my two 4090s (they absolutely are) but I feel like you're exaggerating the security implications a bit.

We've seen some absolutely glaring security issues with vibe-coded apps / websites that did use Claude (most recently Moltbook).

No matter whether you're vibe coding with frontier models or local ones, you simply cannot rely on the model knowing what it is doing. Frankly, if you rely on the model's alignment training for writing secure authentication flows, you are doing it wrong. Claude Opus or Qwen3 Coder Next isn't responsible if you ship insecure code - you are.

sReinwald commented on Agent Skills   agentskills.io/home... · Posted by u/mooreds
smithkl42 · 7 days ago
That does raise the question of what the value is of a "skill" vs a "command". Claude Code supports both, and it's not entirely clear to me when we should use one vs the other - especially if skills work best as, well, commands.
sReinwald · 7 days ago
IMO the value and differentiating factor is basically just the ability to organize them cleanly with accompanying scripts and references, which are only loaded on demand. But a skill just by itself (without scripts or references) is essentially just a slash command with metadata.

Another value add is that theoretically agents should trigger skills automatically based on context and their current task. In practice, at least in my experience, that is not happening reliably.

sReinwald commented on UK government launches fuel forecourt price API   gov.uk/guidance/access-th... · Posted by u/Technolithic
Glawen · 7 days ago
Yes but the real feature that makes it viable, is that petrol station in France can change price only once a day. I forgot how it works in the UK, but in Germany they change wildly depending on the hour in the day. For example they show low price in the morning, so that workers who are late for work notice it and fill on the way back, only to find a price 10-20cents higher at 17h.
sReinwald · 7 days ago
I don't see how that makes it uniquely viable in France. Germany has something very much like this too. And we've had it for nearly 13 years.

> Since 31 August 2013 companies which operate public petrol stations or have the power to set their prices are required to report price changes for the most commonly used types of fuel, i.e. Super E5, Super E10 and Diesel “in real time” to the Market Transparency Unit for Fuels. This then passes on the incoming price data to consumer information service providers, which in turn pass it on to the consumer.

As a consumer, there is no direct API by the MTS-K that you can use, but there are some services like Tankerkoenig which pass this data on to you. I have used their API in Home Assistant before I switched to an EV.

https://www.bundeskartellamt.de/EN/Tasks/markettransparencyu...

sReinwald commented on Hacking Moltbook   wiz.io/blog/exposed-moltb... · Posted by u/galnagli
fny · 7 days ago
"Buy a mac mini, copy a couple of lines to install" is marketing fluff. It's incredibly easy to trip moltbot into a config error, and its context management is also a total mess. The agent will outright forget the last 3 messages after compaction occurs even though the logs are available on disk. Finally, it never remembers instructions properly.

Overall, it's a good idea but incredibly rough due to what I assume is heavy vibe coding.

sReinwald · 7 days ago
It's been a few days, but when I tried it, it just completely bricked itself because it tried to install a plugin (matrix) even though that was already installed. That wasn't some esoteric config or anything. It bricked itself right in the onboarding process.

When I investigated the issue, I found a bunch of hardcoded developer paths and a handful of other issues and decided I'm good, actually.

    sre@cypress:~$ grep -r "/Users/steipete" ~/.nvm/versions/node/v24.13.0/lib/node_modules/openclaw/ | wc -l
    144
And bonus points:

    sre@cypress:~$ grep -Fr "workspace:*" ~/.nvm/versions/node/v24.13.0/lib/node_modules/openclaw/ | wc -l
    41
Nice build/release process.

I really don't understand how anyone just hands this vibe coded mess API keys and access to personal files and accounts.

sReinwald commented on Ask HN: Do you have any evidence that agentic coding works?    · Posted by u/terabytest
edude03 · 21 days ago
I have the same experience despite using claude every day. As an funny anecdote:

Someone I know wrote the code and the unit tests for a new feature with an agent. The code was subtly wrong, fine, it happens, but worse the 30 or so tests they added added 10 minutes to the test run time and they all essentially amounted to `expect(true).to.be(true)` because the LLM had worked around the code not working in the tests

sReinwald · 21 days ago
From my experience: TDD helps here - write (or have AI write) tests first, review them as the spec, then let it implement.

But when I use Claude code, I also supervise it somewhat closely. I don't let it go wild, and if it starts to make changes to existing tests it better have a damn good reason or it gets the hose again.

The failure mode here is letting the AI manage both the implementation and the testing. May as well ask high schoolers to grade their own exams. Everyone got an A+, how surprising!

Dead Comment

u/sReinwald

KarmaCake day673March 6, 2021View Original