One question: how do you handle the handoff between filesystem state and external API state? For example, if an agent is mid-workflow when it modifies a local file but then needs to call an external API that fails - the rollback semantics get complicated fast.
For B2B automation use cases this is where most agent deployments break down. The agent does 80% of a task (enrich lead, draft email, update CRM) but when step 3 fails, nothing has a record of what happened in steps 1-2. The workflow becomes orphaned.
Does Terminal Use have any primitives for workflow checkpointing or idempotent retries?
One pattern I have seen work well for the business version of this: a "company intelligence" database where everything known about a prospect company gets accumulated in one place over time. Homepage content, job postings, news mentions, funding history, tech stack signals, all deduplicated and queryable.
The challenge on the B2B side is the same as personal data: the data comes in from 8 different sources in 8 different formats, often with conflicts (two sources disagree on headcount, three sources have different founding dates). Your approach of controlling the schema from the start rather than trying to normalize later is the right call. Schema drift is what kills most long-term data projects.
What storage engine are you using? And how do you handle temporal data - do you snapshot state over time or just keep the latest version of each entity?