we're leaning away from being a framework in favor of being a library specifically because we're seeing teams looking to implement their own business logic for most core agentic capabilities where things like concurrency, fairness, or resource contention become problematic (think many agents reading 1000s of documents in parallel).
unlike most frameworks we've been working on the orchestrator, hatchet, first for over a year and are basing these patterns on what we've seen our most successful companies already doing.
put shortly - pickaxe brings orchestration and best practices, but you're free to implement to your requirements.
Depending on execution order, tool is either called or a cached value returned. That way local state can be replayed, and that's why "no side effects" rule is in place.
I like it. Just, what's the recommended way to have a chat assistant agent with multiple tools? Message history would need to be passed to the very top-level agent.run call, isn't it?
we'll be continuously improving docs on this project, but since pickaxe is built on hatchet it supports concurrency [1]. so for a chat usecase, you can pass the chat history to the top level agent but propagate cancelation for other message runs in the session to handle if the user sends a few messages in a row. we'll work an example in pattern section for this!
[1] https://docs.hatchet.run/home/concurrency#cancel-in-progress
One use case I imagine is key here is background/async agents, so OpenAI Codex/Jules style, so that's great if I can durably run them with Pickaxe(btw I belive I've read somewhere in temporal docs or some webinar that Codex was built on that ;), but how do I get that real-time and resumable message stream back to the client? The user might reload the page or return after 15 minutes, etc. I wasn't able to think of an elegant way to model this in a distributed system.
we'll have agent->client streaming on the very short term roadmap (order of weeks), but haven't broadly rolled out since its not 100% ready for prime time.
we do already have wait for event support for client->agent eventing [1] already in this release!
The reason I ask is because I've had a lot of success using different models for different tasks, constructing the system prompt specifically for each task, and also choosing between the "default" long assistant/tool_call/user/(repeat) message history vs. constantly pruning it (bad for caching but sometimes good for performance). And it would be nice to know a library like this could allow experimentation of these strategies.
under the hood we're using vercel ai sdk to make tool calls so this is easily extended [1]. this is the only "opinionated" api for calling llm apis which is "bundled" within the sdk and we were torn on how to expose it for this exact reason, but since its so common we decided to include it.
some things we were thinking is overloading `defaultLanguageModel` with a map for different usecases, or allowing users to "eject" the tool picker to customize it as needed. i've opened a discussion [2] to track this.
[1] https://github.com/hatchet-dev/pickaxe/blob/main/sdk/src/cli...
I could see this being incredible if it had a set of performance related queries or ran explain analyze and offered some interpreted results.
Can this be run fully locally with a local llm?
Would love to see some sort of architecture overview in the docs
The top-level docs have a section on "Deploying workers" but I think there are more components than that?
It's cool there's a Helm chart but the docs don't really say what resources it would deploy
https://docs.hatchet.run/self-hosting/docker-compose
...shows four different Hatchet services plus, unexpectedly, both a Postgres server and RabbitMQ. Can't see anywhere that describes what each one of those does
Also in much of the docs it's not very clear where the boundary between Hatchet Cloud and Hatchet the self-hostable OSS part lies
The simplest way to run hatchet is with `hatchet-lite`[0] which bundles all internal services. For most deployments we recommend running these components separately hence the multiple services in the helm chart [1]. RabbitMQ is now an optional dependency which is used for internal-service messages for higher throughput deployments [2].
Your workers are always run as a separate process.
[0] https://docs.hatchet.run/self-hosting/hatchet-lite
[1] https://docs.hatchet.run/self-hosting/improving-performance#...
[2] https://hatchet.run/launch-week-01/pg-only-mode
edit: missed your last question -- currently self-host includes everything in cloud except managed workers
> It is such a better system than dedicated long running worker listeners, because then you can just scale your HTTP workers as needed.
This depends on the use-case - with long running listeners, you get the benefit of reusing caches, database connections, and disk, and from a pricing perspective, if your task spends a lot of time waiting for i/o operations (or waiting for an event), you don't get billed separately for CPU time. A long-running worker can handle thousands of concurrently running functions on cheap hardware.
> I also find that DAGs tend to get ugly really fast because it generally involves logic. I'd prefer that logic to not be tied into the queue implementation because it becomes harder to unit test. Much easier reason about if you have the HTTP endpoint create a new task, if it needs to.
We usually recommend that DAGs which require too much logic (particularly fanout to a dynamic amount of workflows) should be implemented as a durable task instead.
Deleted Comment
also a fan of a single state for billing, metering, and entitlements. any plans for a go sdk for these?