lutzleonhardt (u/lutzleonhardt)

lutzleonhardt commented on Brokk: AI for Large Codebases brokk.ai... · Posted by u/handfuloflight

corysama · 10 months ago

How large is "Large"? Are we testing on Unreal Engine? :D

I tested it with Ghidra recently and got very good results

lutzleonhardt commented on Brokk: AI for Large Codebases brokk.ai... · Posted by u/handfuloflight

I'd be interested to try this out. I'm especially keen on AI tools that implement a native RAG workflow. I've given Cursor documentation links, populated my codebase with relevant READMEs and diagram files that I'm hoping might provide useful context, and yet when I ask it to assist on some refactoring task it often spends 10-20 minutes simply grepping for various symbol names and reading through file matches before attempting to generate a response. This doesn't seem like an efficient way for an LLM to navigate a medium-sized codebase. And for an IDE with first-class LLM tooling, it is a bit surprising that it doesn't seem to provide powerful vector-based querying capabilities out of the box — if implemented well, a Google-like search interface to one's codebase could be useful to humans as well as to LLMs.

What does this flow look like in Brokk? Do models still need to resort to using obsolete terminal-based CLI tools in order to find stuff?

lutzleonhardt · 10 months ago

We implemented a multi-step process to find the required context:

1. Quick Context Shows the most relevant files based on a pagerank algorithm (static analysis) and semantic embeddings (JLama inference engine). The input are the instructions and the AI workspace fragments (i.e. files).

2. Deep Scan A richer LLM receives the summaries of the AI workspace files (+instructions) and returns a recommendation of files and tests. It also recommends the type of inclusion (editable, read-only, summary/skeleton).

3. Agentic Search The AI has access to a set of tools for finding the required files. But the tools are not limited to grep/rg. Instead you can: - find symbols (classes, methods, ...) in the project - ask for summaries/skeletons of files - provide class or method implementations - find usages of symbols (where is x used?) - call sites (in/out) ...

You can read more about this in the Brokk.ai blog: https://brokk.ai/blog/brokk-under-the-hood

lutzleonhardt commented on Brokk: AI for Large Codebases brokk.ai... · Posted by u/handfuloflight

lutzleonhardt · 10 months ago

The amazing thing here is that the Brokk AI can access your code like an IDE, can ask for usages or gather the summary of a file before deciding to get the implementation of a method! It mimics like a Dev is navigating the codebase. And this is more reliable and token-efficient than the usual grep/rg approach

lutzleonhardt commented on Brokk: AI for Large Codebases brokk.ai... · Posted by u/handfuloflight

soco · 10 months ago

Is there something also to read for those of us who will never watch videos?

lutzleonhardt · 10 months ago

Hi, yes there are some blog posts:

https://brokk.ai/blog/brokk-under-the-hood