The whole thing runs on these prompts: https://github.com/SWE-agent/mini-swe-agent/blob/7e125e5dd49...
Your task: {{task}}. Please reply
with a single shell command in
triple backticks.
To finish, the first line of the
output of the shell command must be
'COMPLETE_TASK_AND_SUBMIT_FINAL_OUTPUT'.
https://github.com/SWE-agent/mini-swe-agent/blob/7e125e5dd49...
The job was set on Friday and ready on Monday. On average it was about 5k tokens (documents ranging from 1k to 200k in size) and only about 10 tokens out.
Average response was about 1.5 seconds ~ 40 hours for full set.
I really did some heavy prompt testing to limit output.
Even then every few thousand queries you'd get some double token responses. That is Gemini would respond in duplicate - ie Daisy Daisy.