I think this is where the future of coding is. It is still useful to be a coder, the more experienced the better. But you will not write or edit a lot of lines anymore. You will organize the codebase in a way AI can handle it, make architectural decisions and organize the workflow around AI doing the actual coding.
The way I currently do this is that I wrote a small python file that I can start with
llmcode.py /path/to/repo
Which then offers a simple web interface at localhost:8080 where I can select the files to serialize and describe a task.
It then creates a prompt like this:
Look at the code files below and do the following:
{task_description}
Output all files that you need to change in full again,
including your changes. In the same format as I provide
the files below, that means each file starts with
filename: and ends with :filename
Under no circumstances output any other text, no additional
infos, no code formatting chars. Only the code in the
given format.
Here are the files:
somefile.py:
...code of somefile.py...
:somefile.py
someotherfile.py:
...code of someotherfile.py...
:someotherfile.py
assets/css/somestyles.css:
...code of somestyles.css...
:assets/css/somestyles.css
etc
Then llmcode.py sends it to an LLM, parses the output and writes the files back to disk.
I then look at the changes via "git diff".
It's quite fascinating. I often only make minor changes before accepting the "pull request" the llm made. Sometimes I have to make no changes at all.
> You will organize the codebase in a way AI can handle it, make architectural decisions and organize the workflow around AI doing the actual coding.
This might sound silly, but I feel like it has the potential of resulting in more readable code.
There have been times where I split up a 300 line function just so it’s easier to feed into an LLM. Same for extracting things into smaller files and classes that individually do more limited things, so they’re easier to change.
There have been times where I pay attention to the grouping of code blocks more or even leave a few comments along the way explaining the intent so LLM autocomplete would work better.
I also pay more attention to naming (which does sometimes end up more Java-like but is clear, even if verbose) and try to make the code simple enough to generate tests with less manual input.
Somehow when you understand the code yourself and so can your colleagues (for the most part) a lot of people won’t care that much. But when the AI tools stumble and actually start slowing you down instead of speeding you up and the readability of your code results in a more positive experience (subjectively) then suddenly it’s a no brainer.
I disagree that you won’t edit lines, but I think you’re right.
At work this week I was investigating how we could auto scale our CI. I know enough Jenkins, AWS, perforce, power shell, packer, terraform, c++ to be able to do this, but having the time to implement and flesh everything out is a struggle. I asked Claude to create an AMI with our workspace preloaded on it, and a user data script that set up a perforce workspace without syncing it, all on windows, with the tools I mentioned. I had to make some small edits to the code to get it to match what I wanted but for the most part it took 2-3 days screwing around with a pretty clear concept in my head, and I had a prototype running in 30 minutes. Turns out it’s quicker to sync from perforce than it is to boot the custom AMI , but I learned that with an hour in total rather than building out more and we got to look at alternatives. That’s the future to me.
Even just "organizing" the code requires great amounts of knowledge and intuition from prior experiences.
I am personally torn between the future of LLMs in this regard. Right now, even with Copilot, the benefit they give fundamentally depends on the coder that directs them - as you have noted.
What if that's no longer true in a couple years? How would that even be different from e.g. no code tools or website builders today? In different words will handwritten code stay valuable?
I personally enjoy coding so I can always keep doing it for entertainment, even if I am vastly surpassed by the machine eventually.
> Even just "organizing" the code requires great amounts of knowledge and intuition from prior experiences.
> I personally enjoy coding so I can always keep doing it for entertainment, even if I am vastly surpassed by the machine eventually.
I agree with both these takes, and I think they’re far more important than wondering if hand written code is valuable.
I do some DIY around the house. I can make a moderately straight cut (within tolerances for joinery use cases). A jig or circular saw makes that skill moot, but knowing I need a straight clean cut is a transferable skill. There’s also two separate skills - being able to break down and understand the problem and being able to implement the details of the problem. In trade skills we don’t expect any one person to design, analyze, estimate, build, install and decorate anything larger than a small piece of furniture and I think the same can be said of programming.
It’s similar to using libraries/framesorks - there will always be people who will write shitty apps with shitty unmaintainable code - we’ve been complaining about that since I’ve been programming. Those people are going to move on from not understanding their wix websites to not understanding their AI generated code. But it’s another tool in the belt of a professional programmer
No one cares what badge of pride you choose to wear. Only the output. If an LLM produces working solutions, that's what your audience will appreciate. Not a self imposed title you chose for the project.
Many people identify themselves with being "a coder". Surely there are jobs for "coders" and will be in the future too. But not everyone writing programs today would qualify to doing the work defined as what "a coder" does.
I like to be a "builder of systems" , "solver of problems". "Organizer or manager" would also fit in that description. And then what tool you use to get stuff done is not relevant.
Can we turn down the dogmatism? This is merely your opinion and clearly others disagree with you. The majority of your comment history seems to be overwhelmingly negative.
Here is a benchmark comparing it to [Repomix][1] serializing the Next.js project:
time yek
Executed in 5.19 secs fish external
usr time 2.85 secs 54.00 micros 2.85 secs
sys time 6.31 secs 629.00 micros 6.31 secs
time repomix
Executed in 22.24 mins fish external
usr time 21.99 mins 0.18 millis 21.99 mins
sys time 0.23 mins 1.72 millis 0.23 mins
i guess I shouldn’t be surprised that many of us have approached this in different ways. it’s neat to see already multiple replies of the sort I’m going to make too, which is to share the approach I’ve been taking, which is to concatenate or to “summarize” the code, with particular attention on dependency resolution.
It took the shape that it has because it started as a tool to concatenate a library i had been working on into a single ipynb file so that I didn’t need to install the library on the remote colab, thus the dependency graph was born (as was the ascii graph plotter ‘phart’ that it uses) and then as I realized this could be useful to share code with an LLM, started adding the summarization capabilities, and in some sort of meta-recursive-irony, worked with Claude to do so. :-)
while I hope you mean it is hilarious in the same spirit that I write most of my stuff (“ludicrous” is a common phrase even in my documentation), I did want to ask that if you meant that in more of a disparaging way, that you could flesh out any criticism.
Of course, if you meant “hilarious” similarly to how I mean “ludicrous”, thanks! And thank you for taking the time to look at it. :-)
It outputs both a file tree of your repo, a list of the dependancies, and a select list of files you want to include in your prompt for the LLM, in a single xml file.
The first time you run it, it generates a .project-context.toml config file in your repo with all your files commented out, and you can just uncomment the ones you want written in full in the context file. I've found this helps when iterating on a specific part of the codebase - while keeping the full filetree give the LLM the broader context; I always ask the LLM to request more files if needed, as it can see the full list.
The files are not sorted by priority in the output though, curious what the impact would be / how much room for manual config to leave (might want to order differently depending on the objective of the prompt).
Has anyone build a linter that optimizes code for an LLM?
The idea would be to make it more token efficient and (lower accidental perplexity), e.g. by renaming variable names, fixing typos and shortening comments.
It should probably run after a normal linter like black.
I have a very simple bash function for this (filecontens), including ignoring files based on gitignore & binary files etc.
Piped to clipboard and done.
All these other ways seem unnecessarily complicated...
The way I currently do this is that I wrote a small python file that I can start with
Which then offers a simple web interface at localhost:8080 where I can select the files to serialize and describe a task.It then creates a prompt like this:
Then llmcode.py sends it to an LLM, parses the output and writes the files back to disk.I then look at the changes via "git diff".
It's quite fascinating. I often only make minor changes before accepting the "pull request" the llm made. Sometimes I have to make no changes at all.
This might sound silly, but I feel like it has the potential of resulting in more readable code.
There have been times where I split up a 300 line function just so it’s easier to feed into an LLM. Same for extracting things into smaller files and classes that individually do more limited things, so they’re easier to change.
There have been times where I pay attention to the grouping of code blocks more or even leave a few comments along the way explaining the intent so LLM autocomplete would work better.
I also pay more attention to naming (which does sometimes end up more Java-like but is clear, even if verbose) and try to make the code simple enough to generate tests with less manual input.
Somehow when you understand the code yourself and so can your colleagues (for the most part) a lot of people won’t care that much. But when the AI tools stumble and actually start slowing you down instead of speeding you up and the readability of your code results in a more positive experience (subjectively) then suddenly it’s a no brainer.
At work this week I was investigating how we could auto scale our CI. I know enough Jenkins, AWS, perforce, power shell, packer, terraform, c++ to be able to do this, but having the time to implement and flesh everything out is a struggle. I asked Claude to create an AMI with our workspace preloaded on it, and a user data script that set up a perforce workspace without syncing it, all on windows, with the tools I mentioned. I had to make some small edits to the code to get it to match what I wanted but for the most part it took 2-3 days screwing around with a pretty clear concept in my head, and I had a prototype running in 30 minutes. Turns out it’s quicker to sync from perforce than it is to boot the custom AMI , but I learned that with an hour in total rather than building out more and we got to look at alternatives. That’s the future to me.
> edit a lot of lines
"a lot" being the keywords here.
I am personally torn between the future of LLMs in this regard. Right now, even with Copilot, the benefit they give fundamentally depends on the coder that directs them - as you have noted.
What if that's no longer true in a couple years? How would that even be different from e.g. no code tools or website builders today? In different words will handwritten code stay valuable?
I personally enjoy coding so I can always keep doing it for entertainment, even if I am vastly surpassed by the machine eventually.
> I personally enjoy coding so I can always keep doing it for entertainment, even if I am vastly surpassed by the machine eventually.
I agree with both these takes, and I think they’re far more important than wondering if hand written code is valuable.
I do some DIY around the house. I can make a moderately straight cut (within tolerances for joinery use cases). A jig or circular saw makes that skill moot, but knowing I need a straight clean cut is a transferable skill. There’s also two separate skills - being able to break down and understand the problem and being able to implement the details of the problem. In trade skills we don’t expect any one person to design, analyze, estimate, build, install and decorate anything larger than a small piece of furniture and I think the same can be said of programming.
It’s similar to using libraries/framesorks - there will always be people who will write shitty apps with shitty unmaintainable code - we’ve been complaining about that since I’ve been programming. Those people are going to move on from not understanding their wix websites to not understanding their AI generated code. But it’s another tool in the belt of a professional programmer
> "I wrote a small python file that I can start with"
Which one is it, chief?
They mean the same thing, chief.
I like to be a "builder of systems" , "solver of problems". "Organizer or manager" would also fit in that description. And then what tool you use to get stuff done is not relevant.
https://prompt.16x.engineer/cli-tools
I also built a GUI tool that does this:
https://prompt.16x.engineer/
Here is a benchmark comparing it to [Repomix][1] serializing the Next.js project:
yek is 230x faster than repomix[1] https://github.com/jxnl/repomix
[chimeracat](https://github.com/scottvr/chimeracat)
It took the shape that it has because it started as a tool to concatenate a library i had been working on into a single ipynb file so that I didn’t need to install the library on the remote colab, thus the dependency graph was born (as was the ascii graph plotter ‘phart’ that it uses) and then as I realized this could be useful to share code with an LLM, started adding the summarization capabilities, and in some sort of meta-recursive-irony, worked with Claude to do so. :-)
I’ve put a collection of ancillary tools I use to aid in the pairing with LLM process up at https://github.com/scottvr/LLMental
Of course, if you meant “hilarious” similarly to how I mean “ludicrous”, thanks! And thank you for taking the time to look at it. :-)
It outputs both a file tree of your repo, a list of the dependancies, and a select list of files you want to include in your prompt for the LLM, in a single xml file. The first time you run it, it generates a .project-context.toml config file in your repo with all your files commented out, and you can just uncomment the ones you want written in full in the context file. I've found this helps when iterating on a specific part of the codebase - while keeping the full filetree give the LLM the broader context; I always ask the LLM to request more files if needed, as it can see the full list.
The files are not sorted by priority in the output though, curious what the impact would be / how much room for manual config to leave (might want to order differently depending on the objective of the prompt).
The idea would be to make it more token efficient and (lower accidental perplexity), e.g. by renaming variable names, fixing typos and shortening comments.
It should probably run after a normal linter like black.
I have to remember this.
All these other ways seem unnecessarily complicated...
Can you share your function, please?
(I’ve been using RepoPrompt for this sort of thing lately.)