gpt-repository-loader as-is works pretty well in helping me achieve better responses. Eventually, I thought it would be cute to load itself into GPT-4 and have GPT-4 improve it. I was honestly surprised by PR#17. GPT-4 was able to write a valid an example repo and an expected output and throw in a small curveball by adjusting .gptignore. I did tell GPT the output file format in two places: 1.) in the preamble when I prompted it to make a PR for issue #16 and 2.) as a string in gpt_repository_loader.py, both of which are indirect ways to infer how to build a functional test. However, I don't think I explained to GPT in English anywhere on how .gptignore works at all!
I wonder how far GPT-4 can take this repo. Here is the process I'm following for developing:
- Open an issue describing the improvement to make
- Construct a prompt - start with using gpt_repository_loader.py on this repo to generate the repository context, then append the text of the opened issue after the --END-- line.
- Try not to edit any code GPT-4 generates. If there is something wrong, continue to prompt GPT to fix whatever it is.
- Create a feature branch on the issue and create a pull request based on GPT's response.
- Have a maintainer review, approve, and merge.
I am going to try to automate the steps above as much as possible. Really curious how tight the feedback loop will eventually get before something breaks!
The initial prompt would be, "person wants to do x, here are the file list of this repo: ...., give me a list of files that you'd want to edit, create or delete" -> take the list, try to fit the contents of them into 32k tokens and re-prompt with "user is trying to achieve x, here's the most relevant files with their contents:..., give me a git commit in the style of git patch/diff output". From playing around with it today, I think this approach would work rather well and can be like a huge step up from AI line autocompletion.
https://github.com/jerryjliu/llama_index
and/or
https://github.com/hwchase17/langchain
Edit: This, btw, is also the reason why I think that this here popped up on the hackernews frontpage a short while ago: https://github.com/pgvector/pgvector
I'm happy to wait even 30-60 seconds for this which I can easily evaluate, criticize (and the model will correct it) and then proceed to just patch and move on. I think the results from this will be much better with the 32k model, but remains to be seen.
It's best illustrated by the old joke:
A good chunk of all bugs in software are down to the requirements being insufficiently well specified. Further, many bugs are the discovery of new requirements when informal specification encounters reality."Read from standard input into this byte array" doesn't specify what to do when the input exceeds the byte array.
When you overflow the buffer, you get a "well obviously you're supposed to not do that"... that's wasn't stated at all.
When the function keeps going after a newline or a null byte or whatever, there's another "well obviously you're supposed to stop at those points". That was also not specified.
and so on.
At the point you're specifying all these cases and what to do when, it's so specific and stilted, you might as well be using a programming language.
You... might wanna consider a self hosted alternative for that use case, or at least do like, a `| wc` to get an idea of what you're potentially sending before calling the api.
[1] - https://help.openai.com/en/articles/7127956-how-much-does-gp...
Deleted Comment
I wonder if the future will just be software hobbled together with shitty AI code that no one understands, with long loops and deep call stacks and abstractions on top of abstractions, while tech priests take a prompt and pray approach to eventually building something that does kind of what they want.
Or to hell with priests! Build some temple where users themselves can come leave prompts for the general AI to hear and maybe put out a fix for some app running on their tablets devices.
There's plenty of software out there that fits this description if you just remove "AI" from the statement. There's nothing new about bad codebases. Now it just costs pennies and is written in seconds instead of thousands paid to an outsourcing middleman firm that takes weeks or months to turn it around.
Deleted Comment
The future that I see, coding and AI are divided into two camps. The one is what we would call "script kiddies" today - people who don't understand how to write software, but know enough to ask the right questions and bodge what they get together into something that mostly works. The other camp would be programmers who are similar to programmers today, but use AI to write boilerplate for them, as well as replace Stack Overflow.
That’s my fear as well. Software may just get shittier overall (aka “good enough”), and in higher volumes, due to it taking less time to crank out using AI.
We're going to witness he greatest copy pasta in history and find out how that goes.
For example, maybe we write some code now with the goal of helping a customer service person do their job. But we all know that plenty of people are trying to replace customer service people with LLMs, not use LLMs to write tools to help customer service people.
I see that the LLM still needs to know what's going on with the customer account, and maybe for a long time that takes the form of conventional APIs. But surely something is going to change here?
For an individual function I can totally believe GPT4 could strip creative expression from it today. For example you could ask it to give a detailed description of a function in English, and then feed that English description back in (in a new session) and ask it to generate a code based upon the description.
Copyright is a law agreed by a humans in a social contract created to protect humans and further their interests in a 'fair' manner. There is no inalienable right to copyright, no universal law that requires it, it's not an emergent property of intelligence that mechanically applies to artificial entities.
So while the current copyright laws could be interpreted in the way you suggest for the time being, they are clearly written without any notion of AI, and can and should be revised to incorporate the new state of the world; you can bet creators will push hard in that direction. It's pretty clear that the mechanical transformation of a human body of work for the sole purpose of stripping it of copyright is a violation of the spirit of copyright law *.
*( as long as that machine can't also generate a similar work from scratch, in which case the point becomes moot. But we are far, far, from that point)
[1] https://en.wikipedia.org/wiki/Clean_room_design
Also, do we know what languages GPT-4 "understands" at a sufficient level? What knowledge does it have of post-2021 language features, like in C23?
(https://gist.github.com/darius/b463c7089358fe138a6c29286fe2d... paste in painful-to-read format if anyone's really curious. In three parts: intro to language; I ask it to code symbolic differentiation; then a metacircular interpreter.)
Our codebase is 1 million lines of code.
Can we feed the documentation to it? What are the limits?
Is it possible to train it on our data without doing prompt engineering? How?
Otherwise are we supposed to use embeddings? Can someone explain how these all work and the tradeoffs?
Clearly this will break eventually, but I am playing around with some ideas to extend how much context I can give it. One is to do something like base64 encode file contents. I've seen some early success that GPT-4 knows how to decode it, so that'll allow me to stuff more characters into it. I'm also hoping that with the use of .gptignore, I can just selectively give the files I think are relevant for whatever prompt I'm writing.
I wonder if you could teach it to understand a binary encoding using the raw bytestream, feed it compressed text, and just tell it to decompress it first.
Probably doesn't work this way lol
Unfortunately GPT is not yet aware of what LangChain is or how it works, and the docs are too long to feed the whole thing to GPT.
But you can still ask it to figure something out for you.
For example: “write pseudo-code that can read documents in chunks of 800 tokens at a time, then for each chunk create a prompt for GPT to summarize the chunk, then save the responses per document and finally aggregate+summarize all the responses per document”
Basically a kind of recursive map/reduce process to get, process and aggregate GPT responses about the data.
LangChain provides tooling to do the above and even allow the model to use tools, like search or other actions.
You can use our repo (which we are currently updating to include QuickStart tutorials, coming in the next few days) to do embedding retrieval and query
www.GitHub.com/Jerpint/buster
All programmers will become translators from product vision to architecture implementation via guided code review. Eventually this gap will also be closed. Product will say: Make a website that aggregates powerlifting meet dates and keeps them up to date. Deploy it. Use my card on file. Don't spend more than $100/month. The AI will execute the plan.
Programmers will come in when product can't figure out what's wrong with the system.
"The Age of Em" by Robin Hanson thinks through a lot of this in great depth
this repo currently has more HN upvotes than LOC.
very high leverage code!
He wrote the issue on https://github.com/mpoon/gpt-repository-loader/issues/16 and summarized https://github.com/mpoon/gpt-repository-loader/discussions/1...
"Open an issue describing the improvement to make Construct a prompt - start with using gpt_repository_loader.py on this repo to generate the repository context, then append the text of the opened issue after the --END-- line."
Feels like it needs to add a little Github client to be able to automatically append the text of issues at the end of the output. I'm sure ChatGPT can write a Github client in Python no problem.