Kinda hard to right a test that a value that is null-checked, when that value may never actually be returned.
For example, have a C function that reads in a file and returns you a string? The string can be checked that malloc actually succeeded, but how do you check that the file actually opened?
At least in my experience, possibly due to context limitations, or just architecture, SOTA LLMs aren't particularly good at iterating as they tend to loop back around to similar results with bad logic / errors
I've tried these LLM "code from test" things (and vice-versa) dozens of times over the last couple of years... they're not even close to approaching being practical.
Why? It will evolve into a slightly higher level language where the compiler is an ML model. Was it a tragedy when developers mostly didn’t have to write assembly any more?
I think it's different... I like high level languages, but this is not a programming language, this is a technique for writing tests in an existing language and leaving the implementation to the AI.
I like programming for problem solving, I don't really like writing tests, but that's personal taste, a lot of people like to just use PowerPoint and Jira and tell others what they need to implement, but these people are not software developers.
> Was it a tragedy when developers mostly didn’t have to write assembly any more?
It wasn't, but for starters compilers have always been generally deterministic.
I'm not saying that this is completely useless (I personally think code completion tools such as GitHub CoPilot are fantastic), but it is still early to compare it to a compiler.
I appreciate that your workflow is so linear.
I often write tests, then the implementation, then I realize that the tests need to be corrected, then I change the implementation, then I change the tests, then I add other tests etc... etc...
I don't really like maintaining tests, it's often a lot of code that needs to be understood and changed carefully
Really it's just validator code instead of feature code. I think this is the only realistic way forward for production level code written by AI, don't ask it to write code - ask it to pass your validation tests.
Essentially, everyone becomes a red team member trying to think of clever ways they can outwit the AI's code which I for one think this is going to be a lot of fun in the future - though we're still quite a way from there yet!
i wrote a similar tool the other day: https://github.com/joseferben/makeitpass
it can make all kinds of commands pass by checking stdout/stderr and it’s language agnostic (you need npx to run makeitpass)
For example, have a C function that reads in a file and returns you a string? The string can be checked that malloc actually succeeded, but how do you check that the file actually opened?
I like programming for problem solving, I don't really like writing tests, but that's personal taste, a lot of people like to just use PowerPoint and Jira and tell others what they need to implement, but these people are not software developers.
It wasn't, but for starters compilers have always been generally deterministic.
I'm not saying that this is completely useless (I personally think code completion tools such as GitHub CoPilot are fantastic), but it is still early to compare it to a compiler.
I don't really like maintaining tests, it's often a lot of code that needs to be understood and changed carefully
Deleted Comment
Essentially, everyone becomes a red team member trying to think of clever ways they can outwit the AI's code which I for one think this is going to be a lot of fun in the future - though we're still quite a way from there yet!
And when things don't work they call people like me, to try to understand the performance problems of something poorly defined and worse written.
Deleted Comment