An example of LLM prompting for programming

The article shows everything that works for this approach. But it's a bit disingenuous. At the end:

> Once this is working, Xu Hao can repeat the process for the rest of the tasks in the master plan.

No, he can't. After that much back and forth and getting it to fix little things where it gives responses with the full code listing again, he would have easily hit the token limit (at least with any chat LLM capable of this quality code and conversation - ChatGPT). The LLM will start hallucinating the task list, the names of functions it wrote earlier etc. and the responses would get less and less useful with more and more "this doesn't work, can you fix X".

So anyone following this approach will hit a footgun after task 1.

For anyone that really wants to follow this approach, the next step is to start a new chat and copy/paste the inital requirement prompt, put the task list in there, any relevant code, adjust the instruction (ie "help me with task 2") and go from there.

It is of limited utility though. By step 3 (or even 2) you end up with so much code that you're at the token limit anyway and it can't write code that fits together.

Where I've found ChatGPT 4 useful is getting me going on something, providing boilerplate, and unblocking me.

If you don't know how to approach a problem like the "awareness layer" (like I didn't before reading the post), you can get a great breakdown and starting point from ChatGPT. Similarly, if you're not sure how to approach that view model, or write tests etc. And if you want a first draft of code or tests.

All that said, I'm looking forward to much larger and affordable token limits in future.

ryanjshaw · 2 years ago

Your experience matches mine closely. I've had ChatGPT-4 do great and then it just gets confused after a while. I can literally tell it "task X is done" and it'll apologise and show me a list of tasks where X is still not done - this is clearly not just a context window issue, as I have repeated variations of my statement over and over in the same session and the issue persists.

I have ended up using it the same way you have - it's honestly the best anti-procrastination tool I've ever used because I can tell it my intentions, what I've thought of so far... and it'll spit out a list of bite-sized chunks that get me going. I find myself looking forward to telling the AI I've completed a task.

Similarly, if I'm facing a tricky design decision, I find that just writing it out for ChatGPT is extremely helpful for clarifying my thought process. I actually used to do this conversational decision making process in a text editor long before ChatGPT, but when I know there's an AI on the other end my thinking becomes clearer and more goal-oriented. And unlike talking to myself or a human friend, it's happy to just say "well if these are your concerns, let's start HERE and then see what happens".

travisjungroth · 2 years ago

Good rule of thumb with ChatGPT: you can’t exit loops. Once you’ve gone A > B > A, your best move is to start a new chat. Even then it may reproduce and you should do some similar but different task. Remember that it’s a prediction engine, weighing heavily on the existing prompt. So you say B again, or B1 and it’s like, I know what to do! A! Cause last time was A->B so let’s do it again.

In your case this would be “[]Task1”, “Task1 is done”, “[]Task1”, [here is where you start a new chat or fix it yourself if possible].

peterashford · 2 years ago

Ooh! That's a really good point - ChatGPT is effectively rubber-ducky as a service =)

hn_throwaway_99 · 2 years ago

Hmm, I also use ChatGPT as an anti-procrastination tool and task manager, and it's never made a mistake with keeping track of my task list (except that when it sums the estimated times of subgroups of tasks, sometimes those sums are wrong).

Note that it outputs my updated task list every time I add or remove a task (I only asked it to do that one time), so even if old messages go outside of the context window, it's not a big deal because the full updated state of the list is output basically every other message.

Tostino · 2 years ago

You iterate on your plan after it is generated step by step. You go and edit the prompt chain you started working on step 1 on, and modify it to start working on step 2 (including any ideas or fixes you have identified while implementing step 1. Repeat until complete.

You can still absolutely hit the context limit, but you are far less likely to do so if you go back and start a new prompt chain for each different thought process you are going through with it.

afro88 · 2 years ago

Great idea. But does it get hard to navigate back to something in older chat histories though?

I find a new separate chat with the revised initial prompt to be easier.

quijoteuniv · 2 years ago

It's great to see that there's now a term for the type of prompting, “generated knowledge”. I've been experimenting with this technique since the beginning, and I've noticed a significant improvement in version 4. The process involves outlining the project, creating tasks, and feeding them back to chatGPT as you progress. This approach has helped me complete projects that would have otherwise taken me much longer to finish.

It's also useful for creating practical tutorials. While there are plenty of tutorials available online, sometimes you need guidance on a specific set of technologies. By using generated knowledge prompts, you can get a good outline and tasks to help you understand how these technologies interact.

One thing to keep in mind is to avoid derailing the conversation with questions that are not relevant to the core tasks. If you get stuck on something and need to debug, it's best to use a separate conversation to avoid derailing the project's progress and the allucinations & forgettingness

AzzieElbab · 2 years ago

Something must be wrong with me. I could never get anything useful from Martin Fowler's writings, and coincidentally I cannot get any functional code out of ChatGPT. Even the boilerplate it produces for me needs to be corrected. I still use chatGPT to produce examples of abstract things but was not able to get any working code that matches concrete problems or even compiles.

afro88 · 2 years ago

Absolutely, and same here. I've done multiple tools that would have taken 2-3 days each in 2-3 hours each.

> One thing to keep in mind is to avoid derailing the conversation with questions that are not relevant to the core tasks. If you get stuck on something and need to debug, it's best to use a separate conversation to avoid derailing the project's progress and the allucinations & forgettingness

Definitely. Great advice.

Another tip: don't bother asking it to fix small things. Just mention you fixed it in the next reply and move on.

matchagaucho · 2 years ago

he would have easily hit the token limit

That was my first question. Do all these tasks fit within a 4K or 8K buffer?

Wouldn't be surprised, though, if it works in a 32K GPT4 token limit. Amazing things are possible.

heyyyouu · 2 years ago

This may be a dumb question, but do you know if this something that LLM frameworks like LangChain can (or others) can help with? Aren't they designed to help with more complex prompts/logic/outputs? Or will they run into the same token limits?

blastro · 2 years ago

I completely agree with this take.

If somebody thinks an LLM is coming for everybody's coding job, I'd say this article is a great counterpoint just for existing.

You could tell someone from decades ago that we now use a very high level language for complex tasks in complex code ecosystems, never even mention AI, explain that the parser is really generalist-biased, and this article would make perfect sense as an example of exemplary code by a modern coder working for a living.

That's code in there, the stuff Xu Hao is writing.

And also, that's not even getting into the debugging part... Which will be about other code, that looks different.

wpietri · 2 years ago

Yeah, I think there's a "stone soup" effect going on with AI.

It's the same sort of thing you see happening with the customers of psychics. People often have poor awareness of how much they're putting in to a conversation. Or it's a bit like the way Tom Sawyer tricks other kids into painting the fence for him. For me a lot of the magic here is in knowing what questions to ask and when the answers aren't right. If you have those skills, is pounding out the code that hard?

The interesting part for me is not generating new bits of code, but the long-term maintenance of a whole thing. A while back there was a fashion for coding "wizards", things that would ask some questions and then generate code for you. People were very excited, as they saw it as lowering the barrier to entry. But the fashion died out because it just pushed all the problems a bit further down the road. Now you had novice developers trying to understand and improve code they weren't competent to write.

I suspect that in practice, anything a person can get a LLM to wholly write is also something that could be turned into a library or framework or service or no-code tool that they can just use. That, basically, if the novelty is low enough that an LLM can produce it, the novelty is low enough that there are better options than writing the code from scratch over and over.

baq · 2 years ago

I mostly agree except one critical detail: LLMs are the low code/no code service. You literally tell them what you want and if they’re fine tuned on the problem domain, you’re all set. Microsoft demo’d the office 365 integration and if it works half as well in practice they’ll own the space as much as they have in 1997.

fijiaarone · 2 years ago

One strange coincidence with the emergence of ChatGPT is that at almost the exact same time, Google became practically unusable as a search engine. Like at least an order of magnitude worse.

People used to use Google the same way they use ChatGPT. They would ask a question in plain English, and get sent back a list of relevant links to blog posts, articles, stack overflow, or whatever that had answers to their questions, including example code.

Sometimes that code or information was outdated or completely wrong and sometimes it was too basic to be useful, or just the code-generated docs.

Google has been getting gradually worse over the years due to spam, algorithm gaming, and ads, but circa late November 2022 it became practically worthless.

harlanlewis · 2 years ago

Great points (and after checking your user name, I’ve been nodding my head to posts of yours for about a decade now).

This is a bot tangential - your reference to stone soup is a wonderful example of the information density possible with natural language. And all the meaning and story behind the phrase is accessible to LLMs.

I’ll have to start experimenting with idiom driven development, especially when prompt golfing.

62951413 · 2 years ago

I believe the Model Driven Architecture fad (https://en.wikipedia.org/wiki/Model-driven_architecture) is a better analogy than wizards. Back then the holy grail of complete round trip UML->code->UML didn't get practical enough to justify the effort.

notacoward · 2 years ago

The problem is that it's not quite code. It's almost code, but without the precision, which puts it into a sort of Uncanny Valley of code-ness. It's detailed instructions for someone to write code, but the someone in this case is an alien or insane or on drugs so they might interpret it the way you meant it or they might go off on some weird tangent. You never know, and that means you'll need to check it with almost as much care as you'd take writing it.

Also, having it write its own tests doesn't mean those tests will themselves be correct let alone complete. This is a problem we already have with humans, because any blind spot they had while writing the code will still be present for writing the tests. Who hasn't found a bug in tests, leading to acceptance of broken code and/or rejection of correct alternatives? There's no reason to believe this problem won't also exist with an AI, and they have more blind spots to begin with.

elendee · 2 years ago

I think often of the adage "it's harder to read code than write it". GPT gives you a lot to read. definitely a better consultant than coder imo. I've also had GPT write entirely false things, then I say "isn't that false?" and it says, "yes sorry about that" . very uncanny

nextworddev · 2 years ago

The opposite might be true, and here’s why - 1) by using English as spec, the barrier of entry has gone lower, 2) LLMs can also write prompts and self introspect to debug.

ZephyrBlu · 2 years ago

I think English as a spec actually makes the barrier of entry higher, not lower. Code itself is far easier to understand than an English description of the code.

To understand an English description of code you already have to have a deeper understanding of what the code is doing. For code itself you can reference the syntax to understand what's going on.

The prompt in this case is using very technical language that a beginner will have no idea about. But if you gave them the code they could at least struggle along and figure it out by looking things up.

mooreds · 2 years ago

> by using English as spec, the barrier of entry has gone lower,

I'm not sure that is true. The level of back and forth and refinements needed indicate to me that the "English" used is not the normal language I use when talking to people.

It's almost like a refined version of cucumber with syntax that is slightly more forgiving.

Maybe I'm being a codger, but LLMs seem (at least for now) far better for summarizing and giving high level overviews of concepts rather than nailing precise code requirements.

agentultra · 2 years ago

But you can't determine if a statement is true by simply reading more words.

It's also not efficient for doing higher level work. There was a time before we had algebra where people were still expressing the same ideas but the notation wasn't there. Mathematics was expressed in "plain language." It's extremely difficult to read for us. For mathematician's of the time there was no other way to explain algorithms or expressions.

For simple programs I have no doubt that these tools enable more people to generate code.

However it's not going to be helpful for people working on hypervisors, networking stacks, operating systems, distributed databases, cryptography, and the like yet. For that you need a more precise language and an LLM that can reason about semantics and generate understandable proofs: not boilerplate proofs either -- they have to be elegant so that a human reading them can understand the problem as well. We're still a ways from being able to do that.

notacoward · 2 years ago

> LLMs can also write prompts and self introspect to debug.

Why should we assume that won't lead to a rabbit hole of misunderstanding or outright hallucination? If it doesn't know what "correct" really is, even infinite levels of supervision and reinforcement might still be toward an incorrect goal.

gumballindie · 2 years ago

I mean sure if the world were to run on basic code. Perhaps wordpress developers may feel slightly threatened by even that is well above all examples of a"i" code i've seen.

have_faith · 2 years ago

English as a spec is incredibly "fuzzy", there are many valid interpretations of intent. I don't think that can be avoided?

bartimus · 2 years ago

But there's still going to have to be a human who has the ability to form a mental model of the thing that's needing to be implemented. Functionally and technically. The results of the LLM will vary depending on the level of know-how the human instructor has.

twelve40 · 2 years ago

Exactly, I actually liked the systematic approach in the article, but it seemed pretty labor-intensive and ... not that much different from other types of programming

sanderjd · 2 years ago

To me, that's the whole point of this. I think it is directly analogous to the jump between assembly and higher level compiled languages. You could have said about that, "it still seems pretty labor intensive and not that much different than writing assembly", and that's true, but it was still a big improvement. Similarly, AI-assisted tools haven't solved the "creating software requires work" problem. But I think they're in the process of further shifting the cost curve, making more software possible to make.

Veedrac · 2 years ago

‘Artists' jobs are safe because AI is bad at hands.’

themodelplumber · 2 years ago

Artists' jobs are safe in part because they can also use AI, and most already use relevant ecosystems that now incorporate AI.

Consumers who can operate AI for clip art purposes are simply still part of the same non-artist-paying demographic they always were.

Same with code

thomaslord · 2 years ago

Yeah, even if ChatGPT could perfectly understand the prompts you'd still run into major issues with token limits. I tried to get it to rebuild a single page for me (to move from one UI framework to another) and I couldn't fit the existing code in the token limit. I might be able to get it to do a chunk of the initial work for a greenfield project if I perfected the prompts, but it's structurally incapable of maintaining existing code.

karmasimida · 2 years ago

I was thinking that draft of the master plan ... you can't really just write it up this clear and easy.

Overall, I don't think a 95% auto pilot GPT model would provide more efficiency than a 80% one.

SanderNL · 2 years ago

Except you now have a way “upwards” from an abstraction POV. Regular code is severely limited and highly surgical, by design. This is not.

All these abstraction layers were invented to serve old style manual coders. Why bother explaining in great detail about “Konva” layers and react anymore? Give it a few years and let it finetune on IT tech and I see this being reduced to “I want whiteboard app with X general characteristics” at which point I’d no longer speak about “programming”.

themodelplumber · 2 years ago

That "upwards" excludes a lot of relevant systems design logic that won't go away though, insofar as it is abstraction ad infinitum in the direction of fewer-relevant-details.

What'll happen is, details will continue to be relevant as tastes adjust to the new normal.

Like for my work, today, React is enterprise-ready, which is not good for me. It means it will likely dip my projects in unnecessary maintenance costs as compared to another widget of its type that does what I want in a lightweight manner. When I troubleshoot something of React's complexity, even my prompts will likely need to be longer.

But also, that's just one component of one component. And you have to experience this stuff in the first place, to know that you should pay attention to these details and not those other ones, for a given job, for a given client, in a given industry, with given specs.

So, if I was able to wave my hands I'd simply have all the problems I had back when I was a beginner. Ergo, it comes back to the clip art problem: Being able to buy clip art never made anyone a designer. But it made a lot of designers' jobs way easier.

We are simply regressing toward the mean with regard to programming. It was never about computers in the first place, never so concerned with syntax.

Anyway, back to browsing my theater program...

gdubs · 2 years ago

There's an unfortunately common take on AI that goes basically like this:

"I tried it and it didn't do what I wanted, not impressed."

My suggestion is to tune out the noise and really try experimenting with these tools – and know that they're rapidly improving. Even if ultimately you have criticisms or decide one way or another, at least really investigate them for your own use-cases rather than jumping on a bandwagon that's either "AI is bad" or the breathless hype-machine at the other end.

rootusrootus · 2 years ago

I agree it's a good idea to take a moderate approach. The hype that LLMs are going to replace SWEs is clearly just that, hype, if you've done any real work trying to get GPT4 to give you the code you want. But it's also clearly a very useful tool. I think it'll absolutely destroy Stack Overflow.

z3c0 · 2 years ago

I am very critical of the LLM hype, but the threat to stackoverflow is evident. Like stackoverflow, I never write code verbatim that comes from even GPT4. I frequently find issues in the output, as the code I write is generally very context-specific. However, I find the back-and-forth with interesting tidbits of info dropped here-and-there amounts to something like rubber duck debugging on steroids.

lcnPylGDnU4H9OF · 2 years ago

> destroy Stack Overflow

It'll be interesting to see how future training data is sourced.

chefandy · 2 years ago

Nobody who professionally designs and writes software AND has used LLM code generation tools sees this as a drop-in replacement for developers, generally, anytime soon. That stance is for overeager, credulous enthusiasts and doomsdayers jumping to conclusions.

Similarly, nobody who professionally designs and creates complex art products sees this as a drop-in replacement for commercial artists anytime soon. That stance is for people dazzled by their new image-generation superpower who don't know how little they know about professional creative work.

I doubt the markets for utility-grade code work (e.g. customizing existing WordPress themes) or low-effort, high-volume creative assets (template-based logos, lightly customized game sprites) will survive. They're still real people with lives and families and medical bills and mortgages and we really ought to get serious about worker protections in this country. Seriously.

tarruda · 2 years ago

> The hype that LLMs are going to replace SWEs is clearly just that, hype

LLMs cannot replace anyone, but it is clear that engineers which master LLMs usage might multiply their productivity by a lot.

The question is: If one LLM assisted engineer can work 10x faster, will companies reduce their engineer staff by 90%?

drowsspa · 2 years ago

Yet, the whole movement of getting blue collar workers to code seems to have lost its steam.

lionkor · 2 years ago

Sadly I dont think this can happen. There is a load of trash answers on SO, and you bet ChatGPT is trained on that.

So you get not only the good of SO, you also get the worst of SO, and theres no way to tell.

Just a downgrade for me, plus for most things I do you are better of reading the source code or the documentation (however lackluster) than fumbling with chatgpt and getting an answer that may or may not be right.

I might as well ask someone else who doesnt know any programming to search for the answer for me - they wont be able to tell a trustworthy answer from another one.

There are so many SO answers (esp. on C++) which look good, but one of the comments points out some edge case in which it breaks.

Remember, not everyone does copy paste programming, some of us have to sit there and think of a solution and work it out over hours, because its not been done before publically

Deleted Comment

Of course, there's the issue that a lot of the info for useful LLMs probably comes from places like Stack Overflow

spaceman_2020 · 2 years ago

People also forget that the model is trained on older data. At first, it will default to referencing out of date frameworks and solutions, but if you tell it that its code isn't working, it will usually correct itself.

mise_en_place · 2 years ago

I was very impressed when it showed me the different techniques for deep reinforcement learning. However, where it struggles is when building an agent. Because you will need a high amount of tokens to template a prompt, in the case of langchain or AutoGPT.

Dead Comment

alexashka · 2 years ago

You may be underestimating how much meaning people derive from jumping on bandwagons and having a simple to understand group identity.

Your suggestion would make many people unhappy. They can't win the competence game and hence 'really investigating' is a losing proposition for them. What they can do is jump on bandwagons very quickly, hoping to score a first mover advantage.

How much of an advantage would one get from taking a couple of years to really investigate Bitcoin and the algorithms involved, vs buying some as early as possible and telling everyone else how great it is? :)

mk89 · 2 years ago

For me chatGPT or phind (which is based on chatGPT4, if I understood right) are great documentation tools and also general productivity tools, nothing to say about it.

The main issue is that sometimes they really f** it up bad, they make you rethink your knowledge quite deeply (do I remember wrong? did I maybe understand this wrong? is chatGPT wrong?) and this is for me something that can be worse than having to do it myself, because it creates some sort of insecurity, as you always have to challenge your self thinking, and this is not how we work in our daily job, isn't it? At least this doesn't happen so frequently to me - from time to time we have arguments in the team, but this kind of "wrong information" feels more like "hidden" traps than someone else arguing (with valid arguments, of course).

One thing that really bothers me is that I want it to use best practices and it doesn't really know which ones I'm talking about, and then I realize they are _my_ set of best practices, made from others' nameless best practices.

So I have to decide if it's just a matter of manually converting the 5-10 little things like using `env bash` in the header, etc. Or do I ask it to remember that and proceed to the next layer of the project, and feel like Katamari Coder, which is quite a feeling of what-is-this-fresh-encumbrance at times.

There is a nascent sense that the interface is not even close to where it needs to be to efficiently support that kind of recall for working memory on the coder's end.

I can definitely see a new LLM relativistic-symbolic instruction code & IDE-equivalent (with yet-unseen presentational and let's even say modal editing factors) being extremely useful, which is a bit funny but also that's what those things are good for... Right now I can scroll up through my prompts to supplement my working memory, but that's another place where the whole activity starts to seem very tedious.

(Is the LLM coming for the coders, or are coders coming for the LLM?)

mason55 · 2 years ago

I think that Copilot is much better/more promising for this kind of thing because it's looking at the code you've already written without you having to constantly prompt it.

I had a lot of the same hangups as you when I had played around with ChatGPT. How do I get it to handle the monotonous stuff without me having to spend all my time teaching it?

I finally tried Copilot the other day and it was stunning. I had a half-written golang client that was a wrapper around an undocumented and poorly structured API for a tool we use. I had written the get and create methods. Then I added a comment with an example URL for delete and Copilot auto-completed the entire method in the same style as the two methods I had already written. In some cases, like formatting & error handling, it was exactly the same as what I'd written, but other cases, like variable naming, string templating, etc., it replicated the spirit of my style but adapted for this new "delete" method.

I think ChatGPT is just the wrong interface for this kind of thing (at least right now).

dpkirchner · 2 years ago

> Or do I ask it to remember that and proceed to the next layer of the project

I think this could be solved with a good browser extension. Something that provides an easy to access (e.g., keyboard-only) way to paste customized prompt preludes that enforce your style (or styles if, say, you're using multiple languages).

It looks like Maccy could do the job, albeit not as an extension. I haven't tried it yet.

One thing ChatGPT (specifically, the GPT4 version) keeps doing to me is confidently lying, and when I call it out, apologizing and spitting out another response. Sometimes the right answer, sometimes another wrong one (after a couple tries it then says something like "well, I guess I don't have the right answer after all, but here is a general description of the problem")

Part of me laughs out loud (literally, out loud for once) when it does that. But the other part of me is irritated at the overconfidence. It is a potentially handy tool but keep the real documentation handy because you'll need it.

moonchrome · 2 years ago

Honestly to me it happens more than it doesn't - but maybe that's because I've tried it in cases where I've already used traditional approaches to come up with the answer and going to GPT and phind to benchmark their viability.

I've mentioned it on other thread, but phind's "google-fu" is weak, it does a shallow pass and bing index (I'm assuming) is worse than google. It's also slow as hell with GPT4 which makes digging deeper slower than just manually going in.

cwp · 2 years ago

To me, this is a great illustration of why chat is a terrible interface for a coding tool. I've gone down this path as well, learning that you need to have a detailed prompt that establishes a lot of context, and iteratively improve it to generate better code. And yup, generating a task list and working from that is definitely a key strategy for getting GPT to do anything bigger than a few paragraphs.

But compare that to Copilot: Copilot doesn't help much when you're starting from scratch, and there's nothing for it to work with. But once you have a bit of structure, it starts to make recommendations. Rather than generating large chunks of code, the recommendations are small, chunks of a few lines or maybe even one line at a time. And it's sooooo good at picking up on patterns. As soon as you start something with built-in symmetries, it'll quickly generate all the permutations. It's sort of prompting by pointing.

This is so. much. better. than writing prompt for the chat interface. I'm really excited to see where these kinds of tools lead.

SamPatt · 2 years ago

I've noticed that after using copilot on a code base for a while, you can effectively prompt the AI just by creating a descriptive comment.

// This function ends the call by sending a disconnection message to all connected peers

Bam, copilot will recommend at least the first line, with subsequent lines usually being pretty good, and more and more frequently, it will recommend the whole function.

I still use GPT-4 a lot, especially for troubleshooting errors, but I'm always pleasantly surprised at how good copilot can be.

armchairhacker · 2 years ago

Copilot is a game-changer and very underrated IMO. GPT4 is smart but not really used in production yet. Copilot is reportedly generating 50% of new code and I can't imagine going without it.

mjr00 · 2 years ago

Absolutely. People will quickly realize that for coding, the natural language part of LLMs is a distraction. Copilot is much better for someone actually writing code, but unfortunately doesn't get as emphasized due to the narrative surrounding LLMs right now.

yodsanklai · 2 years ago

> Copilot is much better for someone actually writing code

I haven't used copilot yet, but I'm using occasionally chatgpt with prompts such as "write a bash/python script take takes these parameters and perform this tasks". Then I iterate if needed, and usually, i can get what i want faster than without using chatgpt. It's not a game changer, but it's a performance boost.

How natural language is a distraction here? and how copilot would do much better for the same task?

moffkalast · 2 years ago

Has the Copilot backend been updated to use anything more advanced yet? I tried it out when it was new and free for a while and it really struggled with anything that wasn't incredibly common. GPT 4 in its chat form works a whole lot better for niche stuff than that one did.

avereveard · 2 years ago

Idk for sure autocomplete is a great interface for someone in the ide coding, but LLM can understand requirements whole and spit out full classes and validate that the output from the server matches the specs, they work great from outside an ide.

barbariangrunge · 2 years ago

Either way, you’re sending your companys biggest asset to another company, aren’t you? I’ll try these tools when they start being able to run locally

throwaway202303 · 2 years ago

No or no company would be able to use it. As you type fragments of code are sent and discarded after use. You need to trust Microsoft to actually do the discarding but contractually they do and you can sue them if they accidentally or deliberately keep your code around or otherwise mismanage it.

I sort of disagree that code is the biggest asset. Take the Yandex leak. What can you do with it? Outcompete them?

koonsolo · 2 years ago

I surely hope they use my copyrighted code and make millions out of it. Ideal case for me to sue them for lots of money.

maroonblazer · 2 years ago

As a hobbyist developer with no formal training, I wish Copilot had a 'teaching' or "Senior Dev" mode, where I can play the role of the Junior Dev. I'd like it to pick up on what I'm trying to write, and then prompt me with questions or hints, but not straight up give me the code.

Or, if that's too Clippy-like annoying, let me prompt it when I'm stuck, and only then suggest hints or ask suggestive questions that guide me to a solution.

I agree, very exciting to see where all this goes.

ukuina · 2 years ago

The Github Copilot Labs extension has "codebrushes" that can transform and explain existing code instead of generating new code, but none of it only gives "hints". Maybe one of the codebrushes can take a custom prompt.

One thing you might try with Copilot is to ask it to explain the code. It can often give insight, even on code that you yourself wrote a few minutes ago.

supernikio2 · 2 years ago

Exactly this. I've tried to implement ChatGPT into may daily workflow, but you have to give it an excruciating level of detail to get something that remotely resembles real code I'd use, and even then you have to hold its hand to guide it in the correct direction, and still have to make some manual final touches at the end.

This is why I'm looking forward to Copilot X so much. It will hold much more context than the current implementation, and integrate the Chat interface that's so natural to us.

People have different preferences and habits. Having tried both models I much prefer having a conversation in one window and constructing my code from that in another. Although copilot is about to add some interesting features that may win me back.

mov_eax_ecx · 2 years ago

How to overengineer with an LLM, don't state clearly the requirements, shove your pet patterns first, it is more important to follow the slice redux awareness hook than to have working solution, never trust your developers to make decisions, worry more how it is built than building a solution.

My way to work with an LLM is to have a good, clear requirement and make the LLM write a possible file organization and query the contents of each file, just the code no comments and assemble a working prototype fast, then you can iterate over the requirements and evolve from there.

lyjackal · 2 years ago

Generally, I agree that approach works well. It’s going to perform better if it’s not trying to fulfill your teams existing patterns. On the other hand, allowing lots of inconsistencies in style in your large code base seems like a quick way to create a hot mess. Chat prompts seem like a really difficult way to communicate code style and conventions though. A sibling comment to yours mentions that a copilot autocomplete seems like a much better pattern for working in an existing code base, and I tend to agree that’s much more promising. Read the existing code, and recommend small pieces as you type

How often do you get working code that way ? Unless it's something trivial that fits in it's scope I'd say that's going to produce garbage. I've seen it steer into garbage on longer prompt chains about a single class (of medium complexity) - I doubt it would work project level. Mind sharing the projects ?

I work only with closed source codebases and this approach for prototypes, but, using the same example as the blog i prompt: "the current system is an online whiteboard system. Tech stack: react, use some test framework, use konva for the canvas, propose a file organization, print the file layout tree. (without explanations)." The trick is that for every chat the context is the requirement+the filesystem + the specific file, so you don't have the entire codebase in the context, only the current file, also use gpt4, gpt3 is not good enough.

My main point is that the blog post final output is mock test awareness hook redux, where an architect feels good to see his patterns, with my approach you have a prototype online whiteboard system,

I feel like this is a bunch of ceremony and back and forth, and also considering GPT-4 speed - I feel like I would fly past this approach just using copilot and coding.

I look forward to offloading these kinds of tasks to LLMs but I'm not seeing the value right now. Using them feels slow and unsatisfying, need to triple check everything, specify everything relevant for context.

Also maybe it's just me but verbalizing requirements unambiguously can often be harder than writing code for it. And it's not fun. If GPT4 was GPT3.5 fast it would probably be a completely different story.

isaacfrond · 2 years ago

The article stresses to never put anything that may be confidential into the prompt. Yet, chatGpt offers to out-out from using your data for training.

For most purposes that seems to be sufficient doesn't it? Or are there reasons not to trust OpenAi on this one?

vharuck · 2 years ago

I will never have full trust in an assertion unless (a) it's included in a contract that binds all parties, (b) the same contract includes a penalty for breaking the assertion that's severe enough to discourage it, and (c) I know the financial and other costs of litigation won't be severe for me.

In short, unless my large employer will likely win in punishing OpenAI should they break a promise, that promise is just aspirational marketing speak.

For data retention and usage, I'd also need a similar contractual agreement to tie the hands of any company that would acquire them in the future.

Copilot for individuals stores code snippets by default according to their TOS. Sure, you can probably find a way to opt out of that somewhere as well, but you'd have to read the TOS for every plugin and service you use, find the opt-out links and make sure you don't opt-in again via some other route (such as not Copilot but ChatGPT proper or some other Github, VSCode or some other plugin or service button or knob).

> Or are there reasons not to trust OpenAi on this one?

Yes, more related to general tech history and not a dig on OpenAI though.

dustypotato · 2 years ago

There was a bug where chat history of some users were visible to others

blowski · 2 years ago

From a GDPR or commercial confidentiality perspective, it doesn't matter what OpenAI say they'll do with your data, you can't share it with them.

Let's say your doctor enters sensitive info about you, and despite having told OpenAI not to train data with it, they use it anyway due to a bug. A year from now, ChatGPT is generating personal information tells everyone and anyone about your sensitive info.

Would you exclusively blame ChatGPT?

clarge1120 · 2 years ago

> are there reasons not to trust OpenAi on this one?

Yes, the fact that they are closed, not open, for one. And that they switched from open to closed the moment it benefited them to do so.