Prompt engineering playbook for programmers

In my experience there's really only three true prompt engineering techniques:

- In Context Learning (providing examples, AKA one shot or few shot vs zero shot)

- Chain of Thought (telling it to think step by step)

- Structured output (telling it to produce output in a specified format like JSON)

Maybe you could add what this article calls Role Prompting to that. And RAG is its own thing where you're basically just having the model summarize the context you provide. But really everything else just boils down to tell it what you want to do in clear plain language.

dachris · 7 months ago

Context is king.

Start out with Typescript and have it answer data science questions - won't know its way around.

Start out with Python and ask the same question - great answers.

LLMs can't (yet) really transfer knowledge between domains, you have to prime them in the right way.

christophilus · 7 months ago

Dunno. I was working on a side project in TypeScript, and couldn’t think of the term “linear regression”. I told the agent, “implement that thing where you have a trend line through a dot cloud”, or something similarly obtuse, and it gave me a linear regression in one shot.

I’ve also found it’s very good at wrangling simple SQL, then analyzing the results in Bun.

I’m not doing heavy data processing, but so far, it’s remarkably good.

nxobject · 7 months ago

I see that as applying to niche platforms/languages without large public training datasets - if Rust was introduced today, the productivity differential would be so stacked against it that I’m not sure it would hypothetically survive.

0points · 7 months ago

That's your made up magical explanation right there dude.

Every day tech broism gets closer to a UFO sect.

lexandstuff · 7 months ago

Even role prompting is totally useless imo. Maybe it was a thing with GPT3, but most of the LLMs already know they're "expert programmers". I think a lot of people are just deluding themselves with "prompt engineering".

Be clear with your requirements. Add examples, if necessary. Check the outputs (or reasoning trace if using a reasoning model). If they aren't what you want, adjust and iterate. If you still haven't got what you want after a few attempts, abandon AI and use the reasoning model in your head.

dimitri-vs · 7 months ago

It's become more subtle but still there. You can bias the model towards more "expert" responses with the right terminology. For example, a doctor asking a question will get a vastly different response than a normal person. A query with emojis will get more emojis back. Etc.

easyThrowaway · 7 months ago

I get the best results with Claude by treating the prompt like a pseudo-SQL language, treating words like "consider" or "think deeply" like keywords in a programming language. Also making use of their XML tags[1] to structure my requests.

I wouldn't be surprised if in a few years from now some sort of actual formalized programming language for "gencoding" AI is gonna emerge.

[1]https://docs.anthropic.com/en/docs/build-with-claude/prompt-...

petesergeant · 7 months ago

One thing I've had a lot of success with recently is a slight variation on role-prompting: telling the LLM that someone else wrote something, and I need their help assessing the quality of it.

When the LLM thinks _you_ wrote something, it's nice about it, and deferential. When it thinks someone else wrote it, you're trying to decide how much to pay that person, and you need to know what edits to ask for, it becomes much more cut-throat and direct.

coolKid721 · 7 months ago

The main thing I think is people just trying to do everything in "one prompt" or one giant thing throwing all the context at it. What you said is correct but also, instead of making one massive request breaking it down into parts and having multiple prompts with smaller context that say all have structured output you feed into each other.

Make prompts focused with explicit output with examples, and don't overload the context. Then the 3 you said basically.

denhaus · 7 months ago

Regarding point 3, my colleagues and i studied this for a use case in science: https://doi.org/10.1038/s41467-024-45563-x

caterama · 7 months ago

Can you provide a "so what?" summary?

denhaus · 7 months ago

As a clarification, we used fine tuning more than prompt engineering because low or few-shot prompt engineering did not work for our use case.

Esophagus4 · 6 months ago

Chain Of Thought prompting loses much of its effectiveness on newer reasoning models like GPT “o” series and Claude Sonnet.

As an exercise for the reader, I encourage you all to try the examples vs. control prompts in prompt engineering papers for chain of thought prompting, and you’ll see that the latest models have either been trained to or instructed to reason by default now - the outputs are close enough to equivalent.

CoT prompting was probably much more effective a few years ago on older, less powerful models.

You may find some benefit in telling it exactly how you want it to reason about a problem, but note that you may actually be limiting its capabilities that way.

I’ve found that most of the time, I will let it use its default reasoning capabilities and guide those rather than supplying my own.

faustocarva · 7 months ago

Did you find it hard to create structured output while also trying to make it reason in the same prompt?

demosthanos · 7 months ago

You use a two-phase prompt for this. Have it reason through the answer and respond with a clearly-labeled 'final answer' section that contains the English description of the answer. Then run its response through again in JSON mode with a prompt to package up what the previous model said into structured form.

The second phase can be with a cheap model if you need it to be.

There is no such thing as "prompt engineering". Since when the ability to write proper and meaningful sentences became engineering?

This is even worse than "software engineering". The unfortunate thing is that there will probably be job postings for such things and people will call themselves prompt engineers for their extraordinary abilities for writing sentences.

NitpickLawyer · 7 months ago

> Since when the ability to write proper and meaningful sentences became engineering?

Since what's proper and meaningful depends on a lot of variables. Testing these, keeping track of them, logging and versioning take it from "vibe prompting" to "prompt engineering" IMO.

There are plenty of papers detailing this work. Some things work better than others (do this and this works better than don't do this - pink elephants thing). Structuring is important. Style is important. Order of information is important. Re-stating the problem is important.

Then there's quirks with family models. If you're running an API-served model you need internal checks to make sure the new version still behaves well on your prompts. These checks and tests are "prompt engineering".

I feel a lot of people take the knee-jerk reaction to the hype and miss critical aspects because they want to dunk on the hype.

gwervc · 7 months ago

It's still very very far from engineering. Like, how long and how much one has to study to get an engineering degree? 5 years over many disciplines.

On the other hand, prompt tweaking can be learned in a few days just by experimenting.

apwell23 · 7 months ago

> Some things work better than others

That could be said about ordering coffee at local coffee shop. Is there a "barista order engineering" we are all supposed to read?

> Re-stating the problem is important.

maybe you can show us some examples ?

liampulles · 7 months ago

If my local barista were to start calling themselves a coffee engineer, I would treat that as a more credible title.

hansmayer · 7 months ago

Yeah, if this catches on, we may definitely see the title "engineer" go the way of "manager" and "VP" in the last decades...So, yeah, we may start seeing coffee engineers now :D

gwervc · 7 months ago

Mixologist has already replaced bartender.

yowlingcat · 7 months ago

I would caution against thinking it's impossible even if it's not something you've personally experienced. Prompt engineering is necessary (but not sufficient) to creating high leverage outcomes from LLMs when solving complex problems.

Without it, the chances of getting to a solution are slim. With it, the chances of getting to 90% of a solution and needing to fine tune the last mile are a lot higher but still not guaranteed. Maybe the phrase "prompt engineering" is bad and it really should be called "prompt crafting" because there is more an element of craft, taste, and judgment than there is durable, repeatable principles which are universally applicable.

SrslyJosh · 7 months ago

> "high leverage outcomes"

You're not talking to managers here, you can use plain english.

> Maybe the phrase "prompt engineering" is bad and it really should be called "prompt crafting" because there is more an element of craft, taste, and judgment than there is durable, repeatable principles which are universally applicable.

Yes, the biggest problem with the phrase is that "engineering" implies a well-defined process with predicable results (think of designing a bridge), and prompting doesn't check either of those boxes.

theshrike79 · 6 months ago

Context is king.

Let's say you have two teams of contractors. One from your native country (I'm assuming US here), working remotely and one from India, located in India.

Would you communicate with both in the exact same manner, you wouldn't adjust your messaging in any way?

Of course you would, that's exactly what "prompt engineering" is.

The language models are a different and a bit fiddly at the moment, so getting quality output from each requires a specific input.

You can try it yourself, ask each of the big free-tier models to write a simple script in a specific language for you, every single one will have a different output. They all have a specific "style" they fall into.

zelias · 7 months ago

Since modern algorithmic driven brainrot has degraded the ability of the average consumer to read a complete sentence, let alone write one

sach1 · 7 months ago

I agree with yowlingcat's point but I see where you are coming from and also agree with you.

The way I see it, it's a bit like putting up a job posting for 'somebody who knows SSH'. While that is a useful skill, it's really not something you can specialize in since it's just a subset within linux/unix/network administration, if that makes sense.

mkfs · 7 months ago

> The unfortunate thing is that there will probably be job postings for such things

I don't think you have to worry about that.

mseepgood · 7 months ago

You don't even have to write proper sentences. "me get error X how fix here code:" usually works.

empath75 · 7 months ago

I think prompt engineering is closer to a people management skill than it is to engineering.

bicepjai · 7 months ago

I would argue code is a meaningful sentence. So software writers is more appropriate :) ?

SchemaLoad · 7 months ago

AI sloperators are desperate to make it look like they are actually doing something.

wiseowise · 7 months ago

God, do you get off of word “engineer”? Is it cultural?

Q9xhm0E5228tjF · 6 months ago

Found a prompt "engineer".

lowbloodsugar · 7 months ago

Like lawyers?

ocimbote · 7 months ago

Please be respectful to justice engineers.

For the uneducated, law engineers are members of the Congress / Parliament / Bundestag / [add for your own country]

DebtDeflation · 7 months ago

haolez · 7 months ago

Sometimes I get the feeling that making super long and intricate prompts reduces the cognitive performance of the model. It might give you a feel of control and proper engineering, but I'm not sure it's a net win.

My usage has converged to making very simple and minimalistic prompts and doing minor adjustments after a few iterations.

taosx · 7 months ago

That's exactly how I started using them as well. 1. Give it just enough context, the assumptions that hold and the goal. 2. Review answer and iterate on the initial prompt. It is also the economical way to use them. I've been burned one too many times by using agents (they just spin and spin, burn 30 dollars for one prompt and either mess the code base or converge on the previous code written ).

I also feel the need to caution others that by letting the AI write lots of code in your project it makes it harder to advance it, evolve it and just move on with confidence (code you didn't think about and write it doesn't stick as well into your memory).

> they just spin and spin, burn 30 dollars for one prompt and either mess the code base or converge on the previous code written ).

My experience as well. I fear admitting this for fear of being labled a luddite.

scarface_74 · 7 months ago

How is that different than code I wrote a year ago or when I have to modify someone else’s code?

conception · 7 months ago

I’d have to hunt, but there is evidence that using the vocabulary of an expert versus a layman will produce better results. Which makes sense since places where people talk “normally” in spaces are more likely to be incorrect. Whereas in places where people speak in the in the professional vernacular they are more likely to be correct. And the training will associate them together in their spaces.

ijk · 7 months ago

At their heart, these are still just document-completion machines. Very clever ones, but still inherently trying to find a continuation that matches the part that came before.

heisenzombie · 7 months ago

This seems right to me. I often ask questions in two phases to take advantage of this (1) How would a professional in the field ask this question? Then (2) paste that question into a new chat.

tgv · 7 months ago

For another kind of task, a colleague had written a very verbose prompt. Since I had to integrate it, I added some CRUD ops for prompts. For a test, I made a very short one, something like "analyze this as a <profession>". The output was pretty much comparable, except that the output on the longer prompt contained (quite a few) references to literal parts of that prompt. It wasn't incoherent, but it was as if that model (gemini 2.5, btw) has a basic response for the task it extracts from the prompt, and merges the superfluous bits in. It would seem that, at least for this particular task, the model cannot (easily) be made to "think" differently.

pjm331 · 7 months ago

Yeah I had this experience today where I had been running code review with a big detailed prompt in CLAUDE.md but then I ran it in a branch that did not have that file yet and got better results.

sagarpatil · 7 months ago

That has been my conclusion too but how do you explain the long ass prompt by AI labs: https://docs.anthropic.com/en/release-notes/system-prompts#m...

Well, your prompt adds up to the baseline. The logic still applies.

nico · 7 months ago

That’s also been my experience

At the same time, I’ve seen the system prompts for a few agents (https://github.com/x1xhlol/system-prompts-and-models-of-ai-t...), and they are huge

How does that work?

ath3nd · 7 months ago

> It might give you a feel of control and proper engineering

Maybe a super salty take, but I personally haven't ever thought anything involving an LLM as "proper engineering". "Flailing around", yes. "Trial and error", definitely. "Confidently wrong hallucinations", for sure. But "proper engineering" and "LLM" are two mutually exclusive concepts in my mind.

dwringer · 7 months ago

I would simplify this as "irrelevant context is worse than no context", but it doesn't mean a long prompt of relevant context is bad.

wslh · 7 months ago

Same here: it starts with a relatively precise need, keeping a roadmap in mind rather than forcing one upfront. When it involves a technology I'm unfamiliar with, I also ask questions to understand what certain things mean before "copying and pasting".

I've found that with more advanced prompts, the generated code sometimes fails to compile, and tracing the issues backward can be more time consuming than starting clean.

lodovic · 7 months ago

I use specs in markdown for the more advanced prompts. I ask the llm to refine the markdown first and add implementation steps, so i can review what it will do. When it starts implementing, i can always ask it to "just implement step 1, and update the document when done". You can also ask it to verify if the spec has been implemented correctly.

matt3210 · 7 months ago

At what point does it become programming in legalese?

efitz · 7 months ago

It already did. Programming languages already are very strict about syntax; professional jargon is the same way, and for the same reason- it eliminates ambiguity.

bsoles · 7 months ago

ColinEberhardt · 7 months ago

There are so many prompting guides at the moment. Personally I think they are quite unnecessary. If you take the time to use these tools, build familiarity with them and the way they work, the prompt you should use becomes quite obvious.

Disposal8433 · 7 months ago

It reminds me that we had the same hype and FOMO when Google became popular. Books were being written on the subject and you had to buy those or you would become a caveman in a near future. What happened is that anyone could learn the whole thing in a day and that was it, no need to debate about whether you would miss anything if you didn't knew all those tools.

verbify · 7 months ago

I certainly have better Google fu than some relatives who are always asking me to find something online.

You’re only proving the opposite: there’s definitely a difference between “experienced Google user” and someone who just puts random words and expects to find what they need.

sokoloff · 7 months ago

I think there are people for whom reading a prompt guide (or watching an experienced user) will be very valuable.

Many people just won't put any conscious thought into trying to get better on their own, though some of them will read or watch one thing on the topic. I will readily admit to picking up several useful tips from watching other people use these tools and from discussing them with peers. That's improvement that I don't think I achieve by solely using the tools on my own.

awb · 7 months ago

Many years ago there were guides on how to write user stories: “As a [role], I want to be able to do [task] so I can achieve [objective]”, because it was useful to teach high-level thinkers how to communicate requirements with less ambiguity.

It may seem simple, but in my experience even brilliant developers can miss or misinterpret unstructured requirements, through no fault of their own.

TheCowboy · 7 months ago

It's at least useful for seeing how other people are being productive with these tools. I also sometimes find a clever idea that improves that I'm already doing.

And documenting the current state of this space as well. It's easy to have tried doing something a year ago and think they're still bad.

I also usually prefer researching some area before reinventing the wheel by trial/failure myself. I appreciate when people share what they've discovered with their own their time, as I don't always have all the time in the world to explore it as I would if I were still a teen.

baby · 7 months ago

There are definitely tricks that are not obvious. For example it seems like you should delete all politeness (e.g. "please")

orochimaaru · 7 months ago

A long time back for my MS CS I took a science of programming course. The way to verify has helped me craft prompts when I do data engineering work. Basically:

Given input (…) and preconditions (…) write me spark code that gives me post conditions (…). If you can formally specify the input, preconditions and post conditions you usually get good working code.

1. Science of programming, David Gries 2. Verification of concurrent and sequential systems

heisenburgzero · 7 months ago

In my own experience, if the problem is not solvable by a LLM. No amount of prompt "engineering" will really help. Only way to solve it would be by partially solving it (breaking down to sub-tasks / examples) and let it run its miles.

I'll love to be wrong though. Please share if anyone has a different experience.

I think part of the skill in using LLMs is getting a sense for how to effectively break problems down, and also getting a sense of when and when not to do it. The article also mentions this.

I think we'll also see ways of restructuring, organizing, and commenting code to improve interaction with LLMs. And also expect LLMs to get better at doing this, and maybe suggesting ways for programmers to break problems down that it is struggling with.

stets · 7 months ago

I think the intent of prompt engineering is to get better solutions quicker, in formats you want. But yeah, ideally the model just "knows" and you don't have to engineer your question

yuvadam · 7 months ago

Seems like so much over (prompt) engineering.

I get by just fine with pasting raw code or errors and asking plain questions, the models are smart enough to figure it out themselves.

leshow · 7 months ago

using the term "engineering" for writing a prompt feels very unserious

vunderba · 7 months ago

I came across a pretty amusing analogy back when prompt "engineering" was all the rage a few years ago.

> Calling someone a prompt engineer is like calling the guy who works at Subway an artist because his shirt says ‘Sandwich Artist.’

All jokes aside I wouldn't get to hung up on the title, the term engineer has long since been diluted to the point of meaninglessness.

https://jobs.mysubwaycareer.eu/careers/sandwich-artist.htm

theanonymousone · 7 months ago

Why would I have a problem calling that guy a sandwich engineer?

https://en.wikipedia.org/wiki/Audio_engineer

guappa · 7 months ago

Well in USA they have "sales engineers", which in my experience are people who have no clue how the thing they're supposed to sell works.

ndriscoll · 7 months ago

I went into software instead, but IIRC sales and QA engineers were common jobs I heard about for people in my actual accredited (optical) engineering program. A quick search suggests it is common for sales engineers to have engineering degrees? Is this specifically about software (where "software engineers" frequently don't have engineering degrees either)?

Isn't this basically the same argument that comes up all the time about software engineering in general?

I have a degree in software engineering and I'm still critical if its inclusion as an engineering discipline, just given the level of rigour that's applied to typical software development.

When it comes to "prompt engineering", the argument is even less compelling. Its like saying typing in a search query is engineering.

ozim · 7 months ago

Because your imagination stopped at chat interface asking for funny cat pictures.

There are prompts to be used with API an inside automated workflows and more to it.

kovac · 7 months ago

IT is where words and their meanings come to die. I wonder if words ever needed to mean something :p

morkalork · 7 months ago

For real. Editing prompts bares no resemblance to engineering at all, there is no accuracy or precision. Say you have a benchmark to test against and you're trying to make an improvement. Will your change to the prompt make the benchmark go up? Down? Why? Can you predict? No, it is not a science at all. It's just throwing shit and examples at the wall in hopes and prayers.

echelon · 7 months ago

> Will your change to the prompt make the benchmark go up? Down? Why? Can you predict? No, it is not a science at all.

Many prompt engineers do measure and quantitatively compare.

yawnxyz · 7 months ago

updating benchmarks and evals is something closer to test engineering / qa engineering work though

I understand your point, but don't we already have e.g. AWS engineers? Or I believe SAP/Tableau/.. engineers?

Absolutely. It's not appropriate to describe developers in general either. That fight has been lost I think and that's all the more reason to push against this nonsense now.