I find the word "engineering" used in this context extremely annoying. There is no "engineering" here. Engineering is about applying knowledge, laws of physics, and rules learned over many years to predictably design and build things. This is throwing stuff at the wall to see if it sticks.
Words often have multiple meanings. The “engineering” in “prompt engineering“ is like in “social engineering”. It’s a secondary, related but distinct meaning.
For example, Google defines the second meaning of "engineering" as:
2. the action of working _artfully_ to bring something about. "if not for his shrewd engineering, the election would have been lost"
Look up “engineering” in almost any dictionary, and it will list something along those lines as one of the meanings of the word. It is a well-established, nontechnical meaning of “engineering”.
Your posted definitions contradict your conclusion - I would argue there is nothing calculated (as parent poster said, there is no calculation, it just trying and watching what works), artful or skillful (because it's so random, what skill is there to develop?) about "prompt engineering".
And in fact, the first engines were developed without a robust understanding of the physics behind them. So, the original version of 'engineering' is more closely to the current practices surrounding AI than the modern reinterpretation the root comment demands.
I still like the Canadian approach that to have a title with the word Engineer in it you have to be licensed by the engineering regulator for the province you work in. The US way of every software dev, mechanic, hvac installer or plumber is an engineer is ridiculous.
Disagree. I think it's valid to describe your work as engineering if it is in fact engineering, regardless of credential. If the distinction is important, call it "<credential name> Engineer". But to simply seize the word and say you can't use it until you have this credential is authoritarian, unnecessary, rent seeking corruption.
> I still like the Canadian approach that to have a title with the word Engineer in it you have to be licensed by the engineering regulator for the province you work in.
That's just not true.
(Despite what Engineers Canada and related parasites tell you.)
You could make this same argument about a lot of work that fall onto "engineering" teams.
There's an implicit assumption that anything an engineer does is engineering (and a deeper assumption that software as a whole is worthy of being called software engineering in the first place)
Perhaps. My point is that the word "engineering" describes a specific approach, based on rigor and repeatability.
If the results of your work depend on a random generator seed, it's not engineering. If you don't have established practices, it's not engineering (hence "software engineering" was always a dubious term).
Throwing new prompts at a machine with built-in randomness to see if one sticks is DEFINITELY not engineering.
I've seen some good arguments recently that software engineering is weird in that computers ARE completely predictable - which isn't the case for other engineering fields, where there are far more unpredictable forces at play and the goal is to engineer in tolerances to account for that.
So maybe "prompt engineering" is closer to real engineering than "software engineering" is!
With distributed systems I'd say network unreliability introduces a good amount of unpredictability. Whether that's comparable to what traditional engineering disciplines see, I couldn't say. Some types of embedded programming, especially those deployed out in the field, might also need to account for non-favorable conditions. But the predictability argument is interesting nonetheless.
Engineers work with non-deterministic systems all the time. Getting them to work predictably within a known tolerance window and/or with a quantified and acceptable failure rate is absolutely engineering.
I saw a talk by somebody from a big national lab recently, and she was announced as the "facilities manager". I wondered for about 5 seconds why the janitor was giving a talk at a technical conference, but it turns out facility meant the equivalent of a whole lab/instrument. She was the top boss.
Assume for the sake of argument, that this is literally sorcery -- ie communing with spirits through prayer.
_Even in that case_, if you can design prayers that get relatively predictable results from gods and incorporate that into automated systems, that is still engineering. Trying to tame chaotic and unpredictable systems is a big part of what engineering is. Even designing systems where _humans_ do all the work -- just as messy a task as dealing with LLMs, if not more -- is a kind of engineering.
> rules learned over many years
How do you think they learned those rules? People were doing engineering for centuries before science even existed as a discipline. They built steam engines first and _then_ discovered the laws of thermodynamics.
There is engineering when this is done seriously, though.
Build a test set and design metrics for it. Do rigorous measurement on any change of the system, including the model, inference parameters, context, prompt text, etc. Use real statistical tests and adjust for multiple comparisons as appropriate. Have monitoring that your assumptions during initial prompt design continue to be valid in the future, and alert on unexpected changes.
I'm surprised to see none of that advice in the article.
I found Hillel Wayne's series of articles about the relationship between software and other engineering disciplines from a few years fairly insightful [1]. It's not _exactly_ the same topic but a lot of overlap in defining wht is "real engineering".
It’s not engineering if you throw anything together without much understanding of the why of things.
But if you understand the model architecture, training process, inference process, computational linguistics, applied linguistics in the areas of semantics, syntax, and more— and apply that knowledge to prompt creation… this application of knowledge from systemic fields of inquiry is the definition of engineering.
Black box spaghetti-hits-wall prompt creation? Sure, not so much.
Part of the problem is the “physics” of prompting changes with the models. At the prompt level, is it Even Possible to engineer when the laws of the universe aren’t even stable.
Engineering of the model architecture, sure. You can mathematically model it.
Indeed. Engineering is the act of employing our best predictive theorems to manifest machines that work in reality. Here we see people doing the opposite, describing theorems (and perhaps superstitions) that are hoped to be predictive, on the basis of observing reality. However insofar as these theorems remain poor in their predictive power, their application can scarcely be called engineering.
(They're not an LLM fan; also: I directionally agree about "prompt" engineering, but the argument proves too much if it disqualifies "context" engineering, which is absolutely a normal CS development problem).
> Engineering is about applying knowledge, laws of physics, and rules learned over many years to predictably design and build things. This is throwing stuff at the wall to see if it sticks.
There’s one other type of “engineering” that this reminds me of…
1) Software engineers don't often have deep physical knowledge of computer systems, and their work is far more involved with philosophy and to a certain extent mathematics than it is with empirical science.
2) I can tell you're not current with advances in AI. To be brief, just like with computer science more broadly, we have developed an entire terminology, reference framework and documentation for working with prompts. This is an entire field that you cannot learn in any school, and increasingly they won't hire anyone without experience.
Unless you are going into a legal definition, where there's a global enumeration of the tasks it does, "engineering" means building stuff. Mostly stuff that is not "art", but sometimes even it.
Building a prompt is "prompt engineering". You could also call it "prompt crafting", or "prompt casting", but any of those would do.
Also, engineering also had a strong connotation of messing with stuff you don't understand until it works reliably. Your idea of it is very new, and doesn't even apply to all areas that are officially named that way.
First they came for science: Physics, Chemistry, Biology -vs- social science, political science, nutrition science, education science, management science...
Now they come for engineering: software engineering, prompt engineering...
Here's my best advice of prompt engineering for hard problems. Always funnel out and then funnel in. Let me explain.
State your concrete problem and context. Then we funnel out by asking the AI to do a thorough analysis and investigate all the possible options and approaches for solving the issue. Ask it to go search the web for all possible relevant information. And now we start funneling in again by asking it to list the pros and cons of each approach. Finally we asked it to choose which one or two solutions are the most relevant to our problem at hand.
For easy problems you can just skip all of this and just ask directly because it'll know and it'll answer.
The issue with harder problems is that if you just ask it directly to come up with a solution then it'll just make something up and it will make up reasons for why it'll work. You need to ground it in reality first.
So you do: contrete context and problem, thorough analysis of options, list pros and cons, and pick a winner.
“Honey, which restaurant should we eat at tonight? First, create a list of restaurants and highlight the pros and cons of each. Conduct a web search. Narrow this down to 2 restaurants and wait for a response.”
The big unlock for me reading this is to think about the order of the output. As in, ask it to produce evidence and indicators before answering a question. Obviously I knew LLMs are a probabilistic auto complete. For some reason, I didn't think to use this for priming.
Note that this is not relevant for reasoning models, since they will think about the problem in whatever order it wants to before outputting the answer. Since it can “refer” back to its thinking when outputting the final answer, the output order is less relevant to the correctness. The relative robustness is likely why openai is trying to force reasoning onto everyone.
This is misleading if not wrong. A thinking model doesn’t fundamentally work any different from a non-thinking model. It is still next token prediction, with the same position independence, and still suffers from the same context poisoning issues. It’s just that the “thinking” step injects this instruction to take a moment and consider the situation before acting, as a core system behavior.
But specialized instructions to weigh alternatives still works better as it ends up thinking about thinking, thinking, then making a choice.
Furthermore, the opposite behavior is very, very bad. Ask it to give you an answer and justify it, it will output a randomish reply and then enter bullshit mode rationalizing it.
Ask it to objectively list pros and cons from a neutral/unbiased perspective and then proclaim an answer, and you’ll get something that is actually thought through.
I typically ask it to start with some short, verbatim quotes of sources it found online (if relevant), as this grounds the context into “real” information, rather than hallucinations. It works fairly well in situations where this is relevant (I recently went through a whole session of setting up Cloudflare Zero Trust for our org, this was very much necessary).
I try so hard for chatgpt to link and quote real documentation. It makes up links, fake quotes, it even gaslights me when i clarify the information isn’t real.
Reminds me of a time that I found I could speed up by 30% an Algo in a benchmark set if I seed the random number generator with the number 7. Not 8. Not 6. 7.
"Engineering" here seems rhetorically designed to convince people they're not just writing sentences. With respect "prompt writing" probably sounds bad to the same type of person who thinks there are "soft" skills.
One could similarly argue software engineering is also just writing sentences with funny characters sprinkled in. Personally, my most productive "software engineering" work is literally writing technical documents (full of sentences!) and talking to people. My mechanical engineering friends report similar as they become more senior.
I dont think so. It says the words were choosen to wngineer peoples emotions and make then feel right way.
Tech people do not feel good about "writing propt essay" so it is called engineering to buy their emotional acceptance.
Just like we call wrong output "hallucination" rather then "bullshit" or "lie" or "bug" or "wrong output". Hallucination is used to make us feel better and more acceptiong.
Yeah, precisely what I'm saying. I don't think "they write prompt 'engineering' instead of 'writing' to maintain the fragile egos of people who use chatbots" [don't agree? See Mr. "but muh soft skills!" crying down thread] is worth saying outside of HN if I'm honest.
This is written for the 3 models (Sonnet, Haiku, Opus 3). While some lessons will be relevant today, others will not be useful or necessary on smarter, RL’d models like Sonnet 4.5.
> Note: This tutorial uses our smallest, fastest, and cheapest model, Claude 3 Haiku. Anthropic has two other models, Claude 3 Sonnet and Claude 3 Opus, which are more intelligent than Haiku, with Opus being the most intelligent.
Yes, Chapters 3 and 6 are likely less relevant now. Any others? Specifically assuming the audience is someone writing a prompt that’ll be re-used repeatedly or needs to be optimized for accuracy.
For example, Google defines the second meaning of "engineering" as:
2. the action of working _artfully_ to bring something about. "if not for his shrewd engineering, the election would have been lost"
(https://www.google.com/search?q=define%3AEngineering)
Merriam-Webster has:
3 : calculated manipulation or direction (as of behavior), giving the example of “social engineering”
(https://www.merriam-webster.com/dictionary/engineering)
Random House has:
3. skillful or artful contrivance; maneuvering
(https://www.collinsdictionary.com/dictionary/english/enginee...)
Webster's has:
The act of maneuvering or managing.
(https://www.yourdictionary.com/engineering)
Look up “engineering” in almost any dictionary, and it will list something along those lines as one of the meanings of the word. It is a well-established, nontechnical meaning of “engineering”.
the "engineering means working with engines" gibberish at the bottom is simply dishonest at best
"engineering" means "it's not guessing game"
I do that every morning, before applying my bus-taking engineering to my job.
Because I do prompt engineering for a living.
So many words lost their meaning today... I am glad I'm not the only one annoyed by this.
Deleted Comment
Words evolve over time because existing words get adapted in ways to help people understand new concepts.
That's just not true.
(Despite what Engineers Canada and related parasites tell you.)
There's an implicit assumption that anything an engineer does is engineering (and a deeper assumption that software as a whole is worthy of being called software engineering in the first place)
If the results of your work depend on a random generator seed, it's not engineering. If you don't have established practices, it's not engineering (hence "software engineering" was always a dubious term).
Throwing new prompts at a machine with built-in randomness to see if one sticks is DEFINITELY not engineering.
Even minor changes to models can render previous prompts useless or invalidate assumptions for new prompts.
Changing the production or operating process in the face of changing inputs or desired outputs is the bread and butter of countless engineers.
So maybe "prompt engineering" is closer to real engineering than "software engineering" is!
_Even in that case_, if you can design prayers that get relatively predictable results from gods and incorporate that into automated systems, that is still engineering. Trying to tame chaotic and unpredictable systems is a big part of what engineering is. Even designing systems where _humans_ do all the work -- just as messy a task as dealing with LLMs, if not more -- is a kind of engineering.
> rules learned over many years
How do you think they learned those rules? People were doing engineering for centuries before science even existed as a discipline. They built steam engines first and _then_ discovered the laws of thermodynamics.
There is engineering when this is done seriously, though.
Build a test set and design metrics for it. Do rigorous measurement on any change of the system, including the model, inference parameters, context, prompt text, etc. Use real statistical tests and adjust for multiple comparisons as appropriate. Have monitoring that your assumptions during initial prompt design continue to be valid in the future, and alert on unexpected changes.
I'm surprised to see none of that advice in the article.
Prompt engineering is honestly closer to the work of an o.g. engineer. Monitor the dials. Tweak the inputs. Keep the train on time.
[1] https://hillelwayne.com/post/are-we-really-engineers/
But if you understand the model architecture, training process, inference process, computational linguistics, applied linguistics in the areas of semantics, syntax, and more— and apply that knowledge to prompt creation… this application of knowledge from systemic fields of inquiry is the definition of engineering.
Black box spaghetti-hits-wall prompt creation? Sure, not so much.
Engineering of the model architecture, sure. You can mathematically model it.
Prompts? Perhaps never possible.
https://news.ycombinator.com/item?id=44978319
(They're not an LLM fan; also: I directionally agree about "prompt" engineering, but the argument proves too much if it disqualifies "context" engineering, which is absolutely a normal CS development problem).
Maybe Blahgineering?
"The words "blah blah blah" increase AI accuracy" - https://news.ycombinator.com/item?id=45524413
There’s one other type of “engineering” that this reminds me of…
2) I can tell you're not current with advances in AI. To be brief, just like with computer science more broadly, we have developed an entire terminology, reference framework and documentation for working with prompts. This is an entire field that you cannot learn in any school, and increasingly they won't hire anyone without experience.
Building a prompt is "prompt engineering". You could also call it "prompt crafting", or "prompt casting", but any of those would do.
Also, engineering also had a strong connotation of messing with stuff you don't understand until it works reliably. Your idea of it is very new, and doesn't even apply to all areas that are officially named that way.
Now it's all about Context Engineering which is very much engineering.
Now they come for engineering: software engineering, prompt engineering...
:P
State your concrete problem and context. Then we funnel out by asking the AI to do a thorough analysis and investigate all the possible options and approaches for solving the issue. Ask it to go search the web for all possible relevant information. And now we start funneling in again by asking it to list the pros and cons of each approach. Finally we asked it to choose which one or two solutions are the most relevant to our problem at hand.
For easy problems you can just skip all of this and just ask directly because it'll know and it'll answer.
The issue with harder problems is that if you just ask it directly to come up with a solution then it'll just make something up and it will make up reasons for why it'll work. You need to ground it in reality first.
So you do: contrete context and problem, thorough analysis of options, list pros and cons, and pick a winner.
But specialized instructions to weigh alternatives still works better as it ends up thinking about thinking, thinking, then making a choice.
Ask it to objectively list pros and cons from a neutral/unbiased perspective and then proclaim an answer, and you’ll get something that is actually thought through.
Reminds me of a time that I found I could speed up by 30% an Algo in a benchmark set if I seed the random number generator with the number 7. Not 8. Not 6. 7.
Dead Comment
One could similarly argue software engineering is also just writing sentences with funny characters sprinkled in. Personally, my most productive "software engineering" work is literally writing technical documents (full of sentences!) and talking to people. My mechanical engineering friends report similar as they become more senior.
Tech people do not feel good about "writing propt essay" so it is called engineering to buy their emotional acceptance.
Just like we call wrong output "hallucination" rather then "bullshit" or "lie" or "bug" or "wrong output". Hallucination is used to make us feel better and more acceptiong.
Yeah, precisely what I'm saying. I don't think "they write prompt 'engineering' instead of 'writing' to maintain the fragile egos of people who use chatbots" [don't agree? See Mr. "but muh soft skills!" crying down thread] is worth saying outside of HN if I'm honest.
> Note: This tutorial uses our smallest, fastest, and cheapest model, Claude 3 Haiku. Anthropic has two other models, Claude 3 Sonnet and Claude 3 Opus, which are more intelligent than Haiku, with Opus being the most intelligent.