Ask HN: Are you using a GPT to prompt-engineer another GPT?

chalsprhebaodu · 2 years ago

I’ve commented it before, and surely it’s something I’m doing wrong, but I cannot believe system prompts or GPTs or any amount of instructing actually works for people to get ChatGPT to respond in a certain fashion with any consistency.

I have spent hours and hours and hours and hours trying to get ChatGPT to be a little less apologetic, long-winded, to stop reiterating, and to not interpret questions about its responses as challenges (i.e when I say “what does this line do?” ChatGPT responds “you’re right, there’s another way to do it…”).

Nothing and I mean NOTHING will get ChatGPT with GPT-4 to behave consistently. And it gets worse every day. It’s like a twisted version of a genie misinterpreting a wish. I don’t know if I’ve poisoned my ChatGPT or if I’m being A/B tested to death but every time I use ChatGPT I very seriously consider unsubscribing. The only reasons I don’t are 1) I had an insanely impressive experience with GPT-3, and 2) Google is similarly rapidly decreasing in usefulness.

marviel · 2 years ago

1. Use the API 2. Use Function Calling, with detailed parameters, well named output variables describing the format of the output you want.

You'll get much, much, much better results.

mpalmer · 2 years ago

My evergreen system prompt prefix:

"You are a maximally terse assistant with minimal affect."

Works well for the most part.

ojosilva · 2 years ago

Another issue I have, especially when demanding terseness, is that it tends to bail out of writing long code snippets with ellipsis comments like "// And more of the same here" which sometimes defeats the purpose. Except when the code is illustrative to a concept, I want it to be thorough and code the damn thing to the last semicolon.

My solution, which works sometimes, is to instruct it to "not write comments in the code." The drawback is that ChatGPT normally does a good job adding comments, but not something I can't live without.

This "code-trimming" effect does not show up for me in API requests.

flir · 2 years ago

I second "terse". Damn useful word.

"No moral lectures. No need to mention your knowledge cutoff. No need to disclose you're an AI. Be detailed and complete, but terse."

Gonna try rewriting in the second person, based on your prompt.

I often feel like I'm trying to undo the damage done by OpenAI though. The API doesn't seem to need this crap.

hackerlight · 2 years ago

OpenAI really should fix this. I've started using Bard and brevity comes out of the box. When I used ChatGPT I always had this background feeling of irritation at the ridiculously verbose responses.

Manouchehri · 2 years ago

Using JSON mode with the GPT 3.5/4 API works well for us. So much so that we have to intentionally fake errors to test that our retries/fallbacks actually work in our code.

BOOSTERHIDROGEN · 2 years ago

Have you compared this to chatgpt plus?

crooked-v · 2 years ago

I would assume a lot of that has to do with whatever obsequieous nonsense they've got in the RLHF 'safety' training, and you're not getting rid of that without pushing it into a totally different context via DAN-like 'jailbreaks'.

Solvency · 2 years ago

It wasn't always like this. GPT in early 2023, hell late 2022, was incredible. I could have it fully stimulating a Unix terminal on acid for hours, it'd never break character. It's so insanely nerfed now.

muzani · 2 years ago

It's insanely good every time they have a public release, then deteriorates significantly. There's plenty of evidence around this too - just compare the exact same prompt then and now. Not sure if this is a matter of cost or just playing whack a mole with unintended behaviorial bugs.

xyproto · 2 years ago

I have a similar experience.

Only asking for things I expect it to be able to find online helps a lot, though.

The moment I try to be innovative or mix two ideas into something new or novel, it falls to pieces in the most frustrating way.

ben30 · 2 years ago

I have this as my custom prefix for when talking to gpt:

Cut unnecessary words, choose those that remain from the bedrock vocabulary everyone knows and keep syntax simple. Opt for being brusque but effective instead of sympathetic. Value brevity. Bullet points, headings and formatting for emphasis are good.

ojosilva · 2 years ago

Unfortunately, in my experience, as the chat session advances it seems to forget these instructions and become its old apologetic self again.

losteric · 2 years ago

Do you thumbs down the bad responses?

chalsprhebaodu · 2 years ago

Religiously.

muzani · 2 years ago

The hack is using GPT-3, and I don't mean 3.5. It still performs to a production level, at least for creative work. It's been sped up and is significantly cheaper.

kirkarg · 2 years ago

I'm not sure about web based service but with the API this is easily achievable by tinkering with the system message.

adamgordonbell · 2 years ago

Share some chats. It will be instructive for others and maybe somebody has a solution.

kromem · 2 years ago

A fun bug is that ChatGPT will always use an emoji when apologizing. So if you ask it not to use emojis in a chat and it does (which it often will do in promising not to), and point it out, it results in a loop of apologies and self critique that devolves into modeling an existential crisis.

JohnBooty · 2 years ago

That's interesting. I've seen a lot of apologies from ChatGPT-4 and I don't think I've ever seen an emoji.

I've never asked it not to, either.

Solvency · 2 years ago

This isn't even remotely true. I've never once seen an emoji from it in over a year of daily use.

lulznews · 2 years ago

Yea why did they build in the wishy washy wokeness … sigh. Very difficult to get succinct answers from it.

Deleted Comment

Dead Comment

knrz · 2 years ago

You should check out https://x.com/lateinteraction's DSPy — which is like an optimizer for prompts — https://github.com/stanfordnlp/dspy

nl · 2 years ago

Yes, I've had great success with this in a few cases.

There's the obvious "create a stable diffusion prompt with all the line noise of 'unreal engine 4K high quality award winning photorealistic'" stuff which is pretty obvious.

Less obvious is using it to refine system prompts for the "create your own GPTs" thing. I used this approach for my "Chat with Marcus Aurelius, Emperor of Rome and Stoic philosopher"[1] and "New Testament Bible chat"[2]

I'm particularly happy with how well the Marcus Aurelius one works, eg: https://chat.openai.com/share/27323fe8-56e2-4620-8e4a-3ebf69...

For both of these I started with a rough prompt and then asked GPT4 to refine it.

I found the key was to make sure to read the generated prompt very carefully to make sure it is actually asking for what you want.

More recently I've been using the same technique for some more complicated use-cases: creating a prompt for GPT-4 to rank answers and creating prompts for Mistral-7B. The same basic approach works well for both of these.

[1] https://chat.openai.com/g/g-qAICXF1nN-marcus-aurelius-empero...

[2] https://chat.openai.com/g/g-CBLrOOGjA-official-new-testament...

zamadatix · 2 years ago

ChatGPT also uses the first approach for image generation. It even released before direct access to Dall-E 3 did.

nbardy · 2 years ago

Yes. I deploy prompts professionally for work and I almost always iterate with chatGPT.

It requires a bit of back forth but you can get great results. It lets you iterate at a higher level instead of word for word.

I also find that the prompts work better. Prompt engineering is often about finding magic words and sentences that are dense keywords from the training data and another LLM is going to be good at finding those phrases because it knows those phrases the best.

Here’s an example dialogue I was using recently to iterate on a set of prompts for generating synthetic training data for LLM training. (Inspired by phi-2)

https://chat.openai.com/share/51dd634b-7743-4b5e-9c3f-3d57c6...

strangattractor · 2 years ago

Sounds a lot like what I do when choosing words to Google for things.

Dead Comment

CrypticShift · 2 years ago

On a related note, with the (tens of) thousands of "custom GPTs" coming up in the next few years, it would be interesting if the chat would automatically recommend using any one of them in response to a particular query. In a way, it is as if it is directing you to a (human-made) better engineered (pre) prompt.

Gustek · 2 years ago

GPT store kind of has it already, tell it what you want and it will give you suggestions for GPTs

ReDeiPirati · 2 years ago

We recently open sourced an agent framework [1] for automating data processing and labeling where the agent's prompt is refined trough iterations with the environment and then asking to an LLM to revise the prompt according to its performance (i.e. automatic prompt tuning). We tested it on the Math reasoning dataset GSM8k and where able to improve the baseline accuracy (GPT4) by 45% -> 74% using 25 labeled examples (I'll put the notebook and blog post linked below [2][3]). Results are definitively very interesting, if not surprising with some skills, and we see more and more of our open source users and customers showing interested in the framework for automating labeling / having it as a copilot.

[1] https://github.com/HumanSignal/Adala

[2] https://github.com/HumanSignal/Adala/blob/master/examples/gs...

[3] https://labelstud.io/blog/mastering-math-reasoning-with-adal...

alfozan · 2 years ago

Checkout Magic Prompts https://magicprompts.lyzr.ai/

FinalDestiny · 2 years ago

Yes, I just used GPT-4 to create a prompt for GPT-3.5-Turbo based on some loose rules that I laid out. It helped me fill in the gaps and write it in a concise format.

The prompt gave much much better results than than the one I wrote.