Diagrams AI can, and cannot, generate

A mistake I see people repeating over and over, is never restarting their conversations with a edited initial message.

Instead of doing what the author is doing here, and sending messages back and forward, leading to a longer and longer conversation, where each messages leads to worse and worse quality replies, until the LLM seems like a dumb rock, rewrite your initial message with everything that went wrong/was misunderstood, and aim to have whatever you want solved in the first message, and you'll get a lot higher quality answers. If the LLM misunderstood, don't reply "No, what I mean was..." but instead rewrite the first message so it's clearer.

This is at least true for all ChatGPT, Claude and DeepSeek models, YMMV with other models.

swatcoder · 9 months ago

Yup.

Inasmuch as these are collaborative document generators at their core, "minimally ambiguous prompt and conforming reply" is a strongly represented document structure and so we benefit by setting them up to complete one.

Likewise, "tragi-comic dialog between increasingly frustrated instructor and bumbling pupil" is also a widely represented document structure that we benefit by trying to avoid.

Chatbot training works to minimize the chance of an LLM engaging in the latter, because dialog is a intuitive interface that users enjoy, but we can avoid the problem more successfully by just providing a new and less ambiguous prompt in a new session, as you suggest.

dartos · 9 months ago

> dialog is a intuitive interface that users enjoy

Do people enjoy chat interfaces in their workflows?

I always thought that cursor/copilot/copy.ai/v0.dev were so popular because they break away from the chat UI.

Dialog is cool when exploring but, imo, really painful when trying to accomplish a task. LLMs are far too slow to make a real fluid conversation.

yuvalr1 · 9 months ago

This means the leading UI for LLMs - the chat - is the wrong UI, at least for some of the tasks. We should instead have a single query text field, like in search engines, that you continue to edit and refine, just like in complex search queries.

freehorse · 9 months ago

I like zed's approach, where the whole discussion is a plain text file you can edit as any text, which gives you the ability to change anything in the "discussion" regardless if it was generated by you or the llm. It makes stuff like that much simpler, ie you can correct simple stuff in the llm's response without unecessary back and forths, you can just cut parts out of the discussion to reduce context size or guide the discussion where you actually want removing distractions etc. I don't understand why the dominant approach is an actual, realistic chat interface where you can only add a new response, or in best case create "threads".

refsab · 9 months ago

I've found the most useful LLM UIs for me are tree-like with lots of branches where you go back and forth between your prompts. You branch off anywhere, edit top or leafs as you go.

If one branch doesn't work out you go back to the last node that gave good results or the top and create another branch with a different prompt from.

Or if you want to ask something in a different direction but don't want all the baggage from recent nodes.

Example: https://exoloom.io/trees

sgillen · 9 months ago

I still think there is value in chats and retaining context. But there is also value in starting clean when necessary. Giving users control and teaching people how to use it is the way IMO.

diggan · 9 months ago

> This means the leading UI for LLMs - the chat - is the wrong UI

For coding, I'd agree. But seemingly people use LLMs for more than that, but I don't have any experience myself. But I agree with the idea that we haven't found the right UX for programming with LLMs yet. I'm getting even worse results with Aider, Cursor and all of those, than just my approach outlined above, so that doesn't seem like the right way either.

Deleted Comment

barnas2 · 9 months ago

I've also started adding "Ask any questions you think are relevant before starting" to the end of my prompts. It usually results in at least one question that addresses something I didn't think to add to my prompt.

kordlessagain · 9 months ago

I’ve been saying “stop writing code until we agree what needs to be done”.

godelski · 9 months ago

It seems like the author in fact did do this. They asked Claud the same message. I really doubt they repeated the entire conversation to get to that point, but I may be wrong.

From personal experience, I agree with you, but I wouldn't make the critique here as it is far from a magic bullet. Honestly, with the first stuff it seems faster to learn mermaid and implement it yourself. Mermaid can be learned in a rather short time, the basic syntax is fairly trivial and essentially obvious. As an added benefit, you then get to have this knowledge and use it later on. This will certainly feel slower than the iterative back and forth with a LLM -- either by follow-up conversations or refining your one shot -- but I'm not convinced it will be a huge difference in time as measured by the clock on the wall[0]

[0] idk, going back and forth with an LLM and refining my initial messages feels slow to me. It reminds me of print statement debugging in a compiled language. Lots of empty time.

diggan · 9 months ago

> It seems like the author in fact did do this.

It doesn't seem like that to me. At one point in the article: "There are also a few issues [...] Let’s fix with the prompt" and then a prompt that is referring the previous message. Almost all prompts after that seem to depend on the context before them.

My point is that instead of doing that, revise the original initially message so the very first response from the LLM doesn't contain any errors, because (in my experience) that's way easier and faster than trying to correct errors by adding more messages, since they all (even O1 Pro) seem to lose track of what's important in the conversation really fast.

bpodgursky · 9 months ago

100%

To be honest, this would help a lot of person-implemented iteration too, if it was biologically feasible to erase a conversation from a brain.

dingnuts · 9 months ago

alright, time for you to go watch Eternal Sunshine of the Spotless Mind so that you can disabuse yourself of that notion

danenania · 9 months ago

I built Plandex[1], an open source AI coding agent, partly to enable this workflow.

It has `log` and `rewind` commands that allow you to easily back up to any previous point in the conversation and start again from there with an updated prompt. Plandex also has branches, which can be helpful for not losing history when using this approach.

You’re right that it’s often a way to get superior results. Having mistakes or bad output in the conversation history tends to beget more mistakes and bad output, even if you are specifically directing the LLM to fix those things. Trial and error with a new prompt and clean context avoids this problem.

1 - https://plandex.ai

P.S. I wrote a bit about the pros and cons of this approach vs. continuing to prompt iteratively in Plandex’s docs here: https://docs.plandex.ai/core-concepts/prompts#which-is-bette...

01100011 · 9 months ago

I tried this approach when attempting to get Deepseek-r1 and GrokV3 to create a simple CUDA application. It was necessary because the iterative approach kept leading to hangs and divergent behaviors. I still wasn't able to get a working application, however.

kordlessagain · 9 months ago

I love Claude, but whomever works on their UI needs to be slapped a bit. Code output covering the stop button on my laptop, page lockups on iPhone/Chrome with certain artifacts (even after reload), crazy slow typing on the computer and refusal to “continue” chat with a cheaper model. Simply providing a summary of the chat on running out of tokens would let me start another conversation, or at least a warning I was getting close.

throwaway519 · 9 months ago

Markov Chain system doesn't like Markov Chain input.

th0ma5 · 9 months ago

In my experience this only marginally improves things. It constantly offers new ways to be wrong.

chatmasta · 9 months ago

That’s too much work. I’d rather ask the LLM to rewrite my first message for me. And the UI should then give me an option to “start new chat from suggested prompt.”

diggan · 9 months ago

> I’d rather ask the LLM to rewrite my first message for me

I guess you can do that too, as long as you start a new conversation afterwards. Personally I found it much easier to keep prompts in .md files on disk, and paste them into the various interfaces when needed, and then I iterate on my local files if I notice the first answer misunderstood/got something wrong. Also lets you compose prompts which is useful if you deal with many different languages/technologies and so on.

Random thoughts:

Sketching backed by automated cleanup can be good for entering small diagrams. There used to be an iOS app based on graphviz: http://instaviz.com

Constraint-based interactive layout may be underinvested, as a consequence of too many disappointments and false starts in the 1980s.

LLMs seem ill-suited to solving the optimization of combinatorial and geometric constraints and objectives required for good diagram layout. Overall, one has to admire the directness and simplicity of mermaid. Also, it would be great to someday see a practical tool with the quality and generality of the ultra-compact grid layout prototype from the Monash group, https://ialab.it.monash.edu/~dwyer/papers/gridlayout2015.pdf (2015!!)

ttd · 9 months ago

Oh wow, thank you for linking that paper. I've been working an interactive tool for a while and have been musing on new constraint and layout types to add. Anecdotally it seems a lot of mainstream graph layout algorithms work well for small to mediumish complexity inputs, but then quickly start generating visual spaghetti. So this looks incredibly apropos for me.

relaxing · 9 months ago

App is unavailable in the US :(

teleforce · 9 months ago

Thanks for link to the Monash's paper.

>LLMs seem ill-suited to solving the optimization of combinatorial and geometric constraints and objectives required for good diagram layout.

I think this is where LLM distance NLP cousin can be of help namely CUE since fundamentally it's based on feature structure from the deterministic approach of NLP unlike LLM that's stochastic NLP [1],[2],[3].

Based on the Monash's paper, Constraint Programming (CP) is one of the popular approaches that's being used for the automatic grid layout.

Since CUE is a constraint configuration language belong to CP, and its NLP background should make it easier and seamless to integrate with LLM. If someone somehow can crack this then it will be a new generation LLM that can perform good and accurate diagramming via prompts and it will be a boon for the architect, designer and engineer. Talking about engineer, if this approach can also be used for IC layout design (analog and digital) not only for diagrams, it will easily disrupt the multi-billion dollars industry for the very expensive software for IC design and man powers.

I hope I'm not getting ahead of myself, but ultimately this combo can probably solve the "holy grails" problem mentioned towards the end of the paper's conclusions regarding layout model that somehow incorporates routing in a way that is efficiently solvable to optimality. After all some people in computer science consider CP as "holy grails" of programming [4].

Please someone somehow make a start up, or any existing YC startup like JITX (Hi Patrick) can look into this potential fruitful endeavor of hybrid LLM combo for automated IC design [5].

Perhaps your random thoughts are not so random but deterministic non-random in nature, pardon the pun.

[1] Cue – A language for defining, generating, and validating data:

https://news.ycombinator.com/item?id=20847943

[2] Feature structure:

https://en.m.wikipedia.org/wiki/Feature_structure

[3] The Logic of CUE:

https://cuelang.org/docs/concept/the-logic-of-cue/

[4] Solving Combinatorial Optimization Problems with Constraint Programming and OscaR [video]:

https://m.youtube.com/watch?v=opXBR00z_QM

[5] JITX: Automatic circuit board design:

https://www.ycombinator.com/companies/jitx