Chat is a bad UI pattern for development tools

I'm going to take a contrarian view and say it's actually a good UI, but it's all about how you approach it.

I just finished a small project where I used o3-mini and o3-mini-high to generate most of the code. I averaged around 200 lines of code an hour, including the business logic and unit tests. Total was around 2200 lines. So, not a big project, but not a throw away script. The code was perfectly fine for what we needed. This is the third time I've done this, and each time I get faster and better at it.

1. I find a "pair programming" mentality is key. I focus on the high-level code, and let the model focus on the lower level code. I code review all the code, and provide feedback. Blindly accepting the code is a terrible approach.

2. Generating unit tests is critical. After I like the gist of some code, I ask for some smoke tests. Again, peer review the code and adjust as needed.

3. Be liberal with starting a new chat: the models can get easily confused with longer context windows. If you start to see things go sideways, start over.

4. Give it code examples. Don't prompt with English only.

FWIW, o3-mini was the best model I've seen so far; Sonnet 3.5 New is a close second.

ryandrake · a year ago

I guess the things I don't like about Chat are the same things I don't like about pair (or team) programming. I've always thought of programming as a solitary activity. You visualize the data structures, algorithms, data paths, calling flow and stack, and so on, in your mind, with very high throughput "discussions" happening entirely in your brain. Your brain is high bandwidth, low latency. Effortlessly and instantly move things around and visualize them. Figure everything out. Finally, when it's correct, you send it to the slow output device (your fingers).

The minute you have to discuss those things with someone else, your bandwidth decreases by orders of magnitude and now you have to put words to these things and describe them, and physically type them in or vocalize them. Then your counterpart has to input them through his eyes and ears, process that, and re-output his thoughts to you. Slow, slow, slow, and prone to error and specificity problems as you translate technical concepts to English and back.

Chat as a UX interface is similarly slow and poorly specific. It has all the shortcomings of discussing your idea with a human and really no upside besides the dictionary-like recall.

yarekt · a year ago

That's such a mechanical way of describing pair programming. I'm guessing you don't do it often (understandable if its not working for you).

For me pair programming accelerates development to much more than 2x. Over time the two of you figure out how to use each other's strengths, and as both of you immerse yourself in the same context you begin to understand what's needed without speaking every bit of syntax between each other.

In best cases as a driver you end up producing high quality on the first pass, because you know that your partner will immediately catch anything that doesn't look right. You also go fast because you can sometimes skim over complexities letting your partner think ahead and share that context load.

I'll leave readers to find all the caveats here

Edit: I should probably mention why I think Chat Interface for AI is not working like Pair programming: As much as it may fake it, AI isn't learning anything while you're chatting to it. Its pointless to argue your case or discuss architectural approaches. An approach that yields better results with Chat AI is to just edit/expand your original prompt. It also feels less like a waste of time.

With Pair programming, you may chat upfront, but you won't reach that shared understanding until you start trying to implement something. For now Chat AI has no shared understanding, just "what I asked you to do" thing, and that's not good enough.

frocodillo · a year ago

I would argue that is a feature of pair programming, not a bug. By forcing you to use the slower I/O parts of your brain (and that of your partner) the process becomes more deliberate, allowing you to catch edge cases, bad design patterns, and would-be bugs before even putting pen to paper so to speak. Not to mention that it immediately reduces the bus factor by having two people with a good understanding of the code.

I’m not saying pair programming is a silver bullet, and I tend to agree that working on your own can be vastly more efficient. I do however think that it’s a very useful tool for critical functionality and hard problems and shouldn’t be dismissed.

hmcdona1 · a year ago

This going to sound out of left field, but I would venture to guess you have very high spatial reasoning skills. I operate much this same way and only recently connected these dots that that skill might be what my brain leans on so heavily while programming and debugging.

Pair programming is endlessly frustrating beyond just rubber duckying because I’m having to exit my mental model, communicate it to someone else, and then translate and relate their inputs back into my mental model which is not exactly rooted in language in my head.

bobbiechen · a year ago

I agree, chat is only useful in scenarios that are 1) poorly defined, and 2) require a back-and-forth feedback loop. And even then, there might be better UX options.

I wrote about this here: https://digitalseams.com/blog/the-ideal-ai-interface-is-prob...

throwup238 · a year ago

At the same time, putting your ideas to words forces you to make them concrete instead of nebulous brain waves. I find that the chat interface gets rid of the downsides of pair programming (that the other person is a human being with their own agency*) while maintaining the “intelligent” pair programmer aspect.

Especially with the new r1 thinking output, I find it useful to iterate on the initial prompt as a way to make my ideas more concrete as much as iterating through the chat interface which is more hit and miss due to context length limits.

* I don’t mean that in a negative way, but in a “I can’t expect another person to respond to me instantly at 10 words per second” way.

cjonas · a year ago

I find its exactly the opposite. With AI chat, I can define signatures, write technical requirements and validate my approach in minutes. I'm not talking with the AI like I would a human... I'm writing a blend of stubs and concise requirements, providing documentation, reviewing, validating and repeating. When it goes in the wrong direction, I add additional details and regenerate from scratch. I focus on small, composable chunks of functionality and then tie it all together at the end.

alickz · a year ago

in my experience, if you can't explain something to someone else then you don't fully understand it

our brains like to jump over inconsistencies or small gaps in our logic when working by themselves, but try to explain that same concept to someone else and those inconsistencies and gaps become glaringly obvious (doubly so if the other person starts asking questions you never considered)

it's why pair programming and rubber duck debugging work at all, at least in my opinion

knighthack · a year ago

I mostly agree with you, but I have to point out something to the contrary of this part you said: "...The minute you have to discuss those things with someone else, your bandwidth decreases by orders of magnitude and now you have to put words to these things and describe them, and physically type them in or vocalize them."

Subvocalization/explicit vocalization of what you're doing actually improves your understanding of the code. Doing so may 'decrease bandwith', but improves comprehension, because it's basically inline rubber duck debugging.

It's actually easy to write code which you don't understand and cannot explain what it's doing, whether at the syntax, logic or application level. I think the analogue is to writing well; anyone can write streams of consciousness amounting to word salad garbage. But a good writer can cut things down and explain why every single thing was chosen, right down to the punctuations. This feature of writing should be even more apparent with code.

I've coded tons of things where I can get the code working in a mediocre fashion, and yet find great difficulty in try to verbally explain what I'm doing.

In contrast there's been code where I've been able to explain each step of what I'm doing before I even write anything; in those situations what generally comes out tends to be superior maintainable code, and readable too.

nick238 · a year ago

Someone else (future you being a distinct person) will also need to grok what's going on when they maintain the code later. By living purely in a high-dimensional trans-enlightenment state and coding that way, means you may as well be building a half-assed organic neural network to do your task, rather than something better "designed".

Neural networks and evolved structures and pathways (e.g. humans make do with ~20k genes and about that many more in regulatory sequences) are absolutely more efficient, but good luck debugging them.

godelski · a year ago

  > I focus on the high-level code, and let the model focus on the lower level code.

Tbh the reason I don't use LLM assistants is because they suck at the "low level". They are okay at mid level and better at high level. I find it's actual coding very mediocre and fraught with errors.

I've yet to see any model understand nuance or detail.

This is especially apparent in image models. Sure, it can do hands but they still don't get 3D space nor temporal movements. It's great for scrolling through Twitter but the longer you look the more surreal they get. This even includes the new ByteDance model also on the front page. But with coding models they ignore context of the codebase and the results feel more like patchwork. They feel like what you'd be annoyed at with a junior dev for writing because not only do you have to go through 10 PRs to make it pass the test cases but the lack of context just builds a lot of tech debt. How they'll build unit tests that technically work but don't capture the actual issues and usually can be highly condensed while having greater coverage. It feels very gluey, like copy pasting from stack overflow when hyper focused on the immediate outcome instead of understanding the goal. It is too "solution" oriented, not understanding the underlying heuristics and is more frustrating than dealing with the human equivalent who says something "works" as evidenced by the output. This is like trying to say a math proof is correct by looking at just the last line.

Ironically, I think in part this is why chat interface sucks too. A lot of our job is to do a lot of inference in figuring out what our managers are even asking us to make. And you can't even know the answer until you're part way in.

yarekt · a year ago

> A lot of our job is to do a lot of inference in figuring out what our managers are even asking us to make

This is why I think LLMs can't really replace developers. 80% of my job is already trying to figure out what's actually needed, despite being given lots of text detail, maybe even spec, or prototype code.

Building the wrong thing fast is about as useful as not building anything at all. (And before someone says "at least you now know what not to do"? For any problem there are infinite number of wrong solutions, but only a handful of ones that yield success, why waste time trying all the wrong ones?)

lucasmullens · a year ago

> But with coding models they ignore context of the codebase and the results feel more like patchwork.

Have you tried Cursor? It has a great feature that grabs context from the codebase, I use it all the time.

wiremine · a year ago

> Tbh the reason I don't use LLM assistants is because they suck at the "low level". They are okay at mid level and better at high level. I find it's actual coding very mediocre and fraught with errors.

That's interesting. I found assistants like Copilot fairly good at low level code, assuming you direct it well.

rpastuszak · a year ago

I've changed my mind on that as well. I think that, generally, chat UIs are a lazy and not very user friendly. However, when coding I keep switching between two modes:

1. I need a smart autocomplete that can work backwards and mimic my coding patterns

2. I need a pair programming buddy (of sorts, this metaphor doesn't completely work, but I don't have a better one)

Pair development, even a butchered version of the so called "strong style" (give the driver the highest level of abstraction they can use/understand) works quite well for me. But, the main reason this works is that it forces me to structure my thinking a little bit, allows me to iterate on the definition of the problem. Toss away the sketch with bigger parts of the problem, start again.

It also helps me to avoid yak shaving, getting lost in the detail or distracted because the feedback loop between me seeing something working on the screen vs. the idea is so short (even if the code is crap).

I'd also add 5.: use prompts to generate (boring) prompts. For instance, I needed a simple #tag formatter for one of my markdown sites. I am aware that there's a not-so-small list of edge cases I'd need to cover. In this case I'd write a prompt with a list of basic requirements and ask the LLM to: a) extend it with good practice, common edge cases b) format it as a spec with concrete input / output examples. This works a bit similar to the point you made about generating unit tests (I do that too, in tandem with this approach).

In a sense 1) is autocomplete 2) is a scaffolding tool.

yarekt · a year ago

Oh yea, point 1 for sure. I call copilot regex on steroids.

Example: - copy paste a table from a pdf datasheet into a comment (it'll be badly formatted with newlines and whatnot, doesn't matter) - show it how to do the first line - autocomplete the rest of the table - Check every row to make sure it didn't invent fields/types

For this type of workflow the tools are a real time saver. I've yet to see any results for the other workflows. They usually just frustrate me by either starting to suggest nonsense code without full understanding, or its far too easy to bias the results and make them stuck in a pattern of thinking.

ryandrake · a year ago

> I've changed my mind on that as well. I think that, generally, chat UIs are a lazy and not very user friendly. However, when coding I keep switching between two modes:

> 1. I need a smart autocomplete that can work backwards and mimic my coding patterns

> 2. I need a pair programming buddy (of sorts, this metaphor doesn't completely work, but I don't have a better one)

Thanks! This is the first time I've seen it put this clearly. When I first tried out CoPilot, I was unsure of how I was "supposed" to interact with it. Is it (as you put it) a smarter autocomplete, or a programming buddy? Is it both? What was the right input method to use?

After a while, I realized that for my personal style I would pretty much entirely use method 1, and never method 2. But, others might really need that "programming buddy" and use that interface instead.

echelon · a year ago

I work on GenAI in the media domain, and I think this will hold true with other fields as well:

- Text prompts and chat interfaces are great for coarse grained exploration. You can get a rough start that you can refine. "Knight standing in a desert, rusted suit of armor" gets you started, but you'll want to take it much further.

- Precision inputs (mouse or structure guided) are best for fine tuning the result and honing in on the solution itself. You can individually plant the cacti and pose the character. You can't get there with text.

dataviz1000 · a year ago

I agree with you.

Yesterday, I asked o3-mini to "optimize" a block of code. It produced very clean, functional TypeScript. However, because the code is reducing stock option chains, I then asked o3-mini to "optimize for speed." In the JavaScript world, this is usually done with for loops, and it even considered aspects like array memory allocation.

This shows that using the right qualifiers is important for getting the results you want. Today, I use both "optimize for developer experience" and "optimize for speed" when they are appropriate.

Although declarative code is just an abstraction, moving from imperative jQuery to declarative React was a major change in my coding experience. My work went from telling the system how to do something to simply telling it what to do. Of course, in React—especially at first—I had to explain how to do things, but only once to create a component. After that, I could just tell the system what to do. Now, I can simply declare the desired outcome, the what. It helps to understand how things work, but that level of detail is becoming less necessary.

javier2 · a year ago

Nah, a Chat is terrible for development. In my tears of working, i have only had the chance to start a new codebase 3-4 times. 90% of the time is spent modifying large existing systems, constantly changing them. The chat interface is terrible for this. It would be much better if it was more integrated with the codebase and editor

pc86 · a year ago

Cursor does all of this, and agent chats let you describe a new feature or an existing bug and it will search the entire codebase and add relevant code to its context automatically. You can optionally attach files for the context - code files that you want to add to the context up front, documentation for third-party calls, whatever you want.

As a side note, "No, you're wrong" is not a great way to have a conversation.

zahlman · a year ago

>In my tears of working

Sometimes typos are eerily appropriate ;)

(I almost typed "errily"...)

rafaelmn · a year ago

This only works for small self-contained problems with narrow scope/context.

Chat sucks for pulling in context, and the only worse thing I've tried is the IDE integrations that supposedly pull the relevant context for you (and I've tried quite a few recently).

I don't know if naive fine-tuning with codebase would work, I suspect there are going to be tools that let you train the AI on code in the sense that it can have some references in model, and it knows how you want your project code/structure to look like (which is often quite different from what it looks in most areas)

jacob019 · a year ago

Totally agree. Chat is a fantastic interface because it stays out of my way. For me it's much more than a coding assistant. I get live examples of how to use tools, and help with boilerplate, which is a time saver and improvement over legacy workflows, but the real benefit is all the spitballing I can do with it to refine ideas and logic and help getting up to speed on tooling way outside of my domain. I spent about 3.5 hours chatting with o1 about RL architecture to solve some business problems. Now I have a crystal clear plan and the confidence to move forward in an optimal way. I feel a little weird now, like I was just talking to myself for a few hours, but it totally helped me work through the planning. For actual code, I find myself being a bit less interactive with LLMs as time goes, sometimes it's easier to just write the logic the way I want rather than trying to explain how I want it but the ability to retrieve code samples for anything with ease is like a superpower. Not to mention all the cool stuff LLMs can do at runtime via API. Yeah, chat is great, and I'll stick with writing code in Vim and pasting as needed.

gamedever · a year ago

What did you create? In my field, so far, I've found the chat bots not doing so well. My guess is the more likely you're making something other people make often, the more likely the bot will help.

Even then though, I asked o1-cursor to start a react app. It failed, mostly because it's out of date. It's instructions were for react 2 versions ago.

This seems like an issue. If the statistically most likley answer is old, that's not helpful.

wiremine · a year ago

The most recent one was a typescript project focused on zod.

I might be reading into your comment, but I agree "top-down" development sucks: "Give me a react that does X". I've had much more success going bottom-up.

And I've often seen models getting confused on versions. You need to be explicit, and even then then forget.

knes · a year ago

IMHO, I would agree with you.

I think chat is a nice intermediary evolution between the CLI (that we use every day) and whatever comes next.

I work at Augment (https://augmentcode.com), which, surprise surprise, is an AI coding assistant. We think about the new modality required to interact with code and AI on a daily basis.

Beside increase productivity (and happiness, as you don't have to do mundane tasks like tests, documentations, etc), I personally believe that what AI can open up is actually more of a way for non-coders (think PMs) to interact with a codebase. AI is really good at converting specs, user stories, and so on into tasks—which today still need to be implemented by software engineers (with the help of AI for the more tedious work). Think of what Figma did between designers and developers, but applied to coding.

What’s the actual "new UI/UX paradigm"? I don’t know yet. But like with Figma, I believe there’s a happy ending waiting for everyone.

bboygravity · a year ago

Interesting to see the narrative on here slowly change from "LLM's will forever be useless for programming" to "I'm using it every day" over the course of the past year or so.

I'm now bracing for the "oh sht, we're all out of a job next year" narrative.

RHSeeger · a year ago

I think a lot of people have always thought of it as a tool that can help.

I don't want an LLM to generate "the answer" for me in a lot of places, but I do think it's amazing for helping me gather information (and cite where that information came from) and pointers in directions to look. A search engine that generates a concrete answer via LLM is (mostly) useless to me. One that gives me an answer and then links to the facts it used to generate that answer is _very_ useful.

It's the same way with programming. It's great helping you find what you need. But it needs to be in a way that you can verify it's right; or take it's answer and adjust it to what you actually need (based on the context it provides).

wiremine · a year ago

> "oh sht, we're all out of a job next year"

Maybe. My sense if we'd need to see 3 to 4 orders of magnitude improvements on the current models before we can replace people outright.

I do think we'll see a huge productivity boost per developer over the next few years. Some companies will use that to increase their throughput, and some will use it to reduce overhead.

Syzygies · a year ago

An environment such as Cursor supports many approaches for working with AI. "Chat" would be the instructions printed on the bottom, perhaps how their developers use it, but far from the only mode it actually supports.

It is helpful to frame this in the historical arc described by Yuval Harari in his recent book "Nexus" on the evolution of information systems. We're at the dawn of history for how to work with AI, and actively visualizing the future has an immediate ROI.

"Chat" is cave man oral tradition. It is like attempting a complex Ruby project through the periscope of an `irb` session. One needs to use an IDE to manage a complex code base. We all know this, but we haven't connected the dots that we need to approach prompt management the same way.

Flip ahead in Harari's book, and he describes rabbis writing texts on how to interpret [texts on how to interpret]* holy scriptures. Like Christopher Nolan's movie "Inception" (his second most relevant work after "Memento"), I've found myself several dreams deep collaborating with AI to develop prompts for [collaborating with AI to develop prompts for]* writing code together. Test the whole setup on multiple fresh AI sessions, as if one is running a business school laboratory on managerial genius, till AI can write correct code in one shot.

Duh? Good managers already understand this, working with teams of people. Technical climbers work cliffs this way. And AI was a blithering idiot until we understood how to simulate recursion in multilayer neural nets.

AI is a Rorschach inkblot test. Talk to it like a kindergartner, and you see the intelligence of a kindergartner. Use your most talented programmer to collaborate with you in preparing precise and complete specifications for your team, and you see a talented team of mature professionals.

We all experience degradation of long AI sessions. This is not inevitable; "life extension" needs to be tackled as a research problem. Just as old people get senile, AI fumbles its own context management over time. Civilization has advanced by developing technologies for passing knowledge forward. We need to engineer similar technologies for providing persistent memory to make each successive AI session smarter than the last. Authoring this knowledge helps each session to survive longer. If we fail to see this, we're condemning ourselves to stay cave men.

Compare the history of computing. There was a lot of philosophy and abstract mathematics about the potential for mechanical computation, but our worldview exploded when we could actually plug the machines in. We're at the same inflection point for theories of mind, semantic compression, structured memory. Indeed, philosophy was an untestable intellectual exercise before; now we can plug it in.

How do I know this? I'm just an old mathematician, in my first month trying to learn AI for one final burst of productivity before my father's dementia arrives. I don't have time to wait for anyone's version of these visions, so I computed them.

In mathematics, the line in the sand between theory and computation keeps moving. Indeed, I helped move it by computerizing my field when I was young. Mathematicians still contribute theory, and the computations help.

A similar line in the sand is moving, between visionary creativity and computation. LLMs are association engines of staggering scope, and what some call "hallucinations" can be harnessed to generalize from all human endeavors to project future best practices. Like how to best work with AI.

I've tested everything I say here, and it works.

Deleted Comment

larodi · a year ago

I would actually join you, as my longstanding view on coding is that it is best done in pairs. Sadly humans and programmers in particular are not so ready to work arms-by-arms, and it is even more depressing that it now turns AI is pairing us.

Perhaps there's gonna be post-AI programming movement where people actually stare at the same monitor and discuss while one of them is coding.

As a sidenote - we've done experiments with FOBsters, and when paired this way, the multiply their output. There's something about psychology of groups and how one can only provide maximum output when teaming.

Even for solo activities, and non-IT activities, such as skiing/snowboard, it is better to have a partner to ride with you and discuss the terrain.

shmoogy · a year ago

Have you tried cursor? I really like the selecting context -> cmd+l to make a chat with it - explain requirement, hit apply, validate the diff.

Works amazingly well for a lot of what I've been working on the past month or two.

gnatolf · a year ago

I haven't tried cursor yet, but how is this different from the copilot plugin in vscode? Sounds pretty similar.

renegat0x0 · a year ago

One thing I would keep in mind. There are some parts of the project that you really cannot fill by chat output.

I had crucial area with threads. Code generated by chat seemed to be ok, but had one flaw. My initial code written manually was bug free. chat-generated output was not. It was difficult to catch it via inspection.

bongodongobob · a year ago

To add to that, I always add some kind of debug function wrapper so I can hand off the state of variables and program flow to the LLM when I need to debug something. Sometimes it's really hard to explain exactly what went wrong so being able to give it a chunk of the program state is more descriptive.

throwup238 · a year ago

I do the same for my QT desktop app. I’ve got an “Inspector” singleton that allows me to select a component tree via click, similar to browser devtools. It takes a screenshot, dumps the QML source, and serializes the state of the components into the clipboard.

I paste that into Claude and it is surprisingly good at fixing bugs and making visual modifications.

ic4l · a year ago

For me the o models consistently make more mistakes for me than Claude 3.5 Sonnet.

pc86 · a year ago

Same for me. I wonder if Claude is better at some languages than others, and o models are better at those weaker languages. There are some devs I know who insist Claude is garbage for coding and o3-* or o4-* are tier 1.

bandushrew · a year ago

Producing 200 lines of usable code an hour is genuinely impressive.

My experiments have been nowhere near that successful.

I would love, love, love to see a transcript of how that process worked over an hour, if that was something you were willing to share.

protocolture · a year ago

100%.

I do all this + rubber ducky the hell out of it.

Sometimes I just discuss concepts of the project with the thing and it helps me think.

I dont think chat is going to be right for everyone but it absolutely works for me.

nonrandomstring · a year ago

> it's actually a good UI

Came to vote good too. I mean, why do we all love a nice REPL? That's chat right? Chat with an interpreter.

AutistiCoder · a year ago

ChatGPT itself is great for coding.

GitHub Copilot is...not. It doesn't seem to understand how to help me as well as ChatGPT does.

sdesol · a year ago

> 1. I find a "pair programming" mentality is key. I focus on the high-level code, and let the model focus on the lower level code. I code review all the code, and provide feedback. Blindly accepting the code is a terrible approach.

This is what I've found to be key. If I start a new feature, I will work with the LLM to do the following:

- Create problem and solution statement

- Create requirements and user stories

- Create architecture

- Create skeleton code. This is critical since it lets me understand what it wants to do.

- Generate a summary of the skeleton code

Once I have done the above, I will have the LLM generate a reusable prompt that I can use to start LLM conversations with. Below is an example of how I turn everything into a reusable prompt.

https://beta.gitsense.com/?chat=b96ce9e0-da19-45e8-bfec-a3ec...

As I make changes like add new files, I will need to generate a new prompt but it is worth the effort. And you can see it in action here.

https://beta.gitsense.com/?chat=b8c4b221-55e5-4ed6-860e-12f0...

The first message is the reusable prompt message. With the first message in place, I can describe the problem or requirements and ask the LLM what files it will need to better understand how to implement things.

What I am currently doing highlights how I think LLM is a game changer. VCs are going for moonshots instead of home runs. The ability to gather requirements and talk through a solution before even coding is how I think LLMs will revolutionize things. It is great that it can produce usable code, but what I've found it to be invaluable is it helps you organize your thoughts.

In the last link, I am having a conversation with both DeepSeek v3 and Sonnet 3.5 and the LLMs legitimately saved me hours in work, without even writing a single line of code. In the past, I would have just implemented the feature and been done with it, and then I would have to fix something if I didn't think of an edge case. With LLMs, it literally takes minutes to develop a plan that is extremely well documented that can be shared with others.

This ability to generate design documents is how I think LLMs will ultimately be used. The bonus is producing code, but the reality is that documentation (which can be tedious and frustrating) is a requirement for software development. In my opinion, this is where LLMs will forever change things.

Deleted Comment

zahlman · a year ago

LoC per hour seems to me like a terrible metric.

esafak · a year ago

Why? Since you are vetting the code it generates, the rate at which you end up with code you accept seems like a good measure of productivity.

ikety · a year ago

do you use pair programming tools like aider?

ls_stats · a year ago

>it's actually a good UI >I just finished a small project >around 2200 lines

why the top comments on HN are always people who have not read the article

pc86 · a year ago

It's not clear to me in the lines you're quoting that the GP didn't read the article.

Deleted Comment

I'm growing to the idea that chat is a bad UI pattern, period. It is a great record of correspondence, I think. But it is a terrible UI for doing anything.

In large, I assert this is because the best way to do something is to do that thing. There can be correspondence around the thing, but the artifacts that you are building are separate things.

You could probably take this further and say that narrative is a terrible way to build things. It can be a great way to communicate them, but being a separate entity, it is not necessarily good at making any artifacts.

zamfi · a year ago

With apologies to Bill Buxton: "Every interface is best at something and worst at something else."

Chat is a great UI pattern for ephemeral conversation. It's why we get on the phone or on DM to talk with people while collaborating on documents, and don't just sit there making isolated edits to some Google Doc.

It's great because it can go all over the place and the humans get to decide which part of that conversation is meaningful and which isn't, and then put that in the document.

It's also obviously not enough: you still need documents!

But this isn't an "either-or" case. It's a "both" case.

packetlost · a year ago

I even think it's bad for generalized communication (ie. Slack/Teams/Discord/etc.) that isn't completely throwaway. Email is better in every single way for anything that might ever be relevant to review again or be filtered due to too much going on.

goosejuice · a year ago

I've had the opposite experience.

I have never had any issue finding information in slack with history going back nearly a decade. The only issue I have with Slack is a people problem where most communication is siloed in private channels and DMs.

Email threads are incredibly hard to follow though. The UX is rough and it shows.

taeric · a year ago

Anything that needs to be filtered for viewing again pretty much needs version control. Email largely fails at that, as hard as other correspondence systems. That said, we have common workflows that use email to build reviewed artifacts.

People love complaining about the email workflow of git, but it is demonstrably better than any chat program for what it is doing.

SoftTalker · a year ago

Yes, agree. Chatting with a computer has all the worst attributes of talking to a person, without any of the intuitive understanding, nonverbal cues, even tone of voice, that all add meaning when two human beings talk to each other.

TeMPOraL · a year ago

That comment made sense 3 years ago. LLMs already solved "intuitive understanding", and the realtime multimodal variants (e.g. the thing behind "Advanced Voice" in ChatGPT app) handle tone of voice in both directions. As for nonverbal cues, I don't know yet - I got live video enabled in ChatGPT only few days ago and didn't have time to test it, but I would be surprised if it couldn't read the basics of body language at this point.

Talking to a computer still sucks as an user interface - not because a computer can't communicate on multiple channels the way people do, as it can do it now too. It sucks for the same reason talking to people sucks as an user interface - because the kind of tasks we use computers for (and that aren't just talking with/to/at other people via electronic means) are better handle by doing than by talking about them. We need an interface to operate a tool, not an interface to an agent that operates a tool for us.

As an example, consider driving (as in, realtime control - not just "getting from point A to B"): a chat interface to driving would suck just as badly as being a backseat driver sucks for both people in the car. In contrast, a steering wheel, instead of being a bandwidth-limiting indirection, is an anti-indirection - not only it lets you control the machine with your body, the control is direct enough that over time your brain learns to abstract it away, and the car becomes an extension of your body. We need more of tangible interfaces like that with computers.

The steering wheel case, of course, would fail with "AI-level smarts" - but that still doesn't mean we should embrace talking to computers. A good analogy is dance - it's an interaction between two independently smart agents exploring an activity together, and as they do it enough, it becomes fluid.

So dance, IMO, is the steering wheel analogy for AI-powered interfaces, and that is the space we need to explore more.

aylmao · a year ago

I would also call it having all the worst attributes of a CLI, without the succinctness, OS integration, and program composability of one.

hakfoo · a year ago

The idea of chat interfaces always seemed to be to disguise available functionality.

It's a CLI without the integrity. When you bought a 386, it came with a big book that said "MS-DOS 4.01" and enumerated the 75 commands you can type at the C:\> prompt and actually make something useful happen.

When you argue with ChatGPT, its whole business is to not tell you what those 75 commands are. Maybe your prompt fits its core competency and you'll get exactly what you wanted. Maybe it's hammering what you said into a shape it can parse and producing marginal garbage. Maybe it's going to hallucinate from nothing. But it's going to hide that behind a bunch of cute language and hopefully you'll just keep pulling the gacha and blaming yourself if it's not right.

taeric · a year ago

Yeah, this is something I didn't make clear on my post. Chat between people is the same bad UI. People read in the aggression that they bring to their reading. And get mad at people who are legit trying to understand something.

You have some of the same problems with email, of course. Losing threading, in particular, made things worse. It was a "chatification of email" that caused people to lean in to email being bad. Amusing that we are now seeing chat applications rise to replace email.

Suppafly · a year ago

I like the idea of having a chat program, the issue is that it's horrible to have a bunch of chat programs all integrated into every application you use that are separate and incompatible with each other.

I really don't like the idea of chatting with an AI though. There are better ways to interface with AIs and the focus on chat is making people forget that.

tux1968 · a year ago

We need an LSP like protocol for AI, so that we can amortize the configuration over every place we want such an integration. AISP?

freedomben · a year ago

Midjourney is an interesting case study in this I think, building their product UI as a discord bot. It was interesting to be sure, but I always felt like I was fighting the "interface" to get things done. It certainly wasn't all bad, and I think if I used it more it might even be great, but as someone who doesn't use Discord other than that and only rarely generated images, I had to read the docs every time I wanted to generate an image, which is a ridiculous amount of friction.

joe_guy · a year ago

There has recently been a pretty large UI inclusion for midjourney directly inside Discord which has the option of being used instead of the text input.

As is often the case in these sorts of thingsz your milage may vary for the more complex settings.

ijk · a year ago

I'm curious if you find their new website interface more tractable--there's some inherent friction to the prompting in either case, but I'd like to know if the Discord chat interface can be overcome by using a different interface or if the issue is more intrinsic.

dapperdrake · a year ago

Email threads seem better for documenting and searching correspondence.

The last counter argument I read got buried on Discord or Slack somewhere.

jayd16 · a year ago

Isn't this entirely an implementation detail of slack and discord search? What about email makes it more searchable fundamentally? The meta data if both platforms is essentially the same, no?

al_borland · a year ago

I find things get buried just as easily in email. People on my team are constantly resending each other emails, because they can’t find the thread.

This is why, if something is important, I take it out of email and put it into a document people can reference. The latest and correct information from all the decisions in the thread can also be collected in one place, so everyone reading doesn’t have to figure it out. Not to mention side conversations can influence the results, without being explicitly stated in the email thread.

taeric · a year ago

Discord and slack baffle me. I liked them specifically because they were more ephemeral than other options. Which, seems at odds with how people want them to be? Why?

65 · a year ago

Oh, how nice it must be to complain about Slack. Try using Teams and you will never want to complain about Slack again.

chinathrow · a year ago

Voice messages within a chat UI is even worse. I can't search it, I can't listen to it in the same situations I can read a message.

I wish I could block them within all these chat apps.

"Sorry, you can't bother to send voice messages to this person."

taeric · a year ago

Oh dear lord yes. I am always baffled when I hear that some folks send voice memos to people.

beambot · a year ago

As a written form of "stream of consciousness", it seems to have a lot of value to me. It's noisy, inefficient & meandering -- all the things those polished artifacts are not -- but it's also where you can explore new avenues without worrying about succinctness or completeness. It's like the first draft of a manuscript.

taeric · a year ago

Certainly, it can have its use. But I question if it is stronger than previous generative techniques for creating many things. There have been strong tools that you could, for example, draw a box and say this should be a house. Add N rooms. This room should be a bathroom. Add windows to these rooms. Add required subfloor and plumbing.

Even with game development. Level editors have a good history for being how people actually make games. Some quite good ones, I should add.

For website development, many template based systems worked quite well. People seem hellbent on never acknowledging that form builders of the late 90s did, in fact, work.

Is it a bit nicer that you can do everything through a dialog? I'm sure it is a great for people that think that way.

tpmoney · a year ago

I disagree. Chat is a fantastic UI for getting an AI to generate something vague. Specifically I’m thinking of AI image generation. A chat UI is a great interface for iterating on an image and dialing it in over a series of iterations. The key here is that the AI model needs to keep context both of the image generation history and that chat history.

I think this applies to any “fuzzy generation” scenario. It certainly shouldn’t be the only tool, and (at least as it stands today) isn’t good enough to finalize and fine tune the final result, but a series of “a foo with a bar” “slightly less orange” “make the bar a bit more like a fizzbuzz” interactions with a good chat UI can really get a good 80% solution.

But like all new toys, AI and AI chat will be hammered into a few thousand places where it makes no sense until the hype dies down and we come up with rules and guidelines for where it does and doesn’t work

badsectoracula · a year ago

> Specifically I’m thinking of AI image generation

I heavily disagree here, chat - or really text - is a horrible UI for image generation, unless you have almost zero idea of what you want to achieve and you don't really care about the final results.

Typing "make the bar a bit more like a fizzbuzz" in some textbox is awful UX compared to, say, clicking on the "bar" and selecting "fizzbuzz" or drag-and-dropping "fizzbuzz" on the "bar" or really anything that takes advantage of the fact we're interacting with a graphical environment to do work on graphics.

In fact it is a horrible UI for anything, except perhaps chatbots and tasks that have to do with text like grammar correction, altering writing styles, etc.

It is helpful for impressing people (especially people with money) though.

t_mann · a year ago

Ok, but what is a good pattern to leverage AI tools for coding (assuming that they have some value there, which I think most people would agree with now)? I could see two distinct approaches:

- "App builders" that use some combination of drag&drop UI builders, and design docs for architecture, workflows,... and let the UI guess what needs to be built "under the hood" (a little bit in the spirit of where UML class diagrams were meant to take us). This would still require actual programming knowledge to evaluate and fix what the bot has built

- Formal requirement specification that is sufficiently rigorous to be tested against automatically. This might go some way towards removing the requirement to know how to code, but the technical challenge would simply shift to knowing the specification language

taeric · a year ago

I'd challenge if this is specific to coding? If you want to get a result that is largely like a repertoire of examples used in a training set, chat is probably workable? This is true for music. Visual art. Buildings. Anything, really?

But, if you want to start doing "domain specific" edits to the artifacts that are made, you are almost certainly going to want something like the app builders idea. Down thread, I mention how this is a lot like procedural generative techniques for game levels and such. Such that I think I am in agreement with your first bullet?

Similarly, if you want to make music with an instrument, it will be hard to ignore playing with said instrument more directly. I suspect some people can create things using chat as an interface. I just also suspect directly touching the artifacts at play is going to be more powerful.

I think I agree with the point on formal requirements. Not sure how that really applies to chat as an interface? I think it is hoping for a "laws of robotics" style that can have a test to confirm them? Reality could surprise me, but I always viewed that as largely a fiction item.

staplers · a year ago

  Ok, but what is a good pattern to leverage AI tools for coding?

Actual product stakeholders are not likely to spill their magic sauce and give free consultancy.

kiitos · a year ago

I've yet to see any AI/LLM produce code that withstands even basic scrutiny.

lucasyvas · a year ago

Disclaimer: Haven't used the tools a lot yet, just a bit. So if I say something that already exists, forgive me.

TLDR: Targeted edits and prompts / Heads Up Display

It should probably be more like an overlay (and hooked into context menus with suggestions, inline context bubbles when you want more context for a code block) and make use of an IDE problems view. The problems view would have to be enhanced to allow it to add problems that spanned multiple files, however.

Probably like the Rust compiler output style, but on steroids.

There would likely be some chatting required, but it should all be at a particular site in the code and then go into some history bank where you can view every topic you've discussed.

For authoring, I think an interactive drawing might be better, allowing you to click on specific areas and then use shorter phrasing to make an adjustment instead of having an argument in some chat to the left of your screen about specificity of your request.

Multi-point / click with minimal prompt. It should understand based on what I clicked what the context is without me having to explain it.

gagik_co · a year ago

I think “correspondence UX” can be bad UX but there’s nothing inherently wrong with chat UI.

I created the tetr app[1] which is basically “chat UI for everything”. I did that because I used to message myself notes and wanted to expand it to many more things. There’s not much back and forth, usually 1 input and instant output (no AI), still acting like a chat.

I think there’s a lot of intuitiveness with chat UI and it can be a flexible medium for sharing different information in a similar format, minimizing context switching. That’s my philosophy with tetr anyhow.

[1] https://tetr.app/

marcosdumay · a year ago

> It can be a great way to communicate them

It's usually not. Narrative is a famously flawed way to communicate or record the real world.

It's great for generating engagement, though.

sangnoir · a year ago

> Narrative is a famously flawed way to communicate or record the real world.

...and yet with it's flaws, it's the most flexible in conveying meaning. A Ted Chiang interview was on the HN frontpage a few days ago, in it, he mentions that humans created multiple precise, unambiguous communication modes like equations used in mathematical papers and proofs. But those same papers are not 100% equations, the mathematicians have to fall back to flawed language to describe and provide context because those formal languages only capture a smaller range of human thought compared to natural language.

This is not to say chat has the best ergonomics for development - it's not, but one has to remember that the tools are based on Large Language Models whose one-trick is manipulating language. Better ergonomics would likely come from models trained or fine-tuned on AST-tokens and diffs. They'd still need to modulate on language (understanding requirements, hints, variable names,and authoring comments, commits and/or PRs).

taeric · a year ago

I think fictional narratives that aim to capture inner monologue are famously flawed. I think narrative tours of things can be good. I'm not clear if "narrated tours" are a specific genre, sadly. :(

OJFord · a year ago

I don't know, I'm in Slack all day with colleagues, I quite like having the additional ChatGPT colleague (even better I can be quite rude/terse in my messages with 'them').

Incidentally I think that's also a good model for how much to trust the output - you might have a colleague who knows enough about X to think they can answer your question, but they're not necessarily right, you don't blindly trust it. You take it as a pointer, or try the suggestion (but not surprised if it turns out it doesn't work), etc.

taeric · a year ago

Oh, do not take my comment as a "chat bots shouldn't exist." That is not at all my intent. I just think it is a bad interface for building things that are self contained in the same chat log.

varispeed · a year ago

Talk to AI chat as if you would talk to junior developer at your company and tell it to do something that you need.

I think it is brilliant. On another hand I caught myself many times writing prompts to colleagues. Although it made requirements of what I need so much clearer for them.

Sylamore · a year ago

NC DMV replaced their regular forms with a chat bot and it's horrible. Takes forever to complete tasks that used to take less than a minute because of the fake interaction and fake typing. Just give me a damn form to pay my taxes or request a custom plate.

brobdingnagians · a year ago

Similar thing I've run into lately, chat is horrible for tracking issues and tasks. When people try to use it that way, it becomes absolute chaos after awhile.

dartos · a year ago

Preach!

I’ve been saying this since 2018