Readit News logoReadit News
wiremine · a year ago
I'm going to take a contrarian view and say it's actually a good UI, but it's all about how you approach it.

I just finished a small project where I used o3-mini and o3-mini-high to generate most of the code. I averaged around 200 lines of code an hour, including the business logic and unit tests. Total was around 2200 lines. So, not a big project, but not a throw away script. The code was perfectly fine for what we needed. This is the third time I've done this, and each time I get faster and better at it.

1. I find a "pair programming" mentality is key. I focus on the high-level code, and let the model focus on the lower level code. I code review all the code, and provide feedback. Blindly accepting the code is a terrible approach.

2. Generating unit tests is critical. After I like the gist of some code, I ask for some smoke tests. Again, peer review the code and adjust as needed.

3. Be liberal with starting a new chat: the models can get easily confused with longer context windows. If you start to see things go sideways, start over.

4. Give it code examples. Don't prompt with English only.

FWIW, o3-mini was the best model I've seen so far; Sonnet 3.5 New is a close second.

ryandrake · a year ago
I guess the things I don't like about Chat are the same things I don't like about pair (or team) programming. I've always thought of programming as a solitary activity. You visualize the data structures, algorithms, data paths, calling flow and stack, and so on, in your mind, with very high throughput "discussions" happening entirely in your brain. Your brain is high bandwidth, low latency. Effortlessly and instantly move things around and visualize them. Figure everything out. Finally, when it's correct, you send it to the slow output device (your fingers).

The minute you have to discuss those things with someone else, your bandwidth decreases by orders of magnitude and now you have to put words to these things and describe them, and physically type them in or vocalize them. Then your counterpart has to input them through his eyes and ears, process that, and re-output his thoughts to you. Slow, slow, slow, and prone to error and specificity problems as you translate technical concepts to English and back.

Chat as a UX interface is similarly slow and poorly specific. It has all the shortcomings of discussing your idea with a human and really no upside besides the dictionary-like recall.

yarekt · a year ago
That's such a mechanical way of describing pair programming. I'm guessing you don't do it often (understandable if its not working for you).

For me pair programming accelerates development to much more than 2x. Over time the two of you figure out how to use each other's strengths, and as both of you immerse yourself in the same context you begin to understand what's needed without speaking every bit of syntax between each other.

In best cases as a driver you end up producing high quality on the first pass, because you know that your partner will immediately catch anything that doesn't look right. You also go fast because you can sometimes skim over complexities letting your partner think ahead and share that context load.

I'll leave readers to find all the caveats here

Edit: I should probably mention why I think Chat Interface for AI is not working like Pair programming: As much as it may fake it, AI isn't learning anything while you're chatting to it. Its pointless to argue your case or discuss architectural approaches. An approach that yields better results with Chat AI is to just edit/expand your original prompt. It also feels less like a waste of time.

With Pair programming, you may chat upfront, but you won't reach that shared understanding until you start trying to implement something. For now Chat AI has no shared understanding, just "what I asked you to do" thing, and that's not good enough.

frocodillo · a year ago
I would argue that is a feature of pair programming, not a bug. By forcing you to use the slower I/O parts of your brain (and that of your partner) the process becomes more deliberate, allowing you to catch edge cases, bad design patterns, and would-be bugs before even putting pen to paper so to speak. Not to mention that it immediately reduces the bus factor by having two people with a good understanding of the code.

I’m not saying pair programming is a silver bullet, and I tend to agree that working on your own can be vastly more efficient. I do however think that it’s a very useful tool for critical functionality and hard problems and shouldn’t be dismissed.

hmcdona1 · a year ago
This going to sound out of left field, but I would venture to guess you have very high spatial reasoning skills. I operate much this same way and only recently connected these dots that that skill might be what my brain leans on so heavily while programming and debugging.

Pair programming is endlessly frustrating beyond just rubber duckying because I’m having to exit my mental model, communicate it to someone else, and then translate and relate their inputs back into my mental model which is not exactly rooted in language in my head.

bobbiechen · a year ago
I agree, chat is only useful in scenarios that are 1) poorly defined, and 2) require a back-and-forth feedback loop. And even then, there might be better UX options.

I wrote about this here: https://digitalseams.com/blog/the-ideal-ai-interface-is-prob...

throwup238 · a year ago
At the same time, putting your ideas to words forces you to make them concrete instead of nebulous brain waves. I find that the chat interface gets rid of the downsides of pair programming (that the other person is a human being with their own agency*) while maintaining the “intelligent” pair programmer aspect.

Especially with the new r1 thinking output, I find it useful to iterate on the initial prompt as a way to make my ideas more concrete as much as iterating through the chat interface which is more hit and miss due to context length limits.

* I don’t mean that in a negative way, but in a “I can’t expect another person to respond to me instantly at 10 words per second” way.

cjonas · a year ago
I find its exactly the opposite. With AI chat, I can define signatures, write technical requirements and validate my approach in minutes. I'm not talking with the AI like I would a human... I'm writing a blend of stubs and concise requirements, providing documentation, reviewing, validating and repeating. When it goes in the wrong direction, I add additional details and regenerate from scratch. I focus on small, composable chunks of functionality and then tie it all together at the end.
alickz · a year ago
in my experience, if you can't explain something to someone else then you don't fully understand it

our brains like to jump over inconsistencies or small gaps in our logic when working by themselves, but try to explain that same concept to someone else and those inconsistencies and gaps become glaringly obvious (doubly so if the other person starts asking questions you never considered)

it's why pair programming and rubber duck debugging work at all, at least in my opinion

knighthack · a year ago
I mostly agree with you, but I have to point out something to the contrary of this part you said: "...The minute you have to discuss those things with someone else, your bandwidth decreases by orders of magnitude and now you have to put words to these things and describe them, and physically type them in or vocalize them."

Subvocalization/explicit vocalization of what you're doing actually improves your understanding of the code. Doing so may 'decrease bandwith', but improves comprehension, because it's basically inline rubber duck debugging.

It's actually easy to write code which you don't understand and cannot explain what it's doing, whether at the syntax, logic or application level. I think the analogue is to writing well; anyone can write streams of consciousness amounting to word salad garbage. But a good writer can cut things down and explain why every single thing was chosen, right down to the punctuations. This feature of writing should be even more apparent with code.

I've coded tons of things where I can get the code working in a mediocre fashion, and yet find great difficulty in try to verbally explain what I'm doing.

In contrast there's been code where I've been able to explain each step of what I'm doing before I even write anything; in those situations what generally comes out tends to be superior maintainable code, and readable too.

nick238 · a year ago
Someone else (future you being a distinct person) will also need to grok what's going on when they maintain the code later. By living purely in a high-dimensional trans-enlightenment state and coding that way, means you may as well be building a half-assed organic neural network to do your task, rather than something better "designed".

Neural networks and evolved structures and pathways (e.g. humans make do with ~20k genes and about that many more in regulatory sequences) are absolutely more efficient, but good luck debugging them.

godelski · a year ago

  > I focus on the high-level code, and let the model focus on the lower level code.
Tbh the reason I don't use LLM assistants is because they suck at the "low level". They are okay at mid level and better at high level. I find it's actual coding very mediocre and fraught with errors.

I've yet to see any model understand nuance or detail.

This is especially apparent in image models. Sure, it can do hands but they still don't get 3D space nor temporal movements. It's great for scrolling through Twitter but the longer you look the more surreal they get. This even includes the new ByteDance model also on the front page. But with coding models they ignore context of the codebase and the results feel more like patchwork. They feel like what you'd be annoyed at with a junior dev for writing because not only do you have to go through 10 PRs to make it pass the test cases but the lack of context just builds a lot of tech debt. How they'll build unit tests that technically work but don't capture the actual issues and usually can be highly condensed while having greater coverage. It feels very gluey, like copy pasting from stack overflow when hyper focused on the immediate outcome instead of understanding the goal. It is too "solution" oriented, not understanding the underlying heuristics and is more frustrating than dealing with the human equivalent who says something "works" as evidenced by the output. This is like trying to say a math proof is correct by looking at just the last line.

Ironically, I think in part this is why chat interface sucks too. A lot of our job is to do a lot of inference in figuring out what our managers are even asking us to make. And you can't even know the answer until you're part way in.

yarekt · a year ago
> A lot of our job is to do a lot of inference in figuring out what our managers are even asking us to make

This is why I think LLMs can't really replace developers. 80% of my job is already trying to figure out what's actually needed, despite being given lots of text detail, maybe even spec, or prototype code.

Building the wrong thing fast is about as useful as not building anything at all. (And before someone says "at least you now know what not to do"? For any problem there are infinite number of wrong solutions, but only a handful of ones that yield success, why waste time trying all the wrong ones?)

lucasmullens · a year ago
> But with coding models they ignore context of the codebase and the results feel more like patchwork.

Have you tried Cursor? It has a great feature that grabs context from the codebase, I use it all the time.

wiremine · a year ago
> Tbh the reason I don't use LLM assistants is because they suck at the "low level". They are okay at mid level and better at high level. I find it's actual coding very mediocre and fraught with errors.

That's interesting. I found assistants like Copilot fairly good at low level code, assuming you direct it well.

rpastuszak · a year ago
I've changed my mind on that as well. I think that, generally, chat UIs are a lazy and not very user friendly. However, when coding I keep switching between two modes:

1. I need a smart autocomplete that can work backwards and mimic my coding patterns

2. I need a pair programming buddy (of sorts, this metaphor doesn't completely work, but I don't have a better one)

Pair development, even a butchered version of the so called "strong style" (give the driver the highest level of abstraction they can use/understand) works quite well for me. But, the main reason this works is that it forces me to structure my thinking a little bit, allows me to iterate on the definition of the problem. Toss away the sketch with bigger parts of the problem, start again.

It also helps me to avoid yak shaving, getting lost in the detail or distracted because the feedback loop between me seeing something working on the screen vs. the idea is so short (even if the code is crap).

I'd also add 5.: use prompts to generate (boring) prompts. For instance, I needed a simple #tag formatter for one of my markdown sites. I am aware that there's a not-so-small list of edge cases I'd need to cover. In this case I'd write a prompt with a list of basic requirements and ask the LLM to: a) extend it with good practice, common edge cases b) format it as a spec with concrete input / output examples. This works a bit similar to the point you made about generating unit tests (I do that too, in tandem with this approach).

In a sense 1) is autocomplete 2) is a scaffolding tool.

yarekt · a year ago
Oh yea, point 1 for sure. I call copilot regex on steroids.

Example: - copy paste a table from a pdf datasheet into a comment (it'll be badly formatted with newlines and whatnot, doesn't matter) - show it how to do the first line - autocomplete the rest of the table - Check every row to make sure it didn't invent fields/types

For this type of workflow the tools are a real time saver. I've yet to see any results for the other workflows. They usually just frustrate me by either starting to suggest nonsense code without full understanding, or its far too easy to bias the results and make them stuck in a pattern of thinking.

ryandrake · a year ago
> I've changed my mind on that as well. I think that, generally, chat UIs are a lazy and not very user friendly. However, when coding I keep switching between two modes:

> 1. I need a smart autocomplete that can work backwards and mimic my coding patterns

> 2. I need a pair programming buddy (of sorts, this metaphor doesn't completely work, but I don't have a better one)

Thanks! This is the first time I've seen it put this clearly. When I first tried out CoPilot, I was unsure of how I was "supposed" to interact with it. Is it (as you put it) a smarter autocomplete, or a programming buddy? Is it both? What was the right input method to use?

After a while, I realized that for my personal style I would pretty much entirely use method 1, and never method 2. But, others might really need that "programming buddy" and use that interface instead.

echelon · a year ago
I work on GenAI in the media domain, and I think this will hold true with other fields as well:

- Text prompts and chat interfaces are great for coarse grained exploration. You can get a rough start that you can refine. "Knight standing in a desert, rusted suit of armor" gets you started, but you'll want to take it much further.

- Precision inputs (mouse or structure guided) are best for fine tuning the result and honing in on the solution itself. You can individually plant the cacti and pose the character. You can't get there with text.

dataviz1000 · a year ago
I agree with you.

Yesterday, I asked o3-mini to "optimize" a block of code. It produced very clean, functional TypeScript. However, because the code is reducing stock option chains, I then asked o3-mini to "optimize for speed." In the JavaScript world, this is usually done with for loops, and it even considered aspects like array memory allocation.

This shows that using the right qualifiers is important for getting the results you want. Today, I use both "optimize for developer experience" and "optimize for speed" when they are appropriate.

Although declarative code is just an abstraction, moving from imperative jQuery to declarative React was a major change in my coding experience. My work went from telling the system how to do something to simply telling it what to do. Of course, in React—especially at first—I had to explain how to do things, but only once to create a component. After that, I could just tell the system what to do. Now, I can simply declare the desired outcome, the what. It helps to understand how things work, but that level of detail is becoming less necessary.

javier2 · a year ago
Nah, a Chat is terrible for development. In my tears of working, i have only had the chance to start a new codebase 3-4 times. 90% of the time is spent modifying large existing systems, constantly changing them. The chat interface is terrible for this. It would be much better if it was more integrated with the codebase and editor
pc86 · a year ago
Cursor does all of this, and agent chats let you describe a new feature or an existing bug and it will search the entire codebase and add relevant code to its context automatically. You can optionally attach files for the context - code files that you want to add to the context up front, documentation for third-party calls, whatever you want.

As a side note, "No, you're wrong" is not a great way to have a conversation.

zahlman · a year ago
>In my tears of working

Sometimes typos are eerily appropriate ;)

(I almost typed "errily"...)

rafaelmn · a year ago
This only works for small self-contained problems with narrow scope/context.

Chat sucks for pulling in context, and the only worse thing I've tried is the IDE integrations that supposedly pull the relevant context for you (and I've tried quite a few recently).

I don't know if naive fine-tuning with codebase would work, I suspect there are going to be tools that let you train the AI on code in the sense that it can have some references in model, and it knows how you want your project code/structure to look like (which is often quite different from what it looks in most areas)

jacob019 · a year ago
Totally agree. Chat is a fantastic interface because it stays out of my way. For me it's much more than a coding assistant. I get live examples of how to use tools, and help with boilerplate, which is a time saver and improvement over legacy workflows, but the real benefit is all the spitballing I can do with it to refine ideas and logic and help getting up to speed on tooling way outside of my domain. I spent about 3.5 hours chatting with o1 about RL architecture to solve some business problems. Now I have a crystal clear plan and the confidence to move forward in an optimal way. I feel a little weird now, like I was just talking to myself for a few hours, but it totally helped me work through the planning. For actual code, I find myself being a bit less interactive with LLMs as time goes, sometimes it's easier to just write the logic the way I want rather than trying to explain how I want it but the ability to retrieve code samples for anything with ease is like a superpower. Not to mention all the cool stuff LLMs can do at runtime via API. Yeah, chat is great, and I'll stick with writing code in Vim and pasting as needed.
gamedever · a year ago
What did you create? In my field, so far, I've found the chat bots not doing so well. My guess is the more likely you're making something other people make often, the more likely the bot will help.

Even then though, I asked o1-cursor to start a react app. It failed, mostly because it's out of date. It's instructions were for react 2 versions ago.

This seems like an issue. If the statistically most likley answer is old, that's not helpful.

wiremine · a year ago
The most recent one was a typescript project focused on zod.

I might be reading into your comment, but I agree "top-down" development sucks: "Give me a react that does X". I've had much more success going bottom-up.

And I've often seen models getting confused on versions. You need to be explicit, and even then then forget.

knes · a year ago
IMHO, I would agree with you.

I think chat is a nice intermediary evolution between the CLI (that we use every day) and whatever comes next.

I work at Augment (https://augmentcode.com), which, surprise surprise, is an AI coding assistant. We think about the new modality required to interact with code and AI on a daily basis.

Beside increase productivity (and happiness, as you don't have to do mundane tasks like tests, documentations, etc), I personally believe that what AI can open up is actually more of a way for non-coders (think PMs) to interact with a codebase. AI is really good at converting specs, user stories, and so on into tasks—which today still need to be implemented by software engineers (with the help of AI for the more tedious work). Think of what Figma did between designers and developers, but applied to coding.

What’s the actual "new UI/UX paradigm"? I don’t know yet. But like with Figma, I believe there’s a happy ending waiting for everyone.

bboygravity · a year ago
Interesting to see the narrative on here slowly change from "LLM's will forever be useless for programming" to "I'm using it every day" over the course of the past year or so.

I'm now bracing for the "oh sht, we're all out of a job next year" narrative.

RHSeeger · a year ago
I think a lot of people have always thought of it as a tool that can help.

I don't want an LLM to generate "the answer" for me in a lot of places, but I do think it's amazing for helping me gather information (and cite where that information came from) and pointers in directions to look. A search engine that generates a concrete answer via LLM is (mostly) useless to me. One that gives me an answer and then links to the facts it used to generate that answer is _very_ useful.

It's the same way with programming. It's great helping you find what you need. But it needs to be in a way that you can verify it's right; or take it's answer and adjust it to what you actually need (based on the context it provides).

wiremine · a year ago
> "oh sht, we're all out of a job next year"

Maybe. My sense if we'd need to see 3 to 4 orders of magnitude improvements on the current models before we can replace people outright.

I do think we'll see a huge productivity boost per developer over the next few years. Some companies will use that to increase their throughput, and some will use it to reduce overhead.

Syzygies · a year ago
An environment such as Cursor supports many approaches for working with AI. "Chat" would be the instructions printed on the bottom, perhaps how their developers use it, but far from the only mode it actually supports.

It is helpful to frame this in the historical arc described by Yuval Harari in his recent book "Nexus" on the evolution of information systems. We're at the dawn of history for how to work with AI, and actively visualizing the future has an immediate ROI.

"Chat" is cave man oral tradition. It is like attempting a complex Ruby project through the periscope of an `irb` session. One needs to use an IDE to manage a complex code base. We all know this, but we haven't connected the dots that we need to approach prompt management the same way.

Flip ahead in Harari's book, and he describes rabbis writing texts on how to interpret [texts on how to interpret]* holy scriptures. Like Christopher Nolan's movie "Inception" (his second most relevant work after "Memento"), I've found myself several dreams deep collaborating with AI to develop prompts for [collaborating with AI to develop prompts for]* writing code together. Test the whole setup on multiple fresh AI sessions, as if one is running a business school laboratory on managerial genius, till AI can write correct code in one shot.

Duh? Good managers already understand this, working with teams of people. Technical climbers work cliffs this way. And AI was a blithering idiot until we understood how to simulate recursion in multilayer neural nets.

AI is a Rorschach inkblot test. Talk to it like a kindergartner, and you see the intelligence of a kindergartner. Use your most talented programmer to collaborate with you in preparing precise and complete specifications for your team, and you see a talented team of mature professionals.

We all experience degradation of long AI sessions. This is not inevitable; "life extension" needs to be tackled as a research problem. Just as old people get senile, AI fumbles its own context management over time. Civilization has advanced by developing technologies for passing knowledge forward. We need to engineer similar technologies for providing persistent memory to make each successive AI session smarter than the last. Authoring this knowledge helps each session to survive longer. If we fail to see this, we're condemning ourselves to stay cave men.

Compare the history of computing. There was a lot of philosophy and abstract mathematics about the potential for mechanical computation, but our worldview exploded when we could actually plug the machines in. We're at the same inflection point for theories of mind, semantic compression, structured memory. Indeed, philosophy was an untestable intellectual exercise before; now we can plug it in.

How do I know this? I'm just an old mathematician, in my first month trying to learn AI for one final burst of productivity before my father's dementia arrives. I don't have time to wait for anyone's version of these visions, so I computed them.

In mathematics, the line in the sand between theory and computation keeps moving. Indeed, I helped move it by computerizing my field when I was young. Mathematicians still contribute theory, and the computations help.

A similar line in the sand is moving, between visionary creativity and computation. LLMs are association engines of staggering scope, and what some call "hallucinations" can be harnessed to generalize from all human endeavors to project future best practices. Like how to best work with AI.

I've tested everything I say here, and it works.

Deleted Comment

larodi · a year ago
I would actually join you, as my longstanding view on coding is that it is best done in pairs. Sadly humans and programmers in particular are not so ready to work arms-by-arms, and it is even more depressing that it now turns AI is pairing us.

Perhaps there's gonna be post-AI programming movement where people actually stare at the same monitor and discuss while one of them is coding.

As a sidenote - we've done experiments with FOBsters, and when paired this way, the multiply their output. There's something about psychology of groups and how one can only provide maximum output when teaming.

Even for solo activities, and non-IT activities, such as skiing/snowboard, it is better to have a partner to ride with you and discuss the terrain.

shmoogy · a year ago
Have you tried cursor? I really like the selecting context -> cmd+l to make a chat with it - explain requirement, hit apply, validate the diff.

Works amazingly well for a lot of what I've been working on the past month or two.

gnatolf · a year ago
I haven't tried cursor yet, but how is this different from the copilot plugin in vscode? Sounds pretty similar.
renegat0x0 · a year ago
One thing I would keep in mind. There are some parts of the project that you really cannot fill by chat output.

I had crucial area with threads. Code generated by chat seemed to be ok, but had one flaw. My initial code written manually was bug free. chat-generated output was not. It was difficult to catch it via inspection.

bongodongobob · a year ago
To add to that, I always add some kind of debug function wrapper so I can hand off the state of variables and program flow to the LLM when I need to debug something. Sometimes it's really hard to explain exactly what went wrong so being able to give it a chunk of the program state is more descriptive.
throwup238 · a year ago
I do the same for my QT desktop app. I’ve got an “Inspector” singleton that allows me to select a component tree via click, similar to browser devtools. It takes a screenshot, dumps the QML source, and serializes the state of the components into the clipboard.

I paste that into Claude and it is surprisingly good at fixing bugs and making visual modifications.

ic4l · a year ago
For me the o models consistently make more mistakes for me than Claude 3.5 Sonnet.
pc86 · a year ago
Same for me. I wonder if Claude is better at some languages than others, and o models are better at those weaker languages. There are some devs I know who insist Claude is garbage for coding and o3-* or o4-* are tier 1.
bandushrew · a year ago
Producing 200 lines of usable code an hour is genuinely impressive.

My experiments have been nowhere near that successful.

I would love, love, love to see a transcript of how that process worked over an hour, if that was something you were willing to share.

protocolture · a year ago
100%.

I do all this + rubber ducky the hell out of it.

Sometimes I just discuss concepts of the project with the thing and it helps me think.

I dont think chat is going to be right for everyone but it absolutely works for me.

nonrandomstring · a year ago
> it's actually a good UI

Came to vote good too. I mean, why do we all love a nice REPL? That's chat right? Chat with an interpreter.

AutistiCoder · a year ago
ChatGPT itself is great for coding.

GitHub Copilot is...not. It doesn't seem to understand how to help me as well as ChatGPT does.

sdesol · a year ago
> 1. I find a "pair programming" mentality is key. I focus on the high-level code, and let the model focus on the lower level code. I code review all the code, and provide feedback. Blindly accepting the code is a terrible approach.

This is what I've found to be key. If I start a new feature, I will work with the LLM to do the following:

- Create problem and solution statement

- Create requirements and user stories

- Create architecture

- Create skeleton code. This is critical since it lets me understand what it wants to do.

- Generate a summary of the skeleton code

Once I have done the above, I will have the LLM generate a reusable prompt that I can use to start LLM conversations with. Below is an example of how I turn everything into a reusable prompt.

https://beta.gitsense.com/?chat=b96ce9e0-da19-45e8-bfec-a3ec...

As I make changes like add new files, I will need to generate a new prompt but it is worth the effort. And you can see it in action here.

https://beta.gitsense.com/?chat=b8c4b221-55e5-4ed6-860e-12f0...

The first message is the reusable prompt message. With the first message in place, I can describe the problem or requirements and ask the LLM what files it will need to better understand how to implement things.

What I am currently doing highlights how I think LLM is a game changer. VCs are going for moonshots instead of home runs. The ability to gather requirements and talk through a solution before even coding is how I think LLMs will revolutionize things. It is great that it can produce usable code, but what I've found it to be invaluable is it helps you organize your thoughts.

In the last link, I am having a conversation with both DeepSeek v3 and Sonnet 3.5 and the LLMs legitimately saved me hours in work, without even writing a single line of code. In the past, I would have just implemented the feature and been done with it, and then I would have to fix something if I didn't think of an edge case. With LLMs, it literally takes minutes to develop a plan that is extremely well documented that can be shared with others.

This ability to generate design documents is how I think LLMs will ultimately be used. The bonus is producing code, but the reality is that documentation (which can be tedious and frustrating) is a requirement for software development. In my opinion, this is where LLMs will forever change things.

Deleted Comment

zahlman · a year ago
LoC per hour seems to me like a terrible metric.
esafak · a year ago
Why? Since you are vetting the code it generates, the rate at which you end up with code you accept seems like a good measure of productivity.
ikety · a year ago
do you use pair programming tools like aider?
ls_stats · a year ago
>it's actually a good UI >I just finished a small project >around 2200 lines

why the top comments on HN are always people who have not read the article

pc86 · a year ago
It's not clear to me in the lines you're quoting that the GP didn't read the article.

Deleted Comment

taeric · a year ago
I'm growing to the idea that chat is a bad UI pattern, period. It is a great record of correspondence, I think. But it is a terrible UI for doing anything.

In large, I assert this is because the best way to do something is to do that thing. There can be correspondence around the thing, but the artifacts that you are building are separate things.

You could probably take this further and say that narrative is a terrible way to build things. It can be a great way to communicate them, but being a separate entity, it is not necessarily good at making any artifacts.

zamfi · a year ago
With apologies to Bill Buxton: "Every interface is best at something and worst at something else."

Chat is a great UI pattern for ephemeral conversation. It's why we get on the phone or on DM to talk with people while collaborating on documents, and don't just sit there making isolated edits to some Google Doc.

It's great because it can go all over the place and the humans get to decide which part of that conversation is meaningful and which isn't, and then put that in the document.

It's also obviously not enough: you still need documents!

But this isn't an "either-or" case. It's a "both" case.

packetlost · a year ago
I even think it's bad for generalized communication (ie. Slack/Teams/Discord/etc.) that isn't completely throwaway. Email is better in every single way for anything that might ever be relevant to review again or be filtered due to too much going on.
goosejuice · a year ago
I've had the opposite experience.

I have never had any issue finding information in slack with history going back nearly a decade. The only issue I have with Slack is a people problem where most communication is siloed in private channels and DMs.

Email threads are incredibly hard to follow though. The UX is rough and it shows.

taeric · a year ago
Anything that needs to be filtered for viewing again pretty much needs version control. Email largely fails at that, as hard as other correspondence systems. That said, we have common workflows that use email to build reviewed artifacts.

People love complaining about the email workflow of git, but it is demonstrably better than any chat program for what it is doing.

SoftTalker · a year ago
Yes, agree. Chatting with a computer has all the worst attributes of talking to a person, without any of the intuitive understanding, nonverbal cues, even tone of voice, that all add meaning when two human beings talk to each other.
TeMPOraL · a year ago
That comment made sense 3 years ago. LLMs already solved "intuitive understanding", and the realtime multimodal variants (e.g. the thing behind "Advanced Voice" in ChatGPT app) handle tone of voice in both directions. As for nonverbal cues, I don't know yet - I got live video enabled in ChatGPT only few days ago and didn't have time to test it, but I would be surprised if it couldn't read the basics of body language at this point.

Talking to a computer still sucks as an user interface - not because a computer can't communicate on multiple channels the way people do, as it can do it now too. It sucks for the same reason talking to people sucks as an user interface - because the kind of tasks we use computers for (and that aren't just talking with/to/at other people via electronic means) are better handle by doing than by talking about them. We need an interface to operate a tool, not an interface to an agent that operates a tool for us.

As an example, consider driving (as in, realtime control - not just "getting from point A to B"): a chat interface to driving would suck just as badly as being a backseat driver sucks for both people in the car. In contrast, a steering wheel, instead of being a bandwidth-limiting indirection, is an anti-indirection - not only it lets you control the machine with your body, the control is direct enough that over time your brain learns to abstract it away, and the car becomes an extension of your body. We need more of tangible interfaces like that with computers.

The steering wheel case, of course, would fail with "AI-level smarts" - but that still doesn't mean we should embrace talking to computers. A good analogy is dance - it's an interaction between two independently smart agents exploring an activity together, and as they do it enough, it becomes fluid.

So dance, IMO, is the steering wheel analogy for AI-powered interfaces, and that is the space we need to explore more.

aylmao · a year ago
I would also call it having all the worst attributes of a CLI, without the succinctness, OS integration, and program composability of one.
hakfoo · a year ago
The idea of chat interfaces always seemed to be to disguise available functionality.

It's a CLI without the integrity. When you bought a 386, it came with a big book that said "MS-DOS 4.01" and enumerated the 75 commands you can type at the C:\> prompt and actually make something useful happen.

When you argue with ChatGPT, its whole business is to not tell you what those 75 commands are. Maybe your prompt fits its core competency and you'll get exactly what you wanted. Maybe it's hammering what you said into a shape it can parse and producing marginal garbage. Maybe it's going to hallucinate from nothing. But it's going to hide that behind a bunch of cute language and hopefully you'll just keep pulling the gacha and blaming yourself if it's not right.

taeric · a year ago
Yeah, this is something I didn't make clear on my post. Chat between people is the same bad UI. People read in the aggression that they bring to their reading. And get mad at people who are legit trying to understand something.

You have some of the same problems with email, of course. Losing threading, in particular, made things worse. It was a "chatification of email" that caused people to lean in to email being bad. Amusing that we are now seeing chat applications rise to replace email.

Suppafly · a year ago
I like the idea of having a chat program, the issue is that it's horrible to have a bunch of chat programs all integrated into every application you use that are separate and incompatible with each other.

I really don't like the idea of chatting with an AI though. There are better ways to interface with AIs and the focus on chat is making people forget that.

tux1968 · a year ago
We need an LSP like protocol for AI, so that we can amortize the configuration over every place we want such an integration. AISP?
freedomben · a year ago
Midjourney is an interesting case study in this I think, building their product UI as a discord bot. It was interesting to be sure, but I always felt like I was fighting the "interface" to get things done. It certainly wasn't all bad, and I think if I used it more it might even be great, but as someone who doesn't use Discord other than that and only rarely generated images, I had to read the docs every time I wanted to generate an image, which is a ridiculous amount of friction.
joe_guy · a year ago
There has recently been a pretty large UI inclusion for midjourney directly inside Discord which has the option of being used instead of the text input.

As is often the case in these sorts of thingsz your milage may vary for the more complex settings.

ijk · a year ago
I'm curious if you find their new website interface more tractable--there's some inherent friction to the prompting in either case, but I'd like to know if the Discord chat interface can be overcome by using a different interface or if the issue is more intrinsic.
dapperdrake · a year ago
Email threads seem better for documenting and searching correspondence.

The last counter argument I read got buried on Discord or Slack somewhere.

jayd16 · a year ago
Isn't this entirely an implementation detail of slack and discord search? What about email makes it more searchable fundamentally? The meta data if both platforms is essentially the same, no?
al_borland · a year ago
I find things get buried just as easily in email. People on my team are constantly resending each other emails, because they can’t find the thread.

This is why, if something is important, I take it out of email and put it into a document people can reference. The latest and correct information from all the decisions in the thread can also be collected in one place, so everyone reading doesn’t have to figure it out. Not to mention side conversations can influence the results, without being explicitly stated in the email thread.

taeric · a year ago
Discord and slack baffle me. I liked them specifically because they were more ephemeral than other options. Which, seems at odds with how people want them to be? Why?
65 · a year ago
Oh, how nice it must be to complain about Slack. Try using Teams and you will never want to complain about Slack again.
chinathrow · a year ago
Voice messages within a chat UI is even worse. I can't search it, I can't listen to it in the same situations I can read a message.

I wish I could block them within all these chat apps.

"Sorry, you can't bother to send voice messages to this person."

taeric · a year ago
Oh dear lord yes. I am always baffled when I hear that some folks send voice memos to people.
beambot · a year ago
As a written form of "stream of consciousness", it seems to have a lot of value to me. It's noisy, inefficient & meandering -- all the things those polished artifacts are not -- but it's also where you can explore new avenues without worrying about succinctness or completeness. It's like the first draft of a manuscript.
taeric · a year ago
Certainly, it can have its use. But I question if it is stronger than previous generative techniques for creating many things. There have been strong tools that you could, for example, draw a box and say this should be a house. Add N rooms. This room should be a bathroom. Add windows to these rooms. Add required subfloor and plumbing.

Even with game development. Level editors have a good history for being how people actually make games. Some quite good ones, I should add.

For website development, many template based systems worked quite well. People seem hellbent on never acknowledging that form builders of the late 90s did, in fact, work.

Is it a bit nicer that you can do everything through a dialog? I'm sure it is a great for people that think that way.

tpmoney · a year ago
I disagree. Chat is a fantastic UI for getting an AI to generate something vague. Specifically I’m thinking of AI image generation. A chat UI is a great interface for iterating on an image and dialing it in over a series of iterations. The key here is that the AI model needs to keep context both of the image generation history and that chat history.

I think this applies to any “fuzzy generation” scenario. It certainly shouldn’t be the only tool, and (at least as it stands today) isn’t good enough to finalize and fine tune the final result, but a series of “a foo with a bar” “slightly less orange” “make the bar a bit more like a fizzbuzz” interactions with a good chat UI can really get a good 80% solution.

But like all new toys, AI and AI chat will be hammered into a few thousand places where it makes no sense until the hype dies down and we come up with rules and guidelines for where it does and doesn’t work

badsectoracula · a year ago
> Specifically I’m thinking of AI image generation

I heavily disagree here, chat - or really text - is a horrible UI for image generation, unless you have almost zero idea of what you want to achieve and you don't really care about the final results.

Typing "make the bar a bit more like a fizzbuzz" in some textbox is awful UX compared to, say, clicking on the "bar" and selecting "fizzbuzz" or drag-and-dropping "fizzbuzz" on the "bar" or really anything that takes advantage of the fact we're interacting with a graphical environment to do work on graphics.

In fact it is a horrible UI for anything, except perhaps chatbots and tasks that have to do with text like grammar correction, altering writing styles, etc.

It is helpful for impressing people (especially people with money) though.

t_mann · a year ago
Ok, but what is a good pattern to leverage AI tools for coding (assuming that they have some value there, which I think most people would agree with now)? I could see two distinct approaches:

- "App builders" that use some combination of drag&drop UI builders, and design docs for architecture, workflows,... and let the UI guess what needs to be built "under the hood" (a little bit in the spirit of where UML class diagrams were meant to take us). This would still require actual programming knowledge to evaluate and fix what the bot has built

- Formal requirement specification that is sufficiently rigorous to be tested against automatically. This might go some way towards removing the requirement to know how to code, but the technical challenge would simply shift to knowing the specification language

taeric · a year ago
I'd challenge if this is specific to coding? If you want to get a result that is largely like a repertoire of examples used in a training set, chat is probably workable? This is true for music. Visual art. Buildings. Anything, really?

But, if you want to start doing "domain specific" edits to the artifacts that are made, you are almost certainly going to want something like the app builders idea. Down thread, I mention how this is a lot like procedural generative techniques for game levels and such. Such that I think I am in agreement with your first bullet?

Similarly, if you want to make music with an instrument, it will be hard to ignore playing with said instrument more directly. I suspect some people can create things using chat as an interface. I just also suspect directly touching the artifacts at play is going to be more powerful.

I think I agree with the point on formal requirements. Not sure how that really applies to chat as an interface? I think it is hoping for a "laws of robotics" style that can have a test to confirm them? Reality could surprise me, but I always viewed that as largely a fiction item.

staplers · a year ago

  Ok, but what is a good pattern to leverage AI tools for coding?
Actual product stakeholders are not likely to spill their magic sauce and give free consultancy.

kiitos · a year ago
I've yet to see any AI/LLM produce code that withstands even basic scrutiny.
lucasyvas · a year ago
Disclaimer: Haven't used the tools a lot yet, just a bit. So if I say something that already exists, forgive me.

TLDR: Targeted edits and prompts / Heads Up Display

It should probably be more like an overlay (and hooked into context menus with suggestions, inline context bubbles when you want more context for a code block) and make use of an IDE problems view. The problems view would have to be enhanced to allow it to add problems that spanned multiple files, however.

Probably like the Rust compiler output style, but on steroids.

There would likely be some chatting required, but it should all be at a particular site in the code and then go into some history bank where you can view every topic you've discussed.

For authoring, I think an interactive drawing might be better, allowing you to click on specific areas and then use shorter phrasing to make an adjustment instead of having an argument in some chat to the left of your screen about specificity of your request.

Multi-point / click with minimal prompt. It should understand based on what I clicked what the context is without me having to explain it.

gagik_co · a year ago
I think “correspondence UX” can be bad UX but there’s nothing inherently wrong with chat UI.

I created the tetr app[1] which is basically “chat UI for everything”. I did that because I used to message myself notes and wanted to expand it to many more things. There’s not much back and forth, usually 1 input and instant output (no AI), still acting like a chat.

I think there’s a lot of intuitiveness with chat UI and it can be a flexible medium for sharing different information in a similar format, minimizing context switching. That’s my philosophy with tetr anyhow.

[1] https://tetr.app/

marcosdumay · a year ago
> It can be a great way to communicate them

It's usually not. Narrative is a famously flawed way to communicate or record the real world.

It's great for generating engagement, though.

sangnoir · a year ago
> Narrative is a famously flawed way to communicate or record the real world.

...and yet with it's flaws, it's the most flexible in conveying meaning. A Ted Chiang interview was on the HN frontpage a few days ago, in it, he mentions that humans created multiple precise, unambiguous communication modes like equations used in mathematical papers and proofs. But those same papers are not 100% equations, the mathematicians have to fall back to flawed language to describe and provide context because those formal languages only capture a smaller range of human thought compared to natural language.

This is not to say chat has the best ergonomics for development - it's not, but one has to remember that the tools are based on Large Language Models whose one-trick is manipulating language. Better ergonomics would likely come from models trained or fine-tuned on AST-tokens and diffs. They'd still need to modulate on language (understanding requirements, hints, variable names,and authoring comments, commits and/or PRs).

taeric · a year ago
I think fictional narratives that aim to capture inner monologue are famously flawed. I think narrative tours of things can be good. I'm not clear if "narrated tours" are a specific genre, sadly. :(
OJFord · a year ago
I don't know, I'm in Slack all day with colleagues, I quite like having the additional ChatGPT colleague (even better I can be quite rude/terse in my messages with 'them').

Incidentally I think that's also a good model for how much to trust the output - you might have a colleague who knows enough about X to think they can answer your question, but they're not necessarily right, you don't blindly trust it. You take it as a pointer, or try the suggestion (but not surprised if it turns out it doesn't work), etc.

taeric · a year ago
Oh, do not take my comment as a "chat bots shouldn't exist." That is not at all my intent. I just think it is a bad interface for building things that are self contained in the same chat log.
varispeed · a year ago
Talk to AI chat as if you would talk to junior developer at your company and tell it to do something that you need.

I think it is brilliant. On another hand I caught myself many times writing prompts to colleagues. Although it made requirements of what I need so much clearer for them.

Sylamore · a year ago
NC DMV replaced their regular forms with a chat bot and it's horrible. Takes forever to complete tasks that used to take less than a minute because of the fake interaction and fake typing. Just give me a damn form to pay my taxes or request a custom plate.
brobdingnagians · a year ago
Similar thing I've run into lately, chat is horrible for tracking issues and tasks. When people try to use it that way, it becomes absolute chaos after awhile.
dartos · a year ago
Preach!

I’ve been saying this since 2018

themanmaran · a year ago
I'm surprised that the article (and comments) haven't mentioned Cursor.

Agreed that copy pasting context in and out of ChatGPT isn't the fastest workflow. But Cursor has been a major speed up in the way I write code. And it's primarily through a chat interface, but with a few QOL hacks that make it way faster:

1. Output gets applied to your file in a git-diff style. So you can approve/deny changes.

2. It (kinda) has context of your codebase so you don't have to specify as much. Though it works best when you explicitly tag files ("Use the utils from @src/utils/currency.ts")

3. Directly inserting terminal logs or type errors into the chat interface is incredibly convenient. Just hover over the error and click the "add to chat"

dartos · a year ago
I think the wildly different experiences we all seem to have with AI code tools speaks to the inconsistency of the tools and our own lack of understanding of what goes into programming.

I’ve only been slowed down with AI tools. I tried for a few months to really use them and they made the easy tasks hard and the hard tasks opaque.

But obviously some people find them helpful.

Makes me wonder if programming approaches differ wildly from developer to developer.

For me, if I have an automated tool writing code, it’s bc I don’t want to think about that code at all.

But since LLMs don’t really act deterministically, I feel the need to double check their output.

That’s very painful for me. At that point I’d rather just write the code once, correctly.

kenjackson · a year ago
I use LLMs several times a day, and I think for me the issue is that verification is typically much faster than learning/writing. For example, I've never spent much time getting good at scripting. Sure, probably a gap I should resolve, but I feel like LLMs do a great job at it. And what I need to script is typically easy to verify, I don't need to spend time learning how to do things like, "move the files of this extension to this folder, but rewrite them so that the name begins with a three digit number based on the date when it was created, with the oldest starting with 001" -- or stuff like that. Sometimes it'll have a little bug, but one that I can debug quickly.

Scripting assistance by itself is worth the price of admission.

The other thing I've found it good at is giving me an English description of code I didn't write... I'm sure it sometimes hallucinates, but never in a way that has been so wrong that its been apparent to me.

aprilthird2021 · a year ago
I think it's about what you're working on. It's great for greenfield projects, etc. Terrible for complex projects that plug into a lot of other complex projects (like most of the software those of us not at startups work on day to day)
sangnoir · a year ago
> But since LLMs don’t really act deterministically, I feel the need to double check their output.

I feel the same

> That’s very painful for me. At that point I’d rather just write the code once, correctly.

I use AI tools augmentatively, and it's not painful for me, perhaps slightly inconvenient. But for boiler-plate-heavy code like unit tests or easily verifiable refactors[1], adjusting AI-authored code on a per-commit basis is still faster than me writing all the code.

1. Like switching between unit-test frameworks

lolinder · a year ago
I like Cursor, but I find the chat to be less useful than the super advanced auto complete.

The chat interface is... fine. Certainly better integrated into the editor than GitHub Copilot's, but I've never really seen the need to use it as chat—I ask for a change and then it makes the change. Then I fixed what it did wrong and ask for another change. The chat history aspect is meaningless and usually counterproductive, because it's faster for me to fix its mistakes than keep everything in the chat window while prodding it the last 20% of the way.

tarsinge · a year ago
I was a very skeptic on AI assisted coding until I tried Cursor and experienced the super autocomplete. It is ridiculously productive. For me it’s to the point it makes Vim obsolete because pressing tab correctly finishes the line or code block 90% of the time. Every developer having an opinion on AI assistance should have just tried to download Cursor and start editing a file.
themanmaran · a year ago
Agreed the autocomplete definitely gets more milage than the chat. But I frequently use it for terminal commands as well. Especially AWS cli work.

"how do I check the cors bucket policies on [S3 bucket name]"

fragmede · a year ago
> while prodding it the last 20% of the way.

hint: you don't get paid to get the LLM to output perfect code, you get paid by PRs submitted and landed. Generate the first 80% or whatever with the LLM, and then finish the last 20% that you can write faster than the LLM yourself, by hand.

koito17 · a year ago
I'm not familiar with Cursor, but I've been using Zed with Claude 3.5 Sonnet. For side projects, I have found it extremely useful to provide the entire codebase as context and send concise prompts focusing on a single requirement. Claude handles "junior developer" tasks well when each unit of work is clearly separated.

Zed makes it trivial to attach documentation and terminal output as context. To reduce risk of hallucination, I now prefer working in static, strongly-typed languages and use libraries with detailed documentation, so that I can send documentation of the library alongside the codebase and prompt. This sounds like a lot of work, but all I do is type "/f" or "/t" in Zed. When I know a task only modifies a single file, then I use the "inline assist" feature and review the diffs generated by the LLM.

Additionally, I have found it extremely useful to actually comment a codebase. LLMs are good at unstructured human language, it's what they were originally designed for. You can use them to maintain comments across a codebase, which in turn helps LLMs since they get to see code and design together.

Last weekend, I was able to re-build a mobile app I made a year ago from scratch with a cleaner code base, better UI, and implement new features on top (making the rewrite worth my time). The app in question took me about a week to write by hand last year; the rewrite took exactly 2 days.

---

As a side note: a huge advantage of Zed with locally-hosted models is that one can correct the code emitted by the model and force the model to re-generate its prior response with those corrections. This is probably the "killer feature" of models like qwen2.5-coder:32b. Rather than sending extra prompts and bloating the context, one can just delete all output from where the first mistake was made, correct the mistake, then resume generation.

stitched2gethr · a year ago
I think this misses the point. It seems like the author is saying we should move from imperative instructions to a declarative document that describes what the software should do.

Imperative: - write a HTTP server that serves jokes - add a healthcheck endpoint - add TLS and change the serving port to 443

Declarative: - a HTTP server that serves jokes - contains a healthcheck endpoint - supports TLS on port 443

The differences here seem minimal because you can see all of it at once, but in the current chat paradigm you'd have to search through everything you've said to the bot to get the full context, including the side roads that never materialized.

In the document approach you're constantly refining the document. It's better than reviewing the code because (in theory) you're looking at "support TLS on port 443" instead of a lot of code, which means it can be used by a wider audience. And ideally I can give the same high level spec to multiple LLMs and see which makes the best application.

ygouzerh · a year ago
Good explanation! As an open-reflexion: will a declarative document be as detailed as the imperative version? Often between the specs that the product team is providing (that we can consider as the "descriptive" document) and the implementation, many sub specs have been created by the tech team that uncovered some important implementation details. It's like a Rabbit Hole.

For example, for a signup page, we could have: - Declarative: Signup the user using their email address - Imperative: To do the same, we will need to implement the smtp library, which means discovering that we need an SMTP server, so now we need to choose which one. And when purchasing an SMTP Server plan, we discover that there are rate limit, so now we need to add some bot protection to our signup page (IP Rate Limit only? ReCaptcha? Cloudflare bot protection?), etc

Which means that at the end, the imperative code way is kind of like the ultimate implementation specs.

skydhash · a year ago
The issue is that there’s no execution platform for declarative specs, so something will be translated to imperative and that is where the issue lies. There’s always an imperative core which needs to be deterministic or it’s out needs to be verified. LLMs are not the former and the latter option can take more time than just writing the code.
patrickaljord · a year ago
Instead of Cursor I would recommend two open source alternatives that you can combine: https://www.continue.dev/ and https://github.com/cline/cline
freeone3000 · a year ago
It’s not nearly as slick. cursor’s indexing and integration are significant value-adds.
coder543 · a year ago
I used Continue before Cursor. Cursor’s “agent” composer mode is so much better than what Continue offered. The agent can automatically grep the codebase for relevant files and then read them. It can create entirely new files from scratch. I can still manually provide some files as context, but it’s not usually necessary. With Continue, everything was very manual.

Cursor also does a great job of showing inline diffs of what composer is doing, so you can quickly review every change.

I don’t think there’s any reason Continue can’t match these features, but it hadn’t, last I checked.

Cursor also focuses on sane defaults, which is nice. The tab completion model is very good, and the composer model defaults to Claude 3.5 Sonnet, which is arguably the best non-reasoning code model. (One would hope that Cursor gets agent-composer working with reasoning models soon.) Continue felt much more technical… which is nice for power users, but not always the best starting place.

notShabu · a year ago
chat is the best way to orchestrate and delegate. whether or not this is considered "ME writing MY code" is imo a philosophical debate

e.g. executives treat the org as a blackbox LLM and chat w it to get real results

mkozlows · a year ago
Windsurf is even moreso this way -- it'll look through your codebase trying to find the right files to inspect, it runs the build/test stuff and examines the output to see what went wrong.

I found interacting with it via chat to be super-useful and a great way to get stuff done. Yeah, sometimes you just have to drop into the code, and tag a particular line and say "this isn't going to work, rewrite it to do x" (or rewrite it yourself), but the ability to do that doesn't vitiate the value of the chat.

mholm · a year ago
Yeah, the OP has a great idea, but models as-is can't handle that kind of workflow reliably. The article is both a year behind, and a year ahead at the same time. The user must iterate with the chatbot, and you can't do that by just doing a top down 'here's a list of all features, get going, ping me when finished' prompt. AI is a junior engineer, so you have to treat it like a junior engineer, and that means looking through your chat logs, and perhaps backing up to a restore point and going a different direction.
mttrms · a year ago
I've started using Zed on a side project and I really appreciate that you can easily manipulate the chat / context and continue making requests

https://zed.dev/docs/assistant/assistant-panel#editing-a-con...

It's still a "chat" but it's just text at the end of the day. So you can edit as you see fit to refine your context and get better responses.

croes · a year ago
Natural language isn’t made to be precise that’s why we use a subset in programming languages.

So you either need lots of extra text to remove the ambiguity of natural language if you use AI or you need a special precise subset to communicate with AI and that’s just programming with extra steps.

Klaster_1 · a year ago
A lot of extra text usually means prior requirements, meeting transcripts, screen share recordings, chat history, Jira tickets and so on - the same information developers use to produce a result that satisfies the stakeholders and does the job. This seems like a straightforward direction solvable with more compute and more efficient memory. I think this will be the way it pans outs.

Real projects don't require an infinitely detailed specification either, you usually stop where it no longer meaningfully moves you towards the goal.

The whole premise of AI developer automation, IMO, is that if a human can develop a thing, then AI should be able too, given the same input.

throwaway290 · a year ago
idk if you think all those jira tickets and meetings are precise enough (IMO sometimes the opposite)

By the way, remind me why you need design meetings in that ideal world?:)

> Real projects don't require an infinitely detailed specification either, you usually stop where it no longer meaningfully moves you towards the goal.

The point was that specification is not detailed enough in practice. Precise enough specification IS code. And the point is literally that natural language is just not made to be precise enough. So you are back where you started

So you waste time explaining in detail and rehashing requirements in this imprecise language until you see what code you want to see. Which was faster to just... idk.. type.

cube2222 · a year ago
We are kind of actually there already.

With a 200k token window like Claude has you can already dump a lot of design docs / transcripts / etc. at it.

layer8 · a year ago
This premise in your last paragraph can only work with AGI, and we’re probably not close to that yet.
oxfordmale · a year ago
Yes, let's devise a more precise way to give AI instructions. Let's call it pAIthon. This will allow powers that be, like Zuckerberg to save face and claim that AI has replaced mid-level developers and enable developers to rebrand themselves as pAIthon programmers.

Joking aside, this is likely where we will end up, just with a slightly higher programming interface, making developers more productive.

dylan604 · a year ago
man, pAIthon was just sitting right there for the taking
pjc50 · a year ago
There was a wave of this previously in programming: https://en.wikipedia.org/wiki/The_Last_One_(software)

All the same buzzwords, including "AI"! In 1981!

empath75 · a year ago
AIs actually are very good at this. They wouldn't be able to write code at all otherwise. If you're careful in your prompting, they'll make fewer assumptions and ask clarifying questions before going ahead and writing code.
9rx · a year ago
> If you're careful in your prompting

In other words, if you replace natural language with a programming language then the computer will do a good job of interpreting your intent. But that's always been true, so...

LordDragonfang · a year ago
> they'll make fewer assumptions and ask clarifying questions before going ahead and writing code.

Which model are you talking about here? Because with ChatGPT, I struggle with getting it to ask any clarifying questions before just dumping code filled with placeholders I don't want, even when I explicitly prompt it to ask for clarification.

oxfordmale · a year ago
AI is very good at this. Unfortunately, humans tend to be super bad at providing detailed verbal instructions.
croes · a year ago
AI is a little bit like Occam‘s razor, when you say hoofbeats, you get horses. Bad if you need Zebras.
foobiekr · a year ago
I don’t think I’ve ever seen an llm in any context ask for clarification. Is that a real thing?
spacemanspiff01 · a year ago
Or a proposal/feedback process. Ala you are hired by non technical person to build something, you generate requirements and a proposed solution. You then propose that solution, they give feedback.

Having a feedback loop is the only way viable for this. Sure, the client could give you a book on what they want, but often people do not know their edge cases, what issues may arise/etc.

dylan604 · a year ago
> and that’s just programming with extra steps.

If you know how to program, then I agree and part of why I don't see the point. If you don't know how to program, than the prompt isn't much different than providing the specs/requirements to a programmer.

kokanee · a year ago
> or you need a special precise subset to communicate with AI

haha, I just imagined sending TypeScript to ChatGPT and having it spit my TypeScript back to me. "See guys, if you just use Turing-complete logically unambiguous input, you get perfect output!"

charlieyu1 · a year ago
I guess we could have LLM to translate natural language to some precise subset, get it processed, then translate the output back to natural language
thomastjeffery · a year ago
Natural language can be precise, but only in context.

The struggle is to provide a context that disambiguates the way you want it to.

LLMs solve this problem by avoiding it entirely: they stay ambiguous, and just give you the most familiar context, letting you change direction with more prompts. It's a cool approach, but it's often not worth the extra steps, and sometimes your context window can't fit enough steps anyway.

My big idea (the Story Empathizer) is to restructure this interaction such that the only work left to the user is to decide which context suits their purpose best. Given enough context instances (I call them backstories), this approach to natural language processing could recursively eliminate much of its own ambiguity, leaving very little work for us to do in the end.

Right now my biggest struggle is figuring out what the foundational backstories will be, and writing them.

skydhash · a year ago
That’s what programming languages are: You define a context, then you see that you can shorten the notation to symbol character: Like “The symbol a will refer to the value of type string and content ‘abcd’ and cannot refer to anything else for its life time” get you:

  const a = “abcd”
That is called semantics. Programming is mostly fitting the vagueness inherent to natural languages to the precise context of the programming language.

65 · a year ago
We're going to create SQL all over again, aren't we?
lelanthran · a year ago
A more modern COBOL maybe.

Deleted Comment

matthewsinclair · a year ago
Yep. 100% agree. The whole “chat as UX” metaphor is a cul-de-sac that I’m sure we’ll back out of sooner or later.

I think about this like SQL in the late 80s. At the time, SQL was the “next big thing” that was going to mean we didn’t need programmers, and that management could “write code”. It didn’t quite work out that way, of course, as we all know.

I see chat-based interfaces to LLMs going exactly the same way. The LLM will move down the stack (rather than up) and much more appropriate task-based UX/UI will be put on top of the LLM, coordinated thru a UX/UI layer that is much sympathetic to the way users actually want to interact with a machine.

In the same way that no end-users ever touch SQL these days (mostly), we won’t expose the chat-based UX of an LLM to users either.

There will be a place for an ad-hoc natural language interface to a machine, but I suspect it’ll be the exception rather than the rule.

I really don’t think there are too many end users who want to be forced to seduce a mercurial LLM using natural language to do their day-to-day tech tasks.

jug · a year ago
I think a counterpoint to this is that SQL has a specific and well-defined meaning and it takes effort to get what you actually want right. However, communication with an AI can sometimes request a specific context or requirements but also be intentionally open-ended where we want to give the AI leeway. The great thing here is that humans _and_ AI now quite clearly understand when a sentence is non-specific, or with great importance. So, I think it’s hard to come up with a more terse or approachable competitor to the sheer flexibility of language. In a way, I think it’s a similar problem that still has engineers across the world input text commands in a terminal screen since about 80 years now.
sangnoir · a year ago
> The whole “chat as UX” metaphor is a cul-de-sac that I’m sure we’ll back out of sooner or later.

Only when someone discovers another paradigm that matches or exceeds the effectiveness of LLMs without being a language model.

amedviediev · a year ago
I actually came to the same conclusion. I am currently working on a side project that's an AI powered writing app for writers, and while I still provide chat because that seems to be the expectation, my goal is to abstract all the AI assistance a writer might need into curated UI options.
daxfohl · a year ago
Or DSLs like cucumber for acceptance tests. Cute for simple things, but for anything realistic, it's more convoluted than convenient.
spolsky · a year ago
I don't think Daniel's point is that Chat is generically a clunky UI and therefore Cursor cannot possibly exist. I think he's saying that to fully specify what a given computer program should do, you have to provide all kinds of details, and human language is too compressed and too sloppy to always include those details. For example, you might say "make a logon screen" but there are an infinite number of ways this could be done and until you answer a lot of questions you may not get what you want.

If you asked me two or three years ago I would have strongly agreed with this theory. I used to point out that every line of code was a decision made by a programmer and that programming languages were just better ways to convey all those decisions than human language because they eliminated ambiguity and were much terser.

I changed my mind when I saw how LLMs work. They tend to fill in the ambiguity with good defaults that are somewhere between "how everybody does it" and "how a reasonably bright junior programmer would do it".

So you say "give me a log on screen" and you get something pretty normal with Username and Password and a decent UI and some decent color choices and it works fine.

If you wanted to provide more details, you could tell it to use the background color #f9f9f9, but a part of what surprised my and caused me to change my mind on this matter was that you could also leave that out and you wouldn't get an error; you wouldn't get white text on white background; you would get a decent color that might be #f9f9f9 or might be #a1a1a1 but you saved a lot of time by not thinking about that level of detail and you got a good result.

zamfi · a year ago
Yeah, and in fact this is about the best-case scenario in many ways: "good defaults" that get you approximately where you want to be, with a way to update when those defaults aren't what you want.

Right now we have a ton of AI/ML/LLM folks working on this first clear challenge: better models that generate better defaults, which is great—but also will never solve the problem 100%, which is the second, less-clear challenge: there will always be times you don't want the defaults, especially as your requests become more and more high-level. It's the MS Word challenge reconstituted in the age of LLMs: everyone wants 20% of what's in Word, but it's not the same 20%. The good defaults are good except for that 20% you want to be non-default.

So there need to be ways to say "I want <this non-default thing>". Sometimes chat is enough for that, like when you can ask for a different background color. But sometimes it's really not! This is especially true when the things you want are not always obvious from limited observations of the program's behavior—where even just finding out that the "good default" isn't what you want can be hard.

Too few people are working on this latter challenge, IMO. (Full disclosure: I am one of them.)

skydhash · a year ago
Which no one argues about really. But writing code was never the issue of software project. And if you open any books about software engineering, there’s barely any mention of coding. The issue is the process of finding what code to write and where to put it in a practical and efficient way.

In your example, the issue is not with writing the logon screen (You can find several example on github and a lot of css frameworks have form snippets). The issue is making sure that it works and integrate well with the rest of the project, as well as being easy to maintain.

jakelazaroff · a year ago
I agree with the premise but not with the conclusion. When you're building visual things, you communicate visually: rough sketches, whiteboard diagrams, mockups, notes scrawled in the margins.

Something like tldraw's "make real" [1] is a much better bet, imo (not that it's mutually exclusive). Draw a rough mockup of what you want, let AI fill in the details, then draw and write on it to communicate your changes.

We think multi-modally; why should we limit the creative process to just text?

[1] https://tldraw.substack.com/p/make-real-the-story-so-far

Edmond · a year ago
This is about relying on requirements type documents to drive AI based software development, I believe this will be ultimately integrated into all the AI-dev tools, if not so already. It is really just additional context.

Here is an example of our approach:

https://blog.codesolvent.com/2024/11/building-youtube-video-...

We are also using the requirements to build a checklist, the AI generates the checklist from the requirements document, which then serves as context that can be used for further instructions.

Here's a demo:

https://youtu.be/NjYbhZjj7o8?si=XPhivIZz3fgKFK8B

wongarsu · a year ago
Now we just need another tool that allows stakeholders to write requirement docs using a chat interface