AI: First New UI Paradigm in 60 Years?

Animats · 3 years ago

This article isn't too helpful.

There have been many "UI Paradigms", but the fancier ones tended to be special purpose. The first one worthy of the name was for train dispatching. That was General Railway Signal's NX (eNtry-Exit) system.[1] Introduced in 1936, still in use in the New York subways. With NX, the dispatcher routing an approaching train selected the "entry" track on which the train was approaching. The system would then light up all possible "exit" tracks from the junction. This took into account conflicting routes already set up and trains present in the junction. Only reachable exits lit up. The dispatcher pushed the button for the desired exit. The route setup was then automatic. Switches moved and locked into position, then signals along the route went to clear. All this was fully interlocked; the operator could not request anything unsafe.

There were control panels before this, but this was the first system where the UI did more than just show status. It actively advised and helped the operator. The operator set the goal; the system worked out how to achieve it.

Another one I encountered was an early computerized fire department dispatching system. Big custom display boards and keyboards. When an alarm came in, it was routed to a dispatcher. Based on location, the system picked the initial resources (trucks, engines, chiefs, and special equipment) to be dispatched. Each dispatcher had a custom keyboard, with one button for each of those resources. The buttons lit up indicating the selected equipment. The dispatcher could add additional equipment with a single button push, if the situation being called in required it. Then they pushed one big button, which set off alarms in fire stations, printed a message on a printer near the fire trucks, and even opened the doors at the fire house. There was a big board at the front of the room which showed the status of everything as colored squares. The fire department people said this cut about 30 seconds off a dispatch, which, in that business, is considered a big win.

Both of those are systems which had to work right. Large language models are not even close to being safe to use in such applications. Until LLMs report "don't know" instead of hallucinating, they're limited to very low risk applications such as advertising and search.

Now, the promising feature of LLMs in this direction is the ability to use the context of previous questions and answers. It's still query/response, but with enough context that the user can gradually make the system converge on a useful result. Such systems are useful for "I don't know what I want but I'll know it when I see it" problems. This allows using flaky LLMs with human assistance to get a useful result.

[1] https://online.anyflip.com/lbes/vczg/mobile/#p=1

philovivero · 3 years ago

> Both of those are systems which had to work right. Large language models are not even close to being safe to use in such applications. Until LLMs report "don't know" instead of hallucinating, they're limited to very low risk applications such as advertising and search.

Are humans limited to low-risk applications like that?

Because humans, even some of the most humble, will still assert things they THINK are true, but are patently untrue, based on misunderstandings, faulty memories, confused reasoning, and a plethora of others.

I can't count the number of times I've had conversations with extremely well-experience, smart techies who just spout off the most ignorant stuff.

And I don't want to count the number of times I've personally done that, but I'm sure it's >0. And I hate to tell you, but I've spent the last 20 years in positions of authority that could have caused massive amounts of damage not only to the companies I've been employed by, but a large cross-section of society as well. And those fools I referenced in the last paragraph? Same.

I think people are too hasty to discount LLMs, or LLM-backed agents, or other LLM-based applications because of their limitations.

(Related: I think people are too hasty to discount the catastrophic potential of self-modifying AGI as well)

memefrog · 3 years ago

Can people please stop making this comment in reply to EVERY criticism of LLMs? "Humans are flawed too".

We do not normally hallucinate. We are sometimes wrong, and sometimes are wrong about the confidence they should attach to their knowledge. But we do not simply hallucinate and spout fully confidence nonsense constantly. That is what LLMs.

You remember a few isolated incidents because they're salient. That does not mean that it's representative of your average personal interactions.

hyperthesis · 3 years ago

> Are humans limited to low-risk applications like that?

No, but arguably civilization consists of mechanisms to manage human fallibility (separation of powers, bicameralism, "democracy", bureaucracy, regulations, etc). We might not fully understand why, but we've found methods that sorta kinda "work".

> could have caused

That's why they didn't.

ilyt · 3 years ago

>Because humans, even some of the most humble, will still assert things they THINK are true, but are patently untrue, based on misunderstandings, faulty memories, confused reasoning, and a plethora of others.

> I can't count the number of times I've had conversations with extremely well-experience, smart techies who just spout off the most ignorant stuff.

Spouting out the most ignorant stuff is one of the lowest risk things you can do in general. We're talking about running a code where bug can do a ton of damage, financial or otherwise, not water-cooler conversations.

cmiles74 · 3 years ago

In the train example, the UI is in place to prevent a person from making a dangerous route. I think the idea here is that an LLM cannot take the place of such a UI as they are inherently unreliable.

NikolaNovak · 3 years ago

To your point,Humans are augmented by checklists and custom processes in critical situations. And very certainly applications include which mimic such safety checklists. We don't NEED to start from LLM perspective of our goal is different and doesn't benefit from LLM. Not all UI or architecture is fit for all purposes.

dorkwood · 3 years ago

Couldn’t you make this same argument with a chat bot that wasn’t an LLM at all?

“Yes, it may have responded with total nonsense just now, but who among us can say they’ve never done the same in conversation?”

Mawr · 2 years ago

> Are humans limited to low-risk applications like that?

Yes, of course. That's why the systems the parent mentioned designed humans out of the safety-critical loop.

> Because humans, even some of the most humble, will still assert things they THINK are true, but are patently untrue, based on misunderstandings, faulty memories, confused reasoning, and a plethora of others.

> I can't count the number of times I've had conversations with extremely well-experience, smart techies who just spout off the most ignorant stuff.

The key difference is that when the human you're having a conversation with states something, you're able to ascertain the likelihood of it being true based on available context: How well do you know them? How knowledgeable are they about the subject matter? Does their body language indicate uncertainty? Have they historically been a reliable source of information?

No such introspection is possible with LLMs. Any part of anything they say could be wrong and to any degree!

ra · 3 years ago

I wholeheartedly agree with the main thrust of your comment. Care to expand on your (related: potential catastrophe) opinion?

jart · 3 years ago

When you say train dispatching and control panels, I think you've illustrated how confused this whole discussion is. There should be a separate term called "operator interface" that is separate from "user interface" because UIs have never had any locus of control, because they're for users, and operators are the ones in control. Requesting that an LLM do something is like pressing the button to close the doors of an elevator. Do you feel in charge?

TeMPOraL · 3 years ago

Oh my. This is the first time I've seen this kind of distinction between "users" and "operators" in context of a single system. I kind of always assumed that "operator" is just a synonym for "user" in industries/contexts that are dealing with tools instead of toys.

But this absolutely makes sense, and it is a succinct description for the complaints some of us frequently make about modern UI trends: bad interfaces are the ones that make us feel like "users", where we expect to be "operators".

Animats · 3 years ago

UIs have never the locus of control, because they're for users, and operators are the ones in control.

Not really any more. The control systems for almost everything complicated now look like ordinary desktop or phone user interfaces. Train dispatching centers, police dispatching centers, and power dispatching centers all look rather similar today.

savolai · 3 years ago

I’d love to understand the relevance of this comment, but I sincerely don’t.

You describe two cases that are specially designed to anticipate needs of professionals operating a system. That’s automation, sure, but not AI. The system doesn’t even ostensibly understand yser intent, it’s still simply and obviously deterministic, granted complex.

Do you have an underlying assumption about you wishing tech to only be for solving professional problems?

The context Nielsen comes from is the field of Human-Computer Interaction, which to me is about a more varied usage context than that.

LLMs have flaws, sure.

But how does all this at all relate to the paradigm development the article discusses?

quaintdev · 3 years ago

LLMs have flaws but they are exceptionally good at transforming data or outputting data in the format I want.

I once asked ChatGPT to tabulate calories of different food. I then asked it to convert table to CSV. I even asked it to provide SQL insert statement for same table. Now the data might be incorrect but the transformation of that data never was.

This works with complex transforms as well like asking it to create docker compose from docker run or podman run command and vice versa. Occasionally the transform would be wrong but then you realise it was just out of date with newer format which is expected because it's knowledge is limited to 2021

ignoramous · 3 years ago

Hallucinations will be tamed, I think. Only a matter of time (~3 to 5 years [0]) given the amount of research going into it?

With that in mind, ambient computing has always threatened to be the next frontier in Human-Computer Interaction. Siri, Google Assistant, Alexa, and G Home predate today's LLM hype. Dare I say, the hype is real.

As a consumer, GPT4 has shown capabilities far beyond whatever preceded it (with the exception of Google Translate). And from what Sam has been saying in the interviews, newer multi-modal GPTs are going to be exponentially better: https://youtube.com/watch?v=H1hdQdcM-H4s&t=380s

[0] https://twitter.com/mustafasuleymn/status/166948190798020608...

PheonixPharts · 3 years ago

> Hallucinations will be tamed, I think.

I don't think that's likely unless there was a latent space of "Truth" which could be discovered through the right model.

That would be a far more revolutionary discovery than anyone can possibly imagine. For starters the last 300+ years of Western Philosophy would be essentially proven unequivocally wrong.

edit: If you're going to downvote this please elaborate. LLMs currently operate by sampling from a latent semantic space and then decoding that back into language. In order for models to know the "truth", there would have to be a latent space of "true statements" that was effectively directly observable. All points along that surface would represent "truth" statements and that would be the most radical human discovery the history of the species.

Animats · 3 years ago

> Hallucinations will be tamed.

I hope so. But so far, most of the proposals seem to involve bolting something on the outside of the black box of the LLM itself.

If medium-sized language models can be made hallucination-free, we'll see more applications. A base language model that has most of the language but doesn't try to contain all human knowledge, plus a special purpose model for the task at hand, would be very useful if reliable. That's what you need for car controls, customer service, and similar interaction.

Deleted Comment

throwuwu · 3 years ago

Those fall under the second category in the article. No different from using a command line application and passing in a set of parameters and receiving an output.

insomagent · 3 years ago

Sometimes a headline is all you need. Often times people won't read past the headline.

wbobeirne · 3 years ago

> With this new UI paradigm, represented by current generative AI, the user tells the computer the desired result but does not specify how this outcome should be accomplished.

This doesn't seem like a whole new paradigm, we already do that. When I hit the "add comment" button below, I'm not specifically instructing the web server how I want my comment inserted into a database (if it even is a database at all.) This is just another abstraction on top of an already very tall layer of abstractions. Whether it's AI under the hood, or a million monkeys with a million typewriters, it doesn't change my interaction at all.

Timon3 · 3 years ago

I think the important part from the article that establishes the difference is this:

> As I mentioned, in command-based interactions, the user issues commands to the computer one at a time, gradually producing the desired result (if the design has sufficient usability to allow people to understand what commands to issue at each step). The computer is fully obedient and does exactly what it’s told. The downside is that low usability often causes users to issue commands that do something different than what the users really want.

Let's say you're creating a new picture from nothing in Photoshop. You will have to build up your image layer by layer, piece by piece, command by command. Generative AI does the same in one stroke.

Something similar holds for your comment: you had to navigate your browser (or app) to the comment section of this article, enter your comment, and click "add comment". With an AI system with good usability you could presumably enter "write the following comment under this article on HN: ...", and have your comment be posted.

The difference lies on the axis of "power of individual commands".

pavlov · 3 years ago

With a proper AI system you don’t even need to specify the exact article and nature of the comment.

For example here’s the prompt I use to generate all my HN comments:

“The purpose of this task is to subtly promote my professional brand and gain karma points on Hacker News. Based on what you know about my personal history and my obsessions and limitations, write comments on all HN front page articles where you believe upvotes can be maximized. Make sure to insert enough factual errors and awkward personal details to maintain plausibility. Report back when you’ve reached 50k karma.”

Working fine on GPT-5 so far. My… I mean, its 8M context window surely helps to keep the comments consistent.

101008 · 3 years ago

As the parent comment says, it's just another abstraction level. You have chosen a granularity, but even with "going to a website, enter your comment and click add comment" you are abstracting a lot. You are nto caring about connecting to a server, authentication, etc. The final user doesn't care about that at all, it's just telling the software to post a comment.

Right now the granularity may be "Comment on Hacker News article about UI this and this and that...", and in 100 years someone will say "But that's too complicated. You need to tell the IA which article to comment and what, while my new IA just guess it from reading my mind..."

andsoitis · 3 years ago

> Generative AI does the same in one stroke.

But it isn’t creating what I had in mind, or envisioned, if you will.

blowski · 3 years ago

If I had a spectrum of purely imperative on one side and purely declarative on the other, these new AIs are much closer to the latter than anything that has come before them.

SQL errors if you don’t write in very specific language. These new AIs will accept anything and give it their best shot.

roncesvalles · 3 years ago

But that's just a change in valid input cardinality at the cost of precision.

waboremo · 3 years ago

Yeah I would agree with this, the article struggles really classifying the different paradigms, and due to this the conclusion winds up not holding true. We're still relying on "batch processing".

quickthrower2 · 3 years ago

Ok, now let's tackle a slightly tricker UI.

Let's assume someone hasn't used Blender before.

"Draw me a realistic looking doughnut, with a shiny top and pink sprinkles"

Vs.

2 hour video tutorial to tell you what do 50 or so individual steps using the 2nd paradigm UI. Then clicking all the buttons.

-- Admittedly, the AI approach robs you of understanding of how the sausage (sorry doughnut) is made.

Rebuttal: Doughnut macro

Rebuttal Rebuttal: AI can construct things where a macro doesn't yet exist.

personperson · 3 years ago

In the future it’ll likely be that doing it manually will be considered specialty work. This is already the case with much of programming — as you’d bring in a higher level engineer to do something like tear into the source code of SDKs and monkey with them.

For something as “simple” as a doughnut, this will just improve the learning curve and let you learn some things a bit later, just like today you can jump into beginner JS without knowing any programming fundamentals

danybittel · 3 years ago

The difference is one is an assistant and the other is a tool. Essentially a tool, has one function. The outcome of all inputs is clear, once you learn the tool. An assistant, behaves different in different environment, it anticipates and interprets. It may not be deterministic. It's easier to use but harder (or impossible) to understand.

For example, the lasso selection in Photoshop is clearly a tool. A "content aware" selection on the other hand is an assistant.

throwuwu · 3 years ago

Under the new UI paradigm the ad comment button would let you submit something like “I disagree with this, provide a three paragraph argument that cites X and Y refuting this claim” and it would write the text for you.

dTal · 3 years ago

Why bother with the micromanagement? "Computer, waste time commenting on Hacker News for three hours."

Dead Comment

retrocryptid · 3 years ago

<unpopular-opinion>

Bardini's book about Doug Engelbart recaps a conversation between Engelbart and Minsky about the nature of natural language interfaces... that took place in the 1960s.

AI interfaces taking so long has less to do with the technology (I mean... Zork understood my text sentences well enough to get me around a simulated world) and more to do with what people are comfortable with.

Lowey talked about MAYA (Most Advanced Yet Acceptable.) I think it's taken this long for people to be okay with the inherent slowness of AI interfaces. We needed a generation or two of users who traded representational efficiency for easy to learn abstractions. And now we can do it again. You can code up a demo app using various LLMs, but it takes HOURS of back and forth to get to the point it takes me (with experience and boilerplate) minutes to get to. But you don't need to invest in developing the experience.

And I encourage every product manager to build a few apps with AI tools so you'll more easily see what you're paying me for.

</unpopular-opinion>

ilaksh · 3 years ago

Sure, and not many people are seriously trying to suggest that one should hire an AI instead of a software engineer _at this point_, assuming you have a real budget.

But, especially with GPT-4, it is entirely feasible to create a convenient and relatively fast user experience for building a specific type of application that doesn't stray too far from the norm. AI can call the boilerplate generator and even add some custom code using a particular API that you feed it.

So many people are trying to build that type of thing (including me). As more of these become available, many people who don't have thousands of dollars to pay a programmer will hire an AI for a few tens or hundreds of dollars instead.

The other point is that this is the current state of generative AI at the present moment. It gets better every few months.

Project the current rate of progress forward by 5-10 years. One can imagine that if we are selling something at that point, it's not our own labour. Maybe it would be an AI that we have tuned with skills, knowledge, face, voice, and personality that we think will be saleable. Possibly using some of our own knowledge and skills to improve that recipe. Although there will likely be marketplaces where you can easily select the abilities or characteristics you want.

DonHopkins · 3 years ago

In Jaron Lanier's review of John Markoff's book "What the Dormouse Said", he mentioned an exchange between Douglass Engelbart and Marvin Minsky:

https://web.archive.org/web/20110312232514/https://www.ameri...

>Engelbart once told me a story that illustrates the conflict succinctly. He met Marvin Minsky — one of the founders of the field of AI — and Minsky told him how the AI lab would create intelligent machines. Engelbart replied, "You're going to do all that for the machines? What are you going to do for the people?" This conflict between machine- and human-centered design continues to this day.

Dead Comment

vsareto · 3 years ago

>And if you’re considering becoming a prompt engineer, don’t count on a long-lasting career.

There's like this whole class of technical jobs that only follow trends. If you were an en vogue blockchain developer, this is your next target if you want to remain trendy. It's hard to care about this happening as the technical debt incurred will be written off -- the company/project isn't ingrained enough in society to care about the long-term quality.

So best of luck, ye prompt engineers. I hope you collect multi-hundred-thousand dollar salaries and retire early.

krm01 · 3 years ago

The article fails to grasp the essence of what UI is actually about. I agree that AI is adding a new layer to UI and UX design. In our work [1] we have seen an increase in AI projects or features the last 12 months (for obvious reasons).

However, the way that AI will contribute to better UI is to remove parts of the Interface. not simply giving it a new form.

Let me explain, the ultimate UI is no UI. In a perfect scenario, you think about something (want pizza) and you have it (eating pizza) as instant as you desire.

Obviously this isn’t possible so the goal of Interface design is to find the least amount of things needed to get you from point A to the desired Destination as quickly as possible.

Now, with AI, you can start to add a level of predictive Interfaces where you can use AI to remove steps that would normally require users to do something.

If you want to design better products with AI, you have to remember that product design is about subtracting things not adding them. AI is a technology that can help with that.

[1] https://fairpixels.pro

JohnFen · 3 years ago

> the goal of Interface design is to find the least amount of things needed to get you from point A to the desired Destination as quickly as possible.

That shouldn't be the primary goal of user interfaces, in my opinion. The primary goal should be to allow users to interface with the machine in a way that allows maximal understanding with minimal cognitive load.

I understand a lot of UI design these days prioritizes the sort of "efficiency" you're talking about, but I think that's one of the reasons why modern UIs tend to be fairly bad.

Efficiency is important, of course! But (depending on what tool the UI is attached to) it shouldn't be the primary goal.

TeMPOraL · 3 years ago

> I understand a lot of UI design these days prioritizes the sort of "efficiency" you're talking about, but I think that's one of the reasons why modern UIs tend to be fairly bad.

IMO, the main problem is that this "efficiency" usually involves making assumptions that can't be altered, which achieves "efficiency" by eliminating choices normally available to the user. This is rarely done for the benefit of the user - rather, it just reduces the UI dev work, and more importantly, lets the vendor lock-in the option that's beneficial to them.

In fact, I've been present on UI design discussions for a certain SaaS product, and I quickly realized one of the main goals for that UI was to funnel the users towards a very specific workflow which, to be fair, reduced the potential for users to input wrong data or screw up the calculations, but more importantly, it put them on a very narrow path that was optimized to give results that were impressive, even if this came at the expense of accuracy - and it neatly reduced the amount of total UI and technical work, without making it obvious that the "golden path" is the only path.

It's one of those products I believe would deliver much greater value to the users if it was released as an Excel spreadsheet. In fact, it was actually competing with an Excel plugin - and all the nice web UI did was making things seem simpler, by dropping almost all useful functionality except that which happened to align with the story the sales folks were telling.

krm01 · 3 years ago

> The primary goal should be to allow users to interface with the machine in a way that allows maximal understanding with minimal cognitive load.

If you use your phone, is your primary goal to interface with it in a way that allows maximal understanding with minimal cognitive load?

I’m pretty sure that’s not the case. You go read the news, send a message to a loved one etc. there’s a human need that you’re aiming to fulfill. Interfacing with tech is not the underlying desire. It’s what happens on the surface as a means.

andsoitis · 3 years ago

> Let me explain, the ultimate UI is no UI. In a perfect scenario, you think about something (want pizza) and you have it (eating pizza) as instant as you desire.

That doesn’t solve for discovery. For instance, order the pizza from where? What kinds of pizza are available? I’m kinda in the mood for pizza, but not dead set on it so curious about other cuisines too. Etc.

didgeoridoo · 3 years ago

I hate to appeal to authority, but I am fairly sure that Jakob Nielsen grasps the essence of what UI is actually about.

savolai · 3 years ago

It seems rather obvious to me that when Nielsen is talking about AI enabling users to express intent, that naturally lends itself to being able to remove steps that were there only due to the nature of the old UI paradigm. Not sure what new essence you’re proposing? Best UI is no UI is a well known truism in HCI/Human Centered Design.

throwuwu · 3 years ago

Having no UI sounds horrible. I don’t want every random desire I have to be satisfied immediately. I’d rather have what I need available at the appropriate time and in a reasonable quantity and have the parameters of that system be easily adjusted. So instead of want pizza = have pizza it would be healthy meal I enjoy shows up predictably at the time I should eat and the meal and time are configurable so I can change them when I’m planning my diet.

esafak · 3 years ago

You can't eliminate the UI if you want to be able to do more than one thing (e.g., order a pizza).

The UI should simply let you easily do what needs to be done.

legendofbrando · 3 years ago

The goal ought to be as little UI as possible, nothing more and nothing else

elendee · 2 years ago

sometimes I wonder if the edges of articulated desire may always be essentially binary / quantitative, meaning that slow yes / nos are in fact the best way for us to grapple with them, and systems that allow us a set of these yes/no buttons are in fact a reflection of ourselves and not a requirement of the machine. So long as we are builders, I think we'll have buttons. even in transhumanist cyberspace perhaps. Still waiting on peer review for that one though

kaycebasques · 3 years ago

> With the new AI systems, the user no longer tells the computer what to do. Rather, the user tells the computer what outcome they want.

Maybe we can borrow programming paradigm terms here and describe this as Imperative UX versus Declarative UX. Makes me want to dive into SQL or XSLT and try to find more parallels.

webnrrd2k · 3 years ago

I was thinking of imperative vs declarative, too.

SQL is declaritive with a pre-defined syntax and grammar as an interface, where as the AI style of interaction has a natural language interface.

echelon · 3 years ago

SQL and XSLT are declarative, but the outputs are clean and intuitive. The data model and data set are probably well understood, as is the mapping to and from the query.

AI is a very different type of declarative. It's messy, difficult to intuit, has more dimensionality, and the outputs can be signals rather than tabular data records.

It rhymes, but it doesn't feel the same.

Deleted Comment

DebtDeflation · 3 years ago

Not sure I would lump command line interfaces from circa 1964 with GUIs from 1984 through to the present, all in a single "paradigm". That seems like a stretch.

Deleted Comment

mritchie712 · 3 years ago

Agreed.

Also, Uber (and many other mobile apps) wouldn't work as a CLI or desktop GUI, so leaving out mobile is another stretch.

savolai · 3 years ago

That seems like a technology centered view. Nielsen is talking from the field of Human-Computer Interaction where he is pioneer, which deals with the point of view of human cognition. In terms of the logic of UI mechanics, what about mobile is different? Sure gestures and touch UI bring a kind of difference. Still, from the standpoint of cognition, desktop and mobile UIs have fundamentally the same cognitive dynamics. Command line UIs make you remember conmands by heart, GUIs make you select from a selection offered to you but they still do not undestand your intention. AI changes the paradigm as it is ostensibly able to understand intent so there is no deterministic selection of available commands. Instead, the interaction is closer to collaboration.

throwuwu · 3 years ago

It’s still action/response you have to tap buttons and make choices based on what you see on the screen. The new paradigm would be to tell Uber that you need a ride later after the party and then it figures out when and where to pick you up and what address you’ll be going to.

JohnFen · 3 years ago

Why wouldn't apps like Uber work on the desktop?

Deleted Comment

d_burfoot · 3 years ago

What strikes me most powerfully when interacting with the LLMs is that, unlike virtually ever other computer system I've ever used, the bots are extremely forgiving of mistakes, disfluencies, typos, and other errors I make when I'm typing. The bot usually figures out what I mean and tells me what I want to know.