Do AI companies work?

I lead an applied AI research team where I work - which is a mid-sized public enterprise products company. I've been saying this in my professional circles quite often.

We talk about scaling laws, superintelligence, AGI etc. But there is another threshold - the ability for humans to leverage super-intelligence. It's just incredibly hard to innovate on products that fully leverage superintelligence.

At some point, AI needs to connect with the real world to deliver economically valuable output. The ratelimiting step is there. Not smarter models.

In my mind, already with GPT-4, we're not generating ideas fast enough on how best to leverage it.

Getting AI to do work involves getting AI to understand what needs to be done from highly bandwidth constrained humans using mouse / keyboard / voice to communicate.

Anyone using a chatbot already has felt the frustration of "it doesn't get what I want". And also "I have to explain so much that I might as well just do it myself"

We're seeing much less of "it's making mistakes" these days.

If we have open-source models that match up to GPT-4 on AWS / Azure etc, not much point to go with players like OpenAI / Anthropic who may have even smarter models. We can't even use the dumber models fully.

SrslyJosh · a year ago

> I lead an applied AI research team where I work

Your paycheck depends on people believing the hype. Therefore, anything you say about "superintelligence" (LOL) is pretty suspect.

> Getting AI to do work involves getting AI to understand what needs to be done from highly bandwidth constrained humans using mouse / keyboard / voice to communicate.

So, what, you're going to build a model to instruct the model? And how do we instruct that model?

This is such a transparent scam, I'm embarrassed on behalf of our species.

KronisLV · a year ago

Snarky tone aside, there are different audiences. For example, I primarily work with web dev and some DevOps and I can tell you that the state of both can be pretty dire. Maybe not as much in my particular case, as in general.

Some examples to illustrate the point: supply chain risks and an ever increasing amount of dependencies (look at your average React project, though this applies to most stacks), overly abstracted frameworks (how many CPU cycles Spring Boot and others burn and how many hoops you have to jump through to get thigns done), patterns that mess up the DBs ability to optimize queries sometimes (EAV, OTLT, trying to create polymorphic foreign keys), inefficient data fetching (sometimes ORMs, sometimes N+1), bad security practices (committed secrets, anyone? bad usage of OAuth2 or OIDC?), overly complex tooling, especially the likes of Kubernetes when you have a DevOps team of one part time dev, overly complex application architectures where you have more more services than developers (not even teams). That's before you even get into the utter mess of long term projects that have been touched by dozens of developers over the years and the whole sector sometimes feeling like wild west, as opposed to "real engineering".

That's why articles like this ring true: http://www.stilldrinking.org/programming-sucks

However, the difference here is that I wouldn't overwhelm anyone who might give me money with rants about this stuff and would navigate around those issues and risks as best I can, to ship something useful at the end of the day. Same with having constructive discussions about any of those aspects in a circle of technical individuals, on how to make things better.

Calling the whole concept a "scam" doesn't do anyone any good, when I already derive value from the LLMs, as do many others. Look at https://www.cursor.com/ for example and consider where we might be in 10-20 years. Not AGI, but maybe good auto-complete, codegen and reasoning about entire codebases, even if they're hundreds of thousands of lines long. Tooling that would make anyone using it more productive than those who don't. Unless the funding dries up and the status quo is restored.

jcgrillo · a year ago

This is absolutely on point, I don't understand why your post is being down-voted.

shafyy · a year ago

> Anyone using a chatbot already has felt the frustration of "it doesn't get what I want". And also "I have to explain so much that I might as well just do it myself

Ehm yes. That's because it actually doesn't work as well as the hype suggests, not because it's too "high bandwidth".

mumblemumble · a year ago

Moreover, this is exactly the frustration I've experienced when working with outsourced developers.

Which tells me the problem may be fundamental, not a technical one. It's not just a matter of needing "more intelligence". I don't question the intelligence or skill of the people on the outsourced team I was working with. The problem was simple communication. They didn't really know or understand our business and its goals well enough to anticipate all sorts of little things, and the lack of constant social interaction of the type you typically get when everybody's a direct coworker meant we couldn't build that mind-meld over time, either. So we had to pick up the slack with massive over-specification.

bsenftner · a year ago

I'd contribute that we, the engineering class, are as a whole terrible communicators that confuse and cannot explain their own work. LLMs require clear communications, while the majority of LLM users attempt to use them with a large host of implied context that anyone would have a hard time following, not to mention a non-human software construct. The key is clear communications, which is a phrase that many in STEM don't have the education to know what that phrase really means, technically, realistically.

Hoasi · a year ago

It works for some annoyances in life, though. For example, you can get it to write a complaint to an administration. It's good enough unless you'd rather be witty and write it yourself.

hakfoo · a year ago

The "it's making mistakes" phase might be based on the testing strategy.

Remember the old bit about the media-- the stories are always 100% infalliable except strangely in YOUR personal field of expertise.

I suspect it's something similar with AI products.

People test them with toy problems -- "Hey ChatGPT, what's the square root of 36", and then with something close to their core knowledge.

It might learn to solve a lot of the toy problems, but plenty of us are still seeing a lot of hallucinations in the "core knowledge" questions. But we see people then taking that product-- that they know isn't good in at least one vertical-- and trying to apply it to other contexts, where they may be less qualified to validate if the answer is right.

danielbarla · a year ago

I think a crucial aspect is to only apply chatbots' answers to a domain where you can rapidly validate their correctness (or alternatively, to take their answers with a huge pinch of salt, or simply as creative search space exploration).

For me, the number of times where it's led me down a hallucinated, impossible, or thoroughly invalid rabbit hole have been relatively minimal when compared against the number of times when it has significantly helped. I really do think the key is in how you use them, for what types of problems/domains, and having an approach that maximizes your ability to catch issues early.

latexr · a year ago

> Remember the old bit about the media-- the stories are always 100% infalliable except strangely in YOUR personal field of expertise.

Gell-Mann amnesia:

https://en.wikipedia.org/wiki/Michael_Crichton#GellMannAmnes...

yibg · a year ago

* We're seeing much less of "it's making mistakes" these days.*

Perhaps less than before, but still making very fundamental errors. Anything involving number I'm automatically suspicious. Pretty frequently I'd get different answers for the same question (to a human).

e.g. ChatGPT will give an effective tax rate of n for some income amount. Then when asked to break down the calculation will come up with an effective tax rate of m instead. When asked how much tax is owed on that income will come up with a different number such that the effective rate is not n or m.

Until this is addressed to a sufficient degree, it seems difficult to apply to anything that involves numbers and can't be quickly verified by a human.

LASR · a year ago

Yes. Numbers / math is pretty much instant hallucination.

But. Try this approach instead: have it generate python code, with print statements before every bit of math it performs. It will write pretty good code, which you then execute to generate the actual answer.

Simpler example: paste in a paragraph of text, ask it to count the number of words. The answer will be incorrect most of the time.

Instead, ask it to out each word in the text in a numbered list and then output the word count. It will be correct almost always.

My anecdotal learning from this:

LLMs are pretty human-like in their mental abilities. I wouldn't be able to simply look at some text and give you an accurate word count. I would point my finger / cursor to every word and count up.

The solutions above are basically giving LLMs some additional techniques or tools, very similar to how a human may use a calculator, or count words.

In the products we've built, there is an AI feature that generates aggregations of spreadsheet data. We have a dual unittest & aggregator loop to generate correct values.

The first step is to generate some unittests. And in order to generate correct numerical data for unittests, we ask it to write some code with math expressions first. We interpret the expressions, and paste it back into the unittest generator - which then writes the unittests with the correct inputs / outputs.

Then the aggregation generator then generates code until the generated unittests pass completely. Then we have the code for the aggregator function that we can run against the spreadsheet.

Takes a couple of minutes, but pretty bulletproof and also generalizable to other complex math calculations.

Animats · a year ago

> Perhaps less than before, but still making very fundamental errors.

Yes.

Suppose someone developed a way to get a reliable confidence metric out of an LLM. Given that, much more useful systems can be built.

Only high-confidence outputs can be used to initiate action. For low-confidence outputs, chain of reasoning tactics can be tried. Ask for a simpler question. Ask the LLM to divide the question into sub-questions. Ask the LLM what information it needs to answer the question, and try to get that info from a search engine. Most of the strategies humans and organizations use when they don't know something will work for LLMs. The goal is to get an all high confidence chain of reasoning.

If only they knew when they didn't know something.

There's research on this.[4] No really good results yet, but some progress. Biggest unsolved problem in computing today.

[4] https://hungleai.substack.com/p/uncertainty-confidence-and-h...

Propelloni · a year ago

LLMs don't do math in that sense. They build a string of tokens out of a billion pre-weighted ones that gets a favorable probably distribution when taking your prompt into account. Change your prompt, get a different printout. There is no semantic understanding (in the sense of does what is printed make sense) and it therefore cannot plausibility check its response. A LLM will just print gibberish if it gets the best probability distribution of tokens. I'm sure that's something that will be addressed over time, but we are not there yet.

I'm not keen on marketing words like "superintelligence" but boiling it down that's what in my mind the OP said. These systems are limited in ways that we do not yet fully appreciate. They are not silver bullets for all or maybe even many problems. We need to figure out where they can be deployed for greater benefit.

madeofpalk · a year ago

> Perhaps less than before, but still making very fundamental errors. Anything involving number I'm automatically suspicious

Totally. They are large language models, not math models.

I think the problem is that 'some people' overhype them as universal tools to solve any problem, and to answer any question. But really, LLMs excel in generating pretty regular text.

_sys49152 · a year ago

i tell chatgpt i pay 5 dollars for every excellent response, and i make it keep track of how much i spend per chat session (of course in addition to my normal course of work)

it does 2 things. 1 tells me how deep i am in the conversation and 2 when the computation falls apart, i can assume other things in its response will be trash as well. and sometimes number 3 how well the software is working. ranges from 20$ to 35$ on average but a couple days they go deep to 45$ (4,7,9 responses in chat session)

today i learned i can knock the computation loose in a session somewhere around the 4th reply or $20 by injecting a random number in my course work and it was latching onto the number instead of computing $

p1necone · a year ago

We're seeing much less of "it's making mistakes" these days.

Is this because it's actually making less mistakes, or is it just because most people have used it enough now to know not to bother with anything complex?

noisy_boy · a year ago

They have trained us and we have fine tuned our questions.

Havoc · a year ago

> It's just incredibly hard to innovate on products that fully leverage superintelligence.

Once we have actual super intelligence there is no need for humans to innovate anymore. It is by definition better than us anyway.

I guess you could still have artisanal innovation

ang_cire · a year ago

Today I carved a front panel for my cyberdeck project out of a composite wood board. I hand-drafted everything, and planned out the wiring (though I won't be onto the soldering phase for a while now). It felt good. I don't think having a 3d printer + AI designing my cyberdeck would feel the same.

jcgrillo · a year ago

Get back to me when you actually have superintelligence ;)... I'll wait.

troupo · a year ago

> In my mind, already with GPT-4, we're not generating ideas fast enough on how best to leverage it.

It's a token prediction machine. We've already generated most of the ideas for it, and hardly any of them work because see below

> Getting AI to do work involves getting AI to understand what needs to be done from highly bandwidth constrained humans using mouse / keyboard / voice to communicate.

No. Gettin AI to work you need to make an AI, and not a token prediction machine which, however wonderful:

- does not understand what it is it's generating, and approaches generating code the same way it approaches generating haikus

- hallucinates and generates invalid data

> Anyone using a chatbot already has felt the frustration of "it doesn't get what I want". And also "I have to explain so much that I might as well just do it myself"

Indeed. Instead of asking why, you're wildly fantasizing about running out of ideas and pretending you can make this work through other means of communication.

SrslyJosh · a year ago

> hallucinates and generates invalid data

The model is doing the exact same thing when it generates "correct" output as it does when it generates "incorrect" output.

"Hallucination" is a misleading term, cooked up by people who either don't understand what's going on or who want to make it sound like the fundamental problems (models aren't intelligent, can't reason, and attach zero meaning to their input or output) can be solved with enough duct tape.

madeofpalk · a year ago

> Anyone using a chatbot already has felt the frustration of "it doesn't get what I want". And also "I have to explain so much that I might as well just do it myself"

I find this funny because for what I use ChatGPT for - asking programming questions that would otherwise go to Google/StackOverflow - I have a much better time writing queries for ChatGPT than Google, and getting useful results back.

Google will so often return StackOverflow results that are for subtly very different questions, or ill have to squint hard to figure out how to apply that answer to my problem. When using ChatGPT, i rarely have to think about how other people asked the question.

elorant · a year ago

We have ideas on how to leverage it. But we keep them to ourselves for our products and our companies. AI by itself isn’t a breakthrough product the same way that the iPhone or the web was. It’s a utility for others to enhance their products or their operations. Which is the main reason why so many people believe we’re in an AI bubble. We just don’t see the killer feature that justifies all that spending.

mvdtnz · a year ago

> In my mind, already with GPT-4, we're not generating ideas fast enough on how best to leverage it.

This is just another way of saying this technology, like Blockchain before it, is a solution in search of a problem.

unsigner · a year ago

Or, more drastically, another way of saying it’s useless.

smolder · a year ago

Your use of the word superintelligence is jarring to me. That's not yet a thing and not yet visible on the horizon. That aside, the point I like that you seem to be making is along the lines of: we overestimate the short term impact of new tech, but underestimate the long term impact. There is a lot to be done and a lot of refinement to come.

blitzar · a year ago

Have we surpassed Clippy "It looks like you're trying to write a letter" yet?

butterfly42069 · a year ago

Yes, this is in my mind where people will find the fabled moat they search for too.

SOTA models are impressive, as is the idea of building AGIs that do everything for us, but in the meantime there are a lot of practical applications of the open source and smaller models that are being missed out on in my opinion.

I also think business is going to struggle to adapt and existing business is at a disadvantage for deploying AI tools, after all, who wants to replace themselves and lose their salary? Its a personal incentive not to leverage AI at the corporate level.

bryanrasmussen · a year ago

>Anyone using a chatbot already has felt the frustration of "it doesn't get what I want". And also "I have to explain so much that I might as well just do it myself"

the problem is really, can it learn "I need to turn this over to a human because it is such an edge case that there will not be an automated solution."

janoc · a year ago

I.e. long story short - are you saying that this tech is a solution looking for a problem?

Makes sense.

datavirtue · a year ago

"In my mind, already with GPT-4, we're not generating ideas fast enough on how best to leverage it."

This is the main bottle neck, in my kind. A lot of people are missing from the conversation because they don't understand AI fully. I keep getting glimpses of ideas and possibilities and chatting through a browser ain't one of them. On e we have more young people trained on this and comfortable with the tech and understanding it, and existing professionals have light bulbs go off in their heads as they try to integrate local LLMs, then real changes are going to hit hard and fast. This is just a lot to digest right now and the tech is truly exponential which makes it difficult to ideate right now. We are still enveloping the productivity boost from chatting.

I tried explaining how this stuff works to product owners and architects and that we can integrate local LLMs into existing products. Everyone shook their head and agreed. When I posted a demo in chat a few weeks later you would have thought the CEO called them on their personal phone and told them to get on this shit. My boss spent the next two weeks day and night working up a demo and presentation for his bosses. It went from zero to 100kph instantly.

Alex-Programs · a year ago

Just the fact that I can have something proficient in language trivially accessible to me is really useful. I'm working on something that uses LLMs (language translation), but besides that I think it's brilliant that I can just ask an LLM to summarise my prompt in a way that gets the point across in far fewer tokens. When I forget a word, I can give it a vague description and it'll find it. I'm terrible at writing emails, and I can just ask it to point out all the little formalisms I need to add to make it "proper".

I can benchmark the quality of one LLM's translation by asking another to critique it. It's not infallible, but the ability to chat with a multilingual agent is brilliant.

It's a new tool in the toolbox, one that we haven't had in our seventy years of working on computers, and we have seventy years of catchup to do working out where we can apply them.

It's also just such a radical departure from what computers are "meant" to be good at. They're bad at mathematics, forgetful, imprecise, and yet they're incredible at poetry and soft tasks.

Oh - and they are genuinely useful for studying, too. My A Level Physics contained a lot of multiple choice questions, which were specifically designed to catch people out on incorrect intuitions and had no mark scheme beyond which answer was correct. I could just give gpt-4o a photo of the practice paper and it'd tell me not just the correct answer (which I already knew), but why it was correct, and precisely where my mental model was incorrect.

Sure, I could've asked my teacher, and sometimes I did. But she's busy with twenty other students. If everyone asked for help with every little problem she'd be unable to do anything else. But LLMs have infinite patience, and no guilt for asking stupid questions!

tene80i · a year ago

What was the demo? And what is the advantage of integrating local LLMs vs third party?

indigoabstract · a year ago

I will be glad to see the day when LLMs will be able to play Minecraft, so I won't have to. Then I can just relax and watch someone else do everything for me without lifting a single finger.

Won't that be fun.

mavhc · a year ago

Watching people play Minecraft is the main hobby of children these days

Atmael · a year ago

why do you believe that humanity is the habitat for ai and not the other way around?

aaron695 · a year ago

This comment goes a long way to explaining Jonestown to me.

We have a new god (superintelligence) but only the special can see it because it's too intelligent for people to interact with it.

It's so advanced all the common ways humans communicate "using mouse / keyboard / voice" don't work

The reason we've never seen it help us is we need more ideas (prayers) first.

NPC brains are really hard to understand I will say that and they use a lot of electricity.

I've found that everything that works stops being called AI.

Logic programming? AI until SQL came out. Now it's not AI.

OCR, computer algebra systems, voice recognition, checkers, machine translation, go, natural language search.

All solved, all not AI any more yet all were AI before they got solved by AI researchers.

There's even a name for it: https://en.m.wikipedia.org/wiki/AI_effect?utm_source=perplex...

anodari · a year ago

In the same line, there are also a phrase about technology, "is everything that doesn’t work yet." by Danny Hillis, "Electric motors were once technology – they were new and did not work well. As they evolved, they seem to disappear, even though they proliferated and were embedded by the scores into our homes and offices. They work perfectly, silently, unminded, so they no longer register as “technology.” https://kk.org/thetechnium/everything-that/

analog31 · a year ago

On an amusing note, I've read something similar: Everything that works stops being called philosophy. Science and math being the two familiar examples.

benrutter · a year ago

Just in case anyone's curious, this is from Bertrand Russell's "the history of philosophy".

> As soon as definite knowledge concerning any subject becomes possible, this subject ceases to be called philosophy, and becomes a separate science.

I'm not actually sure I agree with it, especially in light of less provable schools of science like string theory or some branches of economics, but it's a great idea.

benterix · a year ago

There is a reason for that. People who inquired into the actual functioning of the world used to be called philosophers. That's why so many foundations of mathematics actually come from philosophers. The split happened around the 17th century. Newton still called his monumental work "Natural Philosophy", not "Physics".

Deleted Comment

js8 · a year ago

This is also true for consciousness or sentience. No matter how surprising abilities of non-human beings (and computer agents), it is something mysterious that only humans do.

winwang · a year ago

Is the contrapositive of this "philsophers (are those who) do stuff which don't work"? :P

Deleted Comment

nine_k · a year ago

I won't say that things like stoicism or humanism never worked. But they never got to the level of strict logical or experimental verifiability. Physics may be hard science, but the very notion of hard science, hypotheses, demand to replicate, demand to be able to falsify, etc, are all philosophy.

fsndz · a year ago

Exactly what I wrote recently: "The "AI effect" is behind some of the current confusion. As John McCarthy, AI pioneer who coined the term "artificial intelligence," once said: "As soon as it works, no one calls it AI anymore." This is why we often hear that AI is "far from existing." This led to the formulation of the Tesler's Theorem: "AI is whatever hasn't been done yet."" https://www.lycee.ai/blog/there-are-indeed-artificial-intell...

beryilma · a year ago

> As soon as it works, no one calls it AI anymore.

So, what are good examples of some things that we used to call AI, which we don't call AI anymore because they work? All the examples that come to my mind (recommendation engines, etc.) do not have any real societal benefits.

namaria · a year ago

I've said it before, AI is the default way to hype technology.

A hundred years ago computers were 'electronic brains'.

It goes dormant after loosing its edge and then re-emerges when some research project gains traction again.

winwang · a year ago

"Computer" in (simplified) Chinese is still "电脑" -- literally "electric/electronic brain".

OJFord · a year ago

We've kind of done it in reverse with AI though - 'chatbots' have been universally shit for ages, and now we have good ones, but they're 'AI'.

digging · a year ago

Chatbot is the form factor; LLMs are what is being called AI. SmarterChild wasn't an LLM.

iterateoften · a year ago

I remember the original Lisp manual describes it as a symbolic language for “AI”

hollerith · a year ago

Lisp's inventor, John McCarthy, was an AI researcher. (The US government started funding AI research in the 1950s, expecting progress to be much faster than it actually was.)

riku_iki · a year ago

> Logic programming? AI until SQL came out. Now it's not AI.

logic programming is not directly linked to SQL, and has its own AI term now: https://en.wikipedia.org/wiki/GOFAI.

llm_trw · a year ago

The original SQL was essentially Prolog restricted to relational algebra and tuple relational calculus. SQL as is happened when a lot of cruft was added to the mathematical core.

cgearhart · a year ago

This is a pretty common perspective that was introduced to me as “shifting the goalposts” in school. I have always found it a disingenuous argument because it’s applied so narrowly.

Humans are intelligent + humans play go => playing go is intelligent

Humans are intelligent + humans do algebra => doing algebra is intelligent

Meanwhile, humans in general are pretty terrible at exact, instantaneous arithmetic. But we aren’t claiming that computers are intelligent because they’re great at it.

Building a machine that does a narrowly defined task better than a human is an achievement, but it’s not intelligence.

Although, in the case of LLMs, in context learning is the closest thing I’ve seen to breaking free from the single-purpose nature of traditional ML/AI systems. It’s been interesting to watch for the past couple years because I still don’t think they’re “intelligent”, but it’s not just because they’re one trick ponies anymore. (So maybe the goalposts really are shifting?) I can’t quite articulate yet what I think is missing from current AI to bridge the gap.

Scarblac · a year ago

> Meanwhile, humans in general are pretty terrible at exact, instantaneous arithmetic. But we aren’t claiming that computers are intelligent because they’re great at it.

"The question of whether a computer can think is no more interesting than the question of whether a submarine can swim." - Edsger Dijkstra

insane_dreamer · a year ago

> breaking free from the single-purpose nature of traditional ML/AI systems

it is really breaking free? so far LLMs in action seem to have a fairly limited scope -- there are a variety of purposes to which they can be applied but it's all essentially the same underlying task

fsmv · a year ago

Is it really in context learning though? It's just recalling and interpolating things from the internet which there's quite a lot of.

akomtu · a year ago

People innately believe that intelligence isn't an algorithm. When a complex problem presents itself for the first time, people think "oh, this must be so complex that no algorithm can solve it, only AI," and when an algorithmic solution is found, people realise that the problem isn't that complex.

Indeed, if AI was an algorithm, imagine what would it feel like to be like one: at every step of your thinking process you are dragged by the iron hand of the algorithm, you have no agency in decision making, for every step is pre-determined already, and you're left the role of an observer. The algorithm leaves no room for intelligence.

mdavidn · a year ago

Is that not the human experience? I have no “agency” over the next thought to pop into my head. I “feel” like I can choose where to focus attention, but that too is a predictable outcome arising from the integration of my embryology, memories, and recently reinforced behaviors. “I” am merely an observer of my own mental state.

But that is an uncomfortable idea for most people.

tempfile · a year ago

The other option you don't mention is "algorithms can solve it, but they do something different to what humans do". That's what happened with Go and Chess, for example.

I agree with you that people don't consider intelligence as fundamentally algorithmic. But I think the appeal of algorithmic intelligence comes from the fact that a lot of intelligent behaviours (strategic thinking, decomposing a problem into subproblems, planning) are (or at least feel) algorithmic.

pas · a year ago

it mostly depends on one's definition of an algorithm.

our brain is mostly scatter-gather with fuzzy pattern matching that loops back on itself. which is a nice loop, inputs feeding in, found patterns producing outputs and then it echoes back for some learning.

but of course most of it is noise, filtered out, most of the output is also just routine, most of the learning happens early when there's a big difference between the "echo" and the following inputs.

it's a huge self-referential state-machine. of course running it feels normal, because we have an internal model of ourselves, we ran it too, and if things are going as usual, it's giving the usual output. (and when the "baseline" is out of whack then even we have the psychopathologies.)

https://www.youtube.com/watch?v=-rxXoiQyQVc

insane_dreamer · a year ago

Exactly. Machine learning used to be AI and "AI-driven solutions" were peddled over a decade ago. Then that died down. Now suddenly every product has to once again be "powered by AI" (even if under the hood all you're running is a good 'ol SVM).

timomaxgalvin · a year ago

Voice recognition is still very poor and barely used.

kombookcha · a year ago

Interestingly, a big barrier to voice recognition is the same as with AI assistants - they don't understand context, and so they have a difficult time navigating messy inputs that you need assumed knowledge and contextual understanding for. Which is kinda the baseline for how humans communicate things to eachother with words in the first place.

eth0up · a year ago

I think when things get really good, which I've no doubt will happen, we'll call it SI; Synthetic Intelligence

But it may be a while.

llm_trw · a year ago

Bert is already not an LLM and the vector embedding it generates are not AI. It is also first general solution for natural language search anyone has come up with. We call them vector databases. Again I'd wager this is because they actually work.

surprisetalk · a year ago

Totally agree!

[1] https://taylor.town/synthetic-intelligence

JohnFen · a year ago

I was deeply involved in voice recognition and OCR back in the day, and nobody working on these things called them "AI".

I don't doubt that there were marketing people wanting to attach an AI label to them, but that's just marketing BS.

AlienRobot · a year ago

That's a very interesting phenomenon!

ninetyninenine · a year ago

Yeah but honestly we all know LLMs are different then say some chess ai.

You can thank social media for dumbing down a human technological milestone in artificial intelligence. I bet if there was social media around when we landed on the moon you’d get a lot of self important people rolling their eyes at the whole thing too.

cubefox · a year ago

I think the best definition is

AI = machine learning

Yes, that makes a convolutional net trained on recognizing digits an AI.

hatthew · a year ago

What about computer characters in video games? They're usually controlled by a system that everyone calls AI, but there's almost never any machine learning involved. What about fitting a line to a curve, i.e. linear regression? Definitely ML, but most people don't call it AI.

bhouston · a year ago

I think we are in the middle of a steep S-curve of technology innovation. It is far from plateauing and there are still a bunch of major innovations that are likely to shift things even further. Interesting time and these companies are riding a wild wave. It is likely some will actually win big, but most will die - similar to previous technology revolutions.

The ones that win will win not just on technology, but on talent retention, business relationships/partnerships, deep funding, marketing, etc. The whole package really. Losing is easy, miss out on one of these for a short period of time and you've easily lost.

There is no major moat, except great execution across all dimensions.

foobiekr · a year ago

In particular:

"There is, however, one enormous difference that I didn’t think about: You can’t build a cloud vendor overnight. Azure doesn’t have to worry about a few executives leaving and building a worldwide network of data centers in 18 months."

This isn't true at all. There are like 8 of these companies stood up in the last three or four years fueled by massive investment of sovereign funds - mostly the saudi, dubai, northern europe, etc. oil-derived funds - all spending billions of dollars doing exactly that and getting something done.

The real problem is the ROI on AI spending is.. pretty much zero. The commonly asserted use cases are the following:

Chatbots Developer tools RAG/search

Not a one of these is going to generate $10 of additional revenue per sollar spent, nor likely even $2. Optimizing your customer services representatives from 8 conversations at once to an average of 12 or 16 is going to save you a whopping $2 per hour per CSR. It just isn't huge money. And RAG has many, many issues with document permissions that make the current approaches bad for enterprises - where the money is - who as a group haven't spent much of anything to even make basic search work.

pzs · a year ago

"The real problem is the ROI on AI spending is.. pretty much zero. The commonly asserted use cases are the following: Chatbots Developer tools RAG/search"

I agree with you that ROI on _most_ AI spending is indeed poor, but AI is more than LLM's. Alas, what used to be called AI before the onset of the LLM era is not deemed sexy today, even though it can still make very good ROI when it is the appropriate tool for solving a problem.

chatmasta · a year ago

While it might be possible for a deep-pocketed organization to spin up a cloud provider overnight, it doesn't mean that people will use it. In general, the switching cost of migrating compute infrastructure from one service to another is much higher than the switching cost of changing the LLM used for inference.

Amazon doesn't need to worry about suddenly losing its entire customer base to Alibaba, Yandex, or Oracle.

> The real problem is the ROI on AI spending is.. pretty much zero.

Companies in user acquisition/growth mode tend to have low internal ROI, but remember both Facebook and Google has the same issue -- then they introduced ads and all was well with their finances. Similar things will happen here.

sushid · a year ago

> And RAG has many, many issues with document permissions

Why can't these providers access all documents and when answers are prompted, self-censor if the reply has references to documents that the end users do not have access permissions? In fact, I'm pretty sure that's how existing RAGaaS providers are handling document/file permissions.

alexashka · a year ago

> I think we are in the middle of a steep S-curve of technology innovation

We are? What innovation?

What do we need innovation for? What present societal problems can tech innovation possibly address? Surely none of the big ones, right? So then is it fit to call technological change - 'innovation'?

I'd agree that LLMs improve upon having to read Wikipedia for topics I'm interested in but would investing billions in Wikipedia and organizing human knowledge have produced a better outcome than relying on a magic LLM? Almost certainly, in my mind.

You see, people are pouring billions into LLMs and not Wikipedia not because it is a better product - but because they foresee a possibility of an abusive monopoly and that really excites them.

That's not innovation - that's more of the same anti-social behaviour that makes any meaningful innovation extremely difficult.

SoftTalker · a year ago

One of Google's founding goals was "organizing human knowledge" IIRC. They ended up being an ad company.

mellosouls · a year ago

I'm not sure the Wikipedia example is a strong one as that site has it's own serious problems with "abusive monopolies" in its moderator cliques and biases (as with any social platform).

At least with the current big AI players there is the potential for differentiation through competition.

Unless there is some similar initiative with the Wikipedias, the problem of single supplier dominance is a difficult one to see as the way forward.

Nonetheless, Microsoft is firing up a nuclear reactor to power a new data center. My money is in the energy sector right now. Obvious boom coming with solar, nuclear and AI.

ActionHank · a year ago

The car wasn't a horse that was better, but a car has not changed drastically since they went mainstream.

They've gotten better, more efficient, loaded with tech, but are still roughly 4 seats, 4 doors, 4 wheels, driven by petroleum.

I know that this is a massive oversimplification, but I think we have seen the "shape" of LLMs\Gen AI\AI products already and it's all incremental improvements from here on out with more specialization.

We are going to have SUVs, sports cars, and single seater cars, not flying cars. AI will be made more fit for purpose for more people to use, but isn't going to replace people outright in their jobs.

pavlov · a year ago

Feels like someone might have said this in 1981 about personal computers.

"We've pretty much seen their shape. The IBM PC isn't fundamentally very different from the Apple II. Probably it's just all incremental improvements from here on out."

ToucanLoucan · a year ago

The big missing thing between both the metaphor in the OP's link and yours is that I just can't fathom any of these companies being able to raise a paying subscriber base that can actually cover the outrageous costs of this tech. It feels like a pipe dream.

Putting aside that I fundamentally don't think AGI is in the tech tree of LLM, if you will, that there's no route from the latter to the former: even if there is, even if it takes, I dunno, ten years: I just don't think ChatGPT is a compelling enough product to fund about $70 billion in research costs. And sure, they aren't having to yet thanks to generous input from various commercial and private interests but like... if this is going to be a stable product at some point, analogous to something like AWS, doesn't it have to... actually make some money?

Like sure, I use ChatGPT now. I use the free version on their website and I have some fun with AI dungeon and occasionally use generative fill in Photoshop. I paid for AI dungeon (for awhile, until I realized their free models actually work better for how I like to play) but am now on the free version. I don't pay for ChatGPT's advanced models, because nothing I've seen in the trial makes it more compelling an offering than the free version. Adobe Firefly came to me free as an addon to my creative cloud subscription, but like, if Adobe increased the price, I'm not going to pay for it. I use it because they effectively gave it to me for free with my existing purchase. And I've played with Copilot a bit too, but honestly found it more annoying than useful and I'm certainly not paying for that either.

And I realize I am not everyone and obviously there are people out there paying for it (I know a few in fact!) but is there enough of those people ready to swipe cards for... fancy autocomplete? Text generation? Like... this stuff is neat. And that's about where I put it for myself: "it's neat." OpenAI supposedly has 3.9 million subscribers right now, and if those people had to foot that 7 billion annual spend to continue development, that's about $150 a month. This product has to get a LOT, LOT better before I personally am ready to drop a tenth of that, let alone that much.

And I realize this is all back-of-napkin math here but still: the expenses of these AI companies seem so completely out of step with anything approaching an actual paying user base, so hilariously outstripping even the investment they're getting from other established tech companies, that it makes me wonder how this is ever, ever going to make so much as a dime for all these investors.

In contrast, I never had a similar question about cars, or AWS. The pitch of AWS makes perfect sense: you get a server to use on the internet for whatever purpose, and you don't have to build the thing, you don't need to handle HVAC or space, you don't need a last-mile internet connection to maintain, and if you need more compute or storage or whatever, you move a slider instead of having to pop a case open and install a new hard drive. That's absolutely a win and people will pay for it. Who's paying for AI and why?

JustFinishedBSG · a year ago

The car was very much a horse that was better though. It has replaced the horse ( or other draught animal ) and that's basically it. I'm not even sure it has brought fundamentally new and different use cases.

Likely we are because the actualized end result already exists: the human brain.

The fact that human intelligence exists means that the idea of human level intelligence is not a pipe dream.

The question is whether or not the basic underlying technology of the LLM can achieve that level of intelligence.

thfuran · a year ago

And brains do it on about 20 watts.

bradhilton · a year ago

Kind of feels like the ride-sharing early days. Lots of capital being plowed into a handful of companies to grab market share. Economics don't really make sense in the short term because the vast majority of cash flows are still far in the future (Zero to One).

In the end the best funded company, Uber, is now the most valuable (~$150B). Lyft, the second best funded, is 30x smaller. Are there any other serious ride sharing companies left? None I know of, at least in the US (international scene could be different).

I don't know how the AI rush will work out, but I'd bet there will be some winners and that the best capitalized will have a strong advantage. Big difference this time is that established tech giants are in the race, so I don't know if there will be a startup or Google at the top of the heap.

I also think that there could be more opportunities for differentiation in this market. Internet models will only get you so far and proprietary data will become more important potentially leading to knowledge/capability specialization by provider. We already see some differentiation based on coding, math, creativity, context length, tool use, etc.

torginus · a year ago

Uber is not really a tech company though - its moat is not technology but market domination. If it, along with all of its competitors were to disappear tomorrow, the power vacuum would be filled in very short order, as the core technology is not very hard to master.

It's a fundamentally different beast from AI companies.

VirusNewbie · a year ago

How is it not a tech company? They're literally trying to approximate TSP in the way that makes them money. In addition, they're constantly optimizing for surge pricing to maximize ROI. What kind of problems do you think those are?

marcosdumay · a year ago

Is Uber profitable already or are they waiting for another order of magnitude increase in scale before they bother with that?

Amazon is the poster-child of that mentality. It spent more than it earned into growth for more than 20 years, got a monopoly on retail, and still isn't the most profitable retail company around.

Uber the company was profitable last year, for the first time[1].

But I am doubtful that the larger enterprise that is Uber (including all the drivers and their expenses and vehicle depreciation, etc) was profitable. I haven't seen that analysis.

[1] https://www.theverge.com/2024/2/8/24065999/uber-earnings-pro...

It took 15 years for Uber to turn a profit. Will AI investors have that much patience?

Loughla · a year ago

Why wouldn't they? Uber is just cabs but without the overhead.

AI offers WAAAAYYYYY more money in the future.

dj_axl · a year ago

You're perhaps forgetting Waymo at $30B

bjornsing · a year ago

Bolt? Fairly big in Europe I think.

Sateeshm · a year ago

Not all businesses have that first mover or network effect advantages.

bcherny · a year ago

This article, and all the articles like it, are missing most of the puzzle.

Models don’t just compete on capability. Over the last year we’ve seen models and vendors differentiate along a number of lines in addition to capability:

- Safety

- UX

- Multi-modality

- Reliability

- Embeddability

And much more. Customers care about capability, but that’s like saying car owners care about horsepower — it’s a part of the choice but not the only piece.

tkgally · a year ago

One somewhat obsessive customer here: I pay for and use Claude, ChatGPT, Gemini, Perplexity, and one or two others.

The UX differences among the models are indeed becoming clearer and more important. Claude’s Artifacts and Projects are really handy as is ChatGPT’s Advanced Voice mode. Perplexity is great when I need a summary of recent events. Google isn’t charging for it yet, but NotebookLM is very useful in its own way as well.

When I test the underlying models directly, it’s hard for me to be sure which is better for my purposes. But those add-on features make a clear differentiation between the providers, and I can easily see consumers choosing one or another based on them.

I haven’t been following recent developments in the companies’ APIs, but I imagine that they are trying to differentiate themselves there as well.

candiddevmike · a year ago

To me, the vast majority of "consumers" as in B2C only care about price, specifically free. Pro and enterprise customers may be more focused on the capabilities you listed, but the B2C crowd is vastly in the free tier only space when it comes to GenAI.

You may be forgetting that ChatGPT has 10M paying customers. Not to mention everyone that pays for Claude Pro, Perplexity Pro, and so on.

flappyeagle · a year ago

This is like when VCs were funding all kinds of ride share, bike share, food delivery, cannabis delivery, and burning money so everyone gets subsidized stuff while the market figures out wtf is going on.

I love it. More goodies for us

No, it means creating a bunch of unprofitable businesses that make it really hard for folks trying to build a sustainable business without VC money.

jfengel · a year ago

Yep, you will probably lose. The VCs aren't out there to advance the technology. They are there to lay down bets on who's going to be the winner. "Winner" has little to do with quality, and rides much more on being the one that just happens to resonate with people.

The ones without money will usually lose because they get less opportunity to get in front of eyeballs. Occasionally they manage it anyway, because despite the myth that the VCs love to tell, they aren't really great at finding and promulgating the best tech.

JumpCrisscross · a year ago

> that make it really hard for folks trying to build a sustainable business without VC money

LLMs are capital intensive. They’re a natural fit for financing.

> when VCs were funding all kinds of ride share, bike share, food delivery, cannabis delivery, and burning money so everyone gets subsidized stuff while the market figures out wtf is going on

I’m reminded of slime molds solving mazes [1]. In essence, VC allows entrepreneurs to explore the solution space aggressively. Once solutions are found, resources are trimmed.

[1] https://www.mbl.edu/news/how-can-slime-mold-solve-maze-physi...

gimmefreestuff · a year ago

VC is the worst possible way to fund entrepeneurs.

Except for all the others.

leeter · a year ago

I'm already keeping an eye on what NVidia gets into next... because that will inevitably be the "Next big thing". This is the third(ish) round of this pattern that I can recall, I'm probably wrong about the exact count, but NVidia is really good at figuring out how to be powering the "Next big thing". So alternatively... I should probably invest in the utilities powering whatever Datacenters are using the powerhungry monsters at the center of it all.

malfist · a year ago

There's a saying in the stock market that probably applies here: past performance does not indicate future performance.

Getting lucky twice is a row is really really lucky. Getting lucky three times in a row is not more likely because they were lucky two times in a row

mandevil · a year ago

One thing I'm not clear on is how much of this is cause and how much effect: that is, does NVidia cheerleading for something make it more popular with the tech press and then everyone else too? There are definitely large parts of the tech press that serve more as stenographers than as skeptical reporters, and so I'm not sure how much is NVidia picking the right next big thing and how much is NVidia announcing the next big thing to the rest of us?

Yizahi · a year ago

Hear me out, I know it is controversial idea, but anyway - gaming. :)

add-sub-mul-div · a year ago

That's exactly the short term thinking they're hoping they can use to distract.

Tech companies purchased television away from legacy media companies and added (1) unskippable ads, (2) surveillance, (3) censorship and revocation of media you don't physically own, and now they're testing (4) ads while shows are paused.

There's no excuse for getting fooled again.

chillfox · a year ago

Where I live the ridesharing/delivering startups didn't bring goodies, they just made everything worse.

They destroyed the Taxi industry, I used to be able to just walk out to the taxi rank and get in the first taxi, but not anymore. Now I have to organize it on an app or with a phone call to a robot, then wait for the car to arrive, and finally I have to find the car among all the others that other people called.

Food delivery used to be done by the restaurants own delivery staff, it was fast, reliable and often free if ordering for 2+ people. Now it always costs extra, and there are even more fees if I want the food while it's still hot. Zero care is taken with the delivery, food/drinks are not kept upright and can be a total mess on arrival. Sometimes it's escaped the container and is just in the plastic bag. I have ended up preferring to go pickup food myself over getting it delivered, even when I have a migraine, it's just gone to shit.

rty32 · a year ago

"walk out to the taxi rank"

I assume you are talking about airports. Guess what, they still exist in many places. And on the other hand, for US, other than a few big cities, the "normal" taxi experience is that you call a number and maybe a taxi shows up in half an hour. With Uber, that becomes 10 minutes or less, with live map updates. Give me that and I'll be happy to forget about Uber.

CaptainFever · a year ago

For where I live (Asia), I disagree with both of these examples.

Getting a taxi was awful before ride-sharing apps. You'd have to walk to a taxi stop, or wait on the side of the road and hope you could hail one. Once the ride-sharing apps came in, suddenly getting a ride became a lot simpler. Our taxi companies are still alive, though they have their own apps now -- something that wouldn't have happened without competition -- and they also work together with the ride-hailing companies as a provider. You could still hail taxis or get them from stops too, though that isn't recommended given that they might try to run the meter by taking a longer route.

For food delivery, before the apps, most places didn't deliver food. Nowadays, more places deliver. Even if a place already had their own delivery drivers, they didn't get rid of them. We get a choice, to use the app or to use the restaurant's own delivery. Usually the app is better for smaller meals since it has a lower minimum order amount, but the restaurant provides faster delivery for bigger orders.

fragmede · a year ago

taxis were the greatest example of regulatory capture. the post-event/airport Uber pickup situation is stupid and has obvious fixes, but, again, that's the taxicab regulatory capture where Uber has to thread a needle in order to not be a taxi, for them to legally operate. if we could clean slate, and make a working system, that would be great but we can't, because of the taxicab regulatory commission.

I can now summon a cab from the comfort of the phone I'm holding, and know that they'll accept my credit card. I know the price before I get in and I know the route they should take. I'm not going to get taken for an unnecessary scenic tourist surcharge detour.

people don't like feeling they got cheated, and pre-uber, taxis did that all the time.

alvah · a year ago

"They destroyed the Taxi industry"

In every city I have lived, that is a good thing, despite all the bad things ridesharing startups may have done.

sqeaky · a year ago

Add the abuse of gig workers, expansion of the toxic tipping culture, increase in job count but reduction in pay, concentration of wealth in fewer hands.

These rideshare and delivery companies are disgusting and terrible.

denkmoon · a year ago

What goodies can I get from AI companies though?

Your own GPL unencumbered regurgitations of popular GPL libraries and applications.

baq · a year ago

First VC to exit a nuclear reactor profitably is going to be something.

xenospn · a year ago

Unless we’re trying to get an actual company funded without giving away our worthless product for free.

It invariably end in enshitifcation of the startups product AND the destruction of the original option.

Leaves you in a worse position than you started with.

Agree completely.

Monetizing all of this is frankly...not my problem.

cageface · a year ago

It seems very difficult to build a moat around a product when the product is supposed to be a generally capable tool and the input is English text. The more truly generally intelligent these models get the more interchangeable they become. It's too easy to swap one out for another.

Humans are the ultimate generally intelligent agents available on this planet. Even though most of them (us) are replaceable for mundane tasks, quite some are unique enough so that people seek their particular services and no one else's. And this is among the pool of about eight billion such agents.

lmm · a year ago

> Even though most of them (us) are replaceable for mundane tasks, quite some are unique enough so that people seek their particular services and no one else's.

Very few people manage that - indeed I can't think of anyone. Even movie stars get replaced with other movie stars if they try to charge too much. Certainly everyone in the tech industry (including the CEOs, the VCs, the investors etc.) has a viable substitute.

How is this relevant to AI? An AI can be trained on the total knowledge of every field.

LLMs are basically becoming commodities: https://www.lycee.ai/blog/why-large-language-models-are-a-co...

treis · a year ago

The moat is/will be the virtuous cycle of feeding user usage back into the model like it is for Google. That historically has been a powerful tool and it's something thats nearly impossible to get as a newcomer to the marketplace.

martin_drapeau · a year ago

The fundamental question is how to monetize AI?

I see 2 paths: - Consumers - the Google way: search and advertise to consumers - Businesses - the AWS way: attrack businesses to use your API and lock them in

The first is fickle. Will OpenAI become the door to the Internet? You'll need people to stop using Google Search and rely on ChatGPT for that to happen. Will become a commodity. Short term you can charge a subscription but long term will most likely become a commondity with advertising.

The second is tangible. My company is plugged directly to the OpenAI API. We build on it. Still very early and not so robust. But getting better and cheaper and faster over time. Active development. No reason to switch to something else as long as OpenAI leads the pack.

jfoster · a year ago

That's like saying "how do you monetize the internet?"

There are so many ways, it makes the question seem nonsensical.

Ways to monetize AI so far:

Metered APIs (OpenAI and others)

Subscription products built on it (Copilot, ChatGPT, etc.)

Using it as a feature to give products a competitive edge (Apple Intelligence, Tesla FSD)

Selling the hardware (Nvidia)

20 years ago people asked that exact question. E-Commerce emerged. People knew the physical process of buying things would move online. Took some time. Sure, more things emerged but monetizing the Internet still remains about selling you something.

What similar parallel can we think of for AI?

WhyOhWhyQ · a year ago

They'll be selling overpriced licenses per computer to every fortune 500 company.

overcast · a year ago

My guess would be using "AI" to increase/enhance sales with your existing processes. Pay for this product, get 20% increased sales, ad revenue, yada yada.

MOARDONGZPLZ · a year ago

But OpenAI doesn’t lead the pack. How do you determine when to switch or when to just keep going with (potentially marginally) inferior product?

Sure it does. Ask any common mortal about AI and they'll mention ChatGPT - not Claude, Gemini or whatever else. They might not even know OpenAI. But they do know ChatGPT.

Has it become a verb yet? Waiting to peole to replace "I googled how to..." with "I chatgpted how to...".

atomsatomsatoms · a year ago

There would need to be significant capabilities that openai doesn't have or wouldn't be built on a short-ish timeline to have the enterprise switch. There's tons of bureaucratic work going on behind the scenes to approve a new vendor.

skeeter2020 · a year ago

I don't see how you charge enough for the second path to make the economics work.