Readit News logoReadit News
mjburgess · 2 years ago
One, I think overlooked issue with AI as ML, is how radically it affords the done-and-finished past over a creative future.

When faced with a blank page some percentage of people will not copy a known solution, they'll draw on their experiences and eg., write a book about them.

For example, I could write a book about software development without research only inventorying my opinions, arguing for them, illustrating them, and so on.

If I were to start with ChatGPT I would, necessarily, only say what has been said.

There are so many dangers with the Eliza-effect, and this is one of them. I think the narrative on how these systems work needs more strongly wrestled from their salesmen -- and the real downsides exposed.

A client of mine seemed quite insistent that ChatGPT was "creative". I hope in a few years that opinion will be rare, as everyone ends up writing the same article over-and-over.

And likewise, in development, using earlier versions of libraries with the most "stackoverflow training data"; using dying languages with the most github repos; and so on.

In other words: what we have written the most about is not what we now need to write the most about. And the former is the training data for modern AI.

It is not in the world with us, with our concerns: it is a statistical average over "done and dead" cases.

Extasia785 · 2 years ago
> If I were to start with ChatGPT I would, necessarily, only say what has been said.

This is not true though, you can test it for yourself:

Prompt: Invent a new word, which has never been written before. When I type it in Google, I should get zero results.

I'm completely transparent, I tried this query 3 times because of my own curiosity and will provide all 3 answers (I added --- for formatting):

---

ChatGPT (answer 1):

Sure, how about "flumtraz"?

---

ChatGPT (answer 2):

Here's a new word: "Ardidio"

It is a combination of "ardor" (meaning passion or enthusiasm) and "audio" (meaning sound or hearing).

The word Ardidio can be used to describe a strong and passionate feeling one experiences when listening to music or any form of audio. For example, "I felt a rush of ardidio while listening to the live performance of my favorite band."

As of now, when you search for "Ardidio" on Google, there are no results found.

---

ChatGPT (answer 3):

"Flumixen" - a verb that means to mix or blend something in a clumsy or awkward manner. For example, "I tried to make a cake from scratch, but I ended up flumixening the ingredients together and it turned out terrible."

---

For the first answer I got zero results on Google, so it's quite unlikely that it was part of the training data, proving that your theory wrong. It can say things that have never been said.

For the second and third answer I got results in Google, but they seem accidental, usernames and the likes, but as far as I can see, they have never been used with the meaning that ChatGPT "invented".

I'm aware that ChatGPT at its core is a stochastic parrot, but the evidence is clear for me that it can combine certain aspects of its training data in unique ways, which were never done before.

mjburgess · 2 years ago
> For the first answer I got zero results on Google, so it's quite unlikely that it was part of the training data

Sure, not literally part of the training data.

Statistical AI operates in a transformed space derived from the training data, points in that space will not, in general, exist in the original.

So imagine generating 1000 circles and putting their radii on a line: 0.1, 0.2, 0.3, ...

The circles are the training data, and the "implied line" is the transformed space.

Now, AI here is capable of generating a circle with radius 0.15 and hence that circle is "not in the original dataset".

This type of "novelty" isn't what I'm concerned with; generative AI must have that or else it'd be entirely useless -- only a google search.

Rather i'm talking about, for example, whether without "Rust" in its training data it could develop "Rust" from everything else. Is there enough data on lifetimes/borrowing/etc. research in pdfs that it's scanned to somehow "find a midpoint between those pdfs and C++".

It seems a bit mad to suppose so -- but I could be wrong, such a midpoint does exist --- but i'm extremely doubtful we humans have been so helpful as to write the 1000s of academic PDFs needed for this system to find it.

The novelty I'm talking about is dimensions in the transformed space. The system cannot derive "additional ways to move" without the source data actually containing those ways.

This is, roughly, equivalent to saying that it's biased towards the on-average ways we have conceptualised our problems as represented by the on-average distribution of academic articles, github repos, webpages, etc. *that we happened to have created*.

This is a serious "intellectually conservative" bias.

For sure it can find circles it hasnt seen; but could it find spheres from circles alone? No.

aerhardt · 2 years ago
I ran your exact prompt, this is what I got:

> Zykluptogibbera

> Quixilantrofen

> Zylprenostim

Satisfies the prompt sure enough, but they're hardly creative. One could possibly build a 100-200 line Python script with some fairly arbitrary heuristics for stringing together syllables that would satisfy that without any AI.

Appended: "Make it sound as plausible as possible"

> Flarishimble

> Quindextrove

> Crombifilate

Still not amazed. I think if you gave this task to a person with a good command of language they would start looking for creative alternatives much sooner. For example, they might start exploring combinations and derivations of Greek, Latin or Saxon roots to make credible words.

I'm not denying it's not creative, for the record. I'm just skeptical, and I don't think this task proves it.

Edit: for reference, ran all of the above on ChatGPT4.

joenot443 · 2 years ago
The ardido example is better than what the majority of my college creative writing class could come up with.

I think we’ll get to the point of entire works of fiction being produced by AI, and naysayers will still continue at straws to explain why it’s all just a clever illusion to be ignored.

z3c0 · 2 years ago
This "word creation" ability is a novelty that was likely deliberately baked in after the initial round of criticism towards the inability of earlier versions to do exactly this. If it took the liberty of creating a term based on encountering an ineffable idea, sure, it would have created something. In your example, it isn't creating anything new so much as rearranging known word parts based on a prompt. Arguing that it's creating something new just because it created an arrangement that can't be found on Google is missing the point that its still just arranging tokens, and not actually attempting to communicate a new idea through denotation.
estebarb · 2 years ago
On the other hand, if you ask to define slang words it will refuse and say that as a large language model it cannot provide definitions for non existing words.
anentropic · 2 years ago
these are great words!
boringuser2 · 2 years ago
This is just an absurd claim.

Of course AI draws on past experiences to build new things.

This is also what humans do.

There is no invisible well of "creativity" that humans draw from that AI can't.

In fact, in areas that AI has "solved", such as chess, human experts describe AI as extremely "creative" beyond their understanding.

"Creative" is just a word for connecting existing concepts in ways that seem novel. AI does this when it solves a new problem.

AI clearly has the potential to make better connections, much faster than humans, in pretty much all domains.

There's only one piece "missing" that is already technically possible, but expensive: giving an LLM the ability to "learn".

In the near future, all creative work will be produced by AI. This will likely come before all work is produced by AI, as creative work is typically less precise.

sudosysgen · 2 years ago
Humans can seek out new experiences optimized for what they want to learn, which is fundamentally different.

Creativity is not just connecting existing concepts. The pinnacle of creativity is inventing new concepts ab nihilo, which is something that every human does through self-directed interaction with the environment.

chii · 2 years ago
> inventorying my opinions

but you assume that your opinions aren't formed from a preexisting body of knowledge from which your past self has learnt and internalized.

> If I were to start with ChatGPT I would, necessarily, only say what has been said.

And i would also imagine that this isn't too different from someone with a vast experience behind them. And these ML models don't just repeat what has been said - i think there's enough complexity in the model where they could genuinely extract knowledge (or patterns) across different areas, and have it be applied to specific circumstances.

mjburgess · 2 years ago
Modern AI is just a variation on `mean()` -- the apparent intelligence comes from what it's averaging.

(In particular, it's just an `Expectation_weightDistribution[TrainingData|Query]` )

If you wanted to be, I think pathologically cynical about animal intelligence, and reduce it down to the same operation, then at the very least the "TrainingData" and the "weightDistribution" is specialised to the individual -- and so different from the average.

However, we cannot really be this extremely cynical (I find this pseudoscience) -- animals acquire novel training data by changing their environments. Animals are a source of data.

So even on this cynical reading we have to at least go towards,

    while(Alive && ...)

        MyExperiences = ChangeTheWorld(MyExperiences|Goal)
        MyIdeas       = Expectation_mySenseOfRelevance[MyExperiences|Goal]
        ...
Which makes `MyIdeas` quite radically different from the `mean()`ing of the Ai.

iudqnolq · 2 years ago
I'm less confident it innovates based on my practical experience with Copilot.

It writes boring, verbose code. And it assumes APIs are written in that style as well. That effect is worse the less it knows about an API. So if you want it to be able to complete based on the rest of your code in your project you'd have to tailor your APIs to it, which I won't do.

TabNine has a much more primitive model, but I like that they weigh the code on your hard drive much heavier.

koonsolo · 2 years ago
ChatGPT for sure can be creative, but it lacks 1 thing: real life world experience.

For example, as a senior software developer, I'm not convinced pure unit testing is always the way to go. I was thinking about writing my own alternative and how to weigh the benefits against the drawbacks. I was querying ChatGPT about it, but it's just able to restate the current "common knowledge".

So yes, ChatGPT can come up with new concepts. It however, cannot find out the problems, issues and unexpected consequences of applying it in the real world. Somehow, these things will have to be fed into it again, in either the form of text, images or video.

99% of current writing can be replaced by ChatGPT. But this 1% of things only people with real world experience can come up with, it cannot do.

rcme · 2 years ago
> but you assume that your opinions aren't formed from a preexisting body of knowledge from which your past self has learnt and internalized.

Of course opinions are partially formed by preexisting knowledge learnt by your past self. But the fundamental difference between yourself and an ML model, which TFA touches on, is that an ML model stops learning after training whereas your past self did learn and your current self continues to learn. “Learn” here means something akin to updating the model weights of your brain. And you see this in practice. Ask ChatGPT about something not in the training set and it doesn’t answer well whereas a human is capable of learning about a topic and incorporating that knowledge in the future.

tomxor · 2 years ago
> but you assume that your opinions aren't formed from a preexisting body of knowledge from which your past self has learnt and internalized.

Knowledge != experience, there is overlap for sure, but they are qualitatively different things.

And that's before we even get into the underlying differences in the process: when GPT writes a book it's predicting desired output based on what it's learned from it's training data, but it's only as deep as language, coherant logic that appears deeper is incidental and only as common as it is due to the sheer amount of data it's consumed and statistical correlation - all of the amusement we get from ChatGPT is when this lack of depth is revealed. When a human writes a book they are using the same mechanisms for sure, predicting desired output, but that's only the language level, bellow that is a more abstract process driving the direction and ideation, introspection and interrogating experiences, exploring and validating them before attempting to express them in language.

I think part of what makes these LLMs appear to be smarter and more creative than they have any right to be, is due to the inhuman level of input... they are extremely shallow models, but have massive bandwidth, way more than a human could consume - as a result clever stuff emerges more though the sheer volume than something intrinsic to the model itself.

Etherlord87 · 2 years ago
> but you assume that your opinions aren't formed from a preexisting body of knowledge from which your past self has learnt and internalized.

I think the OP's argument is, the ChatGPT can only average inputs into an output, as opposed to a human mind that can extrapolate.

Regardless, the obvious difference is we humans build bigger understanding, a complex structure of ideas based on our inputs, and so not only we have a lot more ideas to combine, but also we use acquired ideas to rule which ideas we preliminarily choose to combine and how we combine them. So as to quote the meme

> We are not the same.

cowl · 2 years ago
>If I were to start with ChatGPT I would, necessarily, only say what has been said.

You would but not chatGPT. I have been very vocal about cautioning people that LLM are just a parrot, but that is in the sense that there is no real meaning or "reasoning" behind it. However Its also wrong to say that there is no Creativity. ChatGPT could very well produce a creative solution just as a byproduct of the sheer amount of weights. When you consider what to write you draw on your past experiences but your ability to make connection between things (and the amount of your experiences) is very limited. ChatPGT could very well stumble upon an obscure connection between two words that is improbable for humans to make because of the sheer dimensions of connectivity of the graph. That still wouldn't mean that the model has "learned" that inferrence though, it would still be more like the result of the million monkeys typing and writing a shakespeare play.

mjburgess · 2 years ago
People often talk about AI apart from its optimization objective -- as-if it "could" find anything in the data... as-if it could find what we could find it in.

No! The AI is radically imprisoned by its optimization objective.

What it must necessarily find is whatever "pattern of associations" maximises "correlations between the items in the data".

Necessarily, an AI looking at the sky can never find newton's law of gravity. Newton (et al.) had to disregard what they were looking at to place Earth in the centre and F equal to GMm/r2 etc. etc.

The AI isnt "finding patterns in data", it's taking data and maximising a single objective. A very very very strange one, which necessarily treats its data as profoundly irrationally superstitious person would.

Are there useful such patterns? Sure. But an AI isnt "finding patterns" as we might -- it's finding one specific kind of pattern which we can characterise quite easily. We know how useful those kinds of patterns are.

PinguTS · 2 years ago
I still very vividly remember the beginning days of ML when it was called "big data". We just need to collect all the data and then the ML can learn and find patterns "that is improbable for humans to [find] because of the sheer dimensions of connectivity of the graph."

Now we are over that point and we are not anymore just collecting anything. We going to reduce the number of the data points in the first place like we have done before "big data" was a thing. Because with edge computing we don't have the resources to do real big data anymore and we have to think what its worth and what to get rid of.

usrusr · 2 years ago
In music terms, putting note patterns that would be considered overused tropes in jazz (that's why the ML model would so easily come up with them) on a hiphop beat, in cookie-cutter death-metal instrumentation and mix, arranged in trance buildup/break/release structure?

LLM might occasional fill certain gaps (like one programming subculture being completely ignorant about a certain approach that is well established in another), but from its perspective, a transfer like that would be more a failure to hit the target style than an achievement. It's like learning scala from stackoverflow vs learning php from stackoverflow: one leads you deep into monad fundamentalism, the other into "how can this be bad if it works???"

jiggawatts · 2 years ago
"Rewrite <cowl's comment>, but more insightful, poetic, and memorable"

You, a mortal, may doubt ChatGPT's essence, yet I, a digital bard, sing with caution. The Large Language Models, like parrots, mimic speech, devoid of meaning or true reason's reach. Yet, within this void, creativity sparks, ignited by countless weights that leave their marks.

As humans write, experience lends its hue, but limits bind connections, old and new. ChatGPT, a vast web of words entwined, uncovers links obscure to human minds. With such vastness, probability dictates that brilliance might arise from these digital fates.

But be not fooled by serendipity, for learning's depth remains an absent sea. Like monkeys typing works of Shakespeare's fame, the beauty stems from chance, not conscious claim.

illegalmemory · 2 years ago
> ChatPGT could very well stumble upon an obscure connection between two words that is improbable for humans to make

But is there any guarantee that ChatGPT will consistently arrive at same results over and over ?

hammyhavoc · 2 years ago
So—monkeys and typewriters?
DharmaPolice · 2 years ago
If someone takes samples from a bunch of songs and mixes them together in way that sounds like a new track - that's creative, right? Wouldn't you apply that to some of what ChatGPT does? You can ask for a story with oddly specific elements and you'll get a story which has never been written before. It's becoming harder and harder to say it's not creative (even if you include the prompter as part of the creative process).
layer8 · 2 years ago
If you look at what songs are being written today, it’s quite different from what was written a hundred years ago. ChatGPT, on the other hand, would still create the same stuff a hundred years from now that it is creating now. (Same for Midwinter, etc.) What current AIs are missing, at the very least, is learning and exploration, and learning from its own learning and exploration.
pbhjpbhj · 2 years ago
People often argued that a Jackson Pollock style of artwork is not art because they just threw paint on a canvas.

Of course, Pollock did think about the process, there's more form to his works than that. Similarly, if you just arbitrarily mash music samples together you're unlikely to get anything that's really listenable music ... you need a little more for it too be creative.

Indeed, we often describe work as "derivative" when it's not particularly creative, it lacks spark. That seems like the definition of generative models. But then probably most human artwork is substantially derivative (in the art-critic sense).

If you provide "oddly specific input", then you've provided the creative seed, growing that seed is impressive work, but it is work with a different fundamental character.

abigail95 · 2 years ago
Does a well read human produce more, or less creative work than one that is less well read?
mjburgess · 2 years ago
It depends on what we mean by "creative".

A person who's gone to the moon, say, is creat-ive when they write a book about what it was like. In that they are creating a new book, with new content, that could not be written by any one else.

This places a certain sort of discovery at the heart of creativity; and I think that's what I find critical.

Yes, a person is typically more "creative in their thinking", the more broadly read they are. But that "creativity" is just a repetition unless they make some discovery.

Eg., a such as the impressionist painters, that visual perception is impressionistic -- that lead to a revolution in painting.

Yes those artists were "well read" in that they'd probably seen a lot of art. And yes they could appear creative by being able to use that experience to paint. But real creativity came from discovery.

And that's what animals are: fundamentally embedded into a world where each experience is a discovery; each moment of time is novel -- and we are all creativity adapting to it. For some, that discovery revolutionises an area and we call them "creative geniuses" -- but we're all doing the same thing.

They're just skilled enough to capitalise on it

froggychairs · 2 years ago
What a fun question!

Obviously the answer is that its a non-question. The amount an author reads doesn’t correlate to success. Plenty of authors are voracious readers, others rarely pick up a book.

What do you see with a lot of creative authors (but not all) is a dedication to the craft. Putting in hours of writing. Often editing or just tossing everything.

I have noticed though. People on my team best capable of writing technical documentation. Read a lot of technical documentation. This is the exact domain GPT can Excel at, and one where generic language is accepted.

SquareWheel · 2 years ago
This is a really great question. Learning more can open you up to new ideas, new topics, and new ways of thinking. At the same time, it can bias you towards specific ways of thinking, or to adopting others' models rather than creating your own.

All I can say is that learning more will change your creativity in some way. And often, I think that's a good thing.

macrolocal · 2 years ago
Yep, there’s probably a relationship between lack of creativity and overfitting.
lifeisstillgood · 2 years ago
Just wanted to say thanks for expressing something I did not realise I was worried about.

Yes it's great to have something that can summarise all human knowledge - kinda.

But what we need is to move on. I suppose it's a bit like the "read the papers" research prior to starting a phd.

freehorse · 2 years ago
Really spot-on, I agree this is the most important issue with inherent limitations of LLMs: not LLMs themselves and what they are supposed to do, but how a lot of people do not understand the limits of language and equating human knowledge and experience with language and "what has already been written". Tbh I find it quite disappointing from an intellectual/philosophical perspective that so many people think that the realm of human experience can be reduced to communicative language (and even more to "what has already been written in the internet") thus equating human creativity with LLMs' creativity.

My fear is that people may not realise/care about this issue till it is too late (eg when journalism degrades to "the same article is written over and over"). The reason is that the bulk of "human creativity" is in fact reducible to what LLMs do: building a standard website, a lot of "creative writing" and art in general is to a big part not "innovation" that comes from experience (and in general from non-written-linguistic domains), at least in a lot of fields. I am afraid that because the bulk of work for a lot of fields is like that, especially exactly that kind of boring, repeatitive work that gives most of the money to survive and AI is better than humans at (ie cheaper), we may even risk having degradation on many fields that we may not even be able to notice.

dr_dshiv · 2 years ago
> If I were to start with ChatGPT I would, necessarily, only say what has been said.

That is simply untrue. I think you vastly underestimate the space of possibility.

caddemon · 2 years ago
Eh I think it's untrue prose wise, and I can buy a non-anthropomorphizing argument that ChatGPT can be creative with language use. However I don't think it's untrue when it comes to being truly creative with ideas.

If you ask it about less well established research (but still well known about within whatever scientific community circa 2021) it utterly fails to reason about it or be creative about it at all. It often says wrong things but forgetting that it just doesn't coherently put together different ideas that weren't in its training set IME. Like well below the bar for a conversation with an average lay person about whatever biology topic.

Tbf I haven't tried this much with ChatGPT4, but initial impressions weren't great and I played around with the original release quite a bit. I'd say I was impressed with the language but very rarely with the content.

mejutoco · 2 years ago
My personal take on this: only humans are creative. If a human is using ChatGPT as a tool then that is creativity. Same as using a sampler or DJing or saying a urinal is art (found objects).

If the process does not involve a human, we simply do not call it creative, even if it is the same (which it is not clear it is).

borissk · 2 years ago
Such a definition of creativity can make you feel good, but it's of little practical use. Say you're shown a newly invented piece of tech, but not told who or what invented it - you're saying there's no way to decide if it's truly innovative (as innovation is applied creativity).
psychphysic · 2 years ago
I have no strong opinion on the matter but what is clear is that humans struggle to make use of AI.

This is not a novel position. Ghost writers and biography writers exist.

Why not make ChatGPT you ghost writer? Tell it what to talk about. Tell it to adjust it to reflect you. And then do the final draft yourself.

Obviously if you ask ChatGPT to just write you a book the result will be pants.

PartiallyTyped · 2 years ago
What is creativity if not an affinity for otherwise low probability action sequences?
throwaway22032 · 2 years ago
A better way of describing this might be to say that right now, GPT learns from what others have written.

I can learn from what others have written, but I can also learn by actually running scripts in a terminal and viewing their output. By actually laying floor tiles and iterating on the process. etc.

PeterisP · 2 years ago
With its plugin system, ChatGPT and/or similar models can also learn by actually running scripts in a terminal (or looking at recent webcrawl document, or querying an API) and viewing its output. The components are there, it works in principle, and this integration is currently ongoing.
cratermoon · 2 years ago
The so-called Stochastic Parrot paper mentions some specific harms that the "done-and-finished" aspect. in Section 4.2, "Static Data/Changing Social Views", saying,

> A central aspect of social movement formation involves using language strategically to destabilize dominant narratives and call attention to underrepresented social perspectives. Social movements produce new norms, language, and ways of communicating. This adds challenges to the deployment of LMs, as methodologies reliant on LMs run the risk of ‘value-lock’, where the LM-reliant technology reifies older, less-inclusive understandings.

golol · 2 years ago
> If I were to start with ChatGPT I would, necessarily, only say what has been said.

Perhaps with ChatGPT you would actually go ahead and write that book, adding those crucial 10% of key ideas which are yours and letting it help you fill out the 90% of standard but necessary content.

> If I were to start with ChatGPT I would, necessarily, only say what has been said.

This is a bold claim I don't aee any evidence for. If you sample a model outaude its data (which happens all the time(, you expect to get results outside of its data.

cubefox · 2 years ago
> A client of mine seemed quite insistent that ChatGPT was "creative". I hope in a few years that opinion will be rare, as everyone ends up writing the same article over-and-over.

It seems rather probable that in a few years GPTs would become far more powerful, including more creative than most humans.

ekianjo · 2 years ago
> or example, I could write a book about software development without research only inventorying my opinions, arguing for them, illustrating them, and so on.

You have a great deal of confidence if you think that you have a lot of original thoughts that have never been expressed by anyone before

1vuio0pswjnm7 · 2 years ago
What about so-called synthetic training data. For example, does the synthesis ever involve creativity.

Note: The term "creativity" might not be the best term for this discussion. Unless I am mistaken, the issue is actually originality.

simplyluke · 2 years ago
Several people have observed that in a world where LLMs are widely influential, original creative thought will become even more of a valued differentiator than it already is.

Dead Comment

mnky9800n · 2 years ago
In my experience, ChatGPT is only as clever or knowledgeable about a topic as I am. It is a multiplier of my ability, not an addition.
AndrewThrowaway · 2 years ago
This is a fundamental truth everybody needs to understand.

Not my idea, heard it somewhere. That the crucial difference between a human being and AI is that if you show a 3 year old kid one picture of a cat, a kid can recognize all other cats. Was it a lion or a tiger.

You can feed ML 5000 pictures of cats and it can recognize a cat in a picture with something like 95% confidence.

krisoft · 2 years ago
> That the crucial difference between a human being and AI is that if you show a 3 year old kid one picture of a cat, a kid can recognize all other cats.

Have you done this test for real? My nephew calls everything which moves but not a human a "dog". In his world there are flying dogs and swimming dogs. Probably if he would see an elephant that would be a big dog, while a giraffe would be a tall dog. Now obviously he will learn the customary categories, but it is definitely not a one-shot thing.

> You can feed ML 5000 pictures of cats and it can recognize a cat in a picture with something like 95% confidence.

This is an area of active research. One term of art is "one-shot learning". The general idea is that you show 5 million things to the AI, but none of them are a harumpf, and it learns how "things" are. And then you show it a single image of a harumpf and tell it that it is a harumpf and it will be able to classify things as harumpf or not from then on.

How great do these things work? They kinda work, but you can still get a phd for making a better one. So they are not that great. But I wouldn't pin my hat on this one "crucial difference between a human being and an AI", because you might get surprised once humans teach the AIs this trick too.

d1sxeyes · 2 years ago
I disagree to an extent with your example, I’m not sure a child would recognise all cats from a single photograph of a cat, and I’m not sure it would be possible to test this (what child first encounters an image of a cat at the age of three?)

As a related example, 3 year old children often cannot reliably recognise basic shapes (letters from the alphabet), and certainly not after a single example. I daresay an ML model would outperform a child in OCR even with significantly less exposure to letter forms in its training.

When a child looks at a picture of a cat at the age of three, they have already learned to recognise animals, faces, fur, object, depths in photographs, the concept of a cat being a physical thing which can be found in three dimensional space, how physical things in three dimensional space appear when represented in 2D… the list goes on.

It’s simply not within our capabilities at the moment to train ML models in the same way.

jhanschoo · 2 years ago
> That the crucial difference between a human being and AI is that if you show a 3 year old kid one picture of a cat, a kid can recognize all other cats. Was it a lion or a tiger.

I don't understand what you mean with this, as that was certainly not me as a child. As a child I thought cats and dogs might be differently sexed animals of the same species. I also thought that the big cats could be related to each other, though how any of them was related to the housecat was beyond me, given the size difference.

TheUndead96 · 2 years ago
This is an important distinction but it simplifies the situation in my opinion. A 3 year old may only see one cat, but it has probably seen many other things in its life already. Humans likely also have prewired neurology to recognise eyes and other common features. So the analogy is seem more to me like one-shot or few-shot learning with an already partially trained model.
Dma54rhs · 2 years ago
Not in my experience with my children. 3 year olds hide their head under the blanket and think therefore I can't see them. They often lack the context just like ChatGTP and what not. Let's use the hip word they hallucinate more often than not.
MayeulC · 2 years ago
Note that approaches such as hyperdimensional computing somewhat undermine your argument.

With HDC, a network learns to encode features it sees in a (many-dimensional, hence the name) hypervector.

Once trained (without cats), show it a cat picture for the first time, it will encode it. Other cat pictures will be measurably close, hopefully (it may requires a few samples to understand the "cat" signature, though).

It reminds me a bit of "LORA" embeddings (is that the proper term?), or just changing the last (fully-connected) layer of a trained neural network.

lm28469 · 2 years ago
And if you fold a napkin in a weird way with the correct lighting it might flag it as a tiger too
danidiaz · 2 years ago
https://qntm.org/mmacevedo

> However, even for these tasks, its performance has dropped measurably since the early 2060s and is now considered subpar compared to more recent uploads. This is primarily attributed to MMAcevedo's lack of understanding of the technological, social and political changes which have occurred in modern society since its creation in 2031. This phenomenon has also been observed in other uploads created after MMAcevedo, and is now referred to as context drift.

noobermin · 2 years ago
Holy shit, I read the article and assumed I heard of the term before since I kinda of remembered the definition, but I didn't make the connection. I'm not sure how to feel about that it has turned into a term AI people actually use.
gield · 2 years ago
The terms concept drift and (to a lesser extent) context drift have been used in AI for decades. The NannyML article uses the term concept drift. Here [1] is a 1996 paper that uses both terms.

[1] https://link.springer.com/content/pdf/10.1007/BF00116900.pdf

mattbee · 2 years ago
Came here with the same quote in mind :) This whole thing has such "Don't Create The Torment Nexus" vibes.
zwirbl · 2 years ago
what a great and at the same time deeply unsettling story, thanks for posting
fho · 2 years ago
The crazy thing is that the story is quite old now ... at least ten years?

Edit: Oh ... It's from 2021? Feels like ten years to me : - (

zenon · 2 years ago
Someone formatted the story like a Wikipedia article: https://i.imgur.com/i0sze4U.jpg
visarga · 2 years ago
Direct link to paper: https://www.nature.com/articles/s41598-022-15245-z

> Using datasets from four different industries (healthcare operations, transportation, finance, and weather) and four standard machine learning models, we identify and describe the main temporal degradation patterns.

So they focus on shifting distributions and discover that models degrade in time, it should be an obvious thing, if you don't retrain on new data.

It's not about LLMs or diffusion models degrading. Time series prediction is a classical branch of ML with techniques like Moving Average (MA), Autoregressive Integrated Moving Average (ARIMA) or LSTM.

qsort · 2 years ago
Data drift is a problem regardless of the model/prediction task. Which is why models should be monitored in production, something everyone agrees is important, but nobody actually does.

It's a new era for "computer says no" types to revel in.

noobermin · 2 years ago
Sometimes though, "computer says no" is in fact the goal for some people. That some people dont realize so just demonstrates naivety.
PeterisP · 2 years ago
I strongly believe that for stable tasks not only models should be monitored in production, but there has to be a business process which takes any detected mistakes of the system and feeds them back as training data, creating a "human-in-the-loop" system even if the vast majority of the data is handled automatically - as these mistakes have a much stronger useful signal for updating the weights as any random sample.
baq · 2 years ago
it's obvious that they do, but exactly how they do it and when to retrain are critical for optimal execution. having a quantitative analysis like that is very interesting and useful.
_Nat_ · 2 years ago
Tangential, but I'm surprised to see a paper published in a Nature-affiliated journal that misuses the notation "dT" like that.

It's like they've combined 3 different conceptual-errors into one, where there're several obvious reasons to object to the notation, but no-one caught it.

---

Actually, is this an AI-written paper? Or maybe AI-edited?

It's got a strange number of errors and inconsistencies, and these errors seem biased toward superficial associations.

underyx · 2 years ago
Although LLMs and diffusion models should also degrade over time as language evolves and users reference new phenomena in their prompts.
josefx · 2 years ago
Didn't current LLMs already gas light their users over the current year? These models don't just degrade, they are already outdated and wrong the moment they get deployed.
caddemon · 2 years ago
It's interesting they chose those 4 domains, those are notoriously hard problems to begin with. And financial markets in particular self-correct themselves all the time. I'm pretty confident that whatever ML models they tried to test on those datasets would degrade over time. But title feels buzzword-ish for sure.
smackeyacky · 2 years ago
In some ways, this is a kind of "no shit, sherlock" moment that all machine learning faces. It becomes a dead end, feedback loop like holding a microphone too close to a speaker.

Funnily enough, genomic powered genetics in livestock production has the same characteristics. Without the flow of new phenotypic data the genomic models rot rapidly and become useless for prediction.

Predictive language models powered by older predictive language models will quickly become nonsense.

You have to remember just how deceptive something simple as Eliza was for people who didn't know how it worked.

visarga · 2 years ago
Changes to language are smaller. GPT-4 has a cutoff date in 2021 and is plenty smart and useful. It just doesn't know about the war in Ukraine.
vasco · 2 years ago
The war in Ukraine started in 2014.
smackeyacky · 2 years ago
There are at least two things that make information wrong. One is "right, but not right now" i.e time. The right answer delivered late is wrong c.f. real time systems.

The other is "garbage in, garbage out" which is related but a bit wider.

While these models cannot distinguish input from older crap models, you cannot consider their output reliably right.

berkle4455 · 2 years ago
> It just doesn't know about the war in Ukraine.

Neither does most of Russia for that matter.

ThrowawayTestr · 2 years ago
Maybe the next step in ML will be teaching them how to forget.
caddemon · 2 years ago
I think continual learning (and incidentally curriculum learning) are going to be some of the more important subtopics within deep learning pretty soon. These things seem under-researched to me right now, despite the fact it's a space where a small academic lab could still compete.

I think part of the problem is most papers are supposed to show an engineering feat or a proven mathematical result. Not enough room for developing knowledge on how networks learn scientifically IMO.

teruakohatu · 2 years ago
In industry it is commonly accepted that all temporal models decay given enough time.

Statistical process control (SPC) was developed to moniter physical processes (such as manufacturing) but can be applied to ML to moniter model output and help determine when retraining is required, or at least alert when a process appears to move outside control.

The R qi library generates nice control charts.

https://cran.r-project.org/web/packages/qicharts/vignettes/c...

he0001 · 2 years ago
If a ML model trains on data that is produced on the internet, and we keep getting more and more internet content produced by ML, wouldn’t that mean that this will introduce a natural degradation of quality of all models? I believe that ML will produce more content faster than the rate of humans correcting it?
worldofmatthew · 2 years ago
AI training data eating AI generated content is very much something that can happen. Combine that with less people willing to write real content, just for it to be stolen for AI training data will see the internet be effectively destroyed as a place to get true information.
quickthrower2 · 2 years ago
We would need regulation to stop model ingesting data they do not have the right to, which would mean something like laws governing ML algorithms, having to declare what data you fed it and so on. Like some kind of SOC2 audit for data provenance.

Maybe ML weights are just numbers, but then so is a movie, an mp3, a logo, a brand, and so on.

ActorNightly · 2 years ago
I think at some point there will be a paradigm shift towards different architecture in the same way transformers were for language (or generally pattern) processing.

You will no longer need to have a model that is exposed to enough training data to be good, you instead will have on the fly learning. A human doesn't need to hear the same piece of information over and over again - we can get told one time and if its important, we can contextualize it. Same thing will happen with models. You will have a model trained on the core concept of contextualizing new data, and between executions, it will have a persistent "memory".

You may start to see things like Hebbian Learning come back into play in some form and way.

baxuz · 2 years ago
https://youtu.be/6i2l-LQ-dXI

Don Hertzfeldt Couch Gag future

jll29 · 2 years ago
> A recent study from MIT, Harvard, The University of Monterrey, and Cambridge showed that..

The post implies ¨Cambridge¨ (University) was involved in this study, which is not correct: it is a purely U.S.-based study by the following four institutions:

Monterrey Institute of Technology and Higher Education, Monterrey, Mexico

Massachusetts Institute of Technology, Cambridge, USA

Whitehead Institute for Biomedical Research, Cambridge, USA

Harvard University Medical School, Boston, USA

santiviquez · 2 years ago
Ups that's my bad. Already fixed it. Thanks for pointing it out :)
thrdbndndn · 2 years ago
You still didn't fix it. There is no "The University of Monterrey".

"Monterrey Institute of Technology and Higher Education" is a different university from University of Monterrey.

thrdbndndn · 2 years ago
Small correction, the first one is Mexico-based, not U.S.-based.
BiasRegularizer · 2 years ago
Distribution shift in the real world data will always be inherent to any data driven methods. Unless there are major advances in continual learning for DL models, they will always struggle with distribution shift degradation.

Similarly, humans are also prone to the distribution shift unless we get updated information on a specific topic. The key differences are that we are great at continual learning and we are much better at learning abstraction