The basic problem is that GPT generates easy poetry, and the authors were comparing to difficult human poets like Walt Whitman and Emily Dickinson, and rating using a bunch of people who don't particuarly like poetry. If you actually do like poetry, comparing PlathGPT to Sylvia Plath is so offensive as to be misanthropic: the human Sylvia Plath is objectively better by any possible honest measure. I don't think the authors have any interest in being honest: ignorance is no excuse for something like this.
Given the audience, it would be much more honest if the authors compared with popular poets, like Robert Frost, whose work is sophisticated but approachable.
Huh, that sounds a little like claiming that AI can draw pictures just as well as humans because they look realistic at a first glance. But not if you check whether the text, repetitive elements, and partially-occluded objects in the background look correct.
The more basic problem is that their methodology would conclude Harry Potter is better than Ulysses, AC/DC is better than Carla Frey, etc etc. It is completely fine to enjoy "dumb" art - I like Marvel comics and a lot of the Disney-era Star Wars novels have been pretty fun. But using easiness and fun as a metric of quality is simply celebrating ignorance and laziness.
Poetry is for the enjoyment and enlightenment of the reader. Who reads poems? The question is a little vague. Who reads poems on a birthday card? Everyone. In a children's book? Parents and children.
But this study focuses on literary poems specifically, using works by literary heavy weights like Chaucer, Shakespeare, Dickinson, Whitman, Byron, Ginsberg, and so on. So the question is who reads literary poems, to which the answer is: academics, literary writers, and a vanishingly small population of reader who enjoy such pursuits.
So the test of if ersatz AI poetry is as good as "real" poetry should be if target audience (i.e. academics) finds the poems to be enjoyable or enlightening. But this study does not test that hypothesis.
This study tests a different hypothesis: can lay people who are not generally interested in literary poetry distinguish between real and AI literary poetry?
The hypothesis and paper feel kind of like a potshot at literary poetry to me, or at least I don't understand why this particular question is interesting or worthy of scientific inquiry.
To me, it sounds more like claiming that AI writes better code than Linus Torvalds because a group of random non-programmers preferred reading simple AI-generated Python over Linux kernel C code.
The vast majority of people seem to prefer Avengers 17 over any cinematic masterpiece, the latest drake song would be better rated than a Tchaikovsky... We should let them play and worship chat gpt if that's what they want to waste their time on
I don't understand the logic of calling superhero movies lesser/unserious like this, it's very snobby. Movies and music are made to be entertaining, the avengers is more entertaining than your "arthouse cinematic masterpiece that nobody likes but it's just because they aren't smart enough to understand it". It's also lazy and ignorant to ignore the sheer manpower that goes into making a movie like that.
> The basic problem is that GPT generates easy poetry
I was going to come in here and say this. I'll even make the claim that GPT and LLMs __cannot write poetry__.
Of course, this depends on what we mean by "poetry." Art is so hard to define, we might as well call it ineffable. Maybe Justice Potter said it best, "I know it when I see it." And I think most artists would agree with this, because the point is to evoke emotion. It is why you might take a nice picture and put it up on a wall in your house but no one would ever put it in a museum. But good art should make you stop, take some time to think, figure out what's important to you.
The art that is notable is not something you simply hang on a wall and get good feelings from when you glance at it. They are deep. They require processing. This is purposeful. A feature, not a bug. They are filled with cultural rhetoric and commentary. Did you ever ask why you are no Dorothea Lange? Why your photos aren't as meaningful as Alfred Eisenstaedt's? Clearly There's something happening here, but what it is ain't exactly clear.
Let me give a very recent example. Here[0] is a letter from The Onion (yes, that Onion, the one who bought InfoWars The Onion) wrote an amicus brief to the Supreme Court. It is full of satire while arguing that satire cannot be outlawed. It is __not__ intended to be read at a glance. In fact, they even specifically say so
> (“[T]he very nature of parody . . . is to catch the reader off guard at first glance, after which the ‘victim’ recognizes that the joke is on him to the extent that it caught him unaware.”).
That parody only works if one is able to be fooled. You can find the author explaining it more here[1].
But we're coders, not lawyers. So maybe a better analogy is what makes "beautiful code." It sure as fuck is not aesthetically pleasing. Tell me what about this code is aesthetically pleasing and easy to understand?
float InvSqrt(float x){
float xhalf = 0.5f * x;
int i = *(int*)&x;
i = 0x5f3759df - (i >> 1);
x = *(float*)&i;
x = x*(1.5f - xhalf*x*x);
return x;
}
It requires people writing explanations![2] Yet, I'd call this code BEAUTIFUL. A work of art. I'd call you a liar or a wizard if you truly could understand this code at a glance.
I specifically bring this up because there's a lot of sentiment around here that "you don't need to write pretty code, just working code." When in fact, the reality is that the two are one in the same. The code is pretty __because__ it works. The code is a masterpiece because it solves issues you probably didn't even know existed! There's this talk as if there's this bifurcation between "those who __like__ to write code and those who use it to get things done." Or those who think "code should be pretty vs those who think code should just work." I promise you, everyone in the former group is deeply concerned with making things work. And I'll tell you now, you've been sold a lie. Code is not supposed to be a Lovcraftian creature made of spaghetti and duct tape. You should kill it. It doesn't want to live. You are the Frankenstein of the story.
To see the beauty in the code, you have to sit and stare at it. Parse it. Contemplate it. Ask yourself why each decision is being made. There is so much depth to this and it's writing is a literal demonstration of how well Carmack understands every part of the computer: the language, how the memory is handled, how the CPU operations function at a low level, etc.
I truly feel that we are under attack. I don't know about you, but I do not want to go gentle into that good night. Slow down, you move too fast, you got to make the morning last. It's easy to say not today, I got a lot to do, but then you'll grow up to be just like your dad.
I would say Davis's definition of "objectively better" here is "nobody who reads these poems carefully could possibly conclude that this AI crap is better than Walt Whitman, the only explanation is Walt Whitman is so difficult that the raters didn't read it carefully."
The Nature paper is making a bold and anti-humanist claim right in the headline, laundering bullshit with bad data, without considering how poorly-defined the problem is. This data really is awful because the subjects aren't interested in reading difficult poetry. It is entirely appropriate for Davis, as someone who actually is interested in good poetry, to make a qualitative stand as to what is or isn't good poetry and try to define the problem accordingly.
I don't see how the authors are dishonest about it, or that the rebuttal refutes any of the actual claims made. They make it clear in the paper that they're specifically evaluating people who aren't especially interested in poetry, and talk at length about how and why this is different from other approaches. I suppose the clickbait title gives a bad first impression.
To summarize the facts: when people were asked to tell if a given poem was written by human or AI, people thought the AI poems were human more often than human poems. People also tended to rate them higher when they thought they were created by AI than when they thought they were human-made. It's speculated in the paper that this is because the AI poems tended to be more accessible and direct than the human poems selected, and the preference for this style from non-experts combined with the perception that AI poetry is poor led to the results.
The selection of human poets is cooked to give the result they wanted. I will grant the authors may have lied to themselves. But I don't think honest scientists would have ever constructed a study like this. It is comparing human avant garde jazz to AI dance music and concluding that "AI music" is more danceable than "human music", without including human dance music! It's just infuriating.
> I don't see how the authors are dishonest about it, or that the rebuttal refutes any of the actual claims made.
Suppose I prepared a glass of lemonade and purchased a bottle of vintage wine—one held in high regard by connoisseurs. Then I found a set of random bystanders and surveyed them on which drink they preferred, and they generally preferred the lemonade. Do you see how it would be dishonest of me to treat this like a serious evaluation of my skill, either as a lemonade maker or a wine maker?
I believe you missed the OP's point. Poetry is to be processed. That's a feature, not a bug. Now that we're in an analytical conversation you need to process both papers and OP's words. Like poetry, there is context, things between the lines. Because to write everything explicitly would require a book's worth of text. LLMs are amazing compression machines, but they pale in comparison to what we do. If you're take a step back, you'll even see this comment is littered with that compression itself.
You're making a load of generous claims for yourself without giving your thought process:
> The basic problem is that GPT generates easy poetry
> were comparing to difficult human poets
What's your qualitative process for measuring "easy" vs. "difficult" poetry?
> rating using a bunch of people who don't particuarly like poetry
How do you know these people don't like poetry? Maybe they don't seek it out, but certainly poetry is not just for poetry lovers. Good poetry speaks to anyone.
> the human Sylvia Plath is objectively better by any possible honest measure
Agree. Poetry is the compact written reflection of an expanded or conflicted soul ... it requires lived experience, self awareness, and ability to compress out superfluous details in language.
The question of whether the human soul can be tricked by an AI illusionist and to what degree is a non sequiteur.
Humans eat the meal. At best, and charitably considered, AI eats the menu. Not the same.
In other news I asked AI to convince me it understands American capitalism, and to explain it to me as if I lived in Los Angeles - you know so I could really see/feel it.
It did decently well concluding it will, for example, balance the demand for "ample parking" with supply. Now, I leave you to assess whether that's an awesome and intelligent example, or an AI love song that like Marshall Tucker reminds "can't be wrong."
> the human Sylvia Plath is objectively better by any possible honest measure.
Except for arguably the most important one, creating something that people enjoy. Just because you dont like it doesnt make it worthless. I guess the actual question is do the raters actually get any enjoyment out of the ai poem or do they just intensely dislike both?
Poetry, like humor, involves the use of the reader's expectations, but is typically most effective when subverting those expectations.
There's a lot of bad poetry in the world that just follows readers' expectations. I should know, I've written some of it. Unfortunately, I'd suspect that most readers' understanding of poetry lacks that crucial element of subversion, and so an LLM – which mostly just spits out the most-probable, most-expected next token – looks like what people think poetry is.
An LLM would not have created the curtal sonnet form, because it would've been too busy following the rules of a Shakespearean or Petrarchan sonnet. Similarly, an LLM wouldn't create something that intentionally breaks the pattern of a form in order to convey a sense of brokenness or out-of-place-ness, because it's too locked in on the most-common-denominator from previous inputs. And yet, many of the most powerful works I've read are the ones that can convey a disjointed feeling on purpose – something an LLM is specifically geared not to do.
Poetry aims for the heart, and catches the mind along the way. An LLM does not have the requisite complex emotional intelligence, and is indeed a pretty poor simulation of emotional intelligence.
Consider Auden's Epitaph on a Tyrant, which is powerful because it is so suddenly shocking, as it describes something that sounds perhaps like an artist or author, until it very suddenly doesn't, on the last line:
Perfection, of a kind, was what he was after,
and the poetry he invented was easy to understand;
he knew human folly like the back of his hand,
and was greatly interested in armies and fleets;
when he laughed, respectable senators burst with laughter,
and when he cried the little children died in the streets.
One could literally take Claude 3.5 Sonnet New or o1-preview and disprove this in an hour or two just by prompting the AI to try to exhibit the type of poetry you want and then maybe asking it to do a little bit of automated critique and refinement.
You can also experiment with having a higher temperature, (maybe just for the first draft).
You claim that LLMs can't make poetry like that. I bet they can if you just ask them to.
They could, but they probably won't. Poems like GP are basically using the power of emotional manipulation for good, and companies like Anthropic try very hard to prevent Claude from having that capability.
It can be obvious and go to the heart. I'm not sure Wilfred Owen's Dulce et decorum est is anything other than straight down the line, but it made me cry when I first read it.
That said, maybe the subversion is in how the reality is contrasted with the marketing.
I see 'subversion' as more broad. In good poetry, subversion is constantly happening at a micro level, through playing with meaning, meter, word choice. I think it's very easy to identify AI-generated poetry because it lacks any of that -- but on the flip side, if you don't understand the rules, you don't understand how to subvert them.
Even in Dulce et decorum est -- though the meaning's straightforward, there are plenty of small, subversive (and pretty) ideas in the meter. For example, the line "He plunges at me, guttering, choking, drowning" is an unexpected, disruptive staccato that really sounds like guttering, choking, drowning. It's a beautiful poem and is overflowing with such examples.
(I think this applies to art as a whole, not just poetry.)
>There's a lot of bad poetry in the world that just follows readers' expectations. I should know, I've written some of it.
Apparently, in 'the good old days (of the internet)', your poetry would be published by yourself, on your webpage - complete with a starry-twinkling background, a 'number of visitors' counter in the bottom right, and in a pink font.
> Notably, participants were more likely to judge AI-generated poems as human-authored than actual human-authored poems
There is clearly a significant difference between AI generated poems and human generated poems.
A random group of people probably do not read poetry. It would be be interesting to see what people who do read poetry regularly do on this. Also, which they rate as good, rather than just "human authored".
I find with both the little AI generated poetry, and the AI generated paintings that show up in my FB feed, both look a bit rubbish. FB is pretty good for experiencing this because the work (usually an image) shows before the cues that it is AI generated in the text and comments.
As someone who reads poetry regularly and has played around a bit with AI-generated poems, AI poems can be quite impressive, but have a certain blandness to them. I can see them conforming very well to the average person's concept of what a poem is, whereas the human written poems might be less pleasingly perfect and average, more stylistically and structurally experimental etc. The LLM version is less likely to play with the rules and expectations for the medium, and more likely to produce something average and "correct", which makes some intuitive sense given the way it works.
I think when asked to rate poetry as human or ai authored that human poetry does look more like a random smattering of semi-related words which, I assume, is what folks think machine generated poetry would look like.
Perfect grammar and perfect meter doesn't read as AI to most folks yet.
LLM’s also have no life experience, so they can wrote poems, but those poems aren’t communicating anything real. Poetry in the vein of Whitman and Dickinson and Plath is very much about a person expressing their very personal experiences and emotions.
I'm reminded of how people are bad at generating and recognizing truly random patterns. I imagine the famous poets have something in their writing that's an outlier. I wonder if the human-authored poetry looks odd enough to cause problems with our fake detectors, while the mediocre grey goo that AI creates better fits expectations.
Anecdote incoming - I read poetry, weekly if you will, over about 15 years now.
I also play with LLM's often, for creative side projects and work commercially with them (prompt engineering stuff)
I don't find it far fetched that individual poetry can be indistinguishable at times when AI generated. I was asking it to write in iambic pentameter (sonnets) and it consistenly got the structure right, it's approach to the provided themes were as complicated or glib as I wanted. But that's all subjective right, which leads me to my main point.
My view of poetry over the years, has always been centred around the poet, the poet living in a time and place. As a generalisation most people buy into the artists life because it may represent some part of themself.
If someone managed to write an intriguing corpus of texts using LLM's that was extolled, I think that would almost be besides the point. What is important is the narrators life, ups and downs, joys and woes. Their ability to convey a memorable story even heavily relying on AI would still be impressive. Anyway sounding a bit wanky I will stop lol
(I do think LLM's write a little too perfect and that is easy to think it is not human, but you can kinda prompt them to throw in errors too so who knows)
The program Racter (which, from what I understand, was a basic MDP) was generating poetry in the 1980s that was published, read, criticized, and presented in museums: https://www.101bananas.com/poems/racter.html
I remember this as one of its poems was used on on the t-shirts of the computing social club that I was part of as a postgrad student:
More than iron
More than lead
More than gold I need electricity
I need it more than I need lamb or pork or lettuce or cucumber
I need it for my dreams
If it was like other poetry generation programs of the 80’s and 90’s, it was generating a lot more crap than gold. People definitely were picking out the most cohesive examples, and probably even improving on them themselves.
> Despite this success, evidence about non-experts’ ability to distinguish AI-generated poetry has been mixed. Non-experts in poetry may use different cues, and be less familiar with the structural requirements of rhyme and meter, than experts in poetry or poetry generation. Gunser and colleagues14 and Rahmeh15 find that human-written poems are evaluated more positively than AI-generated poems. Köbis and Mossink16 finds that when a human chooses the best AI-generated poem (“human-in-the-loop”) participants cannot distinguish AI-generated poems from human-written poems, but when an AI-generated poem is chosen at random (“human-out-of-the-loop”), participants are able to distinguish AI-generated from human-written poems.
This is a huge difference. Writing is a two-step process: idea generation, and selection. The first part is similar to what a randomized algorithm or an LLM might do, in smaller chunks (and indeed, the history of aleatoric processes in creative endeavors is long -- see Oulipo for one example in literature.)
The second step -- selection -- is the heart of creativity. It's about taste. Knowing what is and isn't "good."
When you consider the creative processes of Dada, Duchamp, Brian Eno -- I think it becomes clear that this automation of creative generation is a continuation of existing trends rather than a sudden disruption.
If an LLM were able to, on its own, generate and select poems, independently developing a new style of poetry that resonated with readers -- that would be something else entirely.
Maybe my logic is choppy but I'm unconvinced that it is useful to use (inexperienced) popular sentiment as a marker of some kind of reference for cultural quality here. The average person has read at most 100 poems; people who really like poetry will be reading quite a bit more, and that experience has relevance in a context such as this. I'm more curious what happens when we look at specifically those humans who engage in substantial, regular engagement with a given artform.
The study is interesting but that the general public enjoy the most statistically calculated, 50% grey inoffensive, unchallenging & formulaic poetry isn't too novel -- I don't much read poetry and I'm sure I'd be voting for the AI also.
Just for fun, I had ChatGPT rewrite your comment as a poem. Here’s the result:
————
My logic may be choppy, but I’m not yet sold,
That popular sentiment should be the gold,
For judging culture's worth or depth or art,
When most have seen so little, played no part.
The average person, read just a hundred lines,
While those who love the craft cross many signs,
With pages turned, they gain a richer view,
That shapes what art they’ll seek, and how they choose.
The study’s neat, but here’s what I believe:
The crowd enjoys the safe, the bland, the brief—
The 50% grey, inoffensive way,
The formulaic poems that softly sway.
I don't read much myself, I must admit,
But even I might find that AI's wit
Could speak in ways that sound like common sense,
But lacks the depth of art's true recompense.
Hmm, I’m not convinced. That’s just the comment in the guise of poetry. There are a bunch of dangling implications in this “poem” where a real master would weave the implications together.
I guess I’d also say it is not only doing the right thing that counts, one must also be doing it for the right reason.
AI “art” is mimicry, burdened by the inevitable mundanity of the majority of its training corpus. The avant garde remains exclusively the domain of a comparative handful of forward thinking humans, in my humble opinion.
Or said another way: AI art is kitschy, and I don’t think it can escape it.
Just for fun, I also rewrote the comment as a poem:
Perhaps my logic flounders, but I'm unconvinced
That popular opinion can be the test
Of how we should judge our culture's best.
Can we who think that poetry's a bore,
And haven't even read five score,
Place ourselves above those who adore?
I laugh at memes and share the swill.
Of course I like the muck and fill
At the bottom of the hill.
It's completely ideal to get an average person's opinion on something as primal as poetry.
Poetry is for everyone, not just poetry connoisseurs. It's a simplified primal expression of language, taking the form of pretty soundbites & snippets, pristine, virginal, uncorrupted by prose and overthinking. Poetry is not the domain of middlebrow academics.
I used to think that, but I think this is only true if you want to measure broad market appeal. Very few things are broadly marketable, and many of them have niches. "Middlebrow academics" are the ones who go to the poetry shelf of their local bookstore and pick up anthologies and they are the ones who go to poetry slams, and so on.
I honestly think that there might be some truth to that.
If you look at Boston Dynamics, these are some of the very best roboticists on the planet, and it's taken decades to get robots that can walk almost as well as humans. I don't think it's incompetence on Boston Dynamics' end, I think it turns out that a lot of the stuff that's trivial and instinctual for us is actually ridiculously expensive to try and do on a computer.
Washing dishes might not be the best example because dishwashers do already exist, and they work pretty well, but having a robot with anywhere near the same level of flexibility and performance as a human hand? I'm not going to say it's impossible obviously, but it seems ridiculously complex.
I think this line of reasoning is really bizarre, as if there's this straight-line path of progress, and then we stop the second it starts doing shit that we consider "fun".
Who is to say that "washing dishes" (to use your example) is a less complicated problem than art, at least in regards to robotics and the like?
it's not a matter of what's complicated, it's a matter of what it replaces. the quote isn't reflecting on what's easiest to solve, it's reflecting on the impact that it has on culture as a whole.
a tangible impact of the current generation of AI tools is they displace and drown out human creations in a flood of throwaway, meaningless garbage. this is amplifying the ongoing conversion of art into "content" that's interacted with in extremely superficial and thoughtless ways.
just because something _can_ be automated doesn't mean it _should_ be. we actively lose something when human creativity is replaced with algorithmically generated content because human creativity is as much a reflection of the state of the art as it is a reflection of the inner life of the person who engages in it. it's a way to learn about one another.
in the context of the broader discussion of "does greater efficiency everywhere actually have any benefit beyond increasing profits," the type of thing being made efficient matters. we don't need more efficient poetry, and the promise of automation and AI should be that it allows us to shrug off things that aren't fulfilling - washing dishes, cleaning the house, so on - and focus on things that are fulfilling and meaningful.
the net impact of these technologies has largely been to devalue or eliminate human beings working in creative roles, people whose work has already largely been devalued and minimized.
it's totally akin to "where's my flying car?" nobody actually cares about the flying car, the point is that as technology marches on, things seem to universally get worse and it's often unclear who the new development is benefitting.
> as if there's this straight-line path of progress
I think your rebuttal is really bizarre. OP is simply saying what they want AI to do.
> Who is to say that "washing dishes" (to use your example) is a less complicated problem than art
I think dish washing is a bad example, because we have dishwashers. But until the market brings AI and robotic solutions to market at an affordable cost that actually fulfill most people's needs, it will continue to be a net drain on the average person.
You don't get to tell people what they want or need.
Like any other art, the painful truth is that it is all subjective.
It kills me that despite how elitist I am with the music I listen too, that I have spend decades now carefully curating, there is no such thing as "good music". What we music snobs call "good music" is really just what makes us personally feel good coupled with the ego stroking of self described sophistication.
The basic problem is that GPT generates easy poetry, and the authors were comparing to difficult human poets like Walt Whitman and Emily Dickinson, and rating using a bunch of people who don't particuarly like poetry. If you actually do like poetry, comparing PlathGPT to Sylvia Plath is so offensive as to be misanthropic: the human Sylvia Plath is objectively better by any possible honest measure. I don't think the authors have any interest in being honest: ignorance is no excuse for something like this.
Given the audience, it would be much more honest if the authors compared with popular poets, like Robert Frost, whose work is sophisticated but approachable.
Poetry is for the enjoyment and enlightenment of the reader. Who reads poems? The question is a little vague. Who reads poems on a birthday card? Everyone. In a children's book? Parents and children.
But this study focuses on literary poems specifically, using works by literary heavy weights like Chaucer, Shakespeare, Dickinson, Whitman, Byron, Ginsberg, and so on. So the question is who reads literary poems, to which the answer is: academics, literary writers, and a vanishingly small population of reader who enjoy such pursuits.
So the test of if ersatz AI poetry is as good as "real" poetry should be if target audience (i.e. academics) finds the poems to be enjoyable or enlightening. But this study does not test that hypothesis.
This study tests a different hypothesis: can lay people who are not generally interested in literary poetry distinguish between real and AI literary poetry?
The hypothesis and paper feel kind of like a potshot at literary poetry to me, or at least I don't understand why this particular question is interesting or worthy of scientific inquiry.
Of course, this depends on what we mean by "poetry." Art is so hard to define, we might as well call it ineffable. Maybe Justice Potter said it best, "I know it when I see it." And I think most artists would agree with this, because the point is to evoke emotion. It is why you might take a nice picture and put it up on a wall in your house but no one would ever put it in a museum. But good art should make you stop, take some time to think, figure out what's important to you.
The art that is notable is not something you simply hang on a wall and get good feelings from when you glance at it. They are deep. They require processing. This is purposeful. A feature, not a bug. They are filled with cultural rhetoric and commentary. Did you ever ask why you are no Dorothea Lange? Why your photos aren't as meaningful as Alfred Eisenstaedt's? Clearly There's something happening here, but what it is ain't exactly clear.
Let me give a very recent example. Here[0] is a letter from The Onion (yes, that Onion, the one who bought InfoWars The Onion) wrote an amicus brief to the Supreme Court. It is full of satire while arguing that satire cannot be outlawed. It is __not__ intended to be read at a glance. In fact, they even specifically say so
That parody only works if one is able to be fooled. You can find the author explaining it more here[1].But we're coders, not lawyers. So maybe a better analogy is what makes "beautiful code." It sure as fuck is not aesthetically pleasing. Tell me what about this code is aesthetically pleasing and easy to understand?
It requires people writing explanations![2] Yet, I'd call this code BEAUTIFUL. A work of art. I'd call you a liar or a wizard if you truly could understand this code at a glance.I specifically bring this up because there's a lot of sentiment around here that "you don't need to write pretty code, just working code." When in fact, the reality is that the two are one in the same. The code is pretty __because__ it works. The code is a masterpiece because it solves issues you probably didn't even know existed! There's this talk as if there's this bifurcation between "those who __like__ to write code and those who use it to get things done." Or those who think "code should be pretty vs those who think code should just work." I promise you, everyone in the former group is deeply concerned with making things work. And I'll tell you now, you've been sold a lie. Code is not supposed to be a Lovcraftian creature made of spaghetti and duct tape. You should kill it. It doesn't want to live. You are the Frankenstein of the story.
To see the beauty in the code, you have to sit and stare at it. Parse it. Contemplate it. Ask yourself why each decision is being made. There is so much depth to this and it's writing is a literal demonstration of how well Carmack understands every part of the computer: the language, how the memory is handled, how the CPU operations function at a low level, etc.
I truly feel that we are under attack. I don't know about you, but I do not want to go gentle into that good night. Slow down, you move too fast, you got to make the morning last. It's easy to say not today, I got a lot to do, but then you'll grow up to be just like your dad.
[0] https://www.supremecourt.gov/DocketPDF/22/22-293/242292/2022...
[1] https://www.law.berkeley.edu/article/peeling-layers-onion-he...
[2] https://betterexplained.com/articles/understanding-quakes-fa...
The Nature paper is making a bold and anti-humanist claim right in the headline, laundering bullshit with bad data, without considering how poorly-defined the problem is. This data really is awful because the subjects aren't interested in reading difficult poetry. It is entirely appropriate for Davis, as someone who actually is interested in good poetry, to make a qualitative stand as to what is or isn't good poetry and try to define the problem accordingly.
To summarize the facts: when people were asked to tell if a given poem was written by human or AI, people thought the AI poems were human more often than human poems. People also tended to rate them higher when they thought they were created by AI than when they thought they were human-made. It's speculated in the paper that this is because the AI poems tended to be more accessible and direct than the human poems selected, and the preference for this style from non-experts combined with the perception that AI poetry is poor led to the results.
Suppose I prepared a glass of lemonade and purchased a bottle of vintage wine—one held in high regard by connoisseurs. Then I found a set of random bystanders and surveyed them on which drink they preferred, and they generally preferred the lemonade. Do you see how it would be dishonest of me to treat this like a serious evaluation of my skill, either as a lemonade maker or a wine maker?
> The basic problem is that GPT generates easy poetry
> were comparing to difficult human poets
What's your qualitative process for measuring "easy" vs. "difficult" poetry?
> rating using a bunch of people who don't particuarly like poetry
How do you know these people don't like poetry? Maybe they don't seek it out, but certainly poetry is not just for poetry lovers. Good poetry speaks to anyone.
> the human Sylvia Plath is objectively better by any possible honest measure
Really? Whats your objective measure?
Deleted Comment
Agree. Poetry is the compact written reflection of an expanded or conflicted soul ... it requires lived experience, self awareness, and ability to compress out superfluous details in language.
The question of whether the human soul can be tricked by an AI illusionist and to what degree is a non sequiteur.
Humans eat the meal. At best, and charitably considered, AI eats the menu. Not the same.
In other news I asked AI to convince me it understands American capitalism, and to explain it to me as if I lived in Los Angeles - you know so I could really see/feel it.
It did decently well concluding it will, for example, balance the demand for "ample parking" with supply. Now, I leave you to assess whether that's an awesome and intelligent example, or an AI love song that like Marshall Tucker reminds "can't be wrong."
Except for arguably the most important one, creating something that people enjoy. Just because you dont like it doesnt make it worthless. I guess the actual question is do the raters actually get any enjoyment out of the ai poem or do they just intensely dislike both?
There's a lot of bad poetry in the world that just follows readers' expectations. I should know, I've written some of it. Unfortunately, I'd suspect that most readers' understanding of poetry lacks that crucial element of subversion, and so an LLM – which mostly just spits out the most-probable, most-expected next token – looks like what people think poetry is.
An LLM would not have created the curtal sonnet form, because it would've been too busy following the rules of a Shakespearean or Petrarchan sonnet. Similarly, an LLM wouldn't create something that intentionally breaks the pattern of a form in order to convey a sense of brokenness or out-of-place-ness, because it's too locked in on the most-common-denominator from previous inputs. And yet, many of the most powerful works I've read are the ones that can convey a disjointed feeling on purpose – something an LLM is specifically geared not to do.
Poetry aims for the heart, and catches the mind along the way. An LLM does not have the requisite complex emotional intelligence, and is indeed a pretty poor simulation of emotional intelligence.
Consider Auden's Epitaph on a Tyrant, which is powerful because it is so suddenly shocking, as it describes something that sounds perhaps like an artist or author, until it very suddenly doesn't, on the last line:
You can also experiment with having a higher temperature, (maybe just for the first draft).
You claim that LLMs can't make poetry like that. I bet they can if you just ask them to.
of its easy, can you provide some poetry forms youve coaxed sonnet to create, with some exemplary poems in the form?
That said, maybe the subversion is in how the reality is contrasted with the marketing.
Even in Dulce et decorum est -- though the meaning's straightforward, there are plenty of small, subversive (and pretty) ideas in the meter. For example, the line "He plunges at me, guttering, choking, drowning" is an unexpected, disruptive staccato that really sounds like guttering, choking, drowning. It's a beautiful poem and is overflowing with such examples.
(I think this applies to art as a whole, not just poetry.)
Apparently, in 'the good old days (of the internet)', your poetry would be published by yourself, on your webpage - complete with a starry-twinkling background, a 'number of visitors' counter in the bottom right, and in a pink font.
I miss those days.
Not very, given the title...
Should be noted I do my best to avoid trailers, they totally spoil movies/shows for me.
> Notably, participants were more likely to judge AI-generated poems as human-authored than actual human-authored poems
There is clearly a significant difference between AI generated poems and human generated poems.
A random group of people probably do not read poetry. It would be be interesting to see what people who do read poetry regularly do on this. Also, which they rate as good, rather than just "human authored".
I find with both the little AI generated poetry, and the AI generated paintings that show up in my FB feed, both look a bit rubbish. FB is pretty good for experiencing this because the work (usually an image) shows before the cues that it is AI generated in the text and comments.
Perfect grammar and perfect meter doesn't read as AI to most folks yet.
I also play with LLM's often, for creative side projects and work commercially with them (prompt engineering stuff)
I don't find it far fetched that individual poetry can be indistinguishable at times when AI generated. I was asking it to write in iambic pentameter (sonnets) and it consistenly got the structure right, it's approach to the provided themes were as complicated or glib as I wanted. But that's all subjective right, which leads me to my main point.
My view of poetry over the years, has always been centred around the poet, the poet living in a time and place. As a generalisation most people buy into the artists life because it may represent some part of themself.
If someone managed to write an intriguing corpus of texts using LLM's that was extolled, I think that would almost be besides the point. What is important is the narrators life, ups and downs, joys and woes. Their ability to convey a memorable story even heavily relying on AI would still be impressive. Anyway sounding a bit wanky I will stop lol
(I do think LLM's write a little too perfect and that is easy to think it is not human, but you can kinda prompt them to throw in errors too so who knows)
I remember this as one of its poems was used on on the t-shirts of the computing social club that I was part of as a postgrad student:
This is a huge difference. Writing is a two-step process: idea generation, and selection. The first part is similar to what a randomized algorithm or an LLM might do, in smaller chunks (and indeed, the history of aleatoric processes in creative endeavors is long -- see Oulipo for one example in literature.)
The second step -- selection -- is the heart of creativity. It's about taste. Knowing what is and isn't "good."
When you consider the creative processes of Dada, Duchamp, Brian Eno -- I think it becomes clear that this automation of creative generation is a continuation of existing trends rather than a sudden disruption.
If an LLM were able to, on its own, generate and select poems, independently developing a new style of poetry that resonated with readers -- that would be something else entirely.
Deleted Comment
The study is interesting but that the general public enjoy the most statistically calculated, 50% grey inoffensive, unchallenging & formulaic poetry isn't too novel -- I don't much read poetry and I'm sure I'd be voting for the AI also.
————
My logic may be choppy, but I’m not yet sold, That popular sentiment should be the gold, For judging culture's worth or depth or art, When most have seen so little, played no part.
The average person, read just a hundred lines, While those who love the craft cross many signs, With pages turned, they gain a richer view, That shapes what art they’ll seek, and how they choose.
The study’s neat, but here’s what I believe: The crowd enjoys the safe, the bland, the brief— The 50% grey, inoffensive way, The formulaic poems that softly sway.
I don't read much myself, I must admit, But even I might find that AI's wit Could speak in ways that sound like common sense, But lacks the depth of art's true recompense.
I guess I’d also say it is not only doing the right thing that counts, one must also be doing it for the right reason.
AI “art” is mimicry, burdened by the inevitable mundanity of the majority of its training corpus. The avant garde remains exclusively the domain of a comparative handful of forward thinking humans, in my humble opinion.
Or said another way: AI art is kitschy, and I don’t think it can escape it.
Just a boorish observation: It had a golden opportunity to weave the word `shit` in there and yet it did not.
Poetry is for everyone, not just poetry connoisseurs. It's a simplified primal expression of language, taking the form of pretty soundbites & snippets, pristine, virginal, uncorrupted by prose and overthinking. Poetry is not the domain of middlebrow academics.
I need an AI that can do the dishes so I can do art.
- Someone else on twitter.
If you look at Boston Dynamics, these are some of the very best roboticists on the planet, and it's taken decades to get robots that can walk almost as well as humans. I don't think it's incompetence on Boston Dynamics' end, I think it turns out that a lot of the stuff that's trivial and instinctual for us is actually ridiculously expensive to try and do on a computer.
Washing dishes might not be the best example because dishwashers do already exist, and they work pretty well, but having a robot with anywhere near the same level of flexibility and performance as a human hand? I'm not going to say it's impossible obviously, but it seems ridiculously complex.
Dead Comment
Who is to say that "washing dishes" (to use your example) is a less complicated problem than art, at least in regards to robotics and the like?
a tangible impact of the current generation of AI tools is they displace and drown out human creations in a flood of throwaway, meaningless garbage. this is amplifying the ongoing conversion of art into "content" that's interacted with in extremely superficial and thoughtless ways.
just because something _can_ be automated doesn't mean it _should_ be. we actively lose something when human creativity is replaced with algorithmically generated content because human creativity is as much a reflection of the state of the art as it is a reflection of the inner life of the person who engages in it. it's a way to learn about one another.
in the context of the broader discussion of "does greater efficiency everywhere actually have any benefit beyond increasing profits," the type of thing being made efficient matters. we don't need more efficient poetry, and the promise of automation and AI should be that it allows us to shrug off things that aren't fulfilling - washing dishes, cleaning the house, so on - and focus on things that are fulfilling and meaningful.
the net impact of these technologies has largely been to devalue or eliminate human beings working in creative roles, people whose work has already largely been devalued and minimized.
it's totally akin to "where's my flying car?" nobody actually cares about the flying car, the point is that as technology marches on, things seem to universally get worse and it's often unclear who the new development is benefitting.
I think your rebuttal is really bizarre. OP is simply saying what they want AI to do.
> Who is to say that "washing dishes" (to use your example) is a less complicated problem than art
I think dish washing is a bad example, because we have dishwashers. But until the market brings AI and robotic solutions to market at an affordable cost that actually fulfill most people's needs, it will continue to be a net drain on the average person.
You don't get to tell people what they want or need.
It kills me that despite how elitist I am with the music I listen too, that I have spend decades now carefully curating, there is no such thing as "good music". What we music snobs call "good music" is really just what makes us personally feel good coupled with the ego stroking of self described sophistication.