PsicoPop
Home » Can AI create art?

Can AI create art?

Las carencias actuales de la inteligencia artificial en el ámbito del arte

by Mau UrrutiaMau Urrutia
19 reads 14 minutes to read it

In 1953, Roald Dahl published “The Great Automatic Grammatizator“, a story about an engineer who wanted to be an anonymous writer. The day after completing the construction of the world’s fastest calculating machine, he realised that English grammar is governed by quasi-mathematical rules. So he built a fiction writing machine capable of generating a short story of five thousand words in thirty seconds? For a novel, it would take only fifteen minutes and would only require a person to manipulate handles and pedals, as if driving a car or playing an organ, to regulate the level of humour and tone of the writing . The novels proved so popular that within a year, half of all fiction published in English was the product of this invention.

Today, is there anything in art that makes us think that it cannot be created at the touch of a button, as in Dahl’s imagination? Presently, the fiction generated by great language models like ChatGPT is pathetic, but surely these programmes will get better soon. How good can they get? Better than humans? And in the realm of painting or film?

What is art for you? It’s not easy to define, is it? Simplifying it and making a generation we could say that art is the result of many choices , and this could be easily explained if we use fiction writing as an example: when we write fiction, we are, consciously or unconsciously, choosing almost all the words we write. If we simplify it even further, we can imagine that a short story of ten thousand words follows something on the order of ten thousand choices, and when we give a prompt to a generative artificial intelligence, we are making very few choices: if we provide a prompt of a hundred words, we have made about a hundred choices.

This means that if an A.I. generates a ten thousand-word story based on what you have told it, it must fill in all the choices that you have not made and have not told it, and there are different ways to do this: one way is to average the choices that other writers have made, as represented by the text found on the Internet, and this average equals the least interesting and most mediocre choices possible, hence the text that the A.I. generates is often bland and unfunny.

Another way is to tell the programme to imitate a style, emulating the choices made by a particular writer, but this generates a very derivative story…

💡 In neither case is interesting art being created.

Now let’s look at the case of visual art: it is true that it is more difficult to quantify the choices a painter can make, for example, but notice that real paintings bear the mark of many decisions. When we ask a text-to-image program like Dall-E, we can give a prompt like “A knight in armour fighting a fire-breathing dragon” and let the program do the rest.

The current version of Dall-E accepts prompts of up to four thousand characters, i.e. hundreds of words, but they are not enough to describe all the details of a scene.

Most of the choices in the resulting image must be borrowed from similar paintings found online, and the image can be rendered exquisitely, yes, but the person entering the prompts cannot claim authorship.

Some imagine that image generators will eventually affect visual culture just as they did photography, but accepting that photography is similar to generative A.I. cannot be done so quickly: when photography was developed no one must have imagined that it would be an artistic medium because it was not obvious that there were choices to be made: just set up the camera and start the exposure. Eventually, people realised that there were many things that could be done with cameras, and the art is precisely in that: in the many choices a photographer makes.

OK, it’s not easy to be aware of what the options are, but when you compare the photos of an amateur with those of a professional, you can see that there is some difference.

🧠 And hence the question: is there a similar opportunity to realise numerous options using a text-to-picture generator?

Surely not. An artist, whether working digitally or with paint, implicitly makes many more decisions during the process of making a painting than would fit into a text message of several hundred words.

The film director Bennett Miller has used Dall-E to generate some compelling images which have been exhibited at the Gagosian gallery. To create them, he created detailed text instructions and then asked Dall-E to review and manipulate the generated images over and over again, generating over a hundred thousand images to arrive at the twenty images in the exhibition . But he has said that he has not been able to get comparable results in subsequent releases of Dall-E, possibly because Miller was using Dall-E for something it is not intended to do, i.e. it is as if he hacked Microsoft Paint to make it behave like Adobe Photoshop, but as soon as a new version of Paint was released, his hacks stopped working.

OpenAI is probably not trying to create a product to serve users like Miller because a product that requires a user to work for months to create an image does not appeal to a wide audience, i.e. the company wants to offer a product that generates images with minimal effort.

☝🏻 The same happens with a writer who uses A.I. to write a good novel, and the problem with generative A.I. is that these programmes generate much more than what you put in, and that is precisely what prevents them from being effective tools for artists.

A.I. companies like Adobe claim that they will unleash creativity, i.e. they are saying that art can be all inspiration and no perspiration, but these two concepts cannot be easily separated because art requires decisions at all scales, and the myriad small-scale choices made during implementation are just as important to the final product as the few large-scale choices made during conception. It is a mistake to equate “large scale” with “important” when it comes to the choices made when creating art; the interrelationship between the large and small scale is where the art lies!

Believing that inspiration trumps all else is a sign that someone is unfamiliar with the medium, and this is true, even if one’s goal is to create entertainment rather than art. The effort required to entertain is often underestimated: a thriller novel may not live up to Kafka’s ideal of a book – an “axe to the frozen sea within us“, but it can still be as finely crafted as a Swiss watch. And an effective thriller is more than its premise or plot, so we may doubt that we can replace every sentence in a thriller with one that is semantically equivalent and that the resulting novel will be as entertaining. This means that its sentences, and the small-scale choices they represent, help determine the effectiveness of the thriller.

But automating writing that doesn’t have the expectation of including thousands of options? Any piece of writing that deserves your attention as a reader is the fruit of the effort of the writer. Effort during the writing process doesn’t guarantee that the final product will be worth reading, but without it, you can’t do a worthwhile job. The kind of attention you pay when reading a personal email is different from the kind of attention you pay when reading a business report, but in both cases it is only guaranteed when the writer puts some thought into it.

During the Paris Olympics, Google ran an ad for Gemini, the competitor to OpenAI’s GPT-4. This ad shows a father using Gemini to compose a fan letter, which his daughter will send to an Olympic athlete who inspires him. Google pulled the ad after widespread reaction from viewers: no one expects a fan letter from a child to an athlete to be extraordinary : if the young girl had written the letter herself, it would probably have been indistinguishable from many others: the importance of a child’s letter, both for the child who writes it and for the athlete who receives it, has just been sincere rather than eloquent.

How many times have we sent store-bought greeting cards, knowing that the recipient will be clear that we didn’t write the words of the dedication? Programmer Simon Willison has described training for large language models as “money laundering for copyrighted data“, a useful way of thinking about the appeal of generative AI.

Programs that allow us to engage in something like plagiarism, but without having the associated guilt because it’s not even clear to us that we are copying.

chatGPT estic content

It’s straightforward to get ChatGPT to utter a series of words such as “I’m glad to see you“. There are many things we don’t understand about how large language models work, but one thing we can be sure of is that ChatGPT is not happy to see you. A dog can communicate that it is happy to see you, and so can a pre-linguistic child, although both do not have the ability to use words. ChatGPT feels nothing and wants nothing, and this lack of intention is the reason why ChatGPT does not really use language: what makes the words “I’m glad to see you” a linguistic utterance is not that the sequence of text tokens that make it up is well-formed, what makes it a linguistic utterance is the intention to communicate something.

We are tempted to project these experiences onto a large language model when it emits coherent sentences, but to do so is to fall into mimicry: it is the same phenomenon as when butterflies develop large dark spots on their wings that can fool birds into thinking they are predators with big eyes. There is a context in which dark spots are sufficient. Birds are less likely to eat a butterfly that has them, and the butterfly doesn’t really care why it doesn’t eat it, as long as it gets to live. But there is a big difference between a butterfly and a predator that poses a threat to a bird.

Linguist Emily M. Bender points out that teachers do not ask students to write essays because the world needs more student essays: the point of writing essays is to strengthen students’ critical thinking skills . In the same way that weightlifting is useful regardless of the sport the athlete plays, essay writing develops the skills necessary for any job a university student will get.

Using ChatGPT to complete assignments is like taking a forklift to the weight room: you will never improve your cognitive fitness this way.

The computer scientist François Chollet has proposed the following distinction: ability is how well you do at a task, while intelligence is how efficiently you acquire new skills, and this reflects quite well our intuitions about human beings because most people can learn a new skill if there is enough practice, but the faster the person acquires it, the more intelligent we think they are . What is interesting about this definition is that, unlike IQ tests, it is also applicable to non-human entities: when a dog learns a new trick quickly, we consider it a sign of intelligence.

In 2019, researchers conducted an experiment in which they taught rats to drive . They put the rats in small plastic containers with three copper wire rods. When the mice put their paws on one of these bars, the container would move forward, turn left or turn right. The rats could see a plate of food on the other side of the room and tried to steer their vehicles towards it. The researchers continuously trained the rats for five minutes and after 24 practice sessions, the rats could drive. Twenty-four sessions were enough to master a task that probably no rat had ever encountered before in the evolutionary history of the species. That’s quite a display of intelligence, isn’t it?

AlphaZero is a program developed by Google’s DeepMind, and it plays chess better than any human player, but during its training it played forty-four million games, far more than any human can play in a lifetime. For him to master a new game, he will have to undergo an equally enormous amount of training. By Chollet’s definition, programs like AlphaZero are highly skilled, but they are not particularly intelligent because they are not efficient at acquiring new skills . Currently, it is impossible to write a computer program capable of learning even a simple task in only twenty-four trials, if the programmer does not provide feedback on the task beforehand.

Autonomous cars trained in millions of miles of driving can still crash into a truck with an overturned trailer, and this is because these things are not routinely in their training data, whereas humans taking their first driving lesson will know how to stop. More than our ability to solve algebraic equations, our ability to cope with unfamiliar situations is a fundamental part of why we consider humans to be intelligent. Computers will not be able to replace humans until they acquire this kind of competence.

A.I. is a fundamentally dehumanising technology because it treats us as less than what we are: creators and apprehenders of meaning. It reduces the amount of intention in the world.

Some people have defended the great linguistic models by saying that most of what human beings say or write is not particularly original. And this is true, but it is also irrelevant. When someone says, “I’m sorry” to you, it doesn’t matter that other people have said sorry in the past, it doesn’t matter that “I’m sorry” is a string of text that is statistically unremarkable. If someone is being sincere, their apology is valuable and meaningful, even if apologies have been made previously. Likewise, when you tell someone that you are glad to see them, they are saying something meaningful, even if there is no “newness” to it.

And when it comes to art: whether you are creating a novel, a painting or a film, it is engaged in an act of communication between you and your audience. What you create does not have to be entirely different from all previous works of art in human history to be valuable: the fact that you are the one saying it, the fact that it derives from your unique life experience and comes at a particular moment in the life of whoever is viewing your work, is what makes it new.

We are all products of what has come before us, but it is by living our lives in interaction with others that we make sense of the world. This is something that an autocomplete algorithm can never do, and don’t let anyone tell you otherwise.

📎 Urrutia, M. [Maurici]. (2024, 03 September). Can AI create art?. PsicoPop. https://www.psicopop.top/en/can-ai-create-art/


📖 References:

Subscribe
Notify of
0 Comentaris
Newest
Oldest Most Voted
Inline Feedbacks
View all comments

Related posts

Are you sure you want to unblock this post?
Pending unlocks : 0
Are you sure you want to cancel the subscription?
0
Would love your thoughts, please comment.x
()
x

This website uses cookies to improve your experience. If you continue, we assume you agree. Accept

Privacy