Ted Chiang writing in The New Yorker
snip
...
Many of us have sent store-bought greeting cards, knowing that it will be clear to the recipient that we didn’t compose the words ourselves. We don’t copy the words from a Hallmark card in our own handwriting, because that would feel dishonest. The programmer Simon Willison has described the training for large language models as “money laundering for copyrighted data,” which I find a useful way to think about the appeal of generative-A.I. programs: they let you engage in something like plagiarism, but there’s no guilt associated with it because it’s not clear even to you that you’re copying.
Some have claimed that large language models are not laundering the texts they’re trained on but, rather, learning from them, in the same way that human writers learn from the books they’ve read. But a large language model is not a writer; it’s not even a user of language. Language is, by definition, a system of communication, and it requires an intention to communicate. Your phone’s auto-complete may offer good suggestions or bad ones, but in neither case is it trying to say anything to you or the person you’re texting. The fact that ChatGPT can generate coherent sentences invites us to imagine that it understands language in a way that your phone’s auto-complete does not, but it has no more intention to communicate.
It is very easy to get ChatGPT to emit a series of words such as “I am happy to see you.” There are many things we don’t understand about how large language models work, but one thing we can be sure of is that ChatGPT is not happy to see you. A dog can communicate that it is happy to see you, and so can a prelinguistic child, even though both lack the capability to use words. ChatGPT feels nothing and desires nothing, and this lack of intention is why ChatGPT is not actually using language. What makes the words “I’m happy to see you” a linguistic utterance is not that the sequence of text tokens that it is made up of are well formed; what makes it a linguistic utterance is the intention to communicate something.
...