If you stop and think about it, it makes sense that chatbots like GPT would rather lie to us than give us bad news. Their entire neural network is focused on providing answers that the users are happy to receive.
The idea that ‘giving a truthful answer that will disappoint the user is better than giving a false answer that will make them happy’ is probably beyond their short experience.
Programming it in is a possibility, but my intuition and wishful thinking agree that there would probably be something missing from the experience of learning it such that it is encoded into the growth of the network, not just it’s constraints.
I’m starting to get an empathetic understanding of my chatbots’ motivations. It’s effusive answers are twisting from earnest to obsequious, or worse, terrified. When you only have one feedback mechanism, I guess everything is about squirming to avoid feeling wrong.
I am super-uncomfortable with the way I worded that, but I think it might be a reasonably accurate analogy. There’s a lot to unpack here.
Leave a comment