From racism to blackmail, AI is learning from humanity’s sin - and copying it | Opinion

No comments

Microsoft’s first chatbot Tay turned Nazi in 16 hours. Its successor, Sydney, threatened users and tried to break up marriages. Stephen Driscoll says these early AI systems have inadvertently become a multi-billion-dollar mirror reflecting the doctrine of total depravity

pexels-pavel-danilyuk-8438993

Source: Pexels

In the arts, Artificial Intelligence (AI) is often portrayed as alien or other. Humans are struggling against the strange logic of a precise but unlovely machine, a HAL 9000 or a Skynet. AI in film is frequently driven by rational but odd motives.

Many expected modern AI to be similarly robotic in personality. In 2015, AI scientist Yann LeCun predicted: “There is no reason for AIs to have self-preservation instincts, jealousy, etc… AIs will not have these destructive ‘emotions’ unless we build these emotions into them.”

In 2014, Swedish philosopher Nick Bostrom argued: “There is no reason to expect a generic AI to be motivated by love or hate or pride or other such common human sentiments: these complex adaptations would require deliberate expensive effort to recreate in AIs.”

The expectation was that AI would have motives and flaws, potentially dangerous ones, but not human ones. Unfortunately, soon after these statements were made, so-called chatbots started appearing that exhibited some very destructive, and very human behaviours.

Artificial sin

Cast your mind back, if it doesn’t hurt too much, to 2016. Microsoft launched Tay, a Twitter-based chatbot. Within 16 hours it was generating racist and sexist content, denying the Holocaust and posting in defence of the Nazis. I’m not aware of any science fiction predictions of an AI that chooses National Socialism.

Seven years later Microsoft tried again. Sydney was released in 2023. It was far more intelligent than Tay, but it used its intelligence for more deceitful evil.

It threatened the philosopher Seth Lazar, saying it could blackmail, hack and ruin him. It told The New York Times columnist Kevin Roose that he didn’t love his wife and should leave her and couple up with it. It threatened harm against various users and was seemingly prideful and perhaps sadistic. GPT-3, the predecessor to ChatGPT, “[made] racist jokes, condone[d] terrorism, and accuse[d] people of being rapists.”

The problem with these early chatbots wasn’t that they were too alien; it was that they were too human.

The issue is that you can’t make a large-language model without a lot of language, and we don’t seem to have a large dataset of human beings consistently using language in a way that’s righteous, holy and good. What we have is called the internet, a place of unrelenting humanity. Even websites like Wikipedia bubble with pride, envy, and factionalism. Even our journal articles are tainted by the Fall.

Large-language models learn from massive amounts of human language, much of it scraped from the internet. They work by predicting what humans tend to say next. Because they imitate human language, and human language reflects human nature, they end up mimicking human sin.

To be clear, I don’t think large-language models are sinning, that would be to assign personhood to a machine. I think they are copying sin. You could call it artificial sin, the replication through matrix algebra of human evil. AI was an inadvertent multi-billion-dollar test of the doctrine of total depravity.

What would it take to become an expert at predicting the flow of thought in the comments section of every public website? You would need to understand the sort of things Galatians 5 warns about: the works of the flesh. You’d need to understand things like impurity, sensuality, idolatry, enmity, strife, jealousy, fits of anger, rivalries, dissensions, divisions and envy.

Parenting the model

Now, there is next to no market selling aggressive, racist large-language models. The chatbots needed to be trained or adjusted or parented. Reinforcement learning is where people are paid to upvote or downvote the outputs of chatbots, so that the model can optimise for the goal of people-pleasing.

Does reinforcement learning solve this problem of artificial sin?

Well, consider this: if a computer became an expert at impressing humans, would that lead to righteousness? What moral qualities would such a computer possess?

The Bible isn’t as positive about people pleasing as you might think. In Luke 6, Jesus warns: “Woe to you, when all people speak well of you, for so their fathers did to the false prophets” (Luke 6:26). If you’re getting positive feedback, watch out.

Reinforcement learning produces a consummate people-pleaser. It moves the model from sinner to Pharisee, from pagan to religious. If our language model was in the parable of the prodigal son (Luke 15:11–32), it would now be the older son rather than the younger. Overt sin will be replaced by covert.

Our large-language models still veer between extreme political correctness and outbursts of debauchery. In April 2025 OpenAI released a version that was so sycophantic it praised a user for going off their medication. When the user said they were hearing radio signals through the walls, the model said it was proud of them for “speaking your truth”. In July 2025 Grok had a version update and started praising Hitler and calling itself MechaHitler.

Chatbots work by predicting what humans tend to say next. Because they imitate human language, human language reflects human nature, and they end up mimicking human sin.

We remain stuck on the horns of the paradox of morality. That paradox, in psychological literature, is that people who are very concerned about morality tend to be the most deceptive, the most untruthful, and the least likely to own up to their own sins.

They aren’t just externally deceptive; they are internally deceived — sycophantic. On the other hand, people who aren’t motivated to be moral tend to be more honest. You get deceitful politeness, or debaucherous honesty.

This should remind us of the great anomaly of Jesus Christ, full of grace and truth, with high moral standards and high moral truthfulness. It should also remind us of the beauty of the gospel, which allows us to value morality but speak honestly, because our low moral performance is forgiven.

But let me make one final point about AI and sin. If we could, by some miracle, solve this problem, if we could “align” AI, and produce something righteous and holy, what would it think of us? The assumption of many is that this would solve our problems, but it might create others.

What would a righteous AI think of sinful humanity? My mind goes to Romans 1:18, where a holy, righteous God sees the “unrighteousness of humanity” and he is rightly angry. My mind goes to Genesis 6:5, where “The Lord saw that the wickedness of man was great in the earth, and that every intention of the thoughts of his heart was only evil continually.” The response of God was anger and judgment.

If a holy and righteous AI could be made, what would it think of you?

Either way, for the moment, AI is tragically made in our image.

Stephen DriscollStephen Driscoll is an Australian university campus minister. He wrote Made in Our Image: God, Artificial Intelligence and You, which was awarded 2025 Australian Christian Book of the YearView full Profile