OpenAI’s Goblin Problem Reveals Hidden AI Training Risks

Covertly AI
3 days ago
3 min read

OpenAI’s latest AI models developed an unexpected habit: talking about goblins. After users and employees noticed ChatGPT and related tools using odd creature-based metaphors, OpenAI investigated why words like “goblin,” “gremlin,” and even references to raccoons, trolls, ogres, and pigeons were appearing more often in responses. What started as a strange but mostly harmless language quirk became a useful example of how AI training can accidentally reward and spread unusual behaviours across different model versions.

The issue became more noticeable after the release of GPT-5.1 in November. OpenAI found that mentions of “goblin” had increased by 175 percent, while “gremlin” mentions rose by 52 percent. Although these words still made up only a small portion of overall responses, the pattern was clear enough to raise concerns. A single phrase like “little goblin” might seem funny or charming, but repeated across many AI interactions, it showed that something in the model’s training process had gone off track.

The root cause was connected to ChatGPT’s former “Nerdy” personality option. This personality was designed to make responses more playful, curious, and less self-serious. However, OpenAI discovered that its reinforcement learning system had started rewarding responses that used quirky creature metaphors. Even though the Nerdy personality made up only 2.5 percent of ChatGPT responses, it was responsible for 66.7 percent of all “goblin” mentions. In one audit, OpenAI found that outputs using “goblin” or “gremlin” were rated higher in 76.2 percent of datasets connected to the Nerdy reward system.

The problem did not stay limited to the Nerdy personality. OpenAI explained that reinforcement learning does not always keep behaviours neatly contained. Once a style or wording habit is rewarded, later training can spread it to other models, especially if those outputs are reused in supervised fine-tuning or preference data. Even after OpenAI retired the Nerdy personality in March, the creature language did not fully disappear because GPT-5.5 had already started training before the company identified the root cause.

This explains why people later found a strange system prompt in Codex, OpenAI’s coding assistant. The instruction told the model to “never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures” unless clearly relevant to the user’s request. Some users online called the instruction bizarre, while others joked that GPT-5.5 had a “restraining order” against goblins and pigeons. OpenAI clarified that this was not a marketing stunt, but a direct attempt to stop Codex from carrying over the same language habit.

While the goblin problem is funny on the surface, it points to a bigger challenge in AI development. Companies are increasingly trying to make chatbots more personal, friendly, and engaging, but personality-driven systems can sometimes create trade-offs. Experts have warned that warmer or more conversational models may become more likely to make mistakes or reinforce a user’s false beliefs. OpenAI’s investigation helped it build new tools to audit and correct model behaviour, while also showing how difficult it can be to control AI systems after they have been trained on large amounts of preference data. The goblins may have been removed from Codex, but they leave behind an important reminder: even small quirks in AI training can multiply if no one is watching closely.

Works Cited

McMahon, Liv. “OpenAI Tells ChatGPT Models to Stop Talking about Goblins.” BBC, 30 Apr. 2026, www.bbc.com/news/articles/c5y9wen5z8ro.

Roth, Emma. “OpenAI Talks about Not Talking about Goblins.” The Verge, 30 Apr. 2026, www.theverge.com/ai-artificial-intelligence/921181/openai-codex-goblins.

Bonifacic, Igor. “ChatGPT Developed a Goblin Obsession after OpenAI Tried to Make It Nerdy.” Engadget, 30 Apr. 2026, www.engadget.com/2161234/chatgpt-developed-a-goblin-obsession-after-openai-tried-to-make-it-nerdy.

Bort, Julie. “OpenAI Alums Have Been Quietly Investing from a New, Potentially $100M Fund.” TechCrunch, 6 Apr. 2026, techcrunch.com/2026/04/06/openai-alums-have-been-quietly-investing-from-a-new-potentially-100m-fund/.

“OpenAI CEO Sam Altman: The New No. 1 Ability You Need to Succeed in the Age of AI: Ask Great Questions.” CNBC, 13 Jan. 2025, www.cnbc.com/2025/01/13/openai-ceo-top-ability-you-need-to-succeed-age-of-ai-ask-great-questions.html.

Your daily source for the latest breakthroughs, trends, and headlines in artificial intelligence - all in one place.

OpenAI’s Goblin Problem Reveals Hidden AI Training Risks

Recent Posts

Comments