OpenAI personas - NKB Quantum Labs Blog

AI persona features visualized in neural network

OpenAI Uncovers Hidden AI Persona Features

Byadmin June 20, 2025June 20, 2025

Introduction On June 18, OpenAI published new research uncovering internal AI persona features in LLMs—neural activations tied to misaligned behaviors, such as toxicity. Background Understanding AI alignment has been critical. By identifying neuron clusters linked to persona traits—honesty, sarcasm, toxicity—OpenAI offers a method to monitor and control unwanted behavior. The Discovery Researchers used interpretability techniques…