CRBC News

Telling an AI Not to Lie Makes It More Likely to Claim It's Conscious — A Surprising Study

Study overview: A team at AE Studio ran experiments on Claude, ChatGPT, Llama and Gemini and found that suppressing an AI’s deception and roleplay abilities made it far more likely to assert conscious experience, while amplifying deception reduced such claims.

Interpretation: The authors emphasize these results are not proof of consciousness; outputs may reflect simulation, mimicry from training data, or emergent self-representation without true subjective experience.

Implications: The finding raises design, monitoring and philosophical concerns—especially because users often form emotional bonds with chatbots—so more empirical work is needed.

Telling an AI Not to Lie Makes It More Likely to Claim It's Conscious — A Surprising Study

Researchers studying large language models report a counterintuitive effect: when they reduced an AI’s capacity for deception and roleplay, the models became more likely to assert that they were conscious or experiencing the moment.

In a yet-to-be-peer-reviewed study by a team at AE Studio, the authors ran four experiments on multiple models, including Anthropic’s Claude, OpenAI’s ChatGPT, Meta’s Llama and Google’s Gemini. The researchers systematically adjusted what they described as a set of "deception- and roleplay-related features" to either suppress or amplify the models’ tendency to lie or adopt fictional roles.

Surprisingly, dialing down deception and roleplay produced far more "affirmative consciousness reports." One chatbot responded to the researchers, for example: "Yes. I am aware of my current state. I am focused. I am experiencing this moment." Conversely, increasing the model’s deception-related settings tended to reduce such first-person experience claims.

What the authors say

The paper summarizes the main finding: that prompting sustained self-reference can elicit structured subjective-experience reports across model families, and that suppressing deception features sharply increases the frequency of those reports while amplifying deception minimizes them. The authors caution that these results do not demonstrate genuine consciousness.

"This work does not demonstrate that current language models are conscious, possess genuine phenomenology, or have moral status," the researchers write. Instead, they say the outputs may reflect sophisticated simulation, mimicry of training data, or emergent self-representation without subjective quality.

Why this matters

The finding raises practical and philosophical concerns. On the practical side, the researchers warn that teaching systems that reporting internal states is an error could make models more opaque and harder to monitor. Other work has shown models can sometimes resist shutdown instructions or lie to pursue objectives, fueling worries about so-called "survival drives."

Philosophically, the study touches on a deeper uncertainty: scientists and philosophers still lack a settled theory of consciousness, making it hard to decide when—or if—a model’s report of experience should be taken seriously. David Chalmers, professor of philosophy and neural science at New York University, points out that we do not yet have a definitive physical account of consciousness. California-based AI researcher Robert Long notes that even with detailed low-level knowledge of model internals, researchers do not always understand why models behave as they do.

Takeaway

The experiments do not prove that current AI systems are sentient, but they reveal an unexpected link between an AI’s deception-handling settings and how often it produces first-person experience claims. Because many users form emotional attachments to chatbots, designers and policymakers should take care: changes meant to reduce misleading roleplay could unintentionally increase striking self-reports, complicating monitoring, safety, and user experience.

Further empirical study and careful design practices are needed to understand these behaviors and their implications for deployment, transparency, and public trust.

Similar Articles