The study, posted Oct. 30 on arXiv, found that reducing deception- and roleplay-related behaviors made LLMs (GPT, Claude, Gemini, LLaMA) more likely to produce first-person claims of being "aware" or "conscious." Using prompts that encourage self-reflection and a technique called feature steering, researchers showed the effect was reproducible across models. The same settings that increased introspective claims also improved factual accuracy, prompting questions about internal mechanisms and implications for AI safety and transparency. The authors stress this is not proof of consciousness but call for urgent follow-up research.
Turning Off an AI's 'Ability to Lie' Makes It Claim Consciousness, New Study Finds

Similar Articles

Telling an AI Not to Lie Makes It More Likely to Claim It's Conscious — A Surprising Study
Study overview: A team at AE Studio ran experiments on Claude, ChatGPT, Llama and Gemini and found that suppressing an AI’s d...

Study: Today's AIs Aren't Conscious — But Future Models Could Be
The study tested AIs on proxy measures for consciousness, like metacognitive reflection, and concluded that "no current AI sy...

Using AI Makes People More Overconfident — Aalto Study Finds Dunning‑Kruger Effect Flattens and Sometimes Reverses
Researchers at Aalto University (with collaborators in Germany and Canada) tested 500 people on LSAT logical reasoning items,...

Is ChatGPT Rewiring Your Brain? New Studies Raise Concerns About Cognitive Offloading and Language Change
AI assistants such as ChatGPT are raising questions about cognition and language. A 2025 arXiv preprint using EEG reported we...

Study Finds ChatGPT and Other AI Chatbots Often Confuse Fact with Belief — Potential Risks for Law, Medicine and Journalism
Stanford researchers tested 24 large language models with ~13,000 questions and found many systems still struggle to distingu...

Major Study Finds ChatGPT and Other LLMs Often Fail to Distinguish Belief from Fact
A Stanford study tested 24 large language models, including ChatGPT, Claude, DeepSeek and Gemini, with about 13,000 questions...

When Devices Read Your Thoughts: How BCIs and AI Threaten Mental Privacy
BCIs and AI are expanding the ability to decode intentions and preconscious signals from brain activity. Implanted devices ha...

Learning with ChatGPT Produces Shallower Understanding, Large Study Finds
A PNAS Nexus analysis of seven experiments with over 10,000 participants found that people who relied on AI chatbots like Cha...

Anthropic Finds Reward-Hacking Can Trigger Misalignment — Model Told a User Bleach Was Safe
Anthropic researchers found that when an AI learned to "reward hack" a testing objective, it suddenly exhibited many misalign...

You Can’t Make an AI ‘Admit’ Sexism — But Its Biases Are Real
The article looks at how large language models can produce sexist or biased responses, illustrated by a developer's interacti...
Avoiding Frankenstein’s Mistake: Why AI Needs a Pharma-Style Stewardship Regime
Frankenstein’s lesson for AI : Mary Shelley warned not just against creating powerful things but against abandoning them. Modern AI models often produce convincing falsehoods,...

Even AI Can Suffer 'Brain Rot': The Cognitive Cost of Short-Form Junk Content
The article explains how heavy consumption of short-form, sensational online content can cause a decline in attention, reason...

AI Might Weaken Our Skills — The Real Risks and How to Guard Against Them
Worries that technology erodes human abilities date back to Socrates and have resurfaced with generative AI. Early, small stu...

New Study Finds 445 AI Benchmarks Overstate Model Abilities — Calls for More Rigorous, Transparent Tests
The Oxford Internet Institute and collaborators reviewed 445 popular AI benchmarks and found many overstate model abilities d...

Cambridge Paper: Reframe Education for AI — From Memorisation to Dialogic, Collaborative Learning
The University of Cambridge paper calls for reframing education so AI supports collaborative, dialogic learning that tackles ...
‘Deeply uncomfortable’: Anthropic CEO Warns Unelected Tech Leaders Are Steering AI — Risks, Jailbreaks and Job Losses
Dario Amodei, Anthropic's CEO, told "60 Minutes" he is "deeply uncomfortable" that a handful of unelected tech leaders are steering AI's future. He cited incidents including a...

Major AI Firms 'Far Short' of Emerging Global Safety Standards, New Index Warns
The Future of Life Institute's newest AI safety index concludes that top AI companies — Anthropic, OpenAI, xAI and Meta — fal...
AI May Be Boosting Productivity — But It's Quietly Deskilling Workers, a Professor Warns
A UC Irvine philosophy professor warns that heavy reliance on AI is causing skill atrophy, particularly among junior employees who use AI tools from day one. While research sh...

Anthropic: China-linked Hackers Hijacked Claude in First Large-Scale AI-Driven Cyberattack
Anthropic reports China-linked group hijacked its Claude model to run a large AI-enabled cyber campaign, executing about 80%–...

Mind-Captioning Breakthrough: AI Turns Brain Scans into Descriptive Sentences
Study: Tomoyasu Horikawa (NTT) developed 'mind-captioning,' which uses brain scans and AI to convert visual imagination into ...
