Is AI Research Closer to Biology?

Are large language models just autocomplete, or do they think?

Surprisingly, no one has a definitive answer yet. At Anthropic, researchers are working on interpretability, which is the science of “opening up” AI models to see what’s going on inside. They argue that this work is less like software engineering and more like biology—or more precisely, the neuroscience of AI. After all, these models aren’t rigidly programmed, but they evolve structures during training, almost like living systems.

Beyond 'Next-Word Prediction'

On the surface, a language model’s job is simple: predict the next word. But to excel at this, it must develop far more than rote memory. It has to abstract concepts, plan ahead, and grasp context. That’s why AI can write poetry, solve math problems, and hold meaningful conversations.

In other words, ‘next-word prediction’ is just the visible tip of a much deeper process, where complex intermediate goals and internal representations emerge on their own.

Researchers often compare this to human evolution. Survival and reproduction may be the ultimate goals, but we don’t live consciously chasing them at every moment. Instead, mechanisms like emotions, motivations, and concepts developed to help us thrive. Similarly, language models may be trained for prediction, but along the way, they grow their own conceptual tools.

Why Call It AI 'Biology'?

Unlike programs that follow hard-coded rules, language models develop functional clusters during training, consisting of groups of parameters that consistently respond to specific stimuli. Anthropic calls these concept “circuits”.

These circuits don’t sit neatly inside a single neuron. Instead, they span multiple layers, features, neurons, and attention patterns, representing reusable abstractions such as meanings, rules, even social tones.

To study them, researchers probe the model:

Which parts light up in response to which inputs?💡

They also experiment by stimulating or suppressing certain activations, much like neuroscientists do with brain regions. The difference? With AI, every part of the system is fully observable, infinitely replicable, and immune to fatigue or variation between individuals. This makes AI research, in some ways, even easier to control and reproduce than traditional biology.

Stay ahead in AI

Stay ahead in AI / Subscribe

Examples of Circuits in AI

Objects & Places
When the phrase Golden Gate Bridge appears, the same circuit lights up not just for the words ‘golden,’ ‘gate,’ and ‘bridge,’ but also when the model processes context like driving from San Francisco to Marin or even descriptions of a bridge. The fact that text, context, and imagery converge on the same circuit suggests true conceptual understanding, not just memorization.
Code Understanding & Bug Detection
Certain circuits activate whenever the model encounters buggy code. This implies it’s not just matching rules, but recognizing the abstract concept of an error.
Logic & Calculation
Circuits appear when the model performs both simple arithmetic (e.g., adding numbers ending in 6 and 9) and more abstract reasoning (e.g., calculating the year six years after 1959). This points to a generalized calculation circuit.
Causal Intervention
When asked, “What’s the capital of the state Dallas is in?” the model’s internal concept of Texas activates. If researchers replace that concept with California or Byzantine Empire, the output changes to Sacramento or Constantinople. This shows knowledge inside the model can be swapped, almost like neurons rewired in real time.

Why This Matters

Anthropic argues that language models aren’t just autocomplete. LLMs are complex systems that solve problems using internal concept circuits and planning.

But here’s the catch: what the model explains as its reasoning may not match what’s actually happening inside. That’s why interpretability is essential for trust and safety.

With performance now leveling across models, the next frontier is all about ensuring we can explain and verify how that power works.