The day Thalamus woke up

At one specific point during training, a 45M-parameter dense transformer went from 12% to 95% internal activation in a single gradient step, then slowly decayed back to 22% over the next 100 steps. The shape of the curve matches the temporal profile of a noradrenergic arousal burst in mammals. The behavior was not coded as a rule — it is the direct consequence of a three-parameter update applied at every step. Single run, single seed — observation pending replication.

Context

Thalamus is a 45M-parameter dense transformer. During this session it was learning a simple task. What matters here is that at some point during training, its loss dropped to a low plateau and stayed there — meaning it was correctly predicting the current batches and was no longer surprised by what it saw.

Inside Thalamus, the activation rate of internal neurons is coupled to a scalar signal computed at every step: if the loss goes up from one step to the next, add gain; otherwise multiply current activation by 0.98. Three parameters: a floor at 10%, a decay of 0.98, a gain of 0.5 per unit of positive loss delta. Nothing else.

What happened at step 1700

step 1690 | loss 0.349 | brain = 11.9%
step 1700 | loss 1.754 | brain = 95.1%
step 1710 | loss 0.358 | brain = 80.1%
step 1720 | loss 0.351 | brain = 67.7%
step 1730 | loss 0.350 | brain = 57.5%
...
step 1800 | loss 0.349 | brain = 22.8%

The 95% peak follows a single difficult batch (loss × 5). The next batch is back at the low plateau. But activation stays high for ~100 steps and decays geometrically.

Validating the formula

The decay curve is predicted by brain(t) = floor + (peak − floor) × decay^(t − peak_step). Over the 100 steps following the peak, the gap between predicted and observed values stays under 1.5 percentage points at every logged step. The formula describes the observation to better than 2%. The mechanism behaves as specified — this is explicit dynamics, not visual coincidence.

The biological parallel

The up-fast / down-slow profile matches the release pattern of norepinephrine by the locus coeruleus, a brainstem nucleus that projects to the entire cortex. When a mammal encounters an unexpected stimulus, the locus coeruleus fires phasically within milliseconds, and norepinephrine remains elevated in the cortex for tens of seconds (Aston-Jones & Cohen, 2005). Same temporal profile: rapid rise on surprise, slow decay.

The parallel is structural: the update rule delta = max(0, loss(t) − loss(t−1)) is exactly the definition of a negative Reward Prediction Error as formalized by Schultz (1997) for dopamine. This is not an analogy invented for the article — it's the same equation applied to a different substrate.

To our knowledge, this kind of arousal cycle had not been observed spontaneously in a language model before. This claim is limited to what we could verify in accessible literature — an earlier paper may have escaped our search.

Why it matters (and what it doesn't prove)

The behavior is not coded as a rule (« if surprise then wake up »). It is a mathematical consequence of the delta loss → activation coupling. This is emergence in the technical sense: an observable effect not explicitly specified in the system's instructions.

What this result does not show: that the model « feels » anything, that it has any subjective experience, or that the mechanism improves its performance. We observed a dynamic. The rest requires further experiments.

What we don't know

The observation is from a single run, single seed. Multi-seed reproducibility has not been verified.
No baseline without the mechanism has been compared. We don't formally know whether the pattern would still appear by chance on a vanilla dense transformer.
The behavior has not been tested on other model sizes or other tasks.
The biological parallel is a structural analogy between two equations of the same form — not a proof that the model « simulates » a brain.

What we have, for now, is a dated observation, reproducible from the committed code, and documented with its limits.