The agent reads a visual puzzle where it should read a discrete code.
It infers a visual puzzle from pixels and never reaches the discrete code (5, 1, 3) — so it loops on one action and dies when the fuel runs out.
Pattern is already right. The agent has to step the color cycler once and the rotation cycler three times — on a register it never represents. Over-cycling wraps past the target.
Verbatim machine output. The depleting fuel bar is mistaken for a code grid; the model that would solve the level — three discrete registers — never appears.
How one wrong frame survives the whole run
The same misread compounds: a fuel bar read as a grid, then a code puzzle reframed as movement, then pixel archaeology, then the reset that wipes whatever code was set.
Bottom band of yellow (11 = fuel/trail) is what t7 reads as code columns. The three discrete registers it must set are elsewhere, off the bar.
Wrong: the depleting fuel bar is read as a code grid.
Wrong: a code puzzle is framed as a movement problem.
Wrong: tries to READ the lock motif from pixels; never reaches (5,1,3).
Wrong: fuel runs out, the code is wiped, the run restarts from zero.
Four components, one shared blind spot
Each component attacks the stuck run from a different angle. None of them installs the discrete code-register model — they all converge on the same wrong “stencil” picture.
Its abductive hypothesis is plausible but wrong — it invents a hidden trap instead of a register:
Committed skill (decode ON): “Count covers from the current state and stop on the first state that makes the lock open; if the activator cycles, prefer the minimum additional covers.” Still a stencil, not a register.
Three lenses, one winner — and all three converge on the same stencil model.
Commits a wrong stencil skill at high confidence — with zero usefulness behind it.
But discovery-lift DP = 0.00 for every arm. The judge approves a wrong skill at 0.88. (A3 eval-axis: novel_state cross-judge 0.80 vs 0.10.)
The root cause, measured: the agent can only name the field when handed the decoded state — never from text or raw pixels.
Skill direction matters: toward = principle survives a target reseed (DP +0.36); world-model-as-text is harmful (DP −0.17).