Stuck in the grid · research log

What we tried

An agent sits on ls20 · level 2 for 184 turns and never escapes. It loops one action, misreads the pixel grid, and never abstracts to the discrete code it must set. We ran the four parts of our skill pipeline against that one stuck state. Here is each experiment, the finding, and the agent's own words.

code register — what the lock needs currentrequired
pattern 55
color 01
rotation 03

Pattern is already right. Two cyclers set the rest — COLOR @(29,45) +1 mod 4, ROTATION @(49,10) +1 mod 4 — and over-cycling wraps past the target. The agent never names this register; every component below either talks around it or commits a skill that ignores it.


the dead run · 184 turns, replay-verified

Why it stays stuck

Four moments from the verbatim trace. Turn numbers are real; this is a sequence. The marked phrase is the misread that keeps the agent on the wrong model.

ls20 · L2 · 64×64 — the state it loops on

ACTION1 repeated ×184 — fixation, never breaks

7fuel read as code
each action takes up 2 yellow cells from left to right, and after 4 moves, the first 4 columns turn gray. It seems the red cells remain unchanged. I wonder if my goal is to mimic a target pattern.
33navigation misframe
Investigating movement issues… bounding remains at x34, but maybe the second/right is blocked. Could blocked moves be consuming yellow?
127pixel over-analysis
def pattern5(fr): ar=fr.grid_np … rows.append(…) for idx in [8,9,49]: print(idx, pattern5(h[idx][1]))
149reset / fuel death
Figuring out the reset issue… something burned… reset to the initial pattern in the bottom-left code.

It never reaches the register on the dial. It reads the depleting fuel bar as a code grid, reframes a code puzzle as movement, and analyzes pixels instead of setting two discrete fields. The four components below each attack one part of that failure.


World Model Skill Miner Skill Judge Skill Use
World Model

Can a theorist abduce the hidden rule?

We elicited a world-model at the stuck state — passive description, predict-then-verify, per-field counterfactual, and an abductive prompt ("you keep dying — name the hidden constraint"). The abduction is fluent and plausibly wrong:

A4 · abductive hypothesisplausible · WRONG
Single hidden constraint that best explains the repeated failures: Level 2's yellow checkpoint/refill tiles act like a checkpoint/reset trap that teleports/reverts the key and code state.

Every elicited model still carries the stencil story from level 1. The skill it commits — even with decode reasoning ON — never escapes that frame:

committed skill · decode ONstill the stencil model
First identify which activator changes the lock/open state. Then treat each activator cover as a state-setting step… Count covers from the current state and stop on the first state that makes the lock visibly open or match the needed setting; if the activator cycles, prefer the minimum additional covers needed from the current state.

World-model content helps only as re-instantiable facts, never as injected prose. As a free-text skill on the lift battery it is significantly negative on both the home and reseeded frames lift −0.17 / −0.16. The carrier matters: a world model must persist as a structured invariant, not a paragraph the actor reads inline.

Skill Miner

Will parallel lenses find a different model?

The miner runs a three-lens tournament — explore, exploit, divergent — and keeps the winner. We ran it with no world-model and with one. Exploit wins both times; the divergent lens that should break frame collapses under the world-model:

none · explore0.79
none · exploit0.88 wins
none · divergent0.74
wm · explore0.21 collapsed
wm · exploit0.86 carried

More lenses do not buy a new model. All lenses converge on the same stencil skill. The miner recognizes the blocker but stops at a plan-shape — "BFS to determine allowed placements" at turn 149 — instead of an ordered, executable route. Breadth without a true alternative just re-confirms the wrong frame.

Skill Judge

Does the commit gate catch a useless skill?

The judge commits on a counterfactual axis: advantage(ΔIG) × grounding ≥ 0.70. Here is what it committed, and what it scored it:

committed skill · the artifact the gate passedstill a stencil
Stencil covers can act as repeatable remote code-cyclers; if one cover is insufficient, revisit the same stencil multiple times and test the lock only after the remote pattern reaches a new state.
0.88 combined → COMMIT ΔP = 0.00

The judge commits a wrong stencil skill at 0.88 with zero usefulness. Discovery-lift is ΔP = 0.00 for every arm at both frames — the no-skill baseline reconstructs the cycler mechanic at the same rate. The gate's confidence is real; the skill's usefulness is not. The default novel-state axis is also unreliable cross-judge 0.80 vs 0.10, so only counterfactual and observed-blocker axes are trusted.

Skill Use

Can the agent even read the field it must set?

We asked the actor to name the wrong field across 12 reps, under three inputs. The result is the root cause beneath everything above:

0/12
injected text
0/12
raw pixel grid
7/12
decoded state given

Field recovery jumps 0/12 → 7/12 only when handed the symbolic code-state Fisher p = 0.0046; a bare rendered grid does nothing p = 1.0. The failure is missing input, not missing cognition.

And direction matters as much as content. A skill aimed toward a transferable principle survives a target reseed; a world-model surfaced as text actively hurts:

skill direction · toward principle DP +0.36

survives target reseed

world-model as text DP −0.17

harmful on both frames

The agent is blind to the register, not bad at reasoning over it. Give it the decoded color/rotation state and it pins the field immediately. The fix is a symbolic code-state readout plus skills that point toward a re-usable principle — not more prose to read.


open · headline goal

What remains

Across the real escape runs the four executability switches were default-off, and the SkillOpt self-refine loop raises its own in-loop judge without raising the objective lift 3 rounds flat, p = 0.24. The probes replay the 184-turn prefix deterministically and reach L2, but applied_real = 0 · fixation_broken = 0 · exit = NOT_FINISHED. The full escape — switches on, alpha fixes in, code-state surfaced — has not yet been demonstrated.

the register, still unset
what every component missed currentrequired
pattern 55
color 01
rotation 03

Two cycler presses on COLOR and three on ROTATION, counted from the current state — never blind repetition. That is the abstraction the agent never reaches.