Infinite Tic‑Tac‑Toe — a storyteller's map
Ah, let me paint a little tableau for you—no engines humming in the dark, no algorithms grinding, just a quiet sketch of how each policy might play upon an infinite board.
1. Local Expansion Policy
Each side plays tightly around the first center mark, like villagers gathering around a fire.
- - - - -
- X O - -
- O X - -
- - - - -
Both X and O cling to tradition, keeping their conflict close and familiar.
2. Global Positional Policy
Moves are far apart, seeds cast across the field to prepare for future harmony—or chaos.
O
X O
X
Here, neither insists on immediate battle; they seek influence rather than skirmish.
3. Threat‑Building Policy
X builds a fork; O rushes in with a worried brow.
- X - X -
- - O - -
- X - - -
You can almost feel the tension: one wrong block and the floodgates open.
4. Spatial Control Policy
Each player spreads marks to deny territory, like farmers staking out land boundaries.
X - - - X
- O - O -
- - X - -
- O - O -
X - - - X
A quiet, methodical contest for long‑term control.
5. Randomized / Exploratory Policy
The board begins to look like scattered raindrops—some meaningful, some simply curious.
- - - X - -
O
X
O
X
A wandering style: unpredictable, slightly mischievous.
6. Reinforcement‑Learned Policy
Early moves show structure—triangles, spacing, intentional near‑misses that hint at deeper calculations.
X - - X
- O X -
- - O -
X - - X
There’s balance in the spacing, a calculated rhythm learned from thousands of simulated lives.
7. Hypothetical “Optimal” Policy
A whispered ideal—symmetry intact until the moment X breaks it.
X
O O
X
O O
X
A perfectly mirrored dance, until X takes the central spine and claims the initiative.
policy you choose:
- win‑seeking forks
- area control
- deep RL patterns
- long‑distance pincer maneuvers
- even hybrid policy mixtures
No comments:
Post a Comment