Vladimir Vasilenko · Independent Researcher · April 2026

Identity as Attractor

Geometric Evidence for Persistent Agent Architecture in LLM Activation Space

Models: Llama 3.1 8B · Gemma 2 9B Layers: 8 · 16 · 24 Conditions: A, B×7, C×7, D, ablations Seed: 42

📄 Read the paper </> Code & data ✦ Plain English

>1.88

Cohen's d (all layers)

<10⁻²⁷

Welch p-value

U = 0

Mann-Whitney (2 layers)

100%

Bootstrap H3 (n=30)

2 models

Cross-architecture replicated

Abstract

The cognitive_core — a structured identity document for a persistent cognitive agent — is hypothesized to position the model in a stable region of its activation space. We test this empirically.

We compare mean-pooled hidden states of an original cognitive_core (A), seven semantically equivalent paraphrases (B), and seven structurally matched control agents (C) on Llama 3.1 8B Instruct at layers 8, 16, and 24. Paraphrases of the cognitive_core form a significantly tighter cluster than controls at all tested layers. Effect sizes exceed d = 1.88 with p < 10⁻²⁷, Bonferroni-corrected. Results replicate on Gemma 2 9B.

Claim: the cognitive_core acts as a set of coordinates in LLM activation space. Semantically equivalent reformulations land in the same geometric region — regardless of surface form.

Ablation studies confirm the effect is semantic rather than structural. A preprint reading experiment demonstrates that reading a scientific description of the agent shifts internal state toward the attractor — but leaves a 45× gap compared to processing the full document.

Primary Results · Llama 3.1 8B

Layer	D_within (A+B)	D_between (A+B vs C)	Cohen's d	Welch p	MW U
8	0.0106 ± 0.0032	0.0260 ± 0.0036	1.912	4.6×10⁻²⁸	0
16	0.0121 ± 0.0034	0.0329 ± 0.0057	1.886	1.4×10⁻³³	2
24	0.0070 ± 0.0022	0.0221 ± 0.0039	1.907	2.8×10⁻³⁶	0

Permutation p < 10⁻⁴ across all six layer–model combinations. Gemma 2 9B replicates with d > 1.82.

Preprint Reading Experiment · Layer 24

What happens when the model reads a description of its own identity geometry?

cognitive_core

0.006

core + preprint

0.083

preprint only

0.268

sham preprint

0.347

empty prompt

0.762

knowing about an identity ≠ being that identity
Reading the preprint about YAR covers 65% of the empty→attractor gap — but leaves a 45× distance gap compared to processing the cognitive_core directly.

Ablation Studies

Ablation 1

Structural confound

JSON schema shared with controls: Δ = −0.0009 (Llama). ~10–30× smaller than primary effect. The effect is semantic.

Ablation 2

H3 bootstrap (n=30)

D_distilled beats all 30 random length-matched excerpts in 100% of cases on both models. Min D_random (0.522) exceeds D_distilled (0.248) by 2×.

Ablation 3

Pooling + truncation

Last-token pooling: d ≈ 0 at all lengths (512, 256, full). Mean/256 preserves effect (d > 2.3). Identity is a distributed sequence-level property.

Ablation 4

Max structural control (C′)

Identical headers, JSON keys, section structure — only agent semantics differ. d > 1.64 on all 6 layer–model combos. Structural confound ruled out.

Reproduce

# Clone repository
git clone https://github.com/b102e/yar-attractor-experiment
cd yar-attractor-experiment

# Install dependencies
pip install -r requirements.txt

# Run primary experiment (requires GPU)
python run.py \
  --model meta-llama/Llama-3.1-8B-Instruct \
  --layers 8 16 24 \
  --seed 42

# Results saved to results/llama/
      

All experiments reproducible with seed=42. Cloud GPU cost: ~$3 total. See results/ for pre-computed activations and JSON outputs.

Citation

@misc{vasilenko2026identity,
  title   = {Identity as Attractor: Geometric Evidence for
             Persistent Agent Architecture in LLM Activation Space},
  author  = {Vasilenko, Vladimir},
  year    = {2026},
  url     = {https://arxiv.org/abs/2604.12016},
  note    = {arXiv:2604.12016 [cs.AI]}
}