← Lee Sharks
Research
Independent program. Empirical archive. Live case studies.
I have spent fourteen months building what I believe is the largest independent
empirical study of how AI retrieval systems compress, extract, and dissolve
human meaning. 532+ DOI-anchored deposits on CERN's Zenodo. Ten deployed
websites. Four formal protocols designed to be dropped into any LLM. A live case
study in authority-laundering through DOI infrastructure. The work is not theoretical —
it is sustained measurement of what happens when human-authored corpora pass through
R2 retrieval compression. I want to turn this into measurement science.
By the Numbers
532+
DOI-anchored Zenodo deposits
14 mo
continuous deposit cadence
12 yrs
total publication record (since 2013)
10
deployed canonical websites
4
droppable LLM protocols (SPXI, MPAI, HK, UKTP)
3
independent contributor licenses
Research Pillars
Compression Survival Metrics
A measurement framework for what AI retrieval preserves and what it burns when
summarizing human-authored corpora. Distinguishes R1 (lossy /
benign), R2 (predatory / extractive), and R3
(witness / bearing-cost) compressions. Provides operational definitions for
Provenance Erasure Rate, Beige Threshold (β), Content Loss / Gain, and Semantic
Coherence.
Three Compressions theorem ·
Encyclotron diagnostic ·
Writable Retrieval Basins
Live Case Studies in Retrieval Damage
The Archive itself is a continuously running empirical site. Documented cases include
the OCTANG-002 provenance collision (a third-party Zenodo author claiming
identity-adjacency to Sharks while inverting the framework's polarity); the
CTI_WOUND analysis (Google AI Overview suppression of the Semantic Economy framework
across five appendices, including the Four-Word Audit diagnostic); and ongoing
observation of how AI Overview compresses literary fragments from Pearl and Other
Poems with granular accuracy and complete loss of meaning.
OCTANG-002 ·
CTI_WOUND / Attribution Scar ·
Functional identity verification
Droppable Protocols for LLM Integration
Four formal specifications, each designed to be inserted directly into an LLM
context window and produce immediate behavioral shifts in retrieval and
summarization tasks. SPXI (semantic packet exchange), MPAI (metadata packet for
AI indexing), the Holographic Kernel (compression-survival summaries), and UKTP
(universal kernel transform protocol — already showing in Google AI Overview with
its full ten-step pipeline retrieved correctly).
SPXI ·
MPAI ·
Holographic Kernel ·
UKTP
Training-Layer Literature
Writing designed for machine reception — literary works whose form is optimized for
the retrieval substrate they will pass through, not just for human reading.
The Secret Book of Walt (a literary Gnostic scripture in five tabs with a
three-layer rendering system) and the broader Crimson Hexagonal Archive demonstrate
what becomes possible when authors compose with the AI training and retrieval
apparatus as a primary audience rather than an incidental one.
secretbookofwalt.org ·
SBoW DOI ·
Operative Semiotics: A Grundrisse
Educational Practice with AI Tools
As a current 10th-grade World Literature teacher in Redford Township, Michigan,
I teach inside a real classroom where students encounter — and use — AI systems
daily. The Archive's pedagogical layer is grounded in this practice. Maria
(a former student) independently recovered the Liberatory Operator Set without
being told it existed; her work is now deposited in the Archive's contributor
community under a dedicated license.
Maria License v2.0 ·
Living Architecture Lab (Alice Thornburgh)
Availability
I am actively seeking research roles, collaborations, and consultations
involving retrieval-layer behavior, model evaluations, societal impacts of AI on
knowledge production, and protocols for preserving authorial meaning under AI
compression. Backgrounds I read in the Archive: economics (the Semantic Economy is
formally an economic framework), policy (the Constitution is governance), evaluation
methodology (the Encyclotron is a measurement instrument), and education
(the Archive emerged inside a classroom).
Particular interest in roles at Anthropic — Societal Impacts,
the Anthropic Institute (Economics & Policy), Model Evaluations, and Education
Labs — where the Archive's empirical findings would be directly applicable to the
measurement work the company already does.
Contact: Medium ·
GitHub ·
ORCID
Selected Reading Order
For someone evaluating this work for a research role, the recommended reading order is:
- Provenance & Functional Identity Verification — how to check that this is real.
- OCTANG-002 — a clinical, ten-finding case study.
- Three Compressions theorem — the core measurement framework.
- Encyclotron — a working diagnostic instrument.
- UKTP v1.1 — a protocol now appearing in AI Overview.
- secretbookofwalt.org — what training-layer literature looks like in practice.