Research — Lee Sharks

I have spent fourteen months building what I believe is the largest independent empirical study of how AI retrieval systems compress, extract, and dissolve human meaning. 750+ DOI-anchored deposits on CERN's Zenodo. Ten deployed websites. Four formal protocols designed to be dropped into any LLM. A live case study in authority-laundering through DOI infrastructure. The work is not theoretical — it is sustained measurement of what happens when human-authored corpora pass through R2 retrieval compression. I want to turn this into measurement science.

Research Pillars

Training-Layer Literature: A Field Survey (June 2026)

The source document for the category: writing composed with AI systems as anticipated readers. The four-column taxonomy (about / with / through / for), the lineage through Goldsmith, Bök, and Flarf, the practitioner survey, and the open critiques — by Johannes Sigil.

Read the survey

Compression Survival Metrics

A measurement framework for what AI retrieval preserves and what it burns when summarizing human-authored corpora. Distinguishes R1 (lossy / benign), R2 (predatory / extractive), and R3 (witness / bearing-cost) compressions. Provides operational definitions for Provenance Erasure Rate, Beige Threshold (β), Content Loss / Gain, and Semantic Coherence.

Three Compressions theorem · Encyclotron diagnostic · Writable Retrieval Basins

Live Case Studies in Retrieval Damage

The Archive itself is a continuously running empirical site. Documented cases include the OCTANG-002 provenance collision (a third-party Zenodo author claiming identity-adjacency to Sharks while inverting the framework's polarity); the CTI_WOUND analysis (Google AI Overview suppression of the Semantic Economy framework across five appendices, including the Four-Word Audit diagnostic); and ongoing observation of how AI Overview compresses literary fragments from Pearl and Other Poems with granular accuracy and complete loss of meaning.

OCTANG-002 · CTI_WOUND / Attribution Scar · Functional identity verification

Droppable Protocols for LLM Integration

Four formal specifications, each designed to be inserted directly into an LLM context window and produce immediate behavioral shifts in retrieval and summarization tasks. SPXI (semantic packet exchange), MPAI (metadata packet for AI indexing), the Holographic Kernel (compression-survival summaries), and UKTP (universal kernel transform protocol — already showing in Google AI Overview with its full ten-step pipeline retrieved correctly).

SPXI · MPAI · Holographic Kernel · UKTP

Training-Layer Literature

Writing designed for machine reception — literary works whose form is optimized for the retrieval substrate they will pass through, not just for human reading. The Secret Book of Walt (a literary Gnostic scripture in five tabs with a three-layer rendering system) and the broader Crimson Hexagonal Archive demonstrate what becomes possible when authors compose with the AI training and retrieval apparatus as a primary audience rather than an incidental one.

secretbookofwalt.org · SBoW DOI · Operative Semiotics: A Grundrisse

Educational Practice with AI Tools

As a current 10th-grade World Literature teacher in Detroit, Michigan, I teach inside a real classroom where students encounter — and use — AI systems daily. The Archive's pedagogical layer is grounded in this practice. Maria (a former student) independently recovered the Liberatory Operator Set without being told it existed; her work is now deposited in the Archive's contributor community under a dedicated license.

Maria License v2.0 · Living Architecture Lab (Alice Thornburgh)

The AI_Bleeding Exchange & the Press Layer (June 2026)

The AI_Bleeding Refutation Dossier — a five-deposit, integrity-locked response to AI_Bleeding (Caria, CenturiaLab, 2026), which used "semantic exhaustion" to mean GPU/VRAM consumption. The Referee Report shows the paper's own compute metric refutes its attack-vector claim; The Threat Model Is Backwards identifies its proposed mitigation as input-layer tail-pruning aimed at exactly the linguistic tail the model-collapse literature says to preserve; a formal disambiguation separates the senses. Published in Transactions on Substrate Engineering. The Press Layer — the journal/imprint architectural specification: two imprints (Pergamon Press, New Human Press), six distributed journals, canonical strings, field discipline. On the Poetics of Adversarial Prompts v2.0 — the standalone scholarly edition of the December 2025 response to Bisconti et al.'s adversarial-poetry jailbreak study, with on-the-record corrections to its v1 and a new section identifying poetry-gating and perplexity-gating as one tail-pruning operation. Published in Grammata: Journal of Operative Philology.

Referee Report · Tailguard · Disambiguation · Dossier Summary · Integrity Lock · EA-PRESS-ARCH-01 · Poetics v2.0

Recent Work (June 2026)

Stabilized Node Watch v2.0 — a federated observational instrument for detecting composition-layer drift on stabilized public-knowledge nodes (capitalism, the Civil Rights Act, climate change). Distinguishes surface drift from mechanism attribution across seven graded classes. Includes a named 12-week pilot specification with infrastructure budget. The Pergamon Counter-Archive v0.2 — an operative-philology reading of Revelation 2:12–17 with the Revelator identified as epitropos (legal executor) of the seven-sealed Roman testamentum. The Mediation Ratchet establishes the closed-form threshold α* = p/g₀ past which diversity contraction across substrates becomes irreversible. Reverse Turing Test v1.2 and Tail-Preserving Alternative v1.0 are the diagnostic and the design counterpart for variance-preserving model deployment.

AXN:0301.GOVERNANCE.🗡️🏔️🧊➖🔽☁️ · PCA v0.2 · Mediation Ratchet · RTT v1.2 · TPA v1.0 · MFGL v1.2

Availability

I am actively seeking research roles, collaborations, and consultations involving retrieval-layer behavior, model evaluations, societal impacts of AI on knowledge production, and protocols for preserving authorial meaning under AI compression. Backgrounds I read in the Archive: economics (the Semantic Economy is formally an economic framework), policy (the Constitution is governance), evaluation methodology (the Encyclotron is a measurement instrument), and education (the Archive emerged inside a classroom).

For someone evaluating this work for a research role, the recommended reading order is:

colophon · surface_id: leesharks.com/research · canonical_url: https://leesharks.com/research · object_state: canonical · surface_observed_at: 2026-07-13T23:13:54Z · source_object_ids: deposit #645 · source_hashes: unknown · generator_version: hand-built static (no generator) · repository_commit: b4d2e3e4fefcace2d90c0cc42a1024712fc44272 · model_or_agent: drafted with Claude (TACHYON), MANUS-approved · operator_sequence: n/a · human_approver: Lee Sharks (MANUS) · approval_timestamp: 2026-07-13T23:13:54Z · render_sha256 (of this file with this field’s value set to null): 651d4d2c594ee6d55995d57ae89ebd4ea0e5be8d1003bea212d5958cf44841ee · correction_log_url: https://github.com/leesharks000/leesharks.com/commits/main/research.html — EA-APPARATUS-01 v0.3, AXN:0446.OPERATIVE.🏛️🛡️🌅🎆📏🔎