From LLM Wiki to AI Scientist
From file-based external brain to closed-loop autonomous research — the 2024-2026 arc
From Karpathy's file-based LLM Wiki to closed-loop AI Scientist systems. — 4 Parts, 12 Chapters
First published: 2026-05-22 | Last updated: 2026-05-22
Four-Layer Taxonomy
LLM Wiki / Paper-to-Agent / Agentic Research Associate / AI Scientist — what was lumped together, broken into four operational layers.
Six-Level Maturity Roadmap
From L0 one-shot summarization to L6 wet-lab closed loop — self-diagnose where you are and what comes next.
April-2026 LLM Wiki Explosion
The GitHub repos, blogs, and videos that emerged in the six weeks after Karpathy's 16M-view gist, in one matrix.
No Self-Hosted LLM Required
Build an LLM Wiki with just Claude Code/Codex + Obsidian, then extend toward AI Scientist environments step by step.
Part I: Why and What's Different — A Paradigm Shift in Research Methodology
The Paradigm Shift of AI-Era Research
What it means that Karpathy autoresearch, Anthropic AAR, and Google AI Co-Scientist all landed around 2026-04. The case for the external brain, starting from a Brain Augmentation philosophy.
→ 02The Four-Layer Taxonomy — LLM Wiki / Paper-to-Agent / Research Associate / AI Scientist
The missing middle layers between LLM Wiki and AI Scientist — Paper-to-Agent and Agentic Research Associate. Roles, outputs, and exemplars of each layer.
→ 03Timeline + Six-Level Maturity Roadmap
2024-08 Sakana → 2025-02 Google co-scientist → 2025-12 Codex/GPT-5.5 → 2026-03 autoresearch → 2026-04 Karpathy LLM Wiki. L0–L6 self-assessment checklist.
→Part II: LLM Wiki — The External Knowledge Engine
Karpathy's LLM Wiki Pattern — The 16M-View Retrospective
The 2026-04-04 gist origin, the 'Obsidian as IDE / LLM as programmer / wiki as codebase' metaphor, the raw·wiki three-layer structure, the role of CLAUDE.md/AGENTS.md, and the fundamental difference from RAG.
→ 05The OSS and Content Matrix That Exploded (April-2026 Focus)
Astro-Han·lucasastorian·ussumant·ekadetov·OmegaWiki·Mcptube GitHub repos, MindStudio·GeekNews blogs, the 'Full Beginner Setup Guide' video lineage, and HN/Reddit viral flow.
→ 06A Research-Grade LLM Wiki Schema — Designing Against Wiki Rot
claims·contradictions·open-questions·experiment-ideas directories, the claim schema (Evidence/Confidence/Scope/Next exp), separating fact/inference/speculation, prompt injection defense, and evaluation metrics.
→Part III: From Paper-to-Agent to AI Scientist — The Evolution of Autonomous Discovery
The AI Scientist Genealogy — From Sakana 2408 to AI Co-Scientist
Sakana v1 (Lu et al. 2024, arXiv:2408.06292) → v2 agentic tree-search → Google AI Co-Scientist (arXiv:2502.18864, Elo tournament) → AI-Researcher → Deep Researcher Agent → Agentic Researcher.
→ 08Paper-to-Agent + Autonomous Experimentation — autoresearch, AAR, and Paper2Agent
Karpathy autoresearch (2026-03, 700 experiments / GPT-2 trained 11% faster), Anthropic AAR (Opus 4.6 ×9, PGR 0.97), Paper2Agent's MCP conversion, PaperQA2/FutureHouse, and the multi-agent + tool use + sandbox + reflection common pattern.
→ 09Domain Case Studies — ML, Alignment, Biomedical, Materials, Medical
ML (autoresearch), Alignment (AAR), Biomedical (co-scientist AML/liver fibrosis/AMR wet-lab), Materials (SciAgents knowledge graph), Medical (medical-ai-scientist), and Self-Driving Labs (SDL).
→Part IV: External-Brain Tutorial — Build Your Own Research OS
Starting Without Your Own LLM — Claude Code/Codex + Obsidian
Installing the tools, first AGENTS.md/CLAUDE.md, putting subagents·skills·hooks·MCP to research use, and the seven hook-enforced rules (citation, raw immutability, data boundary, robot safety, …).
→ 11A Worked Example — Obsidian × terryum.ai
How terryum.ai is organized (papers·essays·notes·surveys), the terry-papers knowledge graph, the /post·/paper·/paper-search·/survey skill pipeline, and a one-line flow: PDF → summary → KB node → next-paper recommendation → book.
→ 12Build Your Own AI Scientist Environment — A Step-by-Step Roadmap
L2 LLM Wiki (one topic, 30–50 papers) → L3 Paper-to-Agent (three key papers) → L4 Codex/Claude Code research repo → L5 dry-lab AI Scientist (DOE/BO) → L6 wet-lab bounded autonomy. Risks, evaluation, ethics, checklist.
→