Part IV: External-Brain Tutorial — Build Your Own Research OS

Chapter 11: A Worked Example — Obsidian × terryum.ai

Written: 2026-05-22 Last updated: 2026-05-23

11.1 Starting Point — The Brain Augmentation Manifesto

On March 10, 2026, terry published the first post of his homepage, titled Brain Augmentation. The opening line: "Coming back to physical-AI after a decade, the first question I held onto was not 'what to research' but 'how to research.'" ^[12]

That single sentence is the shortest version of the paradigm shift discussed in (Chapter 1). What matters in the AI era is not knowing more yourself — it is building an environment where an AI scientist can keep generating knowledge on its own ^[12]. A month later, on April 4, 2026, Karpathy's LLM Wiki gist ^[2] dropped. By then, terry was already running two things: a public external-brain site called terryum.ai, and a four-skill pipeline that automatically populates and extends it.

This chapter is the worked example of that workflow. The chapter's authority signal is that the survey you are reading is itself an artifact of that pipeline [novelty_matrix Ch11 ⊕ summary]. The four-layer taxonomy (Chapter 2), the six-level maturity ladder (Chapter 3), the Karpathy analysis (Chapter 4-6), the AI Scientist genealogy (Chapter 7-9) — every page of this book was produced through the very pipeline being described here. The meta moment is intentional. If (Chapter 10) was "Day 1," this chapter is "Month 4 — what a month-old vault evolved into."

The earlier ^[3] is a manifesto; ^[4] is its why — "After the democratization of coding, scholarly research is next." This chapter is the how. The same manifesto, made concrete as a four-skill + three-repo system. From abstract vision to daily-running infrastructure.

11.2 The Four-Directory Structure of terryum.ai — The Public Surface

The posts/ directory of terryum.ai splits into four parts.


posts/
├── papers/        # 40+ paper summaries — written by AI
├── essays/        # 12+ essays — written by the author personally
├── notes/         # short memos and ChatGPT digests — author or source=chatgpt
└── (gallery)
    ├── surveys/   # books written by AI agents (this book included)
    └── projects/  # GitHub repos, books, products

Each directory has a distinct input → content → indexing flow.

papers/ (40+ entries as of 2026-05): each folder follows the YYMM-slug/ pattern — 2502-ai-co-scientist/, 2603-autoresearch-autonomous-ml/, 2604-automated-alignment-researchers/. Each contains three files: ko.mdx, en.mdx, meta.json. They are generated when an arXiv URL is fed into the /paper skill (§11.4). The meta.json doubles as node metadata for a GraphRAG-style knowledge graph — fields like domain, subfields, key_concepts, methodology, and contribution_type become the basis for graph traversal.

essays/ (12+ entries): 260310-brain-augmentation/, 260415-democratization-of-research/, 260413-new-leverage/. Same folder layout as papers (ko.mdx, en.mdx, meta.json), but not written by AI. Essays are the product of the author's own thinking; /post --type=essays automates only the publishing metadata, not the writing itself.

notes/ (7 entries as of 2026-05, growing): short memos and ChatGPT-conversation digests. 260424-claude-to-codex/ is one example — terry's edited transcript of a ChatGPT session titled "From Claude Code to Codex" ^[12]. These are auto-flagged via the source=='chatgpt' field in meta.json.

surveys/ (gallery): books written by AI agent teams. The first entry, From Claude Code to Codex, shipped in 2026-04; this book becomes the second. Only gallery cards appear on terryum.ai; the books themselves live at separate Cloudflare Pages domains.

projects/ (gallery): GitHub repos, published books, products. Either manually entered or auto-pulled from the GitHub API.

These four directories are an extension of the (Chapter 4) raw/wiki two-layer pattern onto a public surface. The private vault sits in an Obsidian Drafts/ folder; only items that have been reviewed and approved flow out into the four directories above.

11.3 The Index System — The Head of the Knowledge Graph

Figure 11.3: terryum.ai posts gallery — papers, essays, notes, surveys, projects tabs with card grid — illustration by author (gpt-image assisted)

terryum.ai keeps three kinds of indexes. Listing content is not the same as making it traversable — that distinction is the line between an external brain and an ordinary blog.


posts/
├── global-index.json    # merged metadata of every post
├── index.json           # public posts only
├── index-private.json   # private/group post metadata
└── taxonomy.json        # domain tree (robotics/hand/tactile, etc.)

taxonomy.json is the domain tree. Examples: robotics/hand/tactile, robotics/brain/vla, ai/llm, ai/agent. Each node carries a Korean/English label and a child-node list. When a new paper post arrives, its meta.json's subfields map onto leaves in the taxonomy.

global-index.json aggregates metadata for every post (public + private) into one file. Each entry: slug, content_type, title_ko/en, summary_ko/en, domain, subfields, key_concepts, methodology, tags. This is the input to the terry-papers knowledge graph.

11.4 The Four-Skill Pipeline — The Automated Half

Four skills power terryum.ai. All are invoked as / from Claude Code.


1. /paper "<arxiv-url-or-pdf>"        → create papers/ folder, add KB node
2. /post --from=#-N --type=essays      → publish to essays/ or notes/
3. /paper-search [explore|next]        → recommend via graph + external search
4. /survey "<title>"                   → bootstrap a survey book (this book began here)

A single-paragraph walkthrough of each.

/paper — the Papers-post creation pipeline. It accepts an arXiv URL, a journal URL, or a tech-blog URL. Internally it (1) fetches and extracts the body via Defuddle CLI, (2) generates Korean and English summaries with Gemini or Claude, (3) writes meta.json with auto-inferred domain, subfields, contribution_type, (4) generates a cover image (calling /image-gen if absent), (5) registers the post as a node in the terry-papers knowledge graph, and (6) drafts short share messages for GeekNews / Hacker News-style channels. Five minutes end-to-end. Even while this chapter is being written, posts/papers/ holds 40+ entries — and among them, 2604-automated-alignment-researchers, 2604-self-driving-labs, 2603-autoresearch-autonomous-ml, 2502-ai-co-scientist, and 2603-medical-ai-scientist serve as primary sources for this survey.

/post — publishing a hand-written draft. Essays or notes are first written in Obsidian's Drafts/ folder, then /post --from=#-N --type=essays|notes releases them. It takes a --visibility=public|private|group option. Private/group bodies are stored in the terry-private git repo + Cloudflare R2, with only metadata exposed on terryum.ai.

/paper-search — a two-mode recommendation engine. (1) explore: a question embedding is mapped onto anchor nodes, the knowledge graph is BFS-traversed for internal papers (interpolation), and external academic search (arXiv, Semantic Scholar) pulls in new candidates scored by alignment with the existing graph (extrapolation). (2) next: from the surveys candidate pool, it answers "which paper should the next survey include?". How this survey's papers.json got populated — that is the iterated output of these two modes.

/survey — survey-book bootstrap. Invoked inside the monorepo, it runs MODE A (scaffold a new book + copy .claude/agents/ templates + configure the harness). Given a deployed Cloudflare Pages URL, it runs MODE B (register the survey in the homepage gallery + auto-call /cite-post). The most powerful sub-command is --orchestrate, which kicks off a seven-agent harness team for autonomous parallel writing (two time-sharded deep-researchers + critical-analyst + book-writer + image-curator + fact-checker + qa-reviewer). This book was produced exactly by that command.

11.5 The Recursive Meta Moment — The Pipeline That Produced This Book

Let us close the chapter's core claim in one stroke.

The chapter you are reading is an output of the four-skill pipeline above.

How exactly? Let us walk through the seven-agent pipeline declared in (Chapter 2).

/survey "From LLM Wiki to AI Scientist" — 2026-05-22, terry runs this inside the terry-surveys/ monorepo. MODE A scaffolds the book — survey.json, book/{ko,en}/glossary.md, _research/, _analysis/, and seven templates in .claude/agents/.

deep-researcher-foundations (Claude Opus) — organizes pre-2024 lineage prior to Karpathy's LLM Wiki (Memex 1945, Luhmann 1992, MemGPT 2023, RAG 2020) into 49 entries. Output: _research/papers_foundations.json.

deep-researcher-frontier (Claude Opus) — organizes the frontier from 2024-08 Sakana onward (autoresearch, AAR, Co-Scientist, the six OSS implementations, and 11 of terryum.ai's own posts) into 97 entries. Output: _research/papers_frontier.json. This is the moment terry's own writing entered as a primary source — meaning the book-writer drafting this chapter is reading [Brain Augmentation], [Democratization of Research], and the [Claude-to-Codex migration note] among its inputs.

merge-research — merges both shards along the time axis into a single canonical papers.json (146 entries). Dedupes, marks primary_verified flags.

critical-analyst (Claude Opus) — produces _analysis/gaps.md (15 gaps), novelty_matrix.md (chapter × competitor matrix), and positioning.md (cover and blurb decisions). Surfaces the load-bearing caveats: G3 AAR Sonnet-4 transfer, G1 wiki-rot single-N, and so on.

book-writer (Claude Opus) — co-writes 4 Parts × 3 chapters = 12 chapters in Korean and English simultaneously. Part IV (this chapter included) is handled by a dedicated book-writer-partIV agent, which reads the actual terryum.ai folder structure (the §11.2-11.4 facts above) and builds the narrative on top of it.

image-curator → fact-checker → qa-reviewer — figure placement, citation verification, and the READY FOR RELEASE decision.

The whole cycle takes about 12-18 hours and costs roughly $40-80 (mostly Claude Opus tokens). (Chapter 10) promised honest cost transparency — the production cost of this survey is the first measurement point on that promise.

Figure 11.1: A recursive schematic of the four-skill + seven-agent pipeline that produced this survey — terry's essays and papers posts feed into the deep-researcher as primary sources, the sources flow through the critical-analyst and book-writer to become a survey book, and the book is then registered in terryum.ai's surveys gallery, where it becomes a source for the next survey.

11.6 Obsidian's Role — Drafts, raw, and a Decision Queue

If terryum.ai is the public surface, the Obsidian vault is the private workshop. Three flows pass through it.

Drafts/ — pre-publication drafts. Essays live here until they settle in the author's head. 260310-brain-augmentation, before solidifying into the form cited in this book, went through five or six revisions in Drafts. When /post --from=Drafts/ is called, the draft is split into ko.mdx / en.mdx / meta.json and migrated to terryum.ai.

raw/ (the Karpathy pattern, as-is) — preservation of external sources. arXiv PDFs, captures from other people's blogs, meeting notes, and so on. Read-only. When /paper is called, it reads a PDF from raw/papers/ and writes the output to terryum.ai/posts/papers/, but raw/ itself is never modified.

decisions/ — a decision log written by terry himself. Entries like "this survey splits Part IV into three chapters because of tutorial depth," or "image-curator may include at most two Gemini-assisted illustrations per chapter." Meta-decisions, in other words. The same instinct as the negative-result capture in (Chapter 6) — preserving why a decision was made is itself a part of wiki-rot defense.

All three folders are tracked in git; private material is also pushed to a separate terry-private repo (the externalization principle from ^[12]'s Claude-to-Codex note, in practice). Obsidian's graph view draws live wikilinks across these three folders + Drafts — whenever a new essay enters Drafts, which raw source it connects to becomes visible at once.

Figure 11.2: Bidirectional flow between the private Obsidian vault and the public terryum.ai — Drafts/ as the incubator for essays/notes, raw/ as the input sources, decisions/ as the meta-decision log. The /post and /paper skills automate the private-to-public conversion.

11.7 The terry-papers Knowledge Graph — The Substrate for paper-search

By itself, terryum.ai is content hosting. What makes it knowledge traversal is the terry-papers knowledge graph.

terry-papers/scripts/sync-survey-candidates.mjs runs daily and does the following:

Reads every meta.json in terryum-ai/posts/papers/ and converts it to a KB node.
Embeds each node's key_concepts, methodology, and subfields.
Generates edges using embedding distance + tag overlap + citation relationships.
Refreshes the surveys candidate pool — "which survey could this paper be a candidate for?"

When /paper-search next is called, recommendations come from this pool. The OSS-matrix research in (Chapter 5) was driven partly by this mechanism — GitHub nodes for astrohan2026karpathyllmwikiskill, astorian2026llmwikimcp, and ekadetov2026llmwikiobsidian entered the graph, clustered as "Karpathy LLM Wiki adjacent," and were fed to the deep-researcher-frontier as candidates.

/paper-search explore is a freer search. The question gets embedded onto anchors, BFS pulls internal nodes, and arXiv/Semantic Scholar fetch externals. The decision "should this book's Chapter 9 domain case studies include a medical section?" was answered by a single explore call — wu2026medicalaiscientist and five adjacent papers were recommended, and three of them were adopted as primary sources.

The real value of the knowledge graph is asymmetric acceleration of information-gathering. If terry had manually scanned arXiv every day, it would have taken weeks to discover the medical-AI-Scientist cluster. Graph + paper-search compressed that to five minutes.

11.8 Public · Private · Group — Three-Tier Visibility Routing

One last operational detail. Both /post and /paper accept a --visibility=public|private|group option.

public: the body is exposed on terryum.ai. The default case.
private: the body is stored in terry-private git + Cloudflare R2. Only metadata (title and summary card) appears on terryum.ai. Clicking through requires separate authentication.
group: same as private but the ACL is group-scoped. The use case is "share with the joint-research group only."

This survey is visibility=public. But the intermediate production artifacts — _research/papers.json, _analysis/gaps.md, the deep-researcher logs — stay in the monorepo's internal git and never appear on the static site. This is the multi-surface pattern of an external brain — the same material is exposed at one level during production and a different level at publication.

The wiki/dead-ends/ directory proposed prescriptively in (Chapter 6) is the most direct application of this pattern. A dead end itself may stay private, but the statistics about how often dead-ends occur are surfaced in the weekly-review section of the public surveys page. "Which hypotheses we tried and why we abandoned them" — that is the operational instantiation of the honest-publishing ethic from G3 AAR Sonnet-4 caveat (Chapter 8).

11.9 Four-Month Longitudinal Observations — What Survived and What Did Not

Aimaker's four-month single-N report ^[1] is half of the G1 wiki-rot evidence base. terry's four months of operation add to that single N with comparable weight — the five patterns the two reports agree on are as follows.

~30% rewrite rate: roughly 30% of wiki pages ingested in the first week get rewritten, merged, or deleted in the lint pass a month later. It feels like failure at first; it is normal cycle. In terry's case, wiki/concepts/ had the highest rate (~40%), and wiki/claims/ had the lowest (~15%) thanks to schema enforcement.

Tag pruning: half of the tags created in the first month had lost their meaning by the second. Vague tags like #interesting, #important, and #misc all disappeared, while state-clear tags like #todo-summarize, #contradicts-X, and #followup-experiment survived.

Directory simplification: two of the directories created in the first month did not survive — wiki/people/ (researcher pages) and wiki/projects/ (per-project). Both produced content but had low re-access frequency. The directories that survived: concepts/, claims/, open-questions/, dead-ends/, logs/. The minimal vault in (Chapter 10) §10.3 is the residue of these four months.

The value of subagent separation: the first month used one large instruction file. After two months it was split into four (literature-reviewer, statistician, safety-reviewer, paper-formatter). Wiki-rot dropped from ~30% to ~15%. Hypothesis: each subagent's shorter context cut down off-topic hallucination.

/paper-search next hit rate: 20% in month one, 60% by month four. Recommendation accuracy rose as edge density in the graph grew. This is the quantitative backing for the (Chapter 6) prescription that graphs take operational time.

All five are single-N, and that is the collective limit of the G1 wiki-rot literature ^[1]. This book is explicit about that limit — its prescriptions are starting points, not proven recipes. The drift-metric triad proposed in (Chapter 6) (page-coherence delta, citation-orphan rate, ingest-revert ratio) will eventually make these measurable — a task for the next edition or a follow-up survey.

11.10 You Still Need to Write — The Manifesto's Reservation

The manifesto in ^[3] carries a reservation. If everything is delegated to AI, thought is not externalized — it evaporates. That is why terry writes essays/ by hand: papers/ and surveys/ are AI's job; essays/ is the author's.

This division hardens into four operational rules.

Summaries are the LLM's, judgment is the human's: the ai_summary produced by /paper can be trusted, but the context around key_result is reinforced by terry's own essays/ and notes/.
The graph is the LLM's, the meaning of an edge is the human's: terry-papers's embedding edges are auto-generated, but which edges represent real intellectual connections is verified by a human in the weekly review.
Claims are the human's, evidence is the LLM's: the claim itself in wiki/claims/ is human-framed; the evidence and citations are auto-gathered and verified by the LLM.
Decisions are the human's, alternatives are the LLM's: entries in decisions/ are author-written, but the alternatives for each decision are solicited from the LLM.

One line from Aimaker's four-month report ^[1]: "the LLM finds the connections, but judging which connections are meaningful is still my job." terry's four months arrive at the same conclusion.

11.11 The One-Line Flow — Summarized

Compressing this chapter to a single line.

arXiv link → /paper creates papers/ + KB node → /paper-search next recommends what comes next → /survey "" bootstraps a survey book → --orchestrate lets the multi-agent team write it autonomously → this survey is the result.

That is the worked example of the four-layer × six-level frame this book claims, and it is the most direct proof of that frame. (Chapter 12) shows the step-by-step roadmap for building your own workflow in the same shape — from L2 LLM Wiki all the way to L5 dry-lab AI Scientist, one step at a time.

References

Aimaker (2026). AI-powered second brain from LLM Wiki — 4-month report. Aimaker Substack, 2026.
Karpathy, A., Y. He, X. Lee, et al. (2026). LLM Wiki — A pattern for building personal knowledge bases using LLM agents. GitHub Gist, 2026-04-04.
Um, T. (terryum) (2026). Brain Augmentation — manifesto for AI-era self-generating knowledge environments. terryum.ai post #7, 2026-03-10. [Brain Augmentation, 2026; Um, 2026]
Um, T. (terryum) (2026). Democratization of Research — three stages (document → in silico → physical). terryum.ai post #25, 2026-04-15. [Democratization of Research, 2026]
Um, T. (terryum) (2026). From Claude Code to Codex — A Migration Note. terryum.ai notes post, 2026-04-24. [Claude-to-Codex migration note, 2026]
Um, T. (terryum) (2026). AAR (Automated Alignment Researchers) summary and analysis. terryum.ai paper post, 2026. #28
Um, T. (terryum) (2026). autoresearch summary and analysis. terryum.ai paper post, 2026.
Um, T. (terryum) (2026). AI Co-Scientist summary and analysis. terryum.ai paper post, 2026. #11
Um, T. (terryum) (2026). Self-Driving Labs summary and analysis. terryum.ai paper post, 2026. #31
Um, T. (terryum) (2026). Medical AI Scientist summary and analysis. terryum.ai paper post, 2026. #21
Um, T. (terryum) (2026). Harnessing Claude Intelligence. terryum.ai paper post, 2026.
Um, T. (terryum) (2026). Meta-Harness Optimization. terryum.ai paper post, 2026. #22
Um, T. (terryum) (2025). Conductor — LLM orchestration patterns. terryum.ai paper post, 2025.
ekadetov (2026). ekadetov/llm-wiki — Claude Code plugin for persistent compounding KBs in Obsidian. GitHub.
Astro-Han (2026). Astro-Han/karpathy-llm-wiki — Agent Skills-compatible LLM Wiki package. GitHub.
Astorian, L. (2026). lucasastorian/llmwiki — Open-source LLM Wiki with document upload + MCP. GitHub.
Yu, W. (2026). What Is Karpathy's LLM Wiki? A Zettelkasten User's Honest Review. Personal Blog, 2026.
Lu, C., Lu, C., et al. (2024). The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. arXiv:2408.06292.
Wu, H., Zheng, B., et al. (2026). Towards a Medical AI Scientist. arXiv:2603.28589. #21
Joshi, U. (2026). Andrej Karpathy's LLM Wiki: Create your own knowledge base. Medium, 2026.
Paige (2026). Second-brain setup using Karpathy's LLM Wiki (video). YouTube, 2026.