Consolidated References

257 references

[1] Adam, David, "The AI co-scientist is here," Nature Medicine, 2026-03-16. [Adam, 2026]
[2] Anthropic, "Automated Alignment Researchers — Using LLMs to scale scalable oversight," Anthropic Research, 2026-04-14. [Anthropic, 2026]
[3] Astro-Han, "karpathy-llm-wiki — Agent Skills-compatible LLM Wiki for Claude Code/Codex," GitHub, 2026-04. [Astro-Han, 2026]
[4] Astorian, Lucas, "lucasastorian/llmwiki — Open-source LLM Wiki with document upload + Claude MCP," GitHub, 2026-04. [Astorian, 2026]
[5] Ahrens, Sönke (2017). How to Take Smart Notes. CreateSpace.
[6] Bush, Vannevar (1945). As We May Think (the Memex proposal). The Atlantic, July 1945.
[7] BSWEN, "What Results Did 700 Autoresearch Experiments Achieve Overnight?," Medium, 2026-03-30. [BSWEN, 2026]
[9] Clark, Andy and Chalmers, David (1998). The Extended Mind. Analysis 58(1): 7-19.
[10] Clark, Jack, "Import AI 454: Automating alignment research," Import AI, 2026-04-20. [Clark, 2026]
[12] Gottweis, Juraj et al. (2025). Towards an AI co-scientist. arXiv:2502.18864.
[13] Guan et al. (2026). AI-Assisted Drug Re-Purposing for Human Liver Fibrosis. Advanced Science. [Guan et al., 2026]
[14] Jumper, John et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature 596: 583-589.
[15] Karpathy, Andrej, "LLM Wiki — A pattern for building personal knowledge bases using LLMs," GitHub Gist, 2026-04-04. [Karpathy, 2026]
[16] Karpathy, Andrej, "LLM Wiki announcement (Twitter/X thread)," Twitter/X, 2026-04-04. [Karpathy, 2026]
[17] Karpathy, Andrej, "Farzapedia reply — personalization argument for LLM Wiki," Twitter/X, 2026-04-12. [Karpathy, 2026]
[18] Karpathy, Andrej, "karpathy/autoresearch — AI agents running research on single-GPU nanochat training," GitHub, 2026-03-07. [Karpathy, 2026]
[19] Karpathy, Andrej, "Autoresearch first-run tweet — 12h / 110 changes on nanochat," Twitter/X, 2026-03-07. [Karpathy, 2026]
[20] Karpathy, Andrej, "Autoresearch Round 1 tweet — 700 experiments / 11% Time-to-GPT-2 reduction," Twitter/X, 2026-03-09. [Karpathy, 2026]
[21] Karpathy, Andrej (2017). Software 2.0. Medium. [Karpathy, 2017]
[22] King, Ross D. et al. (2009). The Automation of Science. Science 324: 85-89. [King et al., 2009]
[23] Langley, Pat (1981). Data-Driven Discovery of Physical Laws (BACON). Cognitive Science 5(1): 31-54. [Langley, 1981]
[24] Lu, Chris et al. (2024). The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. arXiv:2408.06292. [Lu et al., 2024]
[25] Luhmann, Niklas (1992). Communicating with Slip Boxes — An Empirical Account. Essay. [Luhmann, 1992]
[26] Packer, Charles et al. (2023). MemGPT: Towards LLMs as Operating Systems. arXiv:2310.08560. [Packer et al., 2023]
[27] Park, Joon Sung et al. (2023). Generative Agents: Interactive Simulacra of Human Behavior. UIST 2023. [Park et al., 2023]
[28] Park, Jaehong, "Forget RAG — Karpathy's 'LLM Wiki' as a new knowledge-management paradigm," GeekNews (Korean), 2026-05. [Park, 2026]
[29] Pilon, Simone et al. (2026). A flexible and affordable self-driving laboratory for automated reaction optimization. Nature Synthesis. [Pilon et al., 2026]
[30] Schmidgall et al. (2025). Evaluating Sakana's AI Scientist for Autonomous Research. arXiv:2502.14297. [Schmidgall et al., 2025]
[31] Silver, David et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature 529: 484-489. [Silver et al., 2016]
[33] The New Stack, "Andrej Karpathy's 630-line Python script ran 50 experiments overnight without any human," The New Stack, 2026-03. [The New Stack, 2026]
[34] Um, Taewoong, "Brain Augmentation — manifesto for AI-era self-generating knowledge environments," terryum.ai, 2026-03-10. [Um, 2026]
[35] Um, Taewoong, "Democratization of research — three stages (document → in silico → physical)," terryum.ai, 2026-04-15. [Um, 2026]
[36] Um, Taewoong, "Claude Code → Codex migration strategy," terryum.ai, 2026-04-24. [Um, 2026]
[38] Wang, Guanzhi et al. (2023). Voyager: An Open-Ended Embodied Agent with Large Language Models. TMLR 2024. [Wang et al., 2023]
[39] Agentic Researcher, "The Agentic Researcher: A Practical Guide to AI-Assisted Research," arXiv:2603.15914, 2026. [Agentic Researcher, 2026]
[40] Agentpedia, "Karpathy's LLM Wiki: The Complete Guide to His Idea File," Agentpedia, 2026. [Agentpedia, 2026]
[41] AIwire, "Stanford's Paper2Agent Reimagines Scientific Papers as Interactive AI Agents," HPCwire AIwire, 2025-10-10. [AIwire, 2025]
[42] Anthropic, "Claude Code memory + subagent documentation," Anthropic Docs, 2026. [Anthropic, 2026]
[43] Denser.ai, "From RAG to LLM Wiki: What Karpathy's idea means for AI knowledge bases," Denser.ai Blog, 2026. [Denser, 2026]
[44] Ghafarollahi, Alireza et al. (2024). SciAgents: Automating scientific discovery through multi-agent intelligent graph reasoning. arXiv:2409.05556. [Ghafarollahi et al., 2024]
[45] Gottweis, Juraj et al. (2025). Towards an AI co-scientist (Google AI Co-Scientist). arXiv:2502.18864. [Gottweis et al., 2025]
[46] HKUDS (2025). AI-Researcher: Autonomous Scientific Innovation. arXiv:2505.18705. [HKUDS, 2025]
[47] InfoQ, "Paper2Agent Converts Scientific Papers into Interactive AI Agents," InfoQ, 2025-10. [InfoQ, 2025]
[48] Izacard, Gautier et al. (2022). Atlas: Few-shot Learning with Retrieval Augmented Language Models. arXiv:2208.03299. [Izacard et al., 2022]
[49] Lala, J. et al. (2024). PaperQA2 — Language agents achieve superhuman synthesis of scientific knowledge. arXiv:2409.13740. [Lala et al., 2024]
[50] Lewis, Patrick et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS 2020. [Lewis et al., 2020]
[51] OpenAI, "Codex /goal Command," Ralphable, 2026. [OpenAI, 2026]
[52] Stanford team (2025). Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents. arXiv:2509.06917. [Stanford, 2025]
[53] Tecton & Tide, "/goal: The Six-Hour Codex Run That Survived a Five-Hour Pause," Tecton & Tide Blog, 2026-04. [Tecton & Tide, 2026]
[54] Um, Taewoong, "Claude Code → Codex 이관 전략," terryum.ai, 2026-04-24. [Um, 2026]
[55] Willison, Simon, "Codex CLI 0.128.0 adds /goal," Simon Willison's Blog, 2026-04-30. [Willison, 2026]
[56] Yamada, Yutaro et al. (2025). The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search. arXiv:2504.08066. [Yamada et al., 2025]
[57] Zhang, Xiangyue (2026). Deep Researcher Agent: An Autonomous Framework for 24/7 Deep Learning Experimentation. arXiv:2604.05854. [Zhang, 2026]
[58] Adam, D. (2026). The AI co-scientist is here. Nature Medicine, 2026-03-16.
[59] Agentic Researcher (2026). The Agentic Researcher: A Practical Guide to AI-Assisted Research. arXiv:2603.15914.
[61] Aimaker (2026). 4-Month Obsidian + LLM Wiki Longitudinal Report. Aimaker blog.
[62] Anthropic (2026). Automated Alignment Researchers — Using LLMs to scale scalable oversight. Anthropic Research, 2026-04-14.
[63] Astorian, L. (2026). lucasastorian/llmwiki — MCP-based LLM Wiki service. GitHub.
[65] Boiko, D. A., MacKnight, R., & Gomes, G. (2023). Emergent autonomous scientific research capabilities of large language models. arXiv:2304.05332; Nature, 2023.
[66] Bowman, S. R. et al. (2022). Measuring Progress on Scalable Oversight for Large Language Models. arXiv:2211.03540.
[67] Bran, A. M. et al. (2023). ChemCrow: Augmenting large-language models with chemistry tools. arXiv:2304.05376; Nature Machine Intelligence, 2024.
[68] Brazil, R. (2026). Inside the self-driving lab revolution. Nature, 2026-03-30.
[69] Burns, C. et al. (2023). Weak-to-Strong Generalization. arXiv:2312.09390; ICML 2024.
[70] Bush, V. (1945). As We May Think. The Atlantic, 1945-07.
[71] ChatGPT seed (2026). LLM Wiki → AI Scientist preliminary research synthesis (private session capture, 2026-05-22). Internal seed used by deep-researcher / critical-analyst / book-writer.
[72] Chamin, 0x (2026). Mcptube — YouTube-to-LLM-Wiki converter. GitHub.
[73] Chen, W. et al. (2023). AgentVerse: Facilitating Multi-Agent Collaboration. arXiv:2308.10848.
[74] Clark, A., & Chalmers, D. J. (1998). The Extended Mind. Analysis, 58(1), 7-19.
[75] Clark, J. (2026). Import AI 454 — Reading AAR carefully. Substack, 2026-04-20.
[77] Ghafarollahi, A., & Buehler, M. J. (2024). SciAgents: Automating Scientific Discovery through Multi-Agent Intelligent Graph Reasoning. arXiv:2409.05556.
[78] Gottweis, J. et al. (2025). Towards an AI co-scientist. arXiv:2502.18864.
[79] Guan, J. et al. (2026). AI-Assisted Drug Re-Purposing for Human Liver Fibrosis. Advanced Science.
[80] Hendrycks, D. et al. (2020). Measuring Massive Multitask Language Understanding. arXiv:2009.03300; ICLR 2021.
[81] Hong, S. et al. (2023). MetaGPT: Meta Programming for Multi-Agent Collaborative Framework. arXiv:2308.00352.
[82] HN (2026). LLM Wiki front-page thread (item 47640875). Hacker News, 2026-04-04.
[83] Izacard, G. et al. (2022). Atlas: Few-shot Learning with Retrieval Augmented Language Models. arXiv:2208.03299; JMLR 2023.
[84] Jumper, J. et al. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596: 583-589.
[85] Karpathy, A. (2017). Software 2.0. Medium.
[86] Karpathy, A. (2026a). karpathy/autoresearch. GitHub.
[87] Karpathy, A. (2026b). Autoresearch first overnight run tweet. Twitter/X, 2026-03-07.
[88] Karpathy, A. (2026c). Autoresearch Round 1 tweet (700 / 2 days / 20 keep-worthy / 11% Time-to-GPT-2). Twitter/X, ~2026-03-09.
[89] Karpathy, A. (2026d). LLM Wiki gist (karpathy/442a6bf555914893e9891c11519de94f). GitHub Gist, 2026-04-04.
[90] Karpathy, A. (2026e). LLM Wiki launch tweet. Twitter/X, 2026-04-04.
[91] Karpathy, A. (2026f). Farzapedia follow-up thread (Explicit / Yours / Files-over-apps / BYOAI). Twitter/X, 2026-04-12.
[92] King, R. D. et al. (2009). The Automation of Science. Science, 324: 85-89.
[93] Langley, P. (1981). Data-Driven Discovery of Physical Laws. Cognitive Science, 5(1).
[94] Lála, J., White, A. D. et al. (2024). PaperQA2: Faster, better, free research agents. arXiv:2409.13740.
[95] Lewis, P. et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. arXiv:2005.11401; NeurIPS 2020.
[96] Li, G. et al. (2023). CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society. arXiv:2303.17760; NeurIPS 2023.
[98] Madaan, A. et al. (2023). Self-Refine: Iterative Refinement with Self-Feedback. arXiv:2303.17651; NeurIPS 2023.
[101] OpenAI (2026b). AGENTS.md specification update. GitHub.
[102] Packer, C. et al. (2023). MemGPT: Towards LLMs as Operating Systems. arXiv:2310.08560; COLM 2024.
[104] Park, J. S. et al. (2023). Generative Agents: Interactive Simulacra of Human Behavior. arXiv:2304.03442; UIST 2023.
[105] Pilon, T. et al. (2026). RoboChem-Flex: A ~$5,000 modular self-driving laboratory. Nature Synthesis.
[106] Rein, D. et al. (2023). GPQA: A Graduate-Level Google-Proof Q&A Benchmark. arXiv:2311.12022; COLM 2024.
[107] Schick, T. et al. (2023). Toolformer: Language Models Can Teach Themselves to Use Tools. arXiv:2302.04761; NeurIPS 2023.
[108] Schmidt, M., & Lipson, H. (2009). Distilling Free-Form Natural Laws from Experimental Data. Science 324(5923): 81-85.
[109] Shen, Y. et al. (2023). HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face. arXiv:2303.17580; NeurIPS 2023.
[110] Shinn, N. et al. (2023). Reflexion: Language Agents with Verbal Reinforcement Learning. arXiv:2303.11366; NeurIPS 2023.
[111] Silver, D. et al. (2016). Mastering the game of Go with deep neural networks and tree search. Nature 529: 484-489.
[114] Tecton & Tide (2026). /goal: The Six-Hour Codex Run That Survived a Five-Hour Pause. Blog, 2026-05-01.
[116] Um, T. (2025). Conductor — LLM Orchestration Patterns. terryum.ai, 2025-12.
[119] Wang, G. et al. (2023). Voyager: An Open-Ended Embodied Agent with Large Language Models. arXiv:2305.16291; TMLR 2024.
[120] Wei, J. et al. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. arXiv:2201.11903; NeurIPS 2022.
[121] Willison, S. (2026). Codex 0.128 — Persisted /goal Walkthrough. Personal blog, 2026-04-30.
[122] Wenhao Yu (2026). What Is Karpathy's LLM Wiki? A Zettelkasten User's Honest Review. Personal blog, 2026-04-20.
[123] Wu, H. et al. (2026). Towards a Medical AI Scientist. arXiv:2603.28589.
[126] Yao, S. et al. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv:2210.03629; ICLR 2023.
[127] Yao, S. et al. (2023). Tree of Thoughts: Deliberate Problem Solving with Large Language Models. arXiv:2305.10601; NeurIPS 2023.
[129] Karpathy, A. (2026). LLM Wiki — A pattern for building personal knowledge bases using LLMs. GitHub Gist, 2026-04-04.
[130] Karpathy, A., "LLM Wiki announcement (Twitter/X thread)," 2026-04-04. [Karpathy, 2026]
[131] Karpathy, A., "Farzapedia reply — personalization argument for LLM Wiki," 2026-04-12. [Karpathy, 2026]
[132] Karpathy, A. (2017). Software 2.0. Medium.
[133] MindStudio (2026). What Is Andrej Karpathy's LLM Wiki? How to Build a Personal Knowledge Base With Claude Code. MindStudio Blog. [MindStudio, 2026]
[134] Cognition AI (2026). llm-wiki: the reference implementation of Karpathy's self-building AI memory pattern. Cognition blog (re-syndicated). [Cognition, 2026]
[135] Denser.ai (2026). From RAG to LLM Wiki: What Karpathy's idea means for AI knowledge bases. Denser.ai blog. [Denser, 2026]
[136] Analytics Vidhya (2026). LLM Wiki Revolution: How Andrej Karpathy's Idea is Changing AI. Analytics Vidhya blog. [Analytics Vidhya, 2026]
[137] Agentpedia (2026). Karpathy's LLM Wiki: The Complete Guide to His Idea File. Agentpedia blog. [Agentpedia, 2026]
[138] Lobster Pack (2026). Karpathy's LLM Wiki and the rise of "idea files" — why sharing instructions beats sharing code. Lobster Pack blog. [Lobster Pack, 2026]
[139] WebEdge (2026). Karpathy's LLM Knowledge Base System: Full Breakdown of His CLAUDE.md Schema. MindStudio Blog (WebEdge attribution). [WebEdge, 2026]
[140] Starmorph (2026). Karpathy's LLM Wiki: Step-by-step setup guide. Starmorph blog. [Starmorph, 2026]
[141] Park, J. (2026). RAG is forgotten: Karpathy's "LLM Wiki" and a new knowledge-management paradigm (Korean). GeekNews / WikiDocs blog. [Park, 2026]
[142] Anthropic (2026). Claude Code documentation. Anthropic docs. [Anthropic, 2026]
[143] OpenAI (2026). Custom instructions with AGENTS.md (Codex). OpenAI Developers Portal. [OpenAI, 2026]
[144] Fulkerson, A. (2026). Karpathy's Pattern for an LLM Wiki in Production. aaronfulkerson.com blog. [Fulkerson, 2026]
[145] Aimaker (2026). AI-powered second brain from LLM Wiki — 4-month report. Aimaker Substack. [Aimaker, 2026]
[146] Hacker News community, "LLM Wiki — example of an 'idea file' (Hacker News front-page thread)," 2026-04-04. [HN, 2026]
[147] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., et al. (2020). Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS 2020. arXiv:2005.11401.
[148] Karpukhin, V., Oğuz, B., Min, S., Lewis, P., Wu, L., Edunov, S., et al. (2020). Dense Passage Retrieval for Open-Domain Question Answering. EMNLP 2020. arXiv:2004.04906.
[149] Johnson, J., Douze, M., and Jégou, H. (2019). Billion-scale similarity search with GPUs (FAISS). IEEE Transactions on Big Data. arXiv:1702.08734. DOI:10.1109/TBDATA.2019.2921572.
[150] Izacard, G., Lewis, P., Lomeli, M., Hosseini, L., Petroni, F., Schick, T., et al. (2022). Atlas: Few-shot Learning with Retrieval Augmented Language Models. JMLR 2023. arXiv:2208.03299.
[151] Bush, V. (1945). As We May Think (the Memex proposal). The Atlantic Monthly, July 1945.
[152] Luhmann, N. (1992). Communicating with Slip Boxes — An Empirical Account. Universität Bielefeld (translated essay).
[153] Clark, A. and Chalmers, D. (1998). The Extended Mind. Analysis 58 (1): 7-19. DOI:10.1093/analys/58.1.7.
[154] Ahrens, S. (2017). How to Take Smart Notes. Book (CreateSpace / Independently Published).
[155] Packer, C., Wooders, S., Lin, K., Fang, V., Patil, S. G., Stoica, I., and Gonzalez, J. E. (2023). MemGPT: Towards LLMs as Operating Systems. arXiv:2310.08560.
[156] Wang, G., Xie, Y., Jiang, Y., Mandlekar, A., Xiao, C., Zhu, Y., Fan, L., and Anandkumar, A. (2023). Voyager: An Open-Ended Embodied Agent with Large Language Models. TMLR 2024. arXiv:2305.16291.
[157] Park, J. S., O'Brien, J. C., Cai, C. J., Morris, M. R., Liang, P., and Bernstein, M. S. (2023). Generative Agents: Interactive Simulacra of Human Behavior. UIST 2023. arXiv:2304.03442. DOI:10.1145/3586183.3606763.
[158] Shinn, N., Cassano, F., Berman, E., Gopinath, A., Narasimhan, K., and Yao, S. (2023). Reflexion: Language Agents with Verbal Reinforcement Learning. NeurIPS 2023. arXiv:2303.11366.
[159] Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., and Cao, Y. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. ICLR 2023. arXiv:2210.03629.
[160] Schick, T., Dwivedi-Yu, J., Dessì, R., Raileanu, R., Lomeli, M., Zettlemoyer, L., Cancedda, N., and Scialom, T. (2023). Toolformer: Language Models Can Teach Themselves to Use Tools. NeurIPS 2023. arXiv:2302.04761.
[161] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., and Zhou, D. (2022). Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS 2022. arXiv:2201.11903.
[162] FutureHouse (2024). PaperQA2: Superhuman scientific literature search (FutureHouse announcement). FutureHouse blog. [FutureHouse, 2024]
[163] skyllwt (DAIR Lab, PKU) (2026). OmegaWiki — Wiki-centric full-lifecycle AI research platform on Claude Code. GitHub. [skyllwt, 2026]
[166] Yu, W. (2026). What Is Karpathy's LLM Wiki? A Zettelkasten User's Honest Review. yu-wenhao.com blog. [Yu, 2026]
[167] Infranodus (2026). Infranodus on LLM Wiki — graph DBs as the missing layer. Infranodus blog. [Infranodus, 2026]
[168] innobu (2026). Karpathy's LLM Wiki: Second Brain and the Enterprise Reality Check 2026. innobu blog. [innobu, 2026]
[169] AI Critique (2026). Andrej Karpathy's latest concept 'LLM Wiki' and the future of enterprise knowledge. AI Critique blog. [AI Critique, 2026]
[170] Critical Analyst (2026). Research gap analysis — gaps.md (internal). terry-surveys repo. [Critical Analyst, 2026]
[173] 0xchamin (2026). Mcptube — Karpathy's LLM Wiki applied to YouTube (transcripts + vision frames). GitHub + Hacker News Show HN. [0xchamin, 2026]
[174] Astorian, L. and Hacker News community, "Show HN: LLM Wiki — Open-Source Implementation of Karpathy's LLM Wiki (lucasastorian)," 2026-04. [HN, 2026]
[175] Hacker News community, "Show HN: A Karpathy-style LLM wiki your agents maintain (Markdown and Git)," 2026-05. [HN, 2026]
[176] 0xchamin and Hacker News community, "Show HN: Mcptube — Karpathy's LLM Wiki idea applied to YouTube videos," 2026-04. [HN, 2026]
[177] Starmorph (2026). Karpathy's LLM Wiki — Full Beginner Setup Guide (video). YouTube. [Starmorph, 2026]
[178] Data Science Dojo (2026). The LLM Wiki Pattern by Andrej Karpathy — 5-paper, 30-minute tutorial. Data Science Dojo blog. [Data Science Dojo, 2026]
[179] Joshi, U. (2026). Andrej Karpathy's LLM Wiki: Create your own knowledge base. Medium. [Joshi, 2026]
[180] Global Advisors / Quantified Strategy Consulting (2026). Term: LLM Wiki — Andrej Karpathy. Global Advisors blog. [Global Advisors, 2026]
[181] TiddlyWiki community (2026). Riding the wave of Andrej Karpathy's 'LLM Wiki' (Talk TW). TiddlyWiki Talk forum. [TiddlyWiki, 2026]
[182] Herk, N. (2026). Karpathy 10x'ed Claude Code (LLM Wiki framing video). YouTube. [Herk, 2026]
[183] Paige (2026). Second-brain setup using Karpathy's LLM Wiki (video). YouTube. [Paige, 2026]
[184] Anthropic, "Automated Alignment Researchers," 2026-04. [Anthropic, 2026]
[185] Schmidgall, S. et al. (2025). Critical evaluation of The AI Scientist v1. arXiv:2502.14297.
[186] ChatGPT seed (2026). Pre-research seed document — research-wiki schema, 9-metric framework. terry-surveys repo. [ChatGPT seed, 2026]
[187] Willison, S. (2026). Notes on Codex /goal. simonwillison.net. [Willison, 2026]
[188] Bran, A. M., Cox, S., Schilter, O., Baldassari, C., White, A. D., & Schwaller, P. (2023). ChemCrow: Augmenting large-language models with chemistry tools. arXiv:2304.05376; Nature Machine Intelligence 2024.
[189] Google AI (2025). Accelerating scientific breakthroughs with an AI co-scientist. Google Research blog, 2025-02-19.
[190] Lu, C., Lu, C., Lange, R. T., Foerster, J., Clune, J., & Ha, D. (2024). The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. arXiv:2408.06292.
[193] Shen, Y., Song, K., Tan, X., Li, D., Lu, W., & Zhuang, Y. (2023). HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face. arXiv:2303.17580; NeurIPS 2023.
[194] Shinn, N., Cassano, F., Berman, E., Gopinath, A., Narasimhan, K., & Yao, S. (2023). Reflexion: Language Agents with Verbal Reinforcement Learning. arXiv:2303.11366; NeurIPS 2023.
[196] Um, T. (2026a). AI Co-Scientist post. terryum.ai paper post.
[197] Yang, J. (2023). Auto-GPT: An Autonomous GPT-4 Experiment. GitHub.
[198] Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. (2023). Tree of Thoughts: Deliberate Problem Solving with Large Language Models. arXiv:2305.10601; NeurIPS 2023.
[199] Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2022). ReAct: Synergizing Reasoning and Acting in Language Models. arXiv:2210.03629; ICLR 2023.
[200] Bowman, S. R., Hyun, J., Perez, E., Chen, E., Pettit, C., Heiner, S., et al. (2022). Measuring Progress on Scalable Oversight for Large Language Models. arXiv:2211.03540.
[201] Burns, C., Izmailov, P., Kirchner, J. H., Baker, B., Gao, L., Aschenbrenner, L., Chen, Y., Ecoffet, A., Joglekar, M., Leike, J., Sutskever, I., & Wu, J. (2023). Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision. arXiv:2312.09390; ICML 2024.
[202] BSWEN (2026). What Results Did 700 Autoresearch Experiments Achieve Overnight? BSWEN Medium, 2026-03-30. [BSWEN, 2026]
[203] Clark, J. (2026). Import AI 454: Automating alignment research. Import AI Substack, 2026-04-20.
[205] FutureHouse (2024b). PaperQA2 — FutureHouse Cookbook entry. FutureHouse Cookbook.
[207] Karpathy, A. (2026b). karpathy/autoresearch. GitHub.
[209] Karpathy, A. (2026d). Autoresearch first-run tweet — 12h / 110 changes on nanochat. X (Twitter).
[210] Karpathy, A. (2026e). karpathy/nanochat. GitHub.
[211] Lála, J., Skarlinski, M., White, A. D., et al. (2024). PaperQA2 — Language agents achieve superhuman synthesis of scientific knowledge. arXiv:2409.13740.
[213] Um, T. (2026a). autoresearch summary + analysis. terryum.ai paper post.
[214] Um, T. (2026b). AAR (Automated Alignment Researchers) summary + analysis. terryum.ai paper post.
[215] Wu, H., Zheng, B., Song, D., Jiang, Y., Gao, J., Xing, L., Sun, L., & Yuan, Y. (2026). Towards a Medical AI Scientist. arXiv:2603.28589.
[216] Adam, D. (2026). The AI co-scientist is here (Nature Medicine feature). Nature Medicine. DOI:10.1038/s41591-026-04275-z.
[217] HIMS (2026). RoboChem Flex: democratisation of the autonomous synthesis robot. HIMS, University of Amsterdam, 2026.
[218] Karpathy, A. (2026). karpathy/autoresearch. GitHub.
[219] King, R. D., Rowland, J., Oliver, S. G., Young, M., Aubrey, W., Byrne, E., Liakata, M., Markham, M., Pir, P., Soldatova, L. N., Sparkes, A., Whelan, K. E., & Clare, A. (2009). The Automation of Science. Science 324(5923):85–89. DOI:10.1126/science.1165620.
[221] Pilon, S. et al., Noël, T. (2026). A flexible and affordable self-driving laboratory for automated reaction optimization (RoboChem-Flex). Nature Synthesis. DOI:10.1038/s44160-026-01053-0.
[223] Um, T. (2026a). Medical AI Scientist summary + analysis. terryum.ai paper post.
[224] Um, T. (2026b). Self-Driving Labs summary + analysis. terryum.ai paper post.
[225] Anthropic (2026). Claude Code memory + subagent documentation. Anthropic Developer Docs.
[226] Karpathy, A., Y. He, X. Lee, et al. (2026). LLM Wiki — A pattern for building personal knowledge bases using LLM agents. GitHub Gist, 2026-04-04.
[227] Karpathy, A. (2026). Farzapedia reply — personalization argument for LLM Wiki. X (Twitter), 2026.
[228] OpenAI (2026). Custom instructions with AGENTS.md. OpenAI Codex Docs.
[229] OpenAI Codex Team (2026). Codex CLI 0.128.0 release notes. OpenAI Codex Changelog, 2026-04-30.
[230] Park, J. (GeekNews) (2026). Forget RAG: Karpathy's LLM Wiki and a new knowledge-management paradigm. GeekNews, 2026. [GeekNews / Park, 2026]
[231] Stanford Paper2Agent Team (2025). Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents. arXiv:2509.06917.
[232] Tecton and Tide (2026). /goal: The Six-Hour Codex Run That Survived a Five-Hour Pause. Tecton and Tide Blog. [Tecton and Tide, 2026]
[233] Um, T. (terryum) (2026). From Claude Code to Codex — A Migration Note. terryum.ai post, 2026-04-24. [Um, 2026]
[234] Willison, S. (2026). Codex CLI 0.128.0 adds /goal. Simon Willison's Weblog, 2026-04-30.
[235] Lu, C., Lu, C., et al. (2024). The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. arXiv:2408.06292.
[236] Um, T. (terryum) (2026). Brain Augmentation — manifesto for AI-era self-generating knowledge environments. terryum.ai post #7, 2026-03-10. [Brain Augmentation, 2026; Um, 2026]
[237] Um, T. (terryum) (2026). Democratization of Research — three stages (document → in silico → physical). terryum.ai post #25, 2026-04-15. [Democratization of Research, 2026]
[238] Um, T. (terryum) (2026). AAR (Automated Alignment Researchers) summary and analysis. terryum.ai paper post, 2026.
[239] Um, T. (terryum) (2026). autoresearch summary and analysis. terryum.ai paper post, 2026.
[240] Um, T. (terryum) (2026). AI Co-Scientist summary and analysis. terryum.ai paper post, 2026.
[241] Um, T. (terryum) (2026). Self-Driving Labs summary and analysis. terryum.ai paper post, 2026.
[242] Um, T. (terryum) (2026). Medical AI Scientist summary and analysis. terryum.ai paper post, 2026.
[243] Um, T. (terryum) (2026). Harnessing Claude Intelligence. terryum.ai paper post, 2026.
[244] Um, T. (terryum) (2026). Meta-Harness Optimization. terryum.ai paper post, 2026.
[245] Um, T. (terryum) (2025). Conductor — LLM orchestration patterns. terryum.ai paper post, 2025.
[246] Wu, H., Zheng, B., et al. (2026). Towards a Medical AI Scientist. arXiv:2603.28589.
[248] Gottweis, J., et al. (2025). Towards an AI co-scientist (Google AI Co-Scientist). arXiv:2502.18864.
[249] Schmidgall, S., et al. (2025). Evaluating Sakana's AI Scientist for Autonomous Research. arXiv:2502.14297.
[250] Pilon, S., et al. (2026). A flexible and affordable self-driving laboratory for automated reaction optimization. Nature Synthesis, 2026.
[251] The New Stack (2026). Karpathy's AutoResearch Ran 700 ML Experiments in 2 Days Without Human Input. Reported by Um, T., terryum.ai, 2026. [The New Stack, 2026]
[252] Um, T. (terryum) (2026). Democratization of Research — three stages. terryum.ai post #25, 2026-04-15. [Democratization of Research, 2026]
[253] Um, T. (terryum) (2026). AAR summary and analysis. terryum.ai paper post, 2026.
[254] Adam, D. (2026). The AI co-scientist is here. Nature Medicine Feature, 2026-03-16.
[255] Guan, Y., et al. (2026). Independent wet-lab replication of liver fibrosis target validation. Reported on terryum.ai paper post, 2026. [Guan et al., 2026]
[256] Zhang, S., et al. (2026). Deep Researcher Agent — Think/Execute/Monitor/Reflect with zero-cost monitoring. Reported via terryum.ai, 2026. [Zhang et al., 2026]
[257] Restrepo, G. (2026). Expanding diversity in chemical space. Nature Chemistry, 2026-03-19. [Restrepo, 2026]

Acknowledgment

This book builds on the author's blog posts #7 'Brain Augmentation' and #25 'Democratization of Research', and Part IV (ch10-12) of the earlier survey 'From Claude Code to Codex'.

Thanks to Andrej Karpathy for the 2026-04-04 LLM Wiki gist that launched the ecosystem, Anthropic's Automated Alignment Researchers (2026-04), Google's AI Co-Scientist (2025-02), and Sakana AI's original The AI Scientist v1 (2024-08) for the genealogy.

Hacker News, Reddit r/LocalLLaMA·r/ClaudeAI, GeekNews discussion threads, and the 'Karpathy's LLM Wiki Full Beginner Setup Guide' video series form the matrix of Part II.

This project was built using the Harness skill by Minho Hwang.

AI tools were used in the production of this work: Claude (Opus 4.6) for literature survey, content generation, and manuscript preparation.