[21] Karpathy, Andrej (2017).
Software 2.0.
Medium. [Karpathy, 2017]
[147] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., et al. (2020).
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. NeurIPS 2020. arXiv:2005.11401.
[148] Karpukhin, V., Oğuz, B., Min, S., Lewis, P., Wu, L., Edunov, S., et al. (2020).
Dense Passage Retrieval for Open-Domain Question Answering. EMNLP 2020. arXiv:2004.04906.
[150] Izacard, G., Lewis, P., Lomeli, M., Hosseini, L., Petroni, F., Schick, T., et al. (2022).
Atlas: Few-shot Learning with Retrieval Augmented Language Models. JMLR 2023. arXiv:2208.03299.
[153] Clark, A. and Chalmers, D. (1998).
The Extended Mind. Analysis 58 (1): 7-19. DOI:10.1093/analys/58.1.7.
[155] Packer, C., Wooders, S., Lin, K., Fang, V., Patil, S. G., Stoica, I., and Gonzalez, J. E. (2023).
MemGPT: Towards LLMs as Operating Systems. arXiv:2310.08560.
[156] Wang, G., Xie, Y., Jiang, Y., Mandlekar, A., Xiao, C., Zhu, Y., Fan, L., and Anandkumar, A. (2023).
Voyager: An Open-Ended Embodied Agent with Large Language Models. TMLR 2024. arXiv:2305.16291.
[157] Park, J. S., O'Brien, J. C., Cai, C. J., Morris, M. R., Liang, P., and Bernstein, M. S. (2023).
Generative Agents: Interactive Simulacra of Human Behavior. UIST 2023. arXiv:2304.03442. DOI:10.1145/3586183.3606763.
[158] Shinn, N., Cassano, F., Berman, E., Gopinath, A., Narasimhan, K., and Yao, S. (2023).
Reflexion: Language Agents with Verbal Reinforcement Learning. NeurIPS 2023. arXiv:2303.11366.
[159] Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., and Cao, Y. (2022).
ReAct: Synergizing Reasoning and Acting in Language Models. ICLR 2023. arXiv:2210.03629.
[160] Schick, T., Dwivedi-Yu, J., Dessì, R., Raileanu, R., Lomeli, M., Zettlemoyer, L., Cancedda, N., and Scialom, T. (2023).
Toolformer: Language Models Can Teach Themselves to Use Tools. NeurIPS 2023. arXiv:2302.04761.
[161] Wei, J., Wang, X., Schuurmans, D., Bosma, M., Ichter, B., Xia, F., Chi, E., Le, Q., and Zhou, D. (2022).
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. NeurIPS 2022. arXiv:2201.11903.
[188] Bran, A. M., Cox, S., Schilter, O., Baldassari, C., White, A. D., & Schwaller, P. (2023).
ChemCrow: Augmenting large-language models with chemistry tools. arXiv:2304.05376; Nature Machine Intelligence 2024.
[190] Lu, C., Lu, C., Lange, R. T., Foerster, J., Clune, J., & Ha, D. (2024).
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery. arXiv:2408.06292.
[193] Shen, Y., Song, K., Tan, X., Li, D., Lu, W., & Zhuang, Y. (2023).
HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face. arXiv:2303.17580; NeurIPS 2023.
[194] Shinn, N., Cassano, F., Berman, E., Gopinath, A., Narasimhan, K., & Yao, S. (2023).
Reflexion: Language Agents with Verbal Reinforcement Learning. arXiv:2303.11366; NeurIPS 2023.
[198] Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T. L., Cao, Y., & Narasimhan, K. (2023).
Tree of Thoughts: Deliberate Problem Solving with Large Language Models. arXiv:2305.10601; NeurIPS 2023.
[199] Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K., & Cao, Y. (2022).
ReAct: Synergizing Reasoning and Acting in Language Models. arXiv:2210.03629; ICLR 2023.
[200] Bowman, S. R., Hyun, J., Perez, E., Chen, E., Pettit, C., Heiner, S., et al. (2022).
Measuring Progress on Scalable Oversight for Large Language Models. arXiv:2211.03540.
[201] Burns, C., Izmailov, P., Kirchner, J. H., Baker, B., Gao, L., Aschenbrenner, L., Chen, Y., Ecoffet, A., Joglekar, M., Leike, J., Sutskever, I., & Wu, J. (2023).
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision. arXiv:2312.09390; ICML 2024.
[215] Wu, H., Zheng, B., Song, D., Jiang, Y., Gao, J., Xing, L., Sun, L., & Yuan, Y. (2026).
Towards a Medical AI Scientist. arXiv:2603.28589.
[219] King, R. D., Rowland, J., Oliver, S. G., Young, M., Aubrey, W., Byrne, E., Liakata, M., Markham, M., Pir, P., Soldatova, L. N., Sparkes, A., Whelan, K. E., & Clare, A. (2009).
The Automation of Science. Science 324(5923):85–89. DOI:10.1126/science.1165620.
Acknowledgment
This book builds on the author's blog posts #7 'Brain Augmentation' and #25 'Democratization of Research', and Part IV (ch10-12) of the earlier survey 'From Claude Code to Codex'.
Thanks to Andrej Karpathy for the 2026-04-04 LLM Wiki gist that launched the ecosystem, Anthropic's Automated Alignment Researchers (2026-04), Google's AI Co-Scientist (2025-02), and Sakana AI's original The AI Scientist v1 (2024-08) for the genealogy.
Hacker News, Reddit r/LocalLLaMA·r/ClaudeAI, GeekNews discussion threads, and the 'Karpathy's LLM Wiki Full Beginner Setup Guide' video series form the matrix of Part II.
This project was built using the Harness skill by Minho Hwang.
AI tools were used in the production of this work: Claude (Opus 4.6) for literature survey, content generation, and manuscript preparation.