12: Kolmogorov Complexity — Sunday Greatest Hits
The only full textbook on Ilya Sutskever's famous reading list. Why did a deep learning pioneer tell John Carmack to study algorithmic randomness? Because compression is intelligence — and this book is the mathematical foundation for that claim. We cover Kolmo...
Show Notes
Episode 0012: Kolmogorov Complexity and Algorithmic Randomness
Series: Greatest Hits — Ilya Sutskever's Reading List Date: Sunday, February 22, 2026
Overview
The only full textbook on Ilya Sutskever's famous ~30 item reading list. Why did a deep learning pioneer tell John Carmack to study algorithmic randomness? Because compression is intelligence — and this book is the mathematical foundation for that claim.
We cover the key concepts: Kolmogorov complexity, the invariance theorem, incompressibility, algorithmic randomness, Berry's paradox, connections to Gödel and Turing, and why all of this matters for understanding why neural networks generalize.
The Textbook
- Title: Kolmogorov Complexity and Algorithmic Randomness
- Authors: Alexander Shen, Vladimir A. Uspensky, Nikolai Vereshchagin
- Publisher: American Mathematical Society (Mathematical Surveys and Monographs, Volume 220), 2017
- AMS Bookstore: bookstore.ams.org/surv-220
- Author's draft (PDF): lirmm.fr/~ashen/kolmbook-eng-scan.pdf
- Alexander Shen's page: lirmm.fr/~ashen
Ilya Sutskever's Reading List
- Ilya Sutskever's recommended reading list — the ~30 papers and one textbook he told John Carmack would cover "90% of what matters"
- Ilya Sutskever on "compression is intelligence" — various talks and interviews
Key Concepts Discussed
Kolmogorov Complexity
- The length of the shortest program that produces a given output
- Independently discovered by Ray Solomonoff (1960), Andrei Kolmogorov (1965), and Gregory Chaitin (1966)
- Wikipedia: Kolmogorov complexity
The Invariance Theorem
- Kolmogorov complexity is independent of programming language, up to an additive constant
- Makes the definition well-defined as a property of the string itself
Algorithmic Randomness & Martin-Löf
- A string is random iff it is incompressible
- Martin-Löf randomness: passes every computable statistical test
- Equivalence between the two definitions is a central theorem
- Wikipedia: Algorithmic randomness
- Wikipedia: Per Martin-Löf
Berry's Paradox
- "The smallest positive integer not definable in fewer than twenty words"
- Formalized via Kolmogorov complexity to prove uncomputability
- Wikipedia: Berry paradox
Connections to Gödel and Turing
- Kolmogorov complexity is uncomputable (reduces to halting problem)
- Chaitin's Omega: halting probability with algorithmically random digits
- Incompleteness, halting, and randomness as three faces of the same limit
- Wikipedia: Chaitin's constant
Why This Matters for AI
Compression ↔ Prediction (Shannon Duality)
- Good predictors are good compressors and vice versa
- Neural networks trained to predict are implicitly compressing
- Shannon's 1948 paper
Minimum Description Length (MDL)
- Also on Ilya's reading list
- Best model = shortest total description (model + data given model)
- Occam's razor made mathematical
- Wikipedia: Minimum description length
- Jorma Rissanen's original work (1978)
Solomonoff Induction
- Theoretically optimal prediction via Kolmogorov complexity prior
- Every neural network is an approximation of this uncomputable ideal
- Wikipedia: Solomonoff's theory of inductive inference
Deep Learning Generalization
- Why overparameterized networks don't overfit: they compress
- Blier & Ollivier (2018): trained network weights compress training data
- The Description Length of Deep Learning Models (Blier & Ollivier, NeurIPS 2018)
The Authors
- Alexander Shen — LIRMM, CNRS & Université de Montpellier, France. Homepage
- Vladimir A. Uspensky (1930–2018) — Lomonosov Moscow State University. Pioneer in Russian mathematical logic. Wikipedia
- Nikolai Vereshchagin — Lomonosov Moscow State University. Google Scholar
Further Reading
- Li & Vitányi, "An Introduction to Kolmogorov Complexity and Its Applications" — the other major textbook on Kolmogorov complexity
- Hutter, "Universal Artificial Intelligence" — extends Solomonoff induction to reinforcement learning (AIXI)
- Grünwald, "The Minimum Description Length Principle" — comprehensive treatment of MDL
This episode was researched and written with AI assistance.