The only full textbook on Ilya Sutskever's famous reading list. Why did a deep learning pioneer tell John Carmack to study algorithmic randomness? Because compression is intelligence — and this book is the mathematical foundation for that claim. We cover Kolmo...

Episode 0012: Kolmogorov Complexity and Algorithmic Randomness

Series: Greatest Hits — Ilya Sutskever's Reading List Date: Sunday, February 22, 2026

Overview

The only full textbook on Ilya Sutskever's famous ~30 item reading list. Why did a deep learning pioneer tell John Carmack to study algorithmic randomness? Because compression is intelligence — and this book is the mathematical foundation for that claim.

We cover the key concepts: Kolmogorov complexity, the invariance theorem, incompressibility, algorithmic randomness, Berry's paradox, connections to Gödel and Turing, and why all of this matters for understanding why neural networks generalize.

The Textbook

Title: Kolmogorov Complexity and Algorithmic Randomness
Authors: Alexander Shen, Vladimir A. Uspensky, Nikolai Vereshchagin
Publisher: American Mathematical Society (Mathematical Surveys and Monographs, Volume 220), 2017
AMS Bookstore: bookstore.ams.org/surv-220
Author's draft (PDF): lirmm.fr/~ashen/kolmbook-eng-scan.pdf
Alexander Shen's page: lirmm.fr/~ashen

Ilya Sutskever's Reading List

Ilya Sutskever's recommended reading list — the ~30 papers and one textbook he told John Carmack would cover "90% of what matters"
Ilya Sutskever on "compression is intelligence" — various talks and interviews

Key Concepts Discussed

Kolmogorov Complexity

The length of the shortest program that produces a given output
Independently discovered by Ray Solomonoff (1960), Andrei Kolmogorov (1965), and Gregory Chaitin (1966)
Wikipedia: Kolmogorov complexity

The Invariance Theorem

Kolmogorov complexity is independent of programming language, up to an additive constant
Makes the definition well-defined as a property of the string itself

Algorithmic Randomness & Martin-Löf

A string is random iff it is incompressible
Martin-Löf randomness: passes every computable statistical test
Equivalence between the two definitions is a central theorem
Wikipedia: Algorithmic randomness
Wikipedia: Per Martin-Löf

Berry's Paradox

"The smallest positive integer not definable in fewer than twenty words"
Formalized via Kolmogorov complexity to prove uncomputability
Wikipedia: Berry paradox

Connections to Gödel and Turing

Kolmogorov complexity is uncomputable (reduces to halting problem)
Chaitin's Omega: halting probability with algorithmically random digits
Incompleteness, halting, and randomness as three faces of the same limit
Wikipedia: Chaitin's constant

Why This Matters for AI

Compression ↔ Prediction (Shannon Duality)

Good predictors are good compressors and vice versa
Neural networks trained to predict are implicitly compressing
Shannon's 1948 paper

Minimum Description Length (MDL)

Also on Ilya's reading list
Best model = shortest total description (model + data given model)
Occam's razor made mathematical
Wikipedia: Minimum description length
Jorma Rissanen's original work (1978)

Solomonoff Induction

Theoretically optimal prediction via Kolmogorov complexity prior
Every neural network is an approximation of this uncomputable ideal
Wikipedia: Solomonoff's theory of inductive inference

Deep Learning Generalization

Why overparameterized networks don't overfit: they compress
Blier & Ollivier (2018): trained network weights compress training data
The Description Length of Deep Learning Models (Blier & Ollivier, NeurIPS 2018)

The Authors

Alexander Shen — LIRMM, CNRS & Université de Montpellier, France. Homepage
Vladimir A. Uspensky (1930–2018) — Lomonosov Moscow State University. Pioneer in Russian mathematical logic. Wikipedia
Nikolai Vereshchagin — Lomonosov Moscow State University. Google Scholar

Listen

12: Kolmogorov Complexity — Sunday Greatest Hits

Show Notes