1: Attention Is All You Need — The Paper That Changed Everything
In our inaugural episode, we dive deep into Attention Is All You Need — the 15-page paper from June 2017 that introduced the Transformer architecture and reshaped all of artificial intelligence. We break down how it works, why the title is a Beatles joke, and ...
Show Notes
Episode 001: Attention Is All You Need
Why it matters. In June 2017, eight researchers at Google published a 15-page paper that would become the most consequential machine learning contribution of the decade. "Attention Is All You Need" introduced the Transformer architecture, replacing the recurrent and convolutional networks that dominated sequence modeling with a mechanism built entirely on attention. The result was a model that trained faster, parallelized better, and scaled beyond anyone's expectations — becoming the foundation for GPT, BERT, PaLM, Gemini, Claude, and virtually every large language model that followed. With over 173,000 citations, it is one of the most cited papers in the history of computer science.
Google Brain / Google Research. The Transformer was born at Google Brain and Google Research, where the eight authors — all listed as equal contributors in randomized order — were working on machine translation. The paper was published at NeurIPS 2017 and is available on arXiv (1706.03762). Google's original implementation lives in the now-archived tensor2tensor repository, though today the HuggingFace Transformers library is the de facto standard. For learning the architecture from scratch, The Annotated Transformer from Harvard NLP, Jay Alammar's Illustrated Transformer, and Andrej Karpathy's "Let's build GPT" are essential resources.
The Transformer Eight. The paper's authors have scattered across the AI landscape in remarkable fashion. Ashish Vaswani co-founded Adept AI and then Essential AI, where he serves as CEO. Noam Shazeer co-founded Character.AI before returning to Google in a deal reportedly worth $2.7 billion, where he now co-leads Gemini development as VP of Engineering at Google DeepMind. Niki Parmar co-founded both Essential AI and Adept AI before joining Anthropic. Jakob Uszkoreit took the attention mechanism into biology, co-founding Inceptive to design RNA molecules with AI. Llion Jones moved to Tokyo and co-founded Sakana AI, pursuing nature-inspired approaches to artificial intelligence. Aidan Gomez — just 20 years old when the paper was published as a Google Brain intern — went on to co-found Cohere, now a leading enterprise AI company. Łukasz Kaiser continued foundational research at OpenAI and co-created the Trax deep learning library. And Illia Polosukhin pivoted entirely, co-founding NEAR Protocol, a layer-1 blockchain, and becoming CEO of the NEAR Foundation.
Daily Tech Feed: From the Labs is available on Apple Podcasts, Spotify, and wherever fine podcasts are distributed. Visit us at pod.c457.org for all our shows. New episodes daily.