Daily Tech Feed: From the Labs

Deep dives into foundational AI and ML research papers

2: Generative Modeling via Drifting — One-Step Image Generation

Researchers from MIT and Harvard propose Drifting Models, a new paradigm for generative modeling that achieves state-of-the-art image generation in a single forward pass. Instead of iterating at inference time like diffusion models, Drifting Models evolve the ...

Show Notes

Episode 002: Generative Modeling via Drifting

Why it matters. Diffusion models generate stunning images but require dozens or hundreds of iterative steps at inference time, making them slow and expensive. "Generative Modeling via Drifting" introduces a fundamentally different approach: instead of iterating at inference, Drifting Models move the iterative process into training itself. A "drifting field" based on attraction toward real data and repulsion from generated samples evolves the output distribution during training, so that at inference time a single forward pass is all you need. The result is state-of-the-art image generation on ImageNet 256×256 — FID 1.54 in latent space, 1.61 in pixel space — with a single network evaluation, and a base-size model (133M parameters) that competes with models five times its size. The approach also transfers to robotics, where a "Drifting Policy" matches the performance of 100-step Diffusion Policy in a single step.

MIT and Harvard. This work comes from MIT CSAIL and Harvard University, with the team operating in Kaiming He's lab at MIT. The paper is available on arXiv (2602.04770) with an interactive project page demonstrating the core algorithm on toy distributions. Code and ImageNet training models have been promised but not yet released. The method builds on self-supervised feature extractors including MoCo, SimCLR, and latent-MAE, and uses the Stable Diffusion VAE tokenizer with a DiT-like architecture.

The Researchers. Kaiming He, the senior author, is a professor at MIT EECS and one of the most cited researchers in all of computer science — his ResNet paper alone has over 200,000 citations, and he pioneered self-supervised learning methods MoCo and MAE as well as Mask R-CNN and Feature Pyramid Networks during his years at Meta AI (FAIR). First author Mingyang Deng is a PhD student at MIT CSAIL focused on generative models. Yilun Du is an Assistant Professor at Harvard known for energy-based models, compositional generation, and applying diffusion models to robotics and planning. Tianhong Li is an MIT researcher whose prior work on MAR (Masked Autoregressive generation) has advanced autoregressive approaches to image synthesis. He Li rounds out the team as an MIT researcher contributing to the generative modeling pipeline.

Daily Tech Feed: From the Labs is available on Apple Podcasts, Spotify, and wherever fine podcasts are distributed. Visit us at pod.c457.org for all our shows. New episodes daily.