My wife (left) and I (right) at the Mendocino arch.

Vighnesh Birodkar

I am a machine learning researcher in the space of video generative models. I am deeply interested in compression, real-time, and long context generation. I have been generating videos since 2017. I believe that profound insights come from running a lot of simple experiments. I currently work in the video pre-training team at Luma AI. For my previous work experience, see my Resume.

Selected Research

Sample what you can't compress

We train auto encoders with diffusion loss (without GANs) and show that we can achieve both better compression and higher quality generation.

Gemini 2.5 and Veo

Google's flagship AI model. I added video data and metrics to the Gemini codebase and trained a prototype diffusion transformer that, when scaled up, became Veo 1.

Videopoet: A large language model for zero-shot video generation

We train an LLM with discrete video tokens and show that it is a powerful framework for video generation.

The surprising impact of mask-head architecture on novel class segmentation

By fixing a critical flaw in Region Proposal networks, we show that we can get strong mask generationzation across classes by using a simple and deep mask head.

Semantic redundancies in image-classification datasets: The 10% you don't need

We show that you can throw away 10% of Imagenet without any loss in evaluation accuracy.

Open source

Undergraduate projects

© 2025 Vighnesh Birodkar.