My wife (left) and I (right) at the Mendocino arch.
I am a machine learning researcher in the space of video generative models. I am deeply interested in compression, real-time, and long context generation. I have been generating videos since 2017. I believe that profound insights come from running a lot of simple experiments. I currently work in the video pre-training team at Luma AI. For my previous work experience, see my Resume.
We train auto encoders with diffusion loss (without GANs) and show that we can achieve both better compression and higher quality generation.
Google's flagship AI model. I added video data and metrics to the Gemini codebase and trained a prototype diffusion transformer that, when scaled up, became Veo 1.
We train an LLM with discrete video tokens and show that it is a powerful framework for video generation.
By fixing a critical flaw in Region Proposal networks, we show that we can get strong mask generationzation across classes by using a simple and deep mask head.
We show that you can throw away 10% of Imagenet without any loss in evaluation accuracy.