I’m currently working on a startup (Reka.ai) where I am co-founder and chief scientist. Reka is an AI research and product company that builds state-of-the-art multimodal language models.
Before founding Reka.ai, I spent a wonderful 3.3 years at Google Research & Google Brain where I made contributions to many industry-defining LLMs such as PaLM, UL2, Flan-2 and Bard and multimodal models such as PaLI-X and ViT-22B. Notably, I was also the co-lead of modeling on PaLM-2 and PaLM-2 API.
During my time as a research scientist at Google, most of my published work revolved around Transformers, especially pertaining to efficiency, scaling and architecture research.
What happened to BERT & T5? On Transformer Encoders, PrefixLM and Denoising Objectives
A Blogpost series about Model Architectures Part 1: What happened to BERT and T5? Thoughts on Transformer Encoders, PrefixLM and Denoising objectives
Training great LLMs entirely from ground up in the wilderness as a startup
Chronicles of training strong LLMs from scratch in the wild
2022 in Review: Top language AI research papers + interesting papers to read
Here are some of the best language AI / NLP papers of 2022!
On Emergence, Scaling and Inductive Bias
Some thoughts on emergent abilities and scaling language models.