Divyat Mahajan

  • Ph.D. Candidate, Mila

  • Visiting Researcher, Meta FAIR
I am a final year Ph.D. candidate at Mila & Université de Montréal, advised by Ioannis Mitliagkas. Most recently, I was a visiting researcher at Meta Super Intelligence Labs (FAIR) working on pretraining large language models with the memory and generalization team.

The central theme of my research is understanding and improving how machine learning systems generalize to novel tasks and environments. My work spans causal representation learning for robustness under distribution shifts (1), as well as modern paradigms such as in-context learning (2) and large-scale pretraining (3). Going forward, I am broadly interested in the following research directions for improving the capabilties and relaibility of foundation models.
  • Novel approaches for pretraining. I am interested in pretraining strategies that help language models learn richer representations and improve long-horizon reasoning & planning. A direction I find especially promising is data-constrained pretraining, where better objectives, architectures, and synthetic data may become increasingly important as compute scales faster than the supply of high-quality data.
  • Reusable skills for continual learning. I am interested in approaches for discovering reusable skills/strategies from reasoning traces that can help "amortize" the reasoning process. I am especially interested in exploring how to consolidate skills over time, enabling efficient adaptation to new tasks and self-improvement.
  • Causal approaches for alignment and safety. I am interested in alignment methods that move beyond spurious correlations and better capture the underlying intent and causal structure. In particular, I am excited by causal approaches for reward design and concept learning that entail better understanding and realiable steering of LLM behavior.
My research is supported by the FRQNT doctoral fellowship, and I am deeply grateful for the amazing collaborations that have enrinched my Ph.D. journey. I was advised by Kartik Ahuja and Pascal Vincent under the Meta AIM Program, and also did a summer internship at Microsoft Research Cambridge with Cheng Zhang and Meyer Scetbon. Further, I worked with Vasilis Syrgkanis at Stanford, and prior to Ph.D., I was a research fellow at Microsoft Research India with Amit Sharma.

Select Publications & Preprints

Select Awards & Honours

Software