Dominique C. Perrault-Joncas

Principal Research Scientist, Amazon

I am a Principal Research Scientist at Amazon, where I lead research on reinforcement learning for large-scale supply chain decision systems. My work spans reinforcement learning, causal inference and experimental design, LLM post-training (RLHF / preference optimization), and game theory.

I received my Ph.D. in Statistics from the University of Washington, advised by Marina Meilă. Previously, I was a Senior Quantitative Analyst at Google. My earlier research focused on manifold learning and spectral methods.

Research Interests

Reinforcement Learning

Model-based RL, constrained RL, sim2real transfer, inventory control and placement optimization

LLM Post-Training

RLHF, preference optimization (DPO / C2-DPO), fine-tuning, evaluation

Causal Inference & Experimentation

Interference-aware experimentation, meta-analysis of randomized experiments, treatment effect transportation

Game Theory & Market Design

Marketplace equilibrium, competitive pricing mechanisms, shared-revenue models

Selected Publications

See Google Scholar for a full list.

Submitted / Under Review

Mind the Sim-to-Real Gap & Think Like a Scientist H. Parikh, G. Levin-Konigsberg, D. Perrault-Joncas, A. Volfovsky Submitted to NeurIPS 2026 [arXiv]
Ready from Day 1: Population-Aware Coordination for Large-Scale Constrained Multi-Agent Systems A. Wang, D. Perrault-Joncas, A. Maggiar, D. Foster, C. Eisenach Submitted to NeurIPS 2026 [arXiv]
TEA-Time: Transporting Effects Across Time H. Parikh, G. Levin-Konigsberg, D. Perrault-Joncas, A. Volfovsky Submitted to NeurIPS 2026 [arXiv]
BRIDGE: Building Representations In Domain Guided Program Synthesis R.J. George, C. Eisenach, U. Ghai, D. Perrault-Joncas, A. Anandkumar, D. Foster Submitted to NeurIPS 2026 [arXiv]

Published

C2-DPO: Constrained Controlled Direct Preference Optimization K. Asadi, J. Han, I. Pipano, X. Xu, D. Perrault-Joncas, S. Sabach, K. Bouyarmane, M. Ghavamzadeh Transactions on Machine Learning Research (TMLR), 2025 [OpenReview] [arXiv]
Structure-Informed Deep Reinforcement Learning for Inventory Management A. Maggiar, S. Andaz, A. Bagaria, C. Eisenach, D. Foster, O. Gottesman, D. Perrault-Joncas NeurIPS 2025 Workshop (MLxOR) [OpenReview] [arXiv]
LLMForecaster: Improving Seasonal Event Forecasts with Unstructured Textual Data H. Zhang, C. Arvin, D. Efimov, M.W. Mahoney, D. Perrault-Joncas, S. Ramasubramanian, A.G. Wilson, M. Wolff NeurIPS 2024 Workshop [OpenReview] [Amazon Science]
Meta-Analysis of Randomized Experiments with Applications to Heavy-Tailed Response Data N. Tripuraneni, D. Perrault-Joncas, D. Madeka, D. Foster, M.I. Jordan NeurIPS 2022 Workshop (HeavyTails) [arXiv] [OpenReview]

Preprints

Marketplace Operators Can Induce Competitive Pricing T. Ding, D. Perrault-Joncas, O. Ronen, M.I. Jordan, D. Bergemann, D. Foster, O. Gottesman arXiv, 2025 [arXiv]
A Shared-Revenue Bertrand Game R. Pabari, U. Ghai, D. Perrault-Joncas, K. Torkkola, O. Ronen, D. Madeka, A. Rubinstein, D. Foster, O. Gottesman arXiv, 2025 [arXiv]

Earlier Work

Improved Graph Laplacian via Geometric Self-Consistency D. Joncas, M. Meilă, J. McQueen NeurIPS 2017 [arXiv] [NeurIPS]
Nearly Isometric Embedding by Relaxation J. McQueen, M. Meilă, D. Joncas NeurIPS 2016 [NeurIPS]
Non-linear Dimensionality Reduction: Riemannian Metric Estimation and the Problem of Geometric Discovery D. Perrault-Joncas, M. Meilă arXiv, 2013 [arXiv]
Directed Graph Embedding: An Algorithm based on Continuous Limits of Laplacian-type Operators D.C. Perrault-Joncas, M. Meilă NeurIPS 2011 [NeurIPS]
Building a Job Landscape from Directional Transition Data D.C. Perrault-Joncas, M. Meilă, M. Scott AAAI Fall Symposium: Manifold Learning and Its Applications, 2010
Linear Stability of a Compressible Coaxial Jet with Continuous Velocity and Temperature Profiles D. Perrault-Joncas, S.A. Maslowe Physics of Fluids, 20(7), 2008 [DOI]
The Generation of Internal Tides at Abrupt Topography L. St. Laurent, S. Stringer, C. Garrett, D. Perrault-Joncas Deep Sea Research Part I, 50(8), 987–1003, 2003 [DOI]

Experience

Amazon

Jan 2020 – Present

Principal Research Scientist, Supply Chain Optimization Technologies

Tech lead for a 15+ researcher group building reinforcement learning systems for inventory buying and placement decisions spanning multi-billion-dollar annual inventory spend. Led initiatives that drove nine-figure impact in automated ordering, experimentation methodology, and forecasting. Co-developed C2-DPO for LLM preference optimization and LLMForecaster for demand forecasting with LLM embeddings.

Google

Apr 2015 – Dec 2019

Senior Quantitative Analyst, Advanced Measurement Technologies

Led a team of statisticians on cross-media measurement. Designed experiments for face recognition and gaze estimation systems. Developed indoor positioning models using WiFi, Bluetooth, and sensor fusion.

Amazon

Nov 2012 – Apr 2015

Research Scientist, Demand Forecasting

Developed first demand forecasting models for Amazon Pantry at launch. Scaled Bayesian and quantile regression to big data via MapReduce. Designed methods for imputing demand data at scale.

Dominique C. Perrault-Joncas

Research Interests

Reinforcement Learning

LLM Post-Training

Causal Inference & Experimentation

Game Theory & Market Design

Selected Publications

Submitted / Under Review

Published

Preprints

Earlier Work

Experience

Amazon

Google

Amazon

Education

Ph.D. in Statistics

M.Sc. in Applied Mathematics

B.Sc. in Mathematics & Physics