Research interestsI am currently focused on developing data-efficient reinforcement learning with human feedback (RLHF) to fine-tune large language models (LLMs) and pretraining the LLM for applications in search, recommendation, and advertising systems. I'm also interested in foundational research in bandit theory and RL, and designing and optimizing multi-agent systems within the social system. Publications/Manuscripts2024
2023
2022
2021
2020
|