About me

I am a Member of Technical Staff at Reflection AI, working on advanced AI agents/skills and mid/post-training. My research interests span reinforcement learning, LLM agents, scalable machine learning, and ML-enhanced systems optimization. More broadly, I am interested in building intelligent systems that combine learning, reasoning, and robust infrastructure at scale.

I completed my PhD in Computer Science at the Department of Computer Science and Technology, University of Cambridge, where I was advised by Dr. Eiko Yoneki and Prof. Jon Crowcroft in the Machine Learning & Systems Research Group at the Computer Lab. My doctoral research explored the intersection of machine learning and systems, with a particular focus on applying reinforcement learning and learning-based methods to improve real-world systems such as databases, cloud services, and storage systems.

My current research focuses on large language models, VLA/VLM systems, and agent fine-tuning. I am especially interested in reinforcement learning and post-training methods for long-horizon agents, as well as the systems and infrastructure needed to support large-scale training and serving. I broadly describe this line of work as MLSys, with a particular emphasis on RLSys.

Before joining Reflection AI, I was a Research Scientist Intern at Google DeepMind, London, where I worked on the Autonomous Agents team led by Edward Grefenstette. My work focused on RLFT and post-training pipelines for advanced agents. I was also a Research Scientist Intern at Noah’s Ark Research Center UK, where I worked on large-scale RLFT frameworks for VLM-based mobile agents. Earlier, I co-founded and served as CTO of Powersense Technology Limited in Cambridge.

Prior to Cambridge, I received my Master of Engineering in Computer Science from Johns Hopkins University. I also received my Bachelor’s Degree in Physics with a minor in Mathematics from Peking University, where I graduated with honors. At Peking University, I received the Excellent Graduate Student Award from the School of Physics, recognition for my Excellent Graduation Thesis, the Special Award at the 5th Youth Physics Tournament, and the Freshman Scholarship.

I am always keen to collaborate on exciting research problems and impactful industry projects.


News! 🚀

📍 (2026) [Google DeepMind 50-Page Tech Report] - A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

📍 (2025.04) [IJCNN 2025] - OCMDP: Observation-Constrained Markov Decision Process

📍 (2025.02) [MLSys 2025] - ThunderServe: High-performance and Cost-efficient LLM Serving in Cloud Environments

📍 (2025.01) [SIGMOD 2025] - A New Paradigm in Tuning Learned Indexes: A Reinforcement Learning-Enhanced Approach

📍 (2025.01) [ICLR 2025] - DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents


Selected Publications 📚

MiRA Paper thumbnail

A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

Taiyi Wang, Sian Gooding, Florian Hartmann, Oriana Riva, Edward Grefenstette

Google DeepMind Technical Report

Pushing forward the research of frontier agents on long horizontal tasks.

Paper

DistRL Paper thumbnail

DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents

Taiyi Wang, Zhihao Wu, Jianheng Liu, et al.

International Conference on Learning Representations (ICLR) 2025

An asynchronous distributed reinforcement learning framework for on-device control agents.

Paper Code Project

LITune Paper thumbnail

A New Paradigm in Tuning Learned Indexes: A Reinforcement Learning-Enhanced Approach

Taiyi Wang, Liang Liang, Guang Yang, Thomas Heinis, Eiko Yoneki

International Conference on Management of Data (SIGMOD) 2025

A reinforcement learning-enhanced approach for tuning learned indexes.

Paper Project

Academic Service and Awards

Academic Service

Awards


Interests and Activities

Beyond research, I have broad interests in sports, leadership, and long-term community engagement.

I am passionate about tennis and previously served as Captain of the Girton College Men’s 1st Tennis Team at the University of Cambridge. I also won the University Cuppers’ Championship for the college in 2024. More broadly, I enjoy skiing and other outdoor activities, which continue to shape my teamwork, discipline, and leadership style.

I have also been involved in long-term charitable activities since 2015, with a focus on supporting education for children in under-resourced areas. In addition, during my time at Peking University, I served as Director of the Debating Center and as a Co-organizer for the Students’ International Communication Association (SICA), experiences that strengthened my commitment to leadership, communication, and community building.

I also co-founded a startup with Dr. Borong Hu. You can find more details here: Powersense Ltd..

ski_png