About me

I am a researcher in Reinforcement Learning, LLM Agents, Scalable Machine Learning, and ML-enhanced Systems Optimization. I completed my PhD in Computer Science at the Department of Computer Science and Technology, University of Cambridge, in January 2026. My research focuses on applying machine learning, especially reinforcement learning, to improve real systems such as databases, LLM fine-tuning pipelines, and LLM serving systems — an area I broadly describe as ML4Sys (especially RL4Sys).

During my PhD, I was fortunate to be advised by Dr. Eiko Yoneki and Prof. Jon Crowcroft from the Machine Learning & Systems Research Group at the Computer Lab, where we explored the intersection of machine learning and systems with the goal of pushing the boundaries of both fields.

Prior to Cambridge, I received my Bachelor’s Degree in Physics with a minor in Mathematics from Peking University, where I graduated with honors. I was honored with the Excellent Graduate Student Award from the School of Physics and received recognition for my Excellent Graduation Thesis. I was also awarded the Special Award at the 5th Youth Physics Tournament and the Freshman Scholarship at Peking University. I later completed my Master of Engineering in Computer Science at Johns Hopkins University.

My recent research and industry experience includes serving as a Research Scientist Intern at Google DeepMind, London, on the Autonomous Agents team led by Edward Grefenstette, where I worked on RLFT and post-training pipelines for advanced agents. Before that, I was a Research Scientist Intern at Noah’s Ark Research Center UK, working on large-scale RLFT frameworks for VLM-based mobile agents. I was also the Co-founder and CTO of Powersense Technology Limited in Cambridge.

I am always keen to collaborate on exciting research problems and impactful industry projects.


News! 🚀

📍 (2026) [Google DeepMind 50-Page Tech Report] - A Subgoal-driven Framework for Improving Long-Horizon LLM Agents

📍 (2025.04) [IJCNN 2025] - OCMDP: Observation-Constrained Markov Decision Process

📍 (2025.02) [MLSys 2025] - ThunderServe: High-performance and Cost-efficient LLM Serving in Cloud Environments

📍 (2025.01) [SIGMOD 2025] - A New Paradigm in Tuning Learned Indexes: A Reinforcement Learning-Enhanced Approach

📍 (2025.01) [ICLR 2025] - DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents


Selected Publications 📚

DistRL Paper thumbnail

DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents

Taiyi Wang, Zhihao Wu, Jianheng Liu, et al.

International Conference on Learning Representations (ICLR) 2025

An asynchronous distributed reinforcement learning framework for on-device control agents.

Paper Code Project

LITune Paper thumbnail

A New Paradigm in Tuning Learned Indexes: A Reinforcement Learning-Enhanced Approach

Taiyi Wang, Liang Liang, Guang Yang, Thomas Heinis, Eiko Yoneki

International Conference on Management of Data (SIGMOD) 2025

A reinforcement learning-enhanced approach for tuning learned indexes.

Paper Project

OCMDP: Observation-Constrained Markov Decision Process

Taiyi Wang, Jianheng Liu, Bryan Lee, Zhihao Wu, Yu Wu

International Joint Conference on Neural Networks (IJCNN) 2025

A study of observation-constrained Markov decision processes.

Paper

ThunderServe: High-performance and Cost-efficient LLM Serving in Cloud Environments

Youhe Jiang, Fangcheng Fu, Xiaozhe Yao, Taiyi Wang, Bin Cui, Ana Klimovic, Eiko Yoneki

Annual Conference on Machine Learning and Systems (MLSys) 2025

High-performance and cost-efficient LLM serving in cloud environments.

Paper


Academic Service and Awards

Academic Service

Awards


Interests and Activities

Beyond research, I have broad interests in sports, leadership, and long-term community engagement.

I am passionate about tennis and previously served as Captain of the Girton College Men’s 1st Tennis Team at the University of Cambridge. I also won the University Cuppers’ Championship for the college in 2024. More broadly, I enjoy skiing and other outdoor activities, which continue to shape my teamwork, discipline, and leadership style.

I have also been involved in long-term charitable activities since 2015, with a focus on supporting education for children in under-resourced areas. In addition, during my time at Peking University, I served as Director of the Debating Center and as a Co-organizer for the Students’ International Communication Association (SICA), experiences that strengthened my commitment to leadership, communication, and community building.

I also co-founded a startup with Dr. Borong Hu. You can find more details here: Powersense Ltd..

ski_png