I possess extensive engineering experience across both Machine Learning Systems (MLSys) and Large Language Model Algorithms. My goal is to advance next-generation AGI systems in order to create larger and better models. I am deeply passionate about the latest technologies and actively contribute to the open-source community as a core contributor to several popular open-source AI projects.
As one of the earliest members of the Seed Team, I focused on AI infrastructure and large-scale training systems for LLMs and multimodal foundation models across pre-training and post-training. Starting from the first-generation Seed models, I supported large-scale training on 10,000-GPU clusters. I led a small team to build VeOmni, an open-source multimodal training system, and supported training at the scale of thousands of GPUs. I was deeply involved in the R&D of the core Seed 1.5 to Seed 2.0 model families, including reasoning and multimodal models, as well as the UI-TARS series of GUI agent models.
Worked on LLM post-training and agent research, with projects directly leading to publications on process reward modeling, SFT data selection, and data-analysis agents.
Joined as Employee #15 and worked on large-model systems and open-source products from Seed to Series A.
Participated in the development of SenseTime's early Megatron-style large-model training framework.
Contributed to MindSpore and MindSpore Lite on device-side GPU inference and runtime infrastructure.