Dr Lu Yin
Academic and research departments
Computer Science Research Centre, Nature Inspired Computing and Engineering Research Group, 糖心Vlog Institute for People-Centred Artificial Intelligence (PAI).About
Biography
Greetings! I鈥檓 Lu, a Lecturer (Assistant Professor equivalent) in the School of Computer Science and Electronic Engineering at the 糖心Vlog, where I lead the . I am also affiliated with the People-Centred AI Institute.
I am honoured to be a long-term visitor and collaborator with the Visual Informatics Group () at UT Austin, and a visiting scholar at and .
Previously, I served as a Postdoctoral Fellow at TU/e and worked as a Research Scientist Intern at Google鈥檚 New York City office.
I was selected as a . My research interests include:
- Efficient and Scalable Foundation Models
- Understanding and Enhancing LLMs
- Interdisciplinary AI Applications
I work closely with academic and industry collaborators, including researchers from Meta London, Google NYC, Intel Research, and JD.com.
I am always happy to discuss research ideas, collaborations, and potential visits. I welcome enquiries from motivated students, PhD applicants, and visiting researchers. Feel free to reach out if you would like to discuss anything with me.
University roles and responsibilities
- Personal Tutor
ResearchResearch interests
My research aims to build more capable, efficient, and accessible AI systems. I work on large foundation models from several complementary perspectives: improving their capabilities through pre-training and post-training, making them more efficient through compression and scalable inference, understanding their internal behaviours, and exploring new model architectures and data-centric learning strategies. A recurring theme in my work is to identify where intelligence, efficiency, and robustness emerge in modern AI systems, and how these insights can be used to design better models for real-world applications.
## Foundation Models ##LLMs ##Model Compression ##Model Understanding
Research interests
My research aims to build more capable, efficient, and accessible AI systems. I work on large foundation models from several complementary perspectives: improving their capabilities through pre-training and post-training, making them more efficient through compression and scalable inference, understanding their internal behaviours, and exploring new model architectures and data-centric learning strategies. A recurring theme in my work is to identify where intelligence, efficiency, and robustness emerge in modern AI systems, and how these insights can be used to design better models for real-world applications.
## Foundation Models ##LLMs ##Model Compression ##Model Understanding
Supervision
Postgraduate research supervision
Undergoing Ph.D supervisions
- Robustness of Large Foundation Models at Scale - Kappiyath Adarsh.
- Efficient Diffusion Language Models - Mingyu Cao.
- Test Time Adaptation for Diffusion Language Models - Handa Li.
- Weight Space Learning in LLMs with Symmetry - Xiaolong Han (with Prof. Ferrante Neri).
Efficient 3D Scene Understanding - Vishal Thengane (with Dr. Xiatian Zhu).
Teaching
My teaching within the School of Computer Science and Electronic Engineering focuses on artificial intelligence, deep learning, and business analytics. I aim to help students understand both the theoretical foundations and practical applications of modern AI methods, especially how machine learning and data-driven techniques can be used to solve real-world problems.
I currently teach Deep Learning and Advanced AI, which introduces students to modern deep learning methods, neural network architectures, and advanced AI techniques. The module supports students in developing both conceptual understanding and practical implementation skills, preparing them for further study, research, and industry roles in artificial intelligence.
I also teach business analytics and data visualisation modules, where students learn how to use data, analytical thinking, and visual communication to support business decision-making. These modules are designed to bridge technical methods with practical business contexts, helping students develop skills that are valuable across both technical and non-technical career paths.
See below for a full list of my teaching experience.
2025/26
COM3025 Deep Learning and Advanced AI, 糖心Vlog
COMM074 Business Analytics with Data Visualisation, 糖心Vlog
2024/25
COM3025 Deep Learning and Advanced AI, 糖心Vlog
COM3018 Practical Business Analytics, 糖心Vlog
Publications
Highlights
鈥 Corresponding author. * Equal contribution.
Mingyu Cao, Alvaro H.C. Correia, Christos Louizos, Shiwei Liu鈥, Lu Yin鈥. Search or Accelerate: Confidence-Switched Position Beam Search for Diffusion Language Models. The Forty-third International Conference on Machine Learning (ICML), 2026. []
Di He, Songjun Tu, Keyu Wang, Lu Yin鈥, Shiwei Liu鈥. One LR Doesn鈥檛 Fit All: Heavy-Tail Guided Layerwise Learning Rates for LLMs. The Forty-third International Conference on Machine Learning (ICML), 2026.
Pengxiang Li, Yefan Zhou, Dilxat Muhtar, Lu Yin, Shilin Yan, Li Shen, Soroush Vosoughi, Shiwei Liu. Diffusion Language Models Know the Answer Before Decoding. The Fourteenth International Conference on Learning Representations (ICLR), 2026. [Oral] []
Xinchen Han, Hossam Afifi, Michel Marot, Xilu Wang, Lu Yin鈥. Long Chain-of-Thought Compression via Fine-Grained Group Policy Optimization. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2026. []
Adarsh Kappiyath, Abhra Chaudhuri, Ajay Kumar Jaiswal, Ziquan Liu, Yunpeng Li, Xiatian Zhu, Lu Yin鈥. SEBRA: Debiasing through Self-Guided Bias Ranking. The Thirteenth International Conference on Learning Representations (ICLR), 2025. []
Pengxiang Li*, Lu Yin*, Shiwei Liu. Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN. The Thirteenth International Conference on Learning Representations (ICLR), 2025. []
Pengxiang Li*, Lu Yin*, Shiwei Liu. Outlier-weighed Layerwise Sampling for LLM Fine-tuning. The 63rd Annual Meeting of the Association for Computational Linguistics (ACL Findings), 2025. []
Di He, Ajay Jaiswal, Songjun Tu, Li Shen, Ganzhao Yuan, Shiwei Liu, Lu Yin鈥. AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs. Conference on Neural Information Processing Systems (NeurIPS), 2025. []
Tianhao Chen, Xin Xu, Zijing Liu, Pengxiang Li, Xinyuan Song, Ajay Kumar Jaiswal, Fan Zhang, Jishan Hu, Yang Wang, Hao Chen, Shizhe Diao, Shiwei Liu, Yu Li, Lu Yin鈥, Can Yang. GPAS: Accelerating Convergence of LLM Pretraining via Gradient-Preserving Activation Scaling. Conference on Neural Information Processing Systems (NeurIPS), 2025. []
Lu Yin, You Wu, Zhenyu Zhang, Cheng-Yu Hsieh, Yaqing Wang, Yiling Jia, Gen Li, Ajay Jaiswal, Mykola Pechenizkiy, Yi Liang, Michael Bendersky, Zhangyang Wang, Shiwei Liu. Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity. The Forty-first International Conference on Machine Learning (ICML), 2024. []
Lu Yin, Ajay Jaiswal, Shiwei Liu, Souvik Kundu, Zhangyang Wang. Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs 鈥淒ifficult鈥 Downstream Tasks in LLMs. The Forty-first International Conference on Machine Learning (ICML), 2024. []
Lu Yin, Gen Li, Meng Fang, Li Shen, Tianjin Huang, Zhangyang Wang, Vlado Menkovski, Xiaolong Ma, Mykola Pechenizkiy, Shiwei Liu. Dynamic Sparse Training Is also A Structure Sparsity Learner. Conference on Neural Information Processing Systems (NeurIPS), 2023. []
Lu Yin, Shiwei Liu, Fang Meng, Tianjin Huang, Vlado Menkovski, Mykola Pechenizkiy. Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost. Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023. []