Bio
I am a Junior Research Fellow starting from October 2024 at Trinity College, University of Cambridge. Before that, I was a research associate at the Machine Intelligence Laboratory, University of Cambridge working with Prof. Phil Woodland. I am also closely collaborating with Prof. Chao Zhang at Tsinghua University. My research interest is controllable and reliable multimodal conversational AI with large language models. Specifically, this includes multi-modal contextual knowledge integration, reliability, hallucination reduction, and multimodal contextualised AI safety. My research interest and experience also include speaker diarisation, language modelling and speech synthesis.
I completed my Ph.D. in June 2023. My Ph.D. at the University of Cambridge supervised by Prof. Phil Woodland (advisor Prof. Mark Gales) on contextual knowledge integration in end-to-end neural-based conversational AI systems. I held a research internship at Google Brain with Dr Yu Zhang in 2019 and ByteDance with Dr Wei Li in 2023. I was also grateful to have collaborated with Poly AI Ltd working with Dr Ivan Vulić and Dr. Paweł Budzianowski in 2023. I received by B. A and M.Eng degree in 2019 at Trinity College, University of Cambridge.
News
- [Oct. 2024] Our paper 'CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models' has been accepted at NeurIPS 2024 Workshop on Responsibly Building the Next Generation of Multimodal Foundational Models. Thank you to everyone involved in making this research possible!
- [Oct. 2024] Our journal 'Large Language Models Surpass Human Experts in Predicting Neuroscience Results' has been accepted and published at Nature Human Behaviour. We thank our collaborators, reviewers, and the editorial team for their support and valuable feedback.
- [Jul. 2024] We won the Best Short Paper Award at CUI 2024! Many thanks and congratulations to my co-authors.
- [Jun. 2024] We have 4 papers accepted at Interspeech 2024.
- [May. 2024] Our paper 'Building Better AI Agents: A Provocation on the Utilisation of Persona in LLM-based Conversational Agents' has been accepted at CUI 2024, see you in Luxembourg City!
- [May. 2024] Our paper 'av-SALMONN: Speech-Enhanced Audio-Visual Large Language Models' has been accepted at ICML 2024, see you in Vienna again!
- [Apr. 2024] We have 4 papers accepted at ICASSP 2024.
- [Mar. 2024] Our journal 'Graph Neural Networks for Contextual ASR with the Tree-Constrained Pointer Generator' has been officially published.
- [Jan. 2024] Our paper 'SALMONN: Towards Generic Hearing Abilities for Large Language Models' has been accepted at ICLR 2024, see you in Vienna!
- [Jan. 2024] I have been awarded a Junior Research Fellowship at Trinity College, University of Cambridge starting from October 2024.
- [Jan. 2024] I will be lecturing MPhil course MLMI14 "Advanced Speech Recognition" at the Department of Engineering, University of Cambridge.
- Show more
Selected Publications
Journal Papers
-
T-ASLP
Guangzhi Sun, Chao Zhang, Philip C. Woodland
IEEE/ACM Transactions on Audio, Speech, and Language Processing (T-ASLP), 2024.
-
T-ASLP
Guangzhi Sun, Chao Zhang, Philip C. Woodland
IEEE/ACM Transactions on Audio, Speech, and Language Processing (T-ASLP), 2022.
-
Neural Networks
Guangzhi Sun, Chao Zhang, Philip C. Woodland
Neural Networks 141, 2021.
Conference Papers
-
ICML
Guangzhi Sun, Wenyi Yu, Changli Tang, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang
(ICML), 2024.
-
ICLR
Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang
(ICLR), 2024.
-
IINTERSPEECH
Guangzhi Sun, Xianrui Zheng, Chao Zhang, Philip C. Woodland
(INTERSPEECH), 2023.
-
IINTERSPEECH
Guangzhi Sun, Chao Zhang, Philip C. Woodland
(INTERSPEECH), 2022.
-
ACL
Yang Li, Cheng Yu, Guangzhi Sun, Hua Jiang, Fanglei Sun, Weiqin Zu, Ying Wen, Yang Yang, Jun Wang
( Corresponding author)
60th Annual Meeting of the Association-for-Computational-Linguistics (ACL), 2022.
-
ASRU
Guangzhi Sun, Chao Zhang, Philip C. Woodland
IEEE Automatic Speech Recognition and Understanding Workshop ((ASRU), 2021.
-
ICASSP
Guangzhi Sun, Chao Zhang, Philip C. Woodland
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2019.
-
ICASSP
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Yonghui Wu
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020.
-
ICASSP
Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020.
Ph.D. Thesis
-
Ph.D.
Guangzhi Sun
Ph.D. in Engineering, University of Cambridge, 2023
M.Eng. Thesis
-
M.Eng.
Guangzhi Sun
Master of Engineering Thesis, University of Cambridge, 2019
Teaching
Mphil Course.
- 2024.01-2024.03: MLMI14 Advanced Speech Recognition.
MPhil Project Supervisor.
- 2024.05-2024.08: Second-order Optimisation and Imbalanced Class Distribution.
- 2024.05-2024.08: LanGWM: Language Grounded World Model (co-supervised by Toshiba team).
- 2024.05-2024.08: Object Localization in 3D World with LLM (co-supervised by Toshiba team).
MPhil Project Co-supervisor.
- 2023.05-2023.08: Conditional diffusion model for speaker separation.
Master of Engineering Project Helper.
- 2020.10-2021.06: Speaker diarisation for overlapped speech; Audio-visual speaker diarisation.
- 2021.10-2022.06: Spectral clustering-aware speaker diarisation.
- 2022.06-2023.06: Language model discounting for end-to-end speech recognition systems.
Undergraduate Supervisor.
Invited Talks
Conference and Workshop Presentations.
Services
Organization Committee
- Organising committee (Technical program chair & Session chair), UK Speech 2024
Conference Reviewers
Journal Reviewers
Contact
Address: Engineering Department, Trumpington Street, Cambridge, CB2 1PZ
Email: gs534 (at) cam.ac.uk
Last Update: