I am currently a Research Fellow with [HLT Lab], Department of Electrical and Computer Engineering, National University of Singapore (NUS). I received Ph.D Degree in College of Intelligence and Computing in Tianjin University, supervised by Prof. Longbiao Wang and Prof. Jianwu Dang. My research topic is Speech Separation Based on Deep Learning in Open Complex Environment [Thesis] [Slide]. During my PhD period, I also worked as a research assistant in NTU and NUS in Singapore from 2019 to 2022, supervised by Prof. Eng Siong Chng and Prof. Haizhou Li. Prior to that, I received my Master’s Degree from Tianjin University under the supervision of Prof. Di Jin and Prof. Liang Yang in 2017, and my master’s research topic is Socical Network or Community Detection.

My research interest includes speech processing, speech separation and social network analysis. I have published more than 10 papers at the top international AI conferences such as IJCAI, ICASSP, INTERSPEECH.

📖 Educations

  • 2017.09 - 2022.10, Ph.D. in Applied Computer Technology, Tianjin University (TJU), Tianjin, China.
  • 2015.09 - 2017.06, M.E. in Software Engineering, Tianjin University (TJU), Tianjin, China. [Got Master’s Degree in advanced]
  • 2011.09 - 2015.06, B.E. in Software Engineering / B.S. in Public Management (double major), Tianjin Polytechnic University (TJPU), Tianjin, China. [GPA: 90.33/100, Ranking: Top3]

💻 Research Experiences

  • 2023.04 - Present, Research Fellow, National University of Singapore (NUS), Singapore.
  • 2022.07 - 2023.04, Research Asistant, Chinese University of Hong Kong (CUHKSZ), Shenzhen, China. [Project Demo1] [Project Demo2]
  • 2020.04 - 2022.07, Research Asistant, National University of Singapore (NUS), Singapore.
  • 2019.10 - 2020.04, Research Asistant, Nanyang Technological University (NTU), Singapore. [Team Members]
  • 2018.09 - 2019.06, Machine Learning Engineer, AI Lab of Didi Chuxing Company (DiDi), Beijing, China. [Outstanding Award]

📝 Publications

  • “PIAVE: A Pose-Invariant Audio-Visual Speaker Extraction Network”, Qinghua Liu, Meng Ge*, Zhizheng Wu, Haizhou Li, INTERSPEECH 2023
  • “Locate and Beamform: Two-dimensional Locating All-neural Beamformer for Multi-channel Speech Separation”, Yanjie Fu, Meng Ge*, Honglong Wang, Nan Li, Haoran Yin, Longbiao Wang, Gaoyan Zhang, Jianwu Dang, Chengyun Deng, Fei Wang, INTERSPEECH 2023
  • “Rethinking the Visual Cues in Audio-Visual Speaker Extraction”, Junjie Li, Meng Ge*, Zexu Pan, Rui Cao, Longbiao Wang, Jianwu Dang, Shiliang Zhang, INTERSPEECH 2023
  • “SDNet: Stream-attention and Dual-feature Learning Network for Ad-hoc Array Speech Separation”, Honglong Wang, Chengyun Deng, Yanjie Fu, Meng Ge*, Longbiao Wang, Gaoyan Zhang, Jianwu Dang, Fei Wang, INTERSPEECH 2023
  • “Time-Domain Speech Separation Networks with Graph Encoding Auxiliary”, Tingting Wang, Zexu Pan, Meng Ge*, Zhen Yang, Haizhou Li, SPL [PDF]
  • “VCSE: Time-Domain Visual-Contextual Speaker Extraction Network”, Junjie Li, Meng Ge*, Zexu Pan, Longbiao Wang, Jianwu Dang, INTERSPEECH 2022 [PDF]
  • “USEV: Universal Speaker Extraction with Visual Cue”, Zexu Pan, Meng Ge*, Haizhou Li, TASLP 2022 [PDF] [Code]
  • “Dual-stream Speech Dereverberation Network Using Long-term and Short-term Cues”, Nan Li, Meng Ge*, Longbiao Wang, Jianwu Dang, IJCNN 2022 [PDF]
  • “MIMO-DoAnet: Multi-channel Input and Multiple Outputs DoA Network with Unknown Number of Sound Sources”, Haoran Yin, Meng Ge, Yanjie Fu, Gaoyan Zhang, Longbiao Wang, Lei Zhang, Lin Qiu, Jianwu Dang, INTERSPEECH 2022 [PDF] [Poster] [Code]
  • “Language-specific Characteristic Assistance for Code-switching Speech Recognition”, Tongtong Song, Qiang Xu, Meng Ge, Longbiao Wang, Hao Shi, Yongjie Lv, Yuqin Lin, Jianwu Dang, INTERSPEECH 2022 [PDF]
  • “RAW-GNN: RAndom Walk Aggregation based Graph Neural Network”, Di Jin, Rui Wang, Meng Ge, Dongxiao He, Xiang Li, Wei Lin, Weixiong Zhang, IJCAI 2022 [PDF] [Video] [Code]
  • “Iterative Sound Source Localization for Unknown Number of Sources”, Yanjie Fu, Meng Ge, Haoran Yin, Xinyuan Qian, Longbiao Wang, Gaoyan Zhang, Jianwu Dang, INTERSPEECH 2022 [PDF] [Video] [Poster] [Code]
  • “Compressing Transformer-Based ASR Model by Task-Driven Loss and Attention-Based Multi-Level Feature Distillation”, Yongjie Lv, Longbiao Wang, Meng Ge, Sheng Li, Chenchen Ding, Lixin Pan, Yuguang Wang, Jianwu Dang, Kiyoshi Honda, ICASSP 2022 [PDF]
  • “L-SpEx: Localized Target Speaker Extraction”, Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li, ICASSP 2022 [PDF] [Slide] [Poster] [Code]
  • “A Hybrid Continuity Loss to Reduce Over-Suppression for Time-domain Target Speaker Extraction”, Zexu Pan, Meng Ge, Haizhou Li, INTERSPEECH 2022 [PDF] [Code]
  • “Global Signal-to-noise Ratio Estimation Based on Multi-subband Processing Using Convolutional Neural Network”, Nan Li, Meng Ge, Longbiao Wang, Masashi Unoki, Sheng Li, Jianwu Dang, INTERSPEECH 2022 [PDF]
  • “Self-Distillation Based on High-level Information Supervision for Compressing End-to-End ASR Model”, Qiang Xu, Tongtong Song, Longbiao Wang, Hao Shi, Yuqin Lin, Yongjie Lv, Meng Ge, Qiang Yu, Jianwu Dang, INTERSPEECH 2022 [PDF]
  • “Simultaneous Progressive Filtering-Based Monaural Speech Enhancement”, Haoran Yin, Hao Shi, Longbiao Wang, Luya Qiang, Sheng Li, Meng Ge, Gaoyan Zhang, Jianwu Dang, ICONIP 2021 [PDF]
  • “Speech Dereverberation Based on Scale-Aware Mean Square Error Loss”, Luya Qiang, Hao Shi, Meng Ge, Haoran Yin, Nan Li, Longbiao Wang, Sheng Li, Jianwu Dang, ICONIP 2021 [PDF]
  • “Robust Voice Activity Detection Using a Masked Auditory Encoder Based Convolutional Neural Network”, Nan Li, Longbiao Wang, Masashi Unoki, Sheng Li, Rui Wang, Meng Ge, Jianwu Dang, ICASSP 2021 [PDF]
  • “Multi-Stage Speaker Extraction with Utterance and Frame-Level Reference Signals”, Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li, ICASSP 2021 [PDF] [Slide]
  • “Order-aware Pairwise Intoxication Detection”, Meng Ge, Ruixiong Zhang, Wei Zou, Xiangang Li, Cheng Gong, Longbiao Wang, Jianwu Dang, ISCSLP 2021 [PDF] [Slide]
  • “Neural Speaker Extraction with Speaker-Speech Cross-Attention Network”, Wupeng Wang, Chenglin Xu, Meng Ge, Haizhou Li, INTERSPEECH 2021 [PDF]
  • “A Pitch-aware Speaker Extraction Serial Network”, Yu Jiang, Meng Ge, Longbiao Wang, Jianwu Dang, Kiyoshi Honda, Sulin Zhang, Bo Yu, APASIPA 2021 [PDF]
  • “Spectrograms Fusion with Minimum Difference Masks Estimation for Monaural Speech Dereverberation”, Hao Shi, Longbiao Wang, Meng Ge, Sheng Li, Jianwu Dang, ICASSP 2020 [PDF]
  • “SpEx+: A Complete Time Domain Speaker Extraction Network”, Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li, INTERSPEECH 2020 [PDF] [Slide] [Video] [Code]
  • “Singing Voice Extraction with Attention-Based Spectrograms Fusion”, Hao Shi, Longbiao Wang, Sheng Li, Chenchen Ding, Meng Ge, Jianwu Dang, Hiroshi Seki, INTERSPEECH 2020 [PDF]
  • “A Fast Convolutional Self-attention Based Speech Dereverberation Method for Robust Speech Recognition”, Nan Li, Meng Ge, Longbiao Wang, Jianwu Dang, ICONIP 2020 [PDF]
  • “Environment-Dependent Attention-Driven Recurrent Convolutional Neural Network for Robust Speech Enhancement”, Meng Ge, Longbiao Wang, Nan Li, Hao Shi, Jianwu Dang, Xiangang Li, INTERSPEECH 2019 [PDF]
  • “Distant-talking Speech Recognition Based on Multi-objective Learning Using Phase and Magnitude-based Feature”, Dongbo Li, Longbiao Wang, Jianwu Dang, Meng Ge, Haotian Guan, ISCSLP 2019 [PDF]
  • “Pitch Synchronized Relative Phase with Peak Error Detection For Noise-robust Speaker Recognition”, Meng Ge, Longbiao Wang, Seiichi Nakagawa, Yuta Kawakami, Jianwu Dang, Xiangang Li, ISCSLP 2019 [PDF]
  • “Integrative Network Embedding via Deep Joint Reconstruction”, Di Jin, Meng Ge, Liang Yang, Dongxiao He, Longbiao Wang, Weixiong Zhang, IJCAI 2018 [PDF]
  • “Using Deep Learning for Community Discovery in Social Networks”, Di Jin, Meng Ge, Zhixuan Li, Wenhuan Lu, Dongxiao He, Francoise Fogelman-Soulie, ICTAI 2017 [PDF]
  • “Exploring the Roles of Cannot-link Constraint in Community Detection via Multi-variance Mixed Gaussian Generative Model”, Liang Yang, Meng Ge, Di Jin, Dongxiao He, Huazhu Fu, Jing Wang, Xiaochun Cao, PLOS ONE 2017 [PDF] [Code]

📝 Patents

  • “Environmental Adaptive Speech Enhancement Algorithm Based on Attention Driven Recurrent Convolutional Network (基于注意力驱动循环卷积网络的环境自适应语音增强算法)”, Meng Ge, Longbiao Wang, Jianwu Dang, CN201910166373.9 [Link]
  • “An order processing method, device, electronic device, and storage medium (一种订单处理方法、装置、电子设备及存储介质)”, Meng Ge, Ruixiong Zhang, CN201910414644.8 [Link]
  • “Method, device, and system for sound source localization based on time-domain units (基于时域单元的声源定位方法、装置及系统)”, Haotian Guan, Yu Jiang, Meng Ge, Qibo Liao, CN202010401597.6 [Link]
  • “A Beamforming Method Based on Complex Gated Cyclic Units (一种基于复数门控循环单元的波束形成方法)”, Yu Jiang, Longbiao Wang, Meng Ge, Jianwu Dang, Kiyoshi Honda, CN202111524413.6 [Link]

📝 Projects (participation)

  • 2023.04 - 2025.04, Single-Channel Far-field Speaker Diarization for Interview Rooms with Far-field Voice Activity Detection, KLASS project in Singapore.
  • 2022.06 - 2025.06, Brain-like Auditory Attention Theory and Engineering Practice (类脑听觉注意力理论与工程化实践), Internal Project from Shenzhen Research Institute of Big Data.
  • 2022.01 - 2023.12, Multi-modal Self-supervised Learning in Speech Processing (多模态自监督预训练技术在语音处理的应用), H Company with CUHKSZ.
  • 2019.06 - 2022.05, 1st Sub-Topic in Brain-like Natural Language Recognition and Interaction Based on Cognitive Mechanism (基于语言认知机理的类脑自然语言识别与交互), National Key R&D Plan “Intelligent Robot” Special Project
  • 2018.01 - 2012.12, Research on Multi-Accent Speech Recognition for Reverberation Environment (面向混响环境的多口音语音识别研究), the National Natural Science Foundation of China
  • 2018.10 - 2021.09, Key Technologies and System Implementation of Conversation in Complex Acoustic Environments for Robots (面向机器人的复杂环境语音对话关键技术及系统实现), Tianjin New Generation Artificial Intelligence Technology Project
  • 2021.10 - 2022.09, Speech Separation for Car Environments (面向车载环境的语音分离), D Company with Tianjin University [Awards]
  • 2020.06 - 2021.03, 1111Fan-shaped Intelligent Sound Screen (扇形智能音幕), H Company with Tianjin University [IdeaHub Product]

💻 Work Experiences

  • 2015.07 - 2016.07, Co-Funder, Tianjin Lingyi Technology Co., Ltd, Tianjin, China.
  • 2014.03 - 2015.06, Co-Funder and Senior Software Engineer, Tianjin Chuanhe Technology Co., Ltd, Tianjin, China.
  • 2014.01 - 2014.07, Software Engineer and Team Leader, iSoftStone (Tianjin) Information Technology Group Co, Ltd, Tianjin, China.
  • 2013.06 - 2013.07, Software Engineer and Team Leader, IBM (Tianjin) Experienced Training Program, Tianjin, China.

🎖 Certifications and Awards

  • The third place in the international L3DAS23 Challenge of ICASSP 2023 [Link]
  • Honda Kiyoshi’s Advanced Speech Science Award [PDF]
  • ISCA Grant Award of INTERSPEECH 2019
  • Oracle Certified Professional (OCP) - Oracle 10g Database Administrator [PDF]
  • Oracle Database 10g Administrator Certified Associate (OCA) [PDF]
  • Outstanding Student Scholarship Award.
  • First-Class Scholarship Award
  • Outstanding Youth Nomination Award
  • Outstanding Graduate Student Award

💬 Leadership and Service Experiences

  • Reviewers: TASLP, ICASSP, INTERSPEECH, Pattern Recognition
  • Conference Volunteer, The National Conference on Man-Machine Speech Communication (NCMMSC’19)
  • Volunteer Leader, The 11th International Seminar on Speech Production (ISSP’17)
  • Conference Volunteer, The 10th International Symposium on Chinese Spoken Language Processing (ISCSLP’16)
  • Foreign-Side Volunteer, The 12th Summer Davos Forum
  • Vice Minister, Graduate Student Union, Tianjin University (TJU).
  • Vice Minister, Lenovo Idea Elite Club, Tianjin Polytechnic University (TJPU)
  • Captain, Volleyball Team, Tianjin Polytechnic University (TJPU)