多模态语音建模

Challenge/Research, Xiaomi, AI lab, 2022

带队代表小米AI参加ICASSP旗下MISP2021挑战赛,赢得音视频唤醒赛道冠军,音视频识别赛道亚军

成果丰富

中稿两篇,被《MISP 2021 workshop》收录

Q. Wang et al., “The Xiaomi-Talkfreely System For Audio-Visual Wake Word Spotting of MISP Challenge 2021”, ICASSP MISP workshop, 2022.

Q. Wang et al., “The Xiaomi-Talkfreely System For Audio-Visual Speech Recognition in MISP Challenge 2021”, ICASSP MISP workshop, 2022.

算法创新

提出两阶段唤醒策略,其中第一阶段为单多通道联合模型,第二阶段为混淆词判别模型

提出传统+神经网络前端,在最困难的鸡尾酒会问题上取得惊艳的效果,大幅提升识别率