多模态语音建模
Challenge/Research, Xiaomi, AI lab, 2022
带队代表小米AI参加ICASSP旗下MISP2021挑战赛,赢得音视频唤醒赛道冠军,音视频识别赛道亚军
成果丰富
中稿两篇,被《MISP 2021 workshop》收录
Q. Wang et al., “The Xiaomi-Talkfreely System For Audio-Visual Wake Word Spotting of MISP Challenge 2021”, ICASSP MISP workshop, 2022.
Q. Wang et al., “The Xiaomi-Talkfreely System For Audio-Visual Speech Recognition in MISP Challenge 2021”, ICASSP MISP workshop, 2022.
算法创新
提出两阶段唤醒策略,其中第一阶段为单多通道联合模型,第二阶段为混淆词判别模型
提出传统+神经网络前端,在最困难的鸡尾酒会问题上取得惊艳的效果,大幅提升识别率