上海交通大学计算机科学全球午餐讲座系列
(SJTU Computer Science Global Lunch Series)
上海交通大学计算机科学全球午餐讲座系列活动由上海交通大学计算机系主办,旨在展示我系研究成果,促进师生学术交流,扩大学生学术视野,增强我系科研氛围。
系列讲座由计算机系七大研究所轮流承办,邀请各领域资深教授、大牛专家、青年精英及优秀研究生介绍其最新最好的成果!
高可靠软件与理论研究所
并行与分布计算研究所
网络与服务计算研究所
智能人机交互研究所
密码与信息安全研究所
计算机应用研究所
计算机体系结构研究所
【2021秋季第十三场】讲座时间为2021年12月10日(星期五),欢迎关注并报名参加!
讲座时间
2021年12月10日(星期五)12:00-13:30
参与方式
线下:上海交通大学闵行校区电信学院3号楼414
线下报名:扫描下方二维码(报名仅用于统计参会人数和午餐用量,午餐份额有限,届时先到先得!)
线上:B站直播
直播地址:http://live.bilibili.com/22797301
Speaker
陈谐 Xie CHEN
Xie Chen is currently an Tenure-Track Associate Professor in the Department of Computer Science and Engineering at Shanghai Jiao Tong University, China. He obtained his Bachelor degree in Electronic Engineering department from Xiamen University in 2009, Master degree in the Electronic Engineering department from Tsinghua University in 2012 and PhD degree in the information engineering department in Cambridge University (U.K.) in 2017. Prior to joining SJTU, he worked in Cambridge University as a Research Associate from 2017 to 2018, and the speech and language research group in Microsoft as a senior and principal researcher form 2018 to 2021. His main research interest lies in deep learning, especially its application on speech processing, including speech recognition and synthesis.
Host:
陈露 Lu CHEN
上海交通大学计算机系助理研究员
Title & Abstract
Advancing Transformer Transducer for Speech Recognition on Large-Scale Dataset: Efficient Streaming and LM Adaptation
(This talk will be in Chinese)
Recent years have witnessed the great success of end-to-end (E2E) based models in speech recognition, especially the neural Transducer based models due to their streaming capability and promising performance. However, in order to replace the traditional hybrid system, which is applied widely and still the main-stream system in the speech community , there are still several key challenging issues to be addressed. Among these challenges, efficient streaming and domain adaptation are two essential factors to handle for developping E2E based ASR models. In this talk, I will introduce our recent effort on these two aspects on neural Transducer models. We proposed an approach called "attention mask is all you need to design" for efficient training and streaming transformer-transducer model. In addition, we designed a novel model architecture, "factorized neural transducer", for efficient language model adaptation. The experiment on large scale data set demonstrates the effectiveness of these two proposed approaches with controllable and low latency, as well as significant WER improvement from LM adaptation on text data.
12月3日,上海交通大学计算机科学全球午餐讲座系列2021秋第十二场邀请到计算机系易冉老师分享科研成果。
本场讲座吸引了50余位同学现场参加,并有3340多人在线观看了B站直播。
讲座视频可在此观看:
https://www.bilibili.com/video/BV1QL411M7FM?spm_id_from=333.999.0.0
长按二维码关注我们,了解更多信息