New Era for Robust Speech Recognition: Exploiting Deep Learning

✍ Scribed by Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey (eds.)

Publisher: Springer International Publishing
Year: 2017
Tongue: English
Leaves: 433
Edition: 1
Category: Library

No coin nor oath required. For personal study only.

✦ Synopsis

This book covers the state-of-the-art in deep neural-network-based methods for noise robustness in distant speech recognition applications. It provides insights and detailed descriptions of some of the new concepts and key technologies in the field, including novel architectures for speech enhancement, microphone arrays, robust features, acoustic model adaptation, training data augmentation, and training criteria. The contributed chapters also include descriptions of real-world applications, benchmark tools and datasets widely used in the field.

This book is intended for researchers and practitioners working in the field of speech processing and recognition who are interested in the latest deep learning techniques for noise robustness. It will also be of interest to graduate students in electrical engineering or computer science, who will find it a useful guide to this field of research.

✦ Table of Contents

Front Matter ....Pages i-xvii
Front Matter ....Pages 1-1
Preliminaries (Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey)....Pages 3-17
Front Matter ....Pages 19-19
Multichannel Speech Enhancement Approaches to DNN-Based Far-Field Speech Recognition (Marc Delcroix, Takuya Yoshioka, Nobutaka Ito, Atsunori Ogawa, Keisuke Kinoshita, Masakiyo Fujimoto et al.)....Pages 21-49
Multichannel Spatial Clustering Using Model-Based Source Separation (Michael I. Mandel, Jon P. Barker)....Pages 51-77
Discriminative Beamforming with Phase-Aware Neural Networks for Speech Enhancement and Recognition (Xiong Xiao, Shinji Watanabe, Hakan Erdogan, Michael Mandel, Liang Lu, John R. Hershey et al.)....Pages 79-104
Raw Multichannel Processing Using Deep Neural Networks (Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Arun Narayanan, Michiel Bacchiani, Bo Li et al.)....Pages 105-133
Novel Deep Architectures in Speech Processing (John R. Hershey, Jonathan Le Roux, Shinji Watanabe, Scott Wisdom, Zhuo Chen, Yusuf Isik)....Pages 135-164
Deep Recurrent Networks for Separation and Recognition of Single-Channel Speech in Nonstationary Background Audio (Hakan Erdogan, John R. Hershey, Shinji Watanabe, Jonathan Le Roux)....Pages 165-186
Robust Features in Deep-Learning-Based Speech Recognition (Vikramjit Mitra, Horacio Franco, Richard M. Stern, Julien van Hout, Luciana Ferrer, Martin Graciarena et al.)....Pages 187-217
Adaptation of Deep Neural Network Acoustic Models for Robust Automatic Speech Recognition (Khe Chai Sim, Yanmin Qian, Gautam Mantena, Lahiru Samarakoon, Souvik Kundu, Tian Tan)....Pages 219-243
Training Data Augmentation and Data Selection (Martin Karafiát, Karel Veselý, Kateřina Žmolíková, Marc Delcroix, Shinji Watanabe, Lukáš Burget et al.)....Pages 245-260
Advanced Recurrent Neural Networks for Automatic Speech Recognition (Yu Zhang, Dong Yu, Guoguo Chen)....Pages 261-279
Sequence-Discriminative Training of Neural Networks (Guoguo Chen, Yu Zhang, Dong Yu)....Pages 281-297
End-to-End Architectures for Speech Recognition (Yajie Miao, Florian Metze)....Pages 299-323
Front Matter ....Pages 325-325
The CHiME Challenges: Robust Speech Recognition in Everyday Environments (Jon P. Barker, Ricard Marxer, Emmanuel Vincent, Shinji Watanabe)....Pages 327-344
The REVERB Challenge: A Benchmark Task for Reverberation-Robust ASR Techniques (Keisuke Kinoshita, Marc Delcroix, Sharon Gannot, Emanuël A. P. Habets, Reinhold Haeb-Umbach, Walter Kellermann et al.)....Pages 345-354
Distant Speech Recognition Experiments Using the AMI Corpus (Steve Renals, Pawel Swietojanski)....Pages 355-368
Toolkits for Robust Speech Processing (Shinji Watanabe, Takaaki Hori, Yajie Miao, Marc Delcroix, Florian Metze, John R. Hershey)....Pages 369-382
Front Matter ....Pages 383-383
Speech Research at Google to Enable Universal Speech Interfaces (Michiel Bacchiani, Françoise Beaufays, Alexander Gruenstein, Pedro Moreno, Johan Schalkwyk, Trevor Strohman et al.)....Pages 385-399
Challenges in and Solutions to Deep Learning Network Acoustic Modeling in Speech Recognition Products at Microsoft (Yifan Gong, Yan Huang, Kshitiz Kumar, Jinyu Li, Chaojun Liu, Guoli Ye et al.)....Pages 401-417
Advanced ASR Technologies for Mitsubishi Electric Speech Applications (Yuuki Tachioka, Toshiyuki Hanazawa, Tomohiro Narita, Jun Ishii)....Pages 419-429
Back Matter ....Pages 431-436

✦ Subjects

Artificial Intelligence (incl. Robotics)

📜 SIMILAR VOLUMES

New Era for Robust Speech Recognition: E

📁 New Era for Robust Speech Recognition: Exploiting Deep Learning

✍ Shinji Watanabe, Marc Delcroix, Florian Metze, John R. Hershey 📂 Library 📅 2017 🏛 Springer 🌐 English