Improving End-to-end Neural Network Models for Low-resource Automatic Speech Recognition

Improving End-to-end Neural Network Models for Low-resource Automatic Speech Recognition
Author :
Publisher :
Total Pages : 140
Release :
ISBN-10 : OCLC:1227518442
ISBN-13 :
Rating : 4/5 (42 Downloads)

Book Synopsis Improving End-to-end Neural Network Models for Low-resource Automatic Speech Recognition by : Jennifer Fox Drexler

Download or read book Improving End-to-end Neural Network Models for Low-resource Automatic Speech Recognition written by Jennifer Fox Drexler and published by . This book was released on 2020 with total page 140 pages. Available in PDF, EPUB and Kindle. Book excerpt: In this thesis, we explore the problem of training end-to-end neural network models for automatic speech recognition (ASR) when limited training data are available. End-to-end models are theoretically well-suited to low-resource languages because they do not rely on expert linguistic resources, but they are difficult to train without large amounts of transcribed speech. This amount of training data is prohibitively expensive to acquire in most of the world’s languages. We present several methods for improving end-to-end neural network-based ASR in low-resource scenarios. First, we explore two methods for creating a shared embedding space for speech and text. In doing so, we learn representations of speech that contain only linguistic content and not, for example, the speaker or noise characteristics in the speech signal. These linguistic-only representations allow the ASR model to generalize better to unseen speech by discouraging the model from learning spurious correlations between the text transcripts and extra-linguistic factors in speech. This shared embedding space also enables semi-supervised training of some parameters of the ASR model with additional text. Next, we experiment with two techniques for probabilistically segmenting text into subword units during training. We introduce the n-gram maximum likelihood loss, which allows the ASR model to learn an inventory of acoustically-inspired subword units as part of the training process. We show that this technique combines well with the embedding space alignment techniques in the previous section, leading to a 44% relative improvement in word error rate in the lowest resource condition tested.


Improving End-to-end Neural Network Models for Low-resource Automatic Speech Recognition Related Books

Improving End-to-end Neural Network Models for Low-resource Automatic Speech Recognition
Language: en
Pages: 140
Authors: Jennifer Fox Drexler
Categories:
Type: BOOK - Published: 2020 - Publisher:

DOWNLOAD EBOOK

In this thesis, we explore the problem of training end-to-end neural network models for automatic speech recognition (ASR) when limited training data are availa
Exploring Neural Network Architectures for Acoustic Modeling
Language: en
Pages: 132
Authors: Yu Zhang (Ph. D.)
Categories:
Type: BOOK - Published: 2017 - Publisher:

DOWNLOAD EBOOK

Deep neural network (DNN)-based acoustic models (AMs) have significantly improved automatic speech recognition (ASR) on many tasks. However, ASR performance sti
Speech Recognition using Deep Learning
Language: en
Pages: 50
Authors: Dr. Narendrababu Reddy G,
Categories: Antiques & Collectibles
Type: BOOK - Published: - Publisher: Archers & Elevators Publishing House

DOWNLOAD EBOOK

New Era for Robust Speech Recognition
Language: en
Pages: 433
Authors: Shinji Watanabe
Categories: Computers
Type: BOOK - Published: 2017-10-30 - Publisher: Springer

DOWNLOAD EBOOK

This book covers the state-of-the-art in deep neural-network-based methods for noise robustness in distant speech recognition applications. It provides insights
Neural Network Based Representation Learning and Modeling for Speech and Speaker Recognition
Language: en
Pages: 127
Authors: Jinxi Guo
Categories:
Type: BOOK - Published: 2019 - Publisher:

DOWNLOAD EBOOK

Deep learning and neural network research has grown significantly in the fields of automatic speech recognition (ASR) and speaker recognition. Compared to tradi