Advances in Deep Learning Methods for Speech Recognition and Understanding

Advances in Deep Learning Methods for Speech Recognition and Understanding
Author :
Publisher :
Total Pages :
Release :
ISBN-10 : OCLC:1238031787
ISBN-13 :
Rating : 4/5 (87 Downloads)

Book Synopsis Advances in Deep Learning Methods for Speech Recognition and Understanding by : Dmitriy Serdyuk

Download or read book Advances in Deep Learning Methods for Speech Recognition and Understanding written by Dmitriy Serdyuk and published by . This book was released on 2020 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: This work presents several studies in the areas of speech recognition and understanding. The semantic speech understanding is an important sub-domain of the broader field of artificial intelligence. Speech processing has had interest from the researchers for long time because language is one of the defining characteristics of a human being. With the development of neural networks, the domain has seen rapid progress both in terms of accuracy and human perception. Another important milestone was achieved with the development of end-to-end approaches. Such approaches allow co-adaptation of all the parts of the model thus increasing the performance, as well as simplifying the training procedure. End-to-end models became feasible with the increasing amount of available data, computational resources, and most importantly with many novel architectural developments. Nevertheless, traditional, non end-to-end, approaches are still relevant for speech processing due to challenging data in noisy environments, accented speech, and high variety of dialects. In the first work, we explore the hybrid speech recognition in noisy environments. We propose to treat the recognition in the unseen noise condition as the domain adaptation task. For this, we use the novel at the time technique of the adversarial domain adaptation. In the nutshell, this prior work proposed to train features in such a way that they are discriminative for the primary task, but non-discriminative for the secondary task. This secondary task is constructed to be the domain recognition task. Thus, the features trained are invariant towards the domain at hand. In our work, we adopt this technique and modify it for the task of noisy speech recognition. In the second work, we develop a general method for regularizing the generative recurrent networks. It is known that the recurrent networks frequently have difficulties staying on same track when generating long outputs. While it is possible to use bi-directional networks for better sequence aggregation for feature learning, it is not applicable for the generative case. We developed a way improve the consistency of generating long sequences with recurrent networks. We propose a way to construct a model similar to bi-directional network. The key insight is to use a soft L2 loss between the forward and the backward generative recurrent networks. We provide experimental evaluation on a multitude of tasks and datasets, including speech recognition, image captioning, and language modeling. In the third paper, we investigate the possibility of developing an end-to-end intent recognizer for spoken language understanding. The semantic spoken language understanding is an important step towards developing a human-like artificial intelligence. We have seen that the end-to-end approaches show high performance on the tasks including machine translation and speech recognition. We draw the inspiration from the prior works to develop an end-to-end system for intent recognition.


Advances in Deep Learning Methods for Speech Recognition and Understanding Related Books

Advances in Deep Learning Methods for Speech Recognition and Understanding
Language: en
Pages:
Authors: Dmitriy Serdyuk
Categories:
Type: BOOK - Published: 2020 - Publisher:

DOWNLOAD EBOOK

This work presents several studies in the areas of speech recognition and understanding. The semantic speech understanding is an important sub-domain of the bro
Deep Learning for NLP and Speech Recognition
Language: en
Pages: 621
Authors: Uday Kamath
Categories: Computers
Type: BOOK - Published: 2019-06-10 - Publisher: Springer

DOWNLOAD EBOOK

This textbook explains Deep Learning Architecture, with applications to various NLP Tasks, including Document Classification, Machine Translation, Language Mode
Automatic Speech Recognition
Language: en
Pages: 329
Authors: Dong Yu
Categories: Technology & Engineering
Type: BOOK - Published: 2014-11-11 - Publisher: Springer

DOWNLOAD EBOOK

This book provides a comprehensive overview of the recent advancement in the field of automatic speech recognition with a focus on deep learning models includin
Deep Learning
Language: en
Pages: 212
Authors: Li Deng
Categories: Machine learning
Type: BOOK - Published: 2014 - Publisher:

DOWNLOAD EBOOK

Provides an overview of general deep learning methodology and its applications to a variety of signal and information processing tasks
Deep Learning in Natural Language Processing
Language: en
Pages: 338
Authors: Li Deng
Categories: Computers
Type: BOOK - Published: 2018-05-23 - Publisher: Springer

DOWNLOAD EBOOK

In recent years, deep learning has fundamentally changed the landscapes of a number of areas in artificial intelligence, including speech, vision, natural langu