Advances in Deep Learning Methods for Speech Recognition and Understanding

Author	: Dmitriy Serdyuk
Publisher	:
Total Pages	:
Release	: 2020
ISBN-10	: OCLC:1238031787
ISBN-13	:
Rating	: 4/5 (87 Downloads)

DOWNLOAD EBOOK

Book Synopsis Advances in Deep Learning Methods for Speech Recognition and Understanding by : Dmitriy Serdyuk

Download or read book Advances in Deep Learning Methods for Speech Recognition and Understanding written by Dmitriy Serdyuk and published by . This book was released on 2020 with total page pages. Available in PDF, EPUB and Kindle. Book excerpt: This work presents several studies in the areas of speech recognition and understanding. The semantic speech understanding is an important sub-domain of the broader field of artificial intelligence. Speech processing has had interest from the researchers for long time because language is one of the defining characteristics of a human being. With the development of neural networks, the domain has seen rapid progress both in terms of accuracy and human perception. Another important milestone was achieved with the development of end-to-end approaches. Such approaches allow co-adaptation of all the parts of the model thus increasing the performance, as well as simplifying the training procedure. End-to-end models became feasible with the increasing amount of available data, computational resources, and most importantly with many novel architectural developments. Nevertheless, traditional, non end-to-end, approaches are still relevant for speech processing due to challenging data in noisy environments, accented speech, and high variety of dialects. In the first work, we explore the hybrid speech recognition in noisy environments. We propose to treat the recognition in the unseen noise condition as the domain adaptation task. For this, we use the novel at the time technique of the adversarial domain adaptation. In the nutshell, this prior work proposed to train features in such a way that they are discriminative for the primary task, but non-discriminative for the secondary task. This secondary task is constructed to be the domain recognition task. Thus, the features trained are invariant towards the domain at hand. In our work, we adopt this technique and modify it for the task of noisy speech recognition. In the second work, we develop a general method for regularizing the generative recurrent networks. It is known that the recurrent networks frequently have difficulties staying on same track when generating long outputs. While it is possible to use bi-directional networks for better sequence aggregation for feature learning, it is not applicable for the generative case. We developed a way improve the consistency of generating long sequences with recurrent networks. We propose a way to construct a model similar to bi-directional network. The key insight is to use a soft L2 loss between the forward and the backward generative recurrent networks. We provide experimental evaluation on a multitude of tasks and datasets, including speech recognition, image captioning, and language modeling. In the third paper, we investigate the possibility of developing an end-to-end intent recognizer for spoken language understanding. The semantic spoken language understanding is an important step towards developing a human-like artificial intelligence. We have seen that the end-to-end approaches show high performance on the tasks including machine translation and speech recognition. We draw the inspiration from the prior works to develop an end-to-end system for intent recognition.

Search Results for: Advances In Deep Learning Methods For Speech Recognition And Understanding

Advances in Deep Learning Methods for Speech Recognition and Understanding

Advances in Deep Learning Methods for Speech Recognition and Understanding Related Books

Advances in Deep Learning Methods for Speech Recognition and Understanding

Deep Learning for NLP and Speech Recognition

Automatic Speech Recognition

Deep Learning

Deep Learning in Natural Language Processing