Comparison of RNN, LSTM and GRU on Speech Recognition Data

Shewalkar, Apeksha Nagesh

Comparison of RNN, LSTM and GRU on Speech Recognition Data

Files

Comparison of RNN, LSTM and GRU on Speech Recognition Data.pdf (618.92 KB)

Date

2018

Authors

Shewalkar, Apeksha Nagesh

Publisher

North Dakota State University

Abstract

Deep Learning [DL] provides an efficient way to train Deep Neural Networks [DNN]. DDNs when used for end-to-end Automatic Speech Recognition [ASR] tasks, could produce more accurate results compared to traditional ASR. Normal feedforward neural networks are not suitable for speech data as they cannot persist past information. Whereas Recurrent Neural Networks [RNNs] can persist past information and handle temporal dependencies. For this project, three recurrent networks, standard RNN, Long Short-Term Memory [LSTM] networks and Gated Recurrent Unit [GRU] networks are evaluated in order to compare their performance on speech data. The data set used for the experiments is a reduced version of TED-LIUM speech data. According to the experiments and their evaluation, LSTM performed best among all other networks with a good word error rate at the same time GRU also achieved results close to those of LSTM in less time.

Keywords

Recurrent neural networks., Long short-term memory networks., Gated recurrent unit networks., Speech recognition., Deep learning., Deep neural networks., TED-LIUM speech data.

URI

https://hdl.handle.net/10365/29111

Collections

Computer Science Masters Papers

Full item page

Comparison of RNN, LSTM and GRU on Speech Recognition Data

Files

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections