Comparison of RNN, LSTM and GRU on Speech Recognition Data

Shewalkar, Apeksha Nagesh

Author/Creator

Shewalkar, Apeksha Nagesh

More Information

Show full item record

View/Open

Comparison of RNN, LSTM and GRU on Speech Recognition Data (618.9Kb)

Abstract

Deep Learning [DL] provides an efficient way to train Deep Neural Networks [DNN]. DDNs when used for end-to-end Automatic Speech Recognition [ASR] tasks, could produce more accurate results compared to traditional ASR. Normal feedforward neural networks are not suitable for speech data as they cannot persist past information. Whereas Recurrent Neural Networks [RNNs] can persist past information and handle temporal dependencies. For this project, three recurrent networks, standard RNN, Long Short-Term Memory [LSTM] networks and Gated Recurrent Unit [GRU] networks are evaluated in order to compare their performance on speech data. The data set used for the experiments is a reduced version of TED-LIUM speech data. According to the experiments and their evaluation, LSTM performed best among all other networks with a good word error rate at the same time GRU also achieved results close to those of LSTM in less time.

URI

https://hdl.handle.net/10365/29111

Collections

Computer Science Masters Papers