Speech Recognition By Using Recurrent Neural Networks

Home >> Journal >> IJSER

International Journal of Scientific and Engineering Research

ISSN Online 2229-5518

ISSN Print: 2229-5518 6

Website: http://www.ijser.org

IJSER >> Volume 2, Issue 6, June 2011 Edition

Speech Recognition By Using Recurrent Neural Networks

Full Text(PDF, 3000) PP.

Author(s)

Dr.R.L.K.Venkateswarlu, Dr. R. Vasantha Kumari, G.Vani JayaSri

KEYWORDS

Frames, Mel-frequency cepstral coefficient, Multi Layer Perceptron (MLP), Neural Networks, Performance, Recurrent Neural Network (RNN), Utterances.

Automatic speech recognition by computers is a process where speech signals are automatically converted into the corresponding sequence of characters in text. In real life applications, however, speech recognizers are used in adverse environments. The recognition performance is typically degraded if the training and the testing environments are not the same. The study on speech recognition and understanding has been done for many years. The aim of the study was to observe the difference of English alphabet from E-set to AH-set. The aim of the study was to observe the difference of phonemes. Neural network is well-known as a technique that has the ability to classify nonlinear problem. Today, lots of researches have been done in applying Neural Network towards the solution of speech recognition. Even though positive results have been obtained from the continuous study, research on minimizing the error rate is still gaining lots of attention. This research utilizes Recurrent Neural Network, one of the Neural Network techniques to observe the difference of alphabet from E- set to AH - set. The purpose of this research is to upgrade the peoples knowledge and understanding on phonemes or word by using Recurrent Neural Network (RNN) and backpropagation through Multilayer Perceptron. 6 speakers (a mixture of male and female) are trained in quiet environment. The English language offers a number of challenges for speech recognition [4]. This paper specifies that the performance of Recurrent Neural Network is better than Multi Layer Perceptron Neural Network.


References

[1] Ben Gold and Nelson Morgan Speech and Audio Signal Processing, Wiley India Edition, New Delhi, 2007. [2] B. Yegnanarayana, Artificial neural networks Prentice- Hall of India, New Delhi, 2006. [3] John Coleman, “Introducing Speech and language processing”, Cambridge university press, 2005. [4] Mayfield T. L., Black A. and Lenzo K., (2003). “Arabic in my Hand: Small-footprint Synthesis of Egyptian Arabic.” Euro Speech 2003, Geneva, Switzerland. [5] D.A.Reynolds, “An overview of Automatic speaker recognition technology”, proc. ICASSP 2002, orlands, Florinda, pp.300-304. [6] Medser L. R. and Jain L. C., (2001). “Recurrent Neural Network: Design and Applications.” London, New York: CRC Press LLC. [7] R.O. Duda, P.E. Hart, and D.G. Strok Pattern Classification, 2nd edn, John Wiley, New York, 2001. [8] Picton, P.Neural Networks, Palgrave, NY (2000). [9] He J. and Liu L., (1999). “Speaker Verification Performance and The Length of Test Sentence.” Proceedings ICASSP 1999 vol.1, pp.305-308. [10] Gingras F. and Bengio Y., (1998). “Handling Asynchronous or Missing Data with Recurrent Networks.” International Journal of Computational Intelligence and Organizations, Vol. 1, no. 3, pp. 154-163 [11] Jihene El Malik, (1998). “Kohonen Clustering Networks For Use In Arabic Word Recognition System.” Sciences Faculty of Monastir, Route de Kairouan, 14-16 December. [12] RuxinChenand Jamieson L. H., (1996). “Experiments on the Implementation of Recurrent Neural Networks for Speech Phone Recognition.” Proceedings of the Thirtieth Annual Asilomar Conference on Signals, Systems and Computers, Pacific Grove, California, November, pp. 7790782. [13] Koizumi T., Mori M., Taniguchi S. and Maruya M., (1996). “Recurrent Neural Networks for Phoneme Recognition.” Department of Information Science, Fukui University, Fukui, Japan, Spoken Language, ICSLP 96, Proceedings, Fourth International Conference, Vol. 1, 3-6 October, Page(s): 326 -329. [14] Joe Tebelskis, (1995). “Speech Recognition using Neural Network.”Carnegie Mellon University: Thesis Ph.D. [15] C.M.Bishop, Neural Networks for pattern recognition, oxford university press, 1995. [16] Rabiner, L and Juang, B, -H; fundamentals of speech recognition, PTR prentice Hall, scan Francisco, N.J (1993). [17] Lee S. J., Kim K. C., Yoon H. and Cho J. W., (1991). “Application of Fully Neural Networks for Speech Recognition.” Korea Advanced Institute of Science and Technology, Korea, Page(s): 77-80. [18] Werbos P., (1990). “Backpropagation Through Time: What It Does and How To Do It.” Proceedings of the IEEE, 78, 1550. [19] Lippman R.P., (1989). “Review of Neural Network for Speech Recognition.” Neural Computation 1.1-38.

Untitled Page