IJSER Home >> Journal >> IJSER
International Journal of Scientific and Engineering Research
ISSN Online 2229-5518
ISSN Print: 2229-5518 9    
Website: http://www.ijser.org
scirp IJSER >> Volume 2, Issue 9, September 2011
A Word Sense Disambiguation System Using Naïve Bayesian Algorithm for Myanmar Language
Full Text(PDF, 3000)  PP.  
Author(s)
Nyein Thwet Thwet Aung, Khin Mar Soe, Ni Lar Thein
KEYWORDS
Myanmar-English machine translation, Myanmar-English parallel corpus, Naïve Bayes Classifier, Natural Language Processing, supervised approach, unsupervised approach, Word Sense Disambiguation
ABSTRACT
Natural Language Processing has been developed to allow human-machine communication to take place in a natural-language. Word Sense Disambiguation (WSD) has always been a key problem in Natural Language Processing. WSD is defined as the task of finding the correct sense of a word in a specific context. Several methodological issues come up with the context of WSD. These are supervised and unsupervised WSD approaches. Supervised WSD approaches have obtained better results than unsupervised WSD approaches. There is not any cited work for resolving ambiguity of words in Myanmar language. Using Naïve Bayesian (NB) classifiers is known as one of the best method for supervised approaches for WSD. In this paper, we use Naïve Bayesian Classifier to disambiguate ambiguous Myanmar words with part-of-speech 'noun' and 'verb'. The system also uses Myanmar-English Parallel Corpus as training data. The WSD module developed here will be used as a complement to improve Myanmar-English machine translation system. As an advantage, the system can improve the accuracy of Myanmar to English language translation.
References
[1] A.Naseer and S.Hussain, “Supervised Word Sense Disambiguation for Urdu Using Bayesian Classification”, 2009, unpublished.

[2] C.A. Le and A. Shimazu, “High WSD accuracy using Naïve Bayesian classifier with rich features"", In Proceedings of the PACLIC 18,Waseda University, Tokyo, December 8th-10th , 2004.

[3] Compaq Oxford Dictionary and Thesaurus.

[4] F. Ahmed and A. Nurnberger, “Arabic/English Word Translation Disambiguation using Parallel Corpora and Matching Schemes”, In Proceedings of the 12th EAMT conference, Hamburg, Germany, 22-23 September 2008.

[5] F. Ahmed and A. Nurnberger, “Corpora based Approach for Arabic/ English Word Translation Disambiguation”, Speech and Language Technology. Volume 11, 2009.

[6] Ide and veronis, “Word Sense Disambiguation: The State of the Art.” Computational Linguistics, 1998.

[7] L.Merhbene, A.Zouaghi and M.Zrigui, “Ambiguous Arabic Words Disambiguation”, 11th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing, 2010.

[8] N.Ide and J.Veronis, “Word Sense Disambiguation Computational Linguistics”, 1998,vol.24(1),1-42.

[9] M.T. Uliniansyah and S. Ishizaki, “A Word Sense Disambiguation System Using Modified Naïve Bayesian Algorithms for Indonesian Language”, Information and Media Technologies 1(1): 257-274(2006).

[10] S. Elmougy, T. Hamza and H.M. Noaman, “Naïve Bayes Classifier for Arabic Word Sense Disambiguation”, In Proceedings of the INFOS2008, Cairo-Egypt, March 27-29, 2008.

[11] S. Pongpinigpinyo and W. Rivepiboon, “Distributional Semantics Approach to Thai Word Sense Disambiguation”, In Proceedings of the International Journal of Computational Intelligence 2:3 2006.

[12] T.M. Ma and N.L. Thein., “MASE Framework for Selecting Most Appropriate Sense of English Content Words in support of English- Myanmar Translation”, In Proceedings of the sixth international conference on Computer Applications, 2008.

[13] Y. Zheng-tao, D. Bin, H. Bo, H. Lu. and G. Jian-yi, “Word Sense Disambiguation Based on Bayes Model and Information Gain”, In the Proceedings of the International Journal of Advanced Science and Technology, Vol.3, February, 2009.

[14] Z.Zheng and Z.Shu, “A New Approach to Word Sense Disambiguation in MT System”, World Congress on Computer Science and Information Engineering, 2009.

Untitled Page