IJSER Home >> Journal >> IJSER
International Journal of Scientific and Engineering Research
ISSN Online 2229-5518
ISSN Print: 2229-5518 9    
Website: http://www.ijser.org
scirp IJSER >> Volume 2, Issue 9, September 2011
Automatic Reordering Rule Generation Based On Parallel Tagged Aligned Corpus for Myanmar-English Machine Translation
Full Text(PDF, 3000)  PP.  
Author(s)
Thinn Thinn Wai, Tin Myat Htwe, Ni Lar Thein
KEYWORDS
Constituent Analysis, English-Myanmar Machine translation, parallel tagged aligned corpus, Reordering, Syntactic Analysis,
ABSTRACT
Reordering is important problem to be considered when translating between language pairs with different word orders. Myanmar is a verb final language and reordering is needed when it is translated into other languages which are different from Myanmar word order. In this paper, automatic reordering rule generation for Myanmar-English machine machine translation is presented. In order to generate reordering rules; Myanmar-English parallel tagged aligned corpus is firstly created. Then reordering rules are generated automatically by using the linguistic information from this parallel tagged aligned corpus. In this paper, function tag and part-of-speech tag reordering rule extraction algorithms are proposed to generate reordering rules automatically. These algorithms can be used for other language pairs which need reordering because these rules generation is only depend on part-of-speech tags and function tags.
References
[1] C. Tillmann and H. Ney. 2002, “Word reordering and DP beam search for statistical machine translation to appear in Computational Linguistics.,” Neurocomputing—Algorithms, Architectures and Applications, F. Fogelman- Soulie and J. Herault, eds., NATO ASI Series F68, Berlin: Springer- Verlag, pp. 227-236, 1989. (Book style with paper title and editor)

[2] R. Zens and H. Ney. 2003. A comparative study on reordering constraints in statistical machine trans lation. In Proceedings of the 41st Annual Meeting on Association for Computational Linguistics, vol ume 1, pages 144–151, Sapporo, Japan.

[3] S. Vogel, F.J. Och, C. Tillmann, S. Nießen, H. Sawaf, and H. Ney. 2000. Statistical methods for machine translation. InW.Wahlster, editor, Verbmobil: Foundations of Speech-to- Speech Translation, pages 377–393. Springer Verlag: Berlin, Heidelberg, New York.

[4] Y.Y. Wang and A. Waibel. 1997. Decoding algorithm in statistical translation. In Proc. 35th Annual Meeting of the Assoc. for Computational Linguistics, pages 366–372, Madrid, Spain, July.

[5] Ei Ei Han and Ni Lar Thein, ""Morphological Synthesis For Myanmar Language"", Proceeding of International Conference on Internet Information Retrieval, Korea, 2007.

[6] Yaser Al-Onaizan and Kishore Papineno. 2006. Distortion models for statistical machine translation. In Proceedings of the 21st International Conference on Computational Linguistics and the 4th annual meeting of the ACL, pages 529–536, Sydney, Australia

[7] A. L. Berger, S. A. Della Pietra, and V. J. Della Pietra,1996. A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39.

[8] B. Chen, M. Cettolo, and M. Federico. 2006. Reordering rules for phrase-based statistical machine translation. In Int. Workshop on Spoken Language Translation Evaluation Campaign on Spoken Language Translation, pages 1–15.

[9] M. Popovic and H. Ney. 2006. POS-based word reorderings for statistical machine translation. In Proc. of the 5th Int. Conf. on Language Resources and Evaluation (LREC), page 1278, Genoa, Italy.

[10] L. Shen, A. Sarkar, and F. J. Och. 2004. Discriminative reranking for machine translation. In HLTNAACL 2004: Main Proc., page 177.

[11] C. Tillmann and T. Zhang. 2005. A localized prediction model for statistical machine translation. In Proceedings of the 43rd Annual Meeting of the As-soc. for Computational Linguistics (ACL), pages 557–564, Ann Arbor, MI.

[12] D. Wu. 1996. A polynomial-time algorithm for statistical machine translation. Proc. 34th Annual Meeting of the Assoc. for Computational Linguistics, page 152.

[13] D. Wu. 1997. Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 23(3):377.

[14] Y. Zhang, R. Zens, and H. Ney. 2007. Chunk-Level Reordering of Source Language Sentences with Automatically Learned Rules for Statistical Machine Translation. In Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL): Proceedings of the Workshop on Syntax and Structure in Statistical Translation (SSST), pages 1–8, Rochester, NY.

[15] Myat Thuzar Tun and Ni Lar Thein, "" English Syntax Analyzer for English-to-Myanmar Machine Translation"", In proceedings of the Fifth International Conference on Computer Application, Myanmar, February, 8-9,2007.

[16] Myat Thuzar Tun, Tin Myat Htwe and Ni Lar Thein, ""EMTM: An Effective Language Translation Model"", In proceedings of International Conference on Internet Information Retrieval, Korea, November 30, 2005.

[17] Shankar Kumar “Local Phrase Reordering Models for Statistical Machine Translation”, Center for Language and Speech Processing, Johns Hopkins University, 3400 North Charles Street, Baltimore, MD 21218, U.S.A.

[18] P. F. Brown, S. A. Della Pietra, V. J. Della Pietra, and R. L. Mercer, “The Mathematics of Statistical Machine Translation: Parameter Estimation,” Computational Linguistics, vol. 19(2), pp. 263–312, 1993.

[19] Kenji Yamada and Kevin Knight. 2000. A Syntax based Statistical Translation Model. ACL 2000.

[20] Josep M. Crego and Jose B. Marino. 2006. Reordering Experiments for N-Gram-based SMT. In Spoken Language Technology Workshop, pages 242-245, Palm Beach, Aruba.

[21] K. Papineni, S. Roukos, T. Ward, and W. J. Zhu, “BLEU: a Method for Automatic Evaluation of Machine Translation”, Association for Computational Linguistics, 2002, pp. 311-318.

[22] Phyu Hnin Myint, Tin Myat Htwe and Ni Lar Thein. “Bigram Part-of-Speech Tagger for Myanmar Language”, Proceedings of International Conference on Information Communication and Management (ICICM 2011), October 14-16, 2011, Singapore.

[23] Win Win Thant,Tin Myat Htwe and Ni Lar Thein .“ Syntactic Analysis of Myanmar Language”, Proceedings of International Conference on Computer Applications (ICCA 2011), Yangon, Myanmar, May 5-6, 2011.

[24] Win Pa Pa and Ni Lar Thein. “Myanmar Word Segmentation using Hybrid Approach.” In Proc. 7th International Conference for Computer Application. Yangon, Myanmar, May 5-6, 2009..

Untitled Page