IJSER Home >> Journal >> IJSER
International Journal of Scientific and Engineering Research
ISSN Online 2229-5518
ISSN Print: 2229-5518 6    
Website: http://www.ijser.org
scirp IJSER >> Volume 3,Issue 6,June 2012
Modified Method Of Document Text Extraction From Document Images Using Haar DWT
Full Text(PDF, )  PP.916-924  
Author(s)
Navjot Kaur
KEYWORDS
Average component, Detail components, Document text, DWT, Multi-resolution of 2-D DWT, Non-Text Edges, Sub-band images, Text extraction, 2-D Haar Wavelet
ABSTRACT
This paper extends the technique used for Document Text Extraction from Images using 2-D Haar Wavelet. The discrete wavelet transform is a very useful tool for signal analysis and image processing, especially in multi-resolution representation. It can decompose signal into different components in the frequency domain. Two-dimensional discrete wavelet transform (2-D DWT) decomposes an input image into four sub-bands, one average component (LL) and three detail components (LH, HL, HH). The multi-resolution of 2-D DWT has been employed to detect edges of an original image. We select an appropriate threshold value and preliminarily remove the non-text edges in the detail component sub-bands. Then we use the logical AND operator to further removes the non-text regions. Another idea of removing the large size area in the image is merged with this idea to eliminate the non-text region from Document Images.
References
[1] S.Audithan, RM. Chandrasekaran (2009), “DOCUMENT TEXT EXTRACTION FROM DOCUMENT IMAGES USING HAAR DISCRETE WAVELET TRANSFORM”, European Journal of Scientific Research ISSN 1450-216X Vol.36 No.4 (2009), pp.502-512.

[2] Shyama Prosad Chowdhury, Soumyadeep Dhar, Amit Kumar Das, Bhabatosh Chanda, Karen mcmenemy (2009),”ROBUST EXTRACTION OF TEXT FROM CAMERA IMAGES”, ICDAR ’09 Proceedings of the 2009 10th International Conference on Documant Analysis and Recognition.

[3] Ujjwal Bhattacharya, Swapan Kumar Parui, Srikanta Mondal (2009), "DEVANAGARI AND BANGLA TEXT EXTRACTION FROM NATURAL SCENE IMAGES," icdar, pp.171-175, 2009 10th International Conference on Document Analysis and Recognition.

[4] Keechul Jung, Kwang In Kim and Anil K. Jain(2004), “TEXT INFORMATION EXTRACTION IN IMAGES AND VIDEOS: A SURVEY”, The journal of the Pattern Recognition society.

[5] G. Rama Mohan Babu, P. Srimaiyee, 3A. Srikrishna(2005- 2010), “TEXT EXTRACTION FROMHETROGENOUS IMAGES USING MATHEMATICAL MORPHOLOGY”, Journal of Theoretical and Applied Information Technology.

[6] S. A. Angadi, M. M. Kodabagi,” A TEXTURE BASED METHODOLOGY FOR TEXT REGION EXTRACTION FROM LOW RESOLUTION NATURAL SCENE IMAGES” International Journal of Image Processing (IJIP) Volume(3), Issue(5)

[7] H. Tran, A lux, H.L. Nguyen T. And A. Boucher(2005),” A NOVEL APPROACH FOR TEXT DETECTION IN IMAGES USING STRUCTURAL FEATURES”, The 3rd International Conference on Advances in Pattern Recognition, LNCS Vol. 3686, pp. 627-63.

[8] X. Liu, H. Fu and Y. Jia.(2008),” GAUSSIAN MIXTURE MODELING AND LEARNNG OF NEIGHBOR CHARACTERS FOR MULTILINGUAL TEXT EXTRACTION IN IMAGES”, Pattern Recognition, Vol. 41, pp. 484-493.

[9] P. Dubey(2006),” EDGE BASED TEXT DETECTION FOR MULTI-PURPOSE APPLICATION”, Proceedings of International Conference Signal Processing, IEEE, Vol. 4.

[10] K. Subramanian, P. Natajajan, M. Decerbo, and D. Casta-non(2007),” CHARACTER-STROKE DETECTION FOR TEXTLOCALIZATION AND EXTRACTION”, Proceedings of Ninth International Conference on Document Analysis and Recognition, IEEE, pp. 33-37.

[11] C. Mancas-Thilou, B. Gosselin(2006),” SPATIAL AND COLOR SPACES COMBINATION FOR NATURAL SCENE TEXT EXTRACTION”, Proceedings of IEEE International Conference on Iimage Processing, pp. 985-988.

[12] W. M. Pan, T. D. Bui, and C. Y. Suen(2007),”TEXT SEGMENTATION FROM COMPLEX BACKGROUND USING SPARSE REPRESENTATIONS”, Proceedings of Ninth International Conference on Document Analysis and Recognition, IEEE, pp. 412-416.

[13] J. Liang, D. Doermann, and H. P. Li.(2005),” CAMERABASED ANALYSIS OF TEXT AND DOCUMENTS: A SURVEY”. Int’l J. Document Analysis and Recognition, 7(2-3):84–104.

[14] D. F. Dunn and N. E. Mathew(2000), “EXTRACTING COLOUR HALFTONES FROM PRINTED DOCUMENTS USING TEXTURE ANALYSIS,” Pattern Recognition, vol. 33, no. 3, pp. 445–463.

[15] M. I. C. Murguiu(1998), “DOCUMENT SEGMENTATION USING TEXTURE VARIANCE AND LOW RESOLUTION IMAGES,” in Proceedings of IEEE Southwest Syniposium on Image Analysis and Interpretation, Tucson, Arizona, USA, pp.164– 167.

[16] L. Clique, L. Lombardi, and G. Mazini(1998), “A MULTIRESTORATION APPROACH FOR PAGE SEGMENTATION,” Pattern Recognition Letters, vol. 19, no. 2, pp. 217–225,.

[17] K. Etemad, D. S. Doermann, and R. Chellappa(1998), “MULTISCALE SEGMENTATION OF UNSTRUCTURED DOCUMENT PAGES USING SOFT DECISION INTEGRATION,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 19, no. 1, pp. 92–96.

[18] A. K. Jain and Y. Zhong(1996), “PAGE SEGMENTATION USING TEXTURE ANALYSIS,” Pattern Recognition, vol. 23, no. 2, pp. 743–770.

[19] Y. K. Ham, M. S. Kang, H. K. Chung, and R. H. Park(1995),” RECOGNITION OF RAISED CHARACTERS FOR AUTOMATIC CLASSIFICATION OF RUBBER TIRES”, Opt. Eng., Vol. 34, pp.102-108.

[20] T. Sato, T. Kanade, E. K. Hughes, and M. A. Smith(1998),” VIDEO OCR FOR DIGITAL NEWS ARCHIVE”, Proc. Of IEEE Workshop on Content based Access of Image and Video Databases, pp. 52-60.

[21] B. Shahraray and D. C. Gibbon(1995),” AUTOMATIC GENERATION OF PICTORIAL TRANSCRIPTS OF VIDEO PROGRAMS”, Proc. Of SPIE, Vol. 2417.

[22] Julinda Gllavata, Ralph Ewerth and Bernd Freisleben (2003), “A ROBUST ALGORITM FOR TEXT DETECTION IN IMAGES”, Proceedings of the 3rd international symposium on Image and Signal Processing and Analysis.

Untitled Page