Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h, Vo lume 3, Issue 2, February -2012 1

ISS N 2229-5518

Enhanced Content Based Image Retrieval Using

Multiple Feature Fusion Algorithms

R.Priya, Dr.VasanthaKalyaniDavid

ABSTRACT- Recently the usage of multimedia contents like images and videos has increased. This usage has created the problem of locating f rom a very large database. This paper presents Content Based Image Retrieval (CBIR) system that uses multiple f eature f usion to retrieve images. The f eatures like color, shape and texture are used. The color histogram is used to extract color feature and active contour model is used f or shape extraction. K-means and SOM algorith ms are used f or clustering and dimensional reduction. The experimental results show that the proposed CBIR

system is better in terms of precision, recall and speed of image retrieval.

Inde x Terms- Content Based Image Retrieval, color histogram, contour model, K-means, Self Organizing Map

1. INTRODUCTION

—————————— ——————————

2. LITERATURE STUDY

Content-Based Image Retrieval (CBIR) is a process that searches and r etr ieves images fr om a lar ge databas e on the basis of automatically -der ived featur es such as color , textur e and shape. The techniques, tools and algor ithms that ar e used in CBIR, or iginate fr om many fields such as statistics, patter n r ecognition, signal pr ocessing, and computer vision. This field of r esear ch is attracting pr ofessionals fr om differ ent industries like cr ime pr evention, medicine, architectur e, fashion and publishing. For the past 10 decades the volume of digital images pr oduced in these ar eas has incr eased dr amatically and the W orld Wide W eb plays a vital r ole in this upsur ge. Several companies ar e maintaining lar ge image databases, wher e the r equir ement is to have a technique that can sear ch and r etr ieve images in a manner that is both time efficient and accur ate (Xiaoling, 2009).
In order to per form the r etr ieval pr ocess two steps ar e involved. The fir st step is the ‘featur e extraction’ step, which identifies unique signatur es, termed as featur e vector , for every image based on its pixel values. The featur e vector is the character istics that descr ibe the contents of an image. Visual featur es such as color , textur e and shape ar e used mor e commonly used in this step. The second step is the classification step which matches the featur es extracted from a query image w ith the featur es of the database images and gr oup’s images accor ding to their matching.
Featur e extr action mainly concentr ates on color , textur e and shape featur es. Out of these color featur e is consider ed as the most dominant and distinguishing visual featur e. A color histogram descr ibes the global color distr ibution of an image and is mor e fr equently used technique for content- based image r etr ieval (Wang and Qin, 2009) because of its efficiency and effectiveness.
Cinque et al. (1999) pr esent a spatial-chr omatic histogram considering the position and variances of color blocks in an image. Huang et al. (1997) pr oposes color corr elogram for r efining histogram which distills the spatial corr elation of color s.
Xioling and Hongyan (2009) proposed a new method that combines color histogram and spatial information. The method w as able to maintain the advantage of the r obustness to image r otation and scaling of the tr aditional histogram, while incor por ating the spatial information of pixels. This high dimensionality indicates that methods of featur e r eduction can be implemented to impr ove the per formance. Another side effect of lar ge dimensionality is that it also incr eases the complexity and computation of the distance function. It particular ly complicates ‘cr oss’ distance functions that include the per ceptual distance betw een histogram bins [2]. In my pr evious w or k, this method is enhanced to use a tr ee str uctur ed r epr esentation which combines color histogram and r egion featur es .

•R. Priy a, He ad & Assistant Pro fesso r, De partme nt o f Co mpute r Scie nce, S ree Naray ana Guru Co llege, Co imbato re, India. E -mail: priy amine rv a@re diffmail.co m•Dr. V asantha Kaly ani Dav id, Assoc iate pro fesso r, De partme nt o f Co mpute r Scie nce,

Av inashiling am Unive rsity fo r Wo me n, Co imbato re , India.


This method uses Self Or ganizing Map (SOM) to impr ove the accuracy, w hile simultaneously per forms dimensionality r eduction. The self-or ganizing Map (SOM), also known as a Kohonen map, is a technique which r educes the dimensions of data thr ough the use of self- or ganizing neural netw or ks. This appr oach uses the impr oved histograms as image featur e vector . The featur e vector , F(I), is generated for each image P in the collection. When a query is made, the featur e vector for the query

IJSER © 2012 http :// www.ijser.org

Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h, Vo lume 3, Issue 2, February -2012 2

ISS N 2229-5518

image is calculated and the similarity betw een any two images is calculated using the Euclidean distance betw een the featur e vector s.
While combining the above procedur e w ith SOM the follow ing pr ocedur e is adher ed. The SOM used consist of M x M units, wher e M x M >> N. This makes it possible to map distinct featur e vectors to unique locations in the SOM, by allow ing each image to occupy its own r egion of the map. The weight vectors of all units in the SOM ar e stor ed using a single color histogram. Since each pixel can keep 3 values in its HSV color space to handle w eight vectors of k-dimensions, the pixels ar e gr ouped in K/3 tiles, with pixels fr om the same position of differ ent tiles keeping the value of the same unit's weight vector . All values in the w eight vectors ar e initialized using the w eight factor calculated using Equation (5).
Dur ing training, the image blocks ar e given as input to the networ k. These input vectors ar e mapped with the networ k w eight vectors to choose a neur on in the competitive layer as a w inner . This w inner is a neur on whose weight vector is much similar to the input vector s. In other w ords it is the neur on having the minimum Euclidean distance fr om the input vector . The input vector , say x is simultaneously applied to all nodes. The similar ity betw een x and w eight wi is measur ed in terms of spatial neighborhood Nm. The w eights affecting the curr ently winning neighbor hood under go adoption at the corr ect lear ning step other w eights r emain unaffected. The neighbor hood, Nm is found ar ound the best matching node m such that
||x – w m || = min [||x – wi||] (1)
after concatenating C and S. The average of all the r espective featur es over the entir e database is used to normalize the individual featur e components. Let the normalized vectors be C' and S'. To avoid the curse of dimensionality, a k-means cluster ing algor ithm is used. The k-means algorithm per forms a two step pr ocedur e, wher e the first step r emoves r edundant featur es and the second step r etains points and r emoves points that ar e very near to the specific point. These points ar e r emoved because they may not pr ovide additional information because of being in vicinity. The distance classifier used is Euclidean distance and the number of clusters is determined using the PBM cluster validity index (Pakhira et al., 2004). To further select optimum featur e vector , the SOM method pr oposed in Priya and Vasantha kalyani David (2010) is used.
If R' be the dimensionality r educed featur e database and R'' is the featur e vector obtained fr om query image, then the r etr ieval system is based on a similar ity measur e defined betw een R' and R''. Featur e matching is per formed using point pattern matching algor ithm. This algor ithm considers tw o points as matching if and only if, the spatial distance, dir ectional distance and Euclidean distance between the corr esponding featur es ar e w ithin a thr eshold (Th) and each featur e R'' and R' contain (c, s, , featur e) (c, s featur es belonging to differ ent featur e vector s). This means that a point in R'' is said to be a match w ith R', if the spatial distance (SD) betw een them is smaller than a gi ven tolerance Th1 , the differ ence dir ection (DD) between them is smaller than an angular toler ance Th2 and the Euclidean differ ence (ED) betw een them is between some thr eshold. To further impr ove the accuracy of the for mulas, w eights W i is attached to each of the featur e vectors. The w eight is

1

The radius of Nm will be decr easing as the training
calculated as W i =

1  i


w her e  is the standar d
pr ogr esses. Towar ds the end of training the neighbor hood
may involve no cells other than the centr al winning one. The w eight-updating r ule for Self Or ganizing Featur e Map is defined as
Δw i(t) = α*x(t) –
(2)
Wher e Nm(t) denotes the curr ent spatial neighbor hood and
α denotes the learning rate. After training the w eight
deviation of the ith featur e of the image. The distance
Equations (3) – (5) used for this pur pose ar e given below .
SD(R', R'') = (W c *|c‛-c’|2 + W s * |s‛-s’|2 )1 /2  Th1
(3)
DD(R', R'') = min(|''-'|,360 - |''-'|)  Th2
(4)
vectors of each neur on of the Kohonen layer acts as code

ED(R', R'') = | f ' ' f '|

 Th3 wher e f is
vectors.

3. PRO POSED METHODOLOGY

Second w or k, that is, it combines SOM + Color featur es with shape featur e. The thr ee selected descr iptors ar e color , textur e and shape. Let C be the set of color featur es S be the set of shape featur es. Let R be the r esultant featur e vector
the featur e (5)
The final matching scor e for the ED and point pattern matching technique is based on the number of matched pairs found in the two sets, and is computed using Equation 6.

IJSER © 2012 http :// www.ijser.org

Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h, Vo lume 3, Issue 2, February -2012 3

ISS N 2229-5518

100xQ 2


Matching Scor e = MxN
(6)

0.9

0.8

0.7

wher e Q is the number of pair ed points betw een the database and the query concatenated point sets, while M and N ar e the number of points in R' and R'' r espectively. The top ‘n’ closest images ar e taken as quer y r esult, excluding the query image pr esent in the database.

4. EXPER IMENT AL RESU LTS

The image database used dur ing exper imentation consists of 650 JPEG color images r andomly selected fr om the W orld Wide W eb. Figur e depicts a sample of images in the database.

0.6

0.5

0.4

0.3

0.2

0.1

0

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0

Recall

SOM-Colour CBIR System Fusion-SOM CBIR System


Fr om the figur e, it could be seen that the fusion SOM CBIR method is an impr oved version of the base system using SOM and histogram. It also pr oves that combination of various featur es impr oves the image r etr ieval pr ocess. The image r etrieval time of Fusion-SOM CBIR System was 1.43 seconds and that of SOM-color CBIR System was 1.67 seconds. Thus, the pr oposed method shows that a speed efficiency of 14.37%. This shows that the dimensionality r eduction in both cases is excellent.

5. CONC LUSION

Dur ing testing, car e was taken to choose a query image fr om differ ent types of images like same scene, lar ge change in appear ance, etc. The per formance metr ics used dur ing evaluation is the pr ecision-r ecall measur e and r etr ieval time. Pr ecision is defined as the fraction of r etr ieved images that ar e truly r elevant to the query image and r ecall is defined as the fraction of r elevant images that ar e actually r etr ieved. Retrieval time is the time taken to r etr ieve images after giving the query image. The system was developed in MATLAB 7.3 and all the exper iments w er e conducted in Pentium IV machine with 512 MB RAM. The histograms for all the images w er e con structed using
72 color bins after conver ting the RGB color space to HSV
color spam.
In this paper , SOM based color histogr am method w ith multiple featur es for content based image r etr ieval is pr oposed. The pr oposed system used color featur es and shape featur es w hich w er e fused to obtain featur e vector . A k-means cluster ing algor ithm and SOM based dimensionality r eduction techniques ar e used. A similar ity measur e that combines spatial distance, dir ection distance and Euclidean distance is used. Several experiments w er e per formed to analyze the per formance of the proposed system. The r esults pr oved that the combination method is efficient in terms of pr ecision, r ecall and speed of image r etr ieval. The w or k can be extended w ith textur e featur e. Further , the pr esent CBIR system can also be integrated with machine lear ning classifiers to further impr ove their per formance.

References

[1] Xiao ling , W. (2009) A Nove l Circ ular Ring Histog ram fo r Co nte nt- Base d Imag e Re triev al, First Inte rnatio nal Wo rksho p o n Educ atio n Tec hno logy and Co mpute r Scie nce, Vo l. 2, Pp.785 -788.

IJSER © 2012 http :// www.ijser.org

Inte rnatio nal Jo urnal o f Sc ie ntific & Eng inee ring Re se arc h, Vo lume 3, Issue 2, February -2012 4

ISS N 2229-5518

[2] Wang, S . and Qin, H. (2009) A S tudy o f Orde r-Base d Bloc k Co lo r Fe ature Imag e Re triev al Co mpare d with Cumul ativ e Co lo r Histog ram Me tho d, S ixth Inte rnatio nal Co nfe re nce o n Fuzzy Syste ms and Kno wle dge Discove ry , FS KD '09, IEEE Xplo re , Vol. 1, Pp. 81 – 84.

[3] Pass, G. and Zabih, R, (1996) Histog ram re fine me nt fo r conte nt- base d image re triev al, Proc . WACV [C]. S araso to , FL, Pp. 96 -102.

[4] Bane rjee , M., Kundu, M.K. and Das, P.K. (2004) Image Re triev al with V isually Pro mine nt Fe ature s using Fuzzy se t theo re tic Ev aluatio n, ICVGIP 2004, India.

[5] Che n, Y. and Wang , J. Z. (2002) A Reg io n-Base d Fuzzy Fe ature Matc hing Appro ac h to Co nte nt-Base d Imag e Re triev al, IEEE Trans. o n PAMI, Vol. 24, No.9, Pp. 1252 -1267.

[6] Dese lae rs, T., Keyse rs, D. and Ney , H. (2008) Fe ature s fo r image re triev al: An e xpe rime ntal co mpariso n, Info rmatio n Re trie v al, Vo l. 11, Pp. 77-107.

[7] Geve rs, T. and S me uide rs, A.W.M. (1999) Co mbining co lo r and shape inv ariant fe ature s fo r image re triev al, Image and V isio n co mputing , Vo l.17, No . 7, Pp. 475 -488.

[8] Go h, S .T. and Tan, K.L. (2000) MOSAIC: A fast multi -fe ature imag e re triev al syste m, Data & Knowle dge Eng inee ring, Vo l. 33, Pp.219 -239

[9] Ha, J., Kim, G. and Cho i, H. (2008) The Co nte nt-Base d Imag e Re triev al Me tho d Using Multiple Fe ature s, Fo urth Inte rnatio nal Co nfe re nce o n Ne two rke d Co mputing and Adv ance d Info rmatio n Manage me nt, 2008. NCM '08, Pp. 652 – 657.

[10] Harris, C. and S te phe ns, M. (1988) A co mbine d co rne r and e dge

de tec to rs‛, 4th Alvey Visio n Co nfe re nce , Pp. 147 -151.

[11] Ko kare, M. and Biswas, P.K. (2007) Te xture image re triev al using ro tate d wave le t filte rs, Patte rn Recog nitio n Le tte rs, Vo l.28. Pp, 1240 -

1249.

[12] Li, J., Wang, J.Z. and Wie de rho ld, G. (2000) IRM: Inte g rate d Reg io n Matc hing fo r Image Re triev al, Proc. o f the 8th ACM Int. Co nf. o n Multime dia, Pp. 147 -156.

[13] Lin, H.J., Kao, Y.T., Ye n, S .H. and Wang , C.J. (2004) A study o f shape -base d image re triev al, Procee dings o f 24th Inte rnatio nal Co nfe re nce o n Distribute d Co mputing Syste ms Wo rksho ps, Pp. 118 –

123.

[14] Manjunath, B.S. and Ma, W.Y (1996) Te xture fe ature fo r bro wsing and re trie v al o f image data, IEEE Transac tio n o n PAMI, Vo l. 18, No. 8, Pp.837-842.

[15] Niblack, W. e t al., ‚The QBIC Projec t: Que ry ing Imag es by Co nte nt Using Colo r, Te xture , and S hape, Proc. SPIE, vo l. 1908, S an Jo se , CA, pp. 173–187, Fe b. 1993.

[16] Pe ntland, A., Pic ard, R. and Sc laro ff, S . (1994) Pho to book: Co nte nt-base d Manipulatio n o f Imag e Databases,‛ in Proc. SPIE S to rag e and Re triev al fo r Imag e and V ideo Database s II, S an Jo se, CA, Pp. 34–47.

[17] S astry, C.S. and Rav indranath, M. (2007) A mo difie d Gabo r func tio n fo r co nte nt base d image re triev al, Patte rn Recognitio n Le tte rs, Pp, 293-300.

Dig ital Co nte nt Tec hno logy and its Applic atio ns, Vo l. 4, No . 3, Pp. 43 -

49

[20] R.Priy a and Dr. V asantha Kaly ani Dav id (2010) Impro ve d Co nte nt Base d Imag e Re triev al Using Co lo r Histog ram And Se lf Org anizing Maps, IJCS IS : Inte rnatio nal Jo urnal o f Co mpute r Sc ie nce & Info rmatio n Sec urity Vo l 8, No . 9, Pp243 -248

[18] V adiv e l, A., S ural, S . and Majumdar, A.K. (2009) Image re triev al fro m the we b using multiple fe ature s, Online Info rmatio n Rev ie w, Vol.

33, Iss: 6, Pp.1169 - 1188

[19] Wu, J., We i, Z. and Chang , Y. (2010) Co lo r and Te xture Fe ature Fo r

Co nte nt Base d Imag e Re triev al, JDCTA: Inte rnatio nal Jo urnal o f

IJSER © 2012 http :// www.ijser.org