International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 580

ISSN 2229-5518

Content-Based Image Retrieval Using Feature

And Color Algorithm

Akshay Shinde1, Akash Malbari2, Salahuddinn Awaise3

1 (Technical Analyst, Nomura Services India Pvt. Ltd., akshayshinde1352007@gmail.com),

2 (Technical Analyst, Nomura Services India Pvt. Ltd., akashmalbari@gmail.com)

3(Assistant System Engineer, Tata Consultancy Services, awaise267@gmail.com)

Abstract- An image retrieval system is a computer system for browsing, searching and retrieving images from a large database of digital images. Content-based image retrieval (CBIR) is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases . This Content-based image retrieval system is based on an efficient combination of both feature and color algorithms. Text-based Image Retrieval operates by augmenting images with keyword-based annotations and the search process always relies on keyword matching techniques. In CBIR, visual features, such as color, texture, and shape information, of images are extracted automatically.

Index terms- Euclidean Distance, Feature Extraction, Kekre Transform, HSV Segmentation, Query Image.

1 INTRODUCTION

n image retrieval system is a computer system for browsing, searching and retrieving images from a large database of digital images. Most traditional and common methods of image retrieval utilize some method of
adding metadata such as captioning, keywords, or descriptions to the images so that retrieval can be performed over the annotation words.
There are several criteria that can be considered in order to classify image retrieval systems. Three basic criteria are as follows.
1. User interaction, browsing, typing text or inserting an image that is visually similar to the target image.
2. Search performance, how the search engine actually searches. For instance, whether the search is accomplished through the analysis of visual features or through semantic annotations.
3. Domain of the search, standalone search engine, only executes a search in a local computer versus Internet based search engine.
This analysis is focused on two major paradigms:
1) Text-based image retrieval
2) Content-based image retrieval

2 EXISTING SYSTEM

Text-based image retrieval is a type of search engine specialized in finding pictures, images, animations using keywords or search phrases and to receive a set of thumbnail images, sorted by relevancy.

2.1. Working

Image search is supported by augmenting images with keyword-based annotations and the search process always relies on keyword matching techniques. The techniques most widely spread for creating the annotations that support this search are building keyword indices based on image content, embedding keyword-based labels into the image or extracting the annotations from the text surrounding images on the Internet, from the filename or even from the “alt” tag in HTML.
Some examples of keyword web-based search engines are Webshots (www.webshots.com), Ask images (www.ask- images.com), Google image, Altavista and Picsearch (www.picsearch.com).
Text-based information retrieval is lexically motivated rather than conceptually motivated, which leads to irrelevant search results in information retrieval. Lexically motivated means that text-based retrieval operates on the word-level, and not on the level of the meaning of words.

2.2. Drawbacks of Text based Image Retrieval

Manual annotations require too much time and are expensive to implement. As the number of images in a database grows, the difficulty in finding desired images increases.
Manual annotations fail to deal with the discrepancy of subjective perception. The textual description is not sufficient for depicting subjective perception. Typically, a medical image usually contains several objects, which convey specific information.

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 581

ISSN 2229-5518

To capture all knowledge, concepts, thoughts, and feelings for the content of any images is almost impossible.
The contents of medical images are difficult to be concretely described in words. Thus to overcome these limitations of text
based image retrieval, content based image retrieval methods are proposed.

3 PROPOSED SYSTEM

Content-based image retrieval means that the search will analyze actual contents of the image by using image analysis technique. CBIR[1] is the application of computer vision to the image retrieval problem. The term content in this context refers to properties of the image called low level features such as color, shape, texture, or any other information that can be derived from the image itself. These systems employ image processing technologies to extract visual features and then apply similarity measurements to them. Feature extraction algorithms extract features and store them in the form of multidimensional vectors. Afterwards, similarity/dissimilarity measurement between two feature vectors is defined for each feature. In general, the distance between two vectors is equivalent to the dissimilarity between the corresponding images.

3.1. Working

The CBIR system performs two main operations in its approach. The first one is feature extraction (FE) [2], where a set of features are obtained from image to form feature vector. Features of an image should have a strong relationship with semantic meaning of the image. In the second step, CBIR system retrieves the relevant images from the image data base for the given query image, by comparing the feature of the query image and images in the database. Relevant images are retrieved according to minimum distance or maximum similarity measure calculated between features of query image and every other image in the image database. The working is depicted in the Fig.1 below:

Fig. 1 Working of CBIR

3.1.1. Common CBIR System

In proposed system, there are 2 different modules that needs to be considered.They are as follows-
1) Kekre’s Transform
2) HSV Segmentation
The CBIR system calculates the distance between query image and database images using each of the methods and merges
the result obtained by two methods.

3.2. Kekre Transform

Kekre Transform [3] matrix can be described as a square matrix of any order N X N which need not have to be in powers of 2 unlike most of other transforms. All the upper diagonal and diagonal values of Kekre Transform matrix are ones, while the lower diagonal values just below the diagonal are zeros. The generalized Kekre Transform matrix can be represented in matrix form as shown in (1).

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 582

ISSN 2229-5518


… (1)
The above matrix can be generated using the following mathematical relation shown in (2).

…(2)
For taking Kekre Transform of an image of the size NxN, the numbers of required multiplications are (N-1) and numbers of additions are 2N (N-1).

3.3. Kekre Transform Combination Image Retrieval

The image retrieval process has two main steps of Feature Extraction and Query Execution. The retrieval process initiates by applying Kekre Transform matrix to row mean vectors and column mean vectors of the image to get Kekre transform row mean vectors [4] and column mean vectors respectively as shown below.
Row Mean Vector = [Average (Row-1), Average (Row-2)… Average (Row-n)]… (3)
Column Mean Vector = [Average (Column-1), Average (Column-2)… Average (Column-n)] ... (4)
These coefficients are used to extract features from both the query image and the image from the database with the
multiplication of the Kekre Transform matrix with the row mean and column mean vectors. For identifying the similarity
measure between the query image and the database image, we calculate the Euclidian Distance for row coefficient and column coefficient respectively. However, the two steps of deriving the row coefficient and column coefficient and their combination can be used as individual methods by just using Kekre Transform, as in (5), with row mean/column mean for the whole image [5]. But effectiveness can be obtained by considering the combination of the two methods.

…(5)

This feature vector is used for comparison with the images in the database and matching for each image is calculated using Minkowski Metric (LM norm) equation. Euclidian distance is one of the special cases derived from the Minkowski metric when the equation has a power of 2. Thus the Euclidian Distance, as in (6), obtained can be represented mathematically as follows:

…(6)

3.3. Adaptive HSV Segmentation

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 583

ISSN 2229-5518

The pre-processing procedure constitutes of HSV color space which is widely used in computer graphics. Then, the image retrieval process using adaptive segmentation of HSV is carried out by converting its RGB color space to HSV color space using the formula, as in (7):

…(7)

Since the RGB values generally lie in the range of 0 to 255, we need to use the formulae given below, which will convert the Hue values between 0° and 360°, Saturation between 0 and 1 and Values between 0 and 1. The formulae, as in (8), for their calculation is as follows:

…(8)

After the conversion from RGB color space to HSV color space of the entire image, the image is divided into m different regions depending on the values of hue and saturation. Table 1, illustrates the fact that the hue is divided into partitions of 20° is done in order to separate the 3primary colors and yellow magenta and cyan into 3Sub-divisions each. And the saturation for each hue is further sub-divided by 0.2. Due to this we get 18*5=90 different regions of color distribution in the image. After dividing the image into various regions using table given above the pixels present in each region of the image are selected. Then the corresponding hue values are extracted and grouped together to form a hue vector. This vector for every region is divided into n segments depending on the number of pixels in the hue vector of the region. If the number of pixels in the region are more the hue vector will be divided into more number of segments and if the number of pixels in the region are less the hue vector will be divided into less number of segments.

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 584

ISSN 2229-5518

TABLE 1
DIFFERENT RANGES OF HUE AND SATURATION USED IN IMAGE
RETIEVAL PROCESS


In order to partition the regions into various segments we need to use the following equation, as in (9) :
…(9)
where, ni represents the number of segments in region I, Xi represents the number of pixels in region i (where i ranges from 1 to m), T represents total number of pixels of the image, and TS represents total number of required segments of the entire HSV image. After this process of breaking the various regions into segments the necessary color distribution information is calculated by finding the maximum occurrence in each segment by using the hue histogram. Using this information we can generate the feature vector of the image. In order to perform an image retrieval operation we need to generate a feature vector for the database image from individual regions. This feature vector is used in comparing both the images by using the Euclidean distance equation. Segments in each region of the query image are compared with the corresponding region of the database image using the Euclidean distance equation as given in equation. Once the Euclidean distance of the individual regions is computed the summation of squares of all the regions is computed and the under root is taken to get the final distance that can be used to compare various images.

3.4. System flow

The entire flow of the process is shown in Fig. 2.

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 585

ISSN 2229-5518

4. EXPERIMENTAL RESULTS

Fig. 2. Flow Diagram
The CBIR system is implemented using MATLAB R2009a[6] platform. A Graphics User Interface is designed which displays various operations that can be performed.

Fig. 3. Homepage that displays various options

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 586

ISSN 2229-5518

Fig 11: Screen that accepts the path to database of images

Fig. 4. Screen that accepts path to database of images

Fig. 5. Relevant images using Kekre’s transform

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 587

ISSN 2229-5518


Fig. 6. Relevant images using HSV segmentation

Fig. 7. Final result of relevant images after combining both kekre’s transform and HSV segmentation

5. CONCLUSION

Content based image retrieval based on feature and color vector extraction is much more efficient and accurate than traditional text-based system. Most current content-based image retrieval systems work with low level features (color, texture, shape). The next generation systems should operate at a higher semantic level. One way to achieve this is to let the system recognize objects

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 9, September-2014 588

ISSN 2229-5518

and scenes. CBIR is still a developing science. As image compression, digital image processing, and image feature extraction techniques become more developed, CBIR maintains a steady pace of development in the research field.

REFERENCES

[1] “Content Based Image Retrieval Using Dominant Color, Texture and Shape”: M.BabuRao et al. / International Journal of Engineering Science and Technology (IJEST). [2] “Content Based Image Retrieval based on Color, Texture and Shape features using Image and its complement”: P S Hiremath

[3] “Kekre Transform over Row Mean, Column Mean and Both Using Image Tiling for Image” in International Journal of Computer and Electrical Engineering, Vol.2, No.6, December, 2010, pp 1793-8163.

[4] “CBIR Using Kekre’s Transform over Row column Mean and Variance Vectors”: Dr. H.B. Kekre et. al. / (IJCSE) International Journal on Computer Science and

Engineering Vol. 02, No. 05, 2010, pp1609-1614.

[5] ”Extended Performance Appraise of Image Retrieval Using the Feature Vector as Row Mean of Transformed Column Image”: Dr. H. B. Kekre, Sudeep D. Thepade & Akshay Maloo “Content Based Image Retrieval Using Combination of Kekre Transform and HSV Color Segmentation”: IACSIT International Journal of Engineering and Technology, Vol. 3, No. 5, October 2011.

[6] MATLAB Guide by Desmond J. Higham and Nicholas J. Higham, SIAM, 2000. xxii+283 pages, softcover, ISBN 0-89871-516-4.

IJSER © 2014 http://www.ijser.org