International Journal of Scientific & Engineering Research, Volume 3, Issue 6, June-2012 1

ISSN 2229-5518

SVM classification of high resolution urban satellites Images using Haralick features

Aissam Bekkari, Soufiane Idbraim, Azeddine Elhassouny, Driss Mammass, Mostafa El yassa and Danielle Ducrot.

Abstract— The classification of remotely sensed images knows a large progress taking into consideration the availability of images with different resolutions as well as the abundance of classification’s algorithms. SVMs (Support Vector Machines) are a group of supervised classification algorithms that have been recently used in the remote sensing field, a number of works have shown promising results by the fusion of spatial and spectral information using SVM. For this purpose, we propose a methodology allowing to combine these two information. The SVM classification was conducted using a combination of multi-spectral features and Haralick texture features as data source. We have used homogeneity, contrast, correlation, entropy and local homogeneity, which were the best texture features to improve the classification algorithm. The result will be compared with both a standard SVM classifier and a SVM classifier with a Graph Cuts approach that introduces spatial domain information applied as a post-classification. The proposed approach was tested on common scenes of urban imagery. Results showed that SVMs, especially with the use of Haralick texture features, outperform the SVM classifier with post-processing in term of the global accuracy. The experimental results indicate a mean accuracy value of 94.045 % which is very promising.

Index Terms— GLCM, Graph Cut, Haralick features, Satellite image, Spatial and spectral information, SVM.

—————————— ——————————


ith the commercial emergence of the optical satellite images of sub-metric resolution (Ikonos, Quickbird) the realization as well as the regular update of numerical
maps with large scales becomes accessible and increasingly frequent. The classification of such images is similar to that of other image types, it follows the same principle, and it is a method of analysis of data that aims to separate the image into several classes in order to gather the data in homogeneous subsets, which show common characteristics. It aims to assign to each pixel of the image a label which represents a theme in the real study area (e.g. vegetation, water, built, etc) [1].
Several classification algorithms have been developed since
the first satellite image was acquired in 1972 [2], [3], [4]. Among the most popular and widely used is the maximum likelihood classifier [5]. It is a parametric approach that assumes the class signature in normal distribution. Although this assumption is generally valid, it is invalid for classes consisting of several sub- classes or classes having different spectral features [6]. To over- come this problem, some non-parametric classification tech- niques such as artificial neural networks, decision trees and


Aissam Bekkari is currently PhD Student in Ibn zohr University, Agadir, Morocco, and member of the laboratory “IRF-SIC” (Image Pattern Recogni- tions - Intelligent and Communicating Systems). E-mail: Soufiane Idbraim is a Professor of informatics at the Faculty of Sciences, Ibn Zohr University, Agadir, Morocco. E-mail: Azeddine Elhassouny is currently PhD Student in Ibn zohr University, Agadir, Morocco, and member of the laboratory “IRF-SIC”. E-mail: in-

Driss Mammass is a Professor of informatics at the Faculty of Sciences, Ibn Zohr University, Agadir, Morocco. And also the director of the high school of Technology of Agadir. E-mail:

Mostafa EL Yassa is a Professor of informatics at the Faculty of Sciences, Ibn

Zohr University, Agadir, Morocco. E-mail:

Danielle Ducrot is currently professor at the faculty of sciences in the Paul

Sabatier University, Toulouse, and a researcher in the Center for the Study of

the BIOsphere from Space (CESBIO). E-mail:

Support vector machines (SVM) have been recently introduced.
SVM is a group of advanced machine learning algorithms that have seen increased use in land cover studies [7], [8]. One of the theoretical advantages of the SVM over other algorithms (decision trees and neural networks) is that it is designed to search for an optimal solution to a classification problem whereas decision trees and neural networks are designed to find a solution, which may or may not be optimal. This theoreti- cal advantage has been demonstrated in a number studies where SVM generally produced more accurate results than de- cision trees and neural networks [5], [9]. SVMs have been used recently to map urban areas at different scales with different remotely sensed data. High or medium spatial resolution im- ages (e.g., IKONOS, Quickbird, Landsat (TM)/ (ETM+), SPOT) have been widely employed on urban land use classification for individual cities for ; building extraction, road extraction and other man-made objects extraction [10], [11].
On the other hand, the consideration of the spatial aspect in
the spectral classification remains very important, for this case,
Haralick described methods for measuring texture in gray-scale images, and statistics for quantifying those textures. It is the hypothesis of this research that Haralick’s Texture Features and statistics as defined for gray-scale images can be modified to incorporate spectral information, and that these Spectral Tex- ture Features will provide useful information about the image. It is shown that texture features can be used to classify general classes of materials, and that Spectral Texture Features in par- ticular provide a clearer classification of land cover types than purely spectral methods alone.
The proposed method consists in combining spatial and
spectral information to obtain a better classification. We start with extraction of spectral and spatial information. Then, we apply the SVM classification to the result file. Experimental results are provided and comparisons with GraphCuts ap- proach [12] applied to the spectral classification are made to illustrate that the method is able to find better classes.

IJSER © 2012

International Journal of Scientific & Engineering Research, Volume 3, Issue 6, June-2012 2

ISSN 2229-5518

This paper is organized as follows. In the second section, we discuss the extraction of spatial and spectral information especially the Grey-Level Co-occurrence Matrix (GLCM) and Haralick texture features used in experimentations. In section
3, we give outlines on the used classifier: Support Vector Ma-
chines (SVM). In Section 4, the results are presented with nu-
merical evaluation. Finally, conclusions are given in Section 5.


2.1 Spectral information

The most used classification methods for the multispectral data consider especially the spectral dimension. The set of spectral values of each pixel is treated as a vector of attributes which will be directly employed as an entry of the classifier. According to Fauvel [13] this allows a good classification based on the spectral signature of each area. However, this does not take in account the spatial information represented
by the various structures in the image.
in the co-occurrence matrix, and texture features are calculated from it. A large number of textural features have been pro- posed starting with the original fourteen features ( f1 to f14 ) described by Haralick et al [15], however only some of these
features are in wide use. Wezska et al [16] used four of Haralick features ( f1 , f 2 , f5 , f8 ). Conners and Harlow [17] use five features ( f1 , f 2 , f3 , f 4 , f5 ). Conners, Trivedi and Harlow [18] introduced two new features which address a
deficiency in the Conners and Harlow set
( f1 , f 2 , f 4 , f5 , f6 , f7 ).
We found that the five features used by Conners and Har-
low are commonly used because seen that the fourteen are much correlated with each other, and that the five sufficed to give good results in classification [19].
In this work, we have used these five features: homogeneity
(E), contrast (C), correlation (Cor), entropy (H) and local ho- mogeneity (LH), and co-occurrence matrices are calculated for four directions: 0°, 45°, 90°and 135° degrees.
Let us recall their definitions:

2.2 Spatial information

Information in a remote sensed image can be deduced based on their textures. A human analyst is able to distinguish man-

E å å (M (i, j))2

i j

m 1


made features from natural features in an image based on the

C å k 2

å M (i, j)


‘regularity’ of the data. Straight lines and regular repetitions of
features hint at man-made objects. This spatial information is useful in distinguishing the different field in the remote
sensed image.
Many approaches were developed for texture analysis. Ac- cording to the processing algorithms, three major categories,

k 0


i j k

1 å å (i

i j i j

i )( j

j )M (i, j)


namely, structural, spectral, and statistical methods are com-
mon ways for texture analysis. Grey-Level Co-occurrence Ma-
trix (GLCM) [14] is one of the most widely used methods,
Where i and i are the horizontal mean and the variance, and j and j are the vertical statistics.
which is a powerful technique for measuring texture features; it contains the relative frequencies of the two neighbouring pixels separated by a distance on the image (Fig 1).
The size of the co-occurrence matrix equals to the number

H å å M (i, j) log(M (i, j))

i j

LH M (i, j)



of the image gray levels, also the dynamics of the image is

usually small (typically, 8 gray levels) in order not to work with too large matrices.

Fig. 1. (a) Image 4 × 4 with 4 gray levels, (b) co-occurrence matrix for a displacement of d = (1, 0) and (c) co-occurrence matrix for a displacement of d = (0, 2)

Even small, a co-occurrence matrix represents a substantial amount of data that is not easy to handle. This is why Haralick uses these matrices to develop a number of spatial indices that are easier to interpret.
Haralick assumed that the texture information is contained

å å 2

i j

Each texture measure can create a new band that can be in- corporated with spectral features for classification purposes.


In this section we briefly describe the general mathematical formulation of SVMs introduced by Vapnik [20], [21]. Starting from the linearly separable case, optimal hyperplanes are in- troduced. Then, the classification problem is modified to han- dle non-linearly separable data and a brief description of mul- ticlass strategies is given.

3.1 Linear SVM

For a two-class problem in a n-dimensional space Rn, we as- sume that l training samples xi Rn, are available with their corresponding labels yi = ±1, S = {(xi, yi) | i [1, l]}. The SVM method consists of finding the hyperplane that maximizes the margin, i.e., the distance to the closest training data points for

IJSER © 2012

International Journal of Scientific & Engineering Research, Volume 3, Issue 6, June-2012 3

ISSN 2229-5518

both classes [22]. Noting w Rn as the normal vector of the hyperplane and b R as the bias, the hyperplane Hp is defined as:

w, x b

0, x H p



w, x

is the inner product between w and x. If
x Hp then f(x) =

w, x

+ b is the distance of x to Hp. The sign
of f corresponds to decision function y = sgn (f(x)).

Finally, the optimal hyperplane has to maximize the mar-
gin: 2

w . This is equivalent to minimize w

2 and leads

to the following quadratic optimization problem:

é w 2 ù

min ê ú

êë 2 úû


subject to yi (w, xi

b) 1 i

1, l

Fig. 2. Classification of a non-linearly separable case by SVMs

For non-linearly separable data, the optimal parameters (w, b) are found by solving:


3.2 Non-Linear SVM

Using the Kernel Method, we can generalize SVMs to non-

é w


l ù

Cå i ú


linear decision functions. With this technique, the classifica- tion capability is improved. The idea is as follows. Via a non-

ëê 2

i 1 úû

linear mapping , data are mapped onto a higher dimen- sional space F (Fig 3):

subject to yi ( w, xi

b) 1

i , i

0 i 1, l

: R n F

Where the constant C control the amount of penalty and i
are slack variables which are introduced to deal with misclas-
sified samples (Fig 2). This optimization task can be solved

x a ( x)


through its Lagrangian dual problem:
The SVM algorithm can now be simply considered with the
following training samples: (S) = {(

l l

(x i ) , yi) | i [1, l]}. It

max å 1 å

y y x , x

leads to a new version of the hyperplane decision function

i 1 2 i , j 1

j i j i j

where the scalar product is now:

(x i ),

(x j )

. Hopefully,

subject to 0 i C


i 1, l


for some kernels function k, the extra computational cost is
reduced to:

å i yi

i 1

0 ( xi ),

( x j )

k ( xi , x j )



The kernel function k should fulfill Mercers’ conditions.


w å i yi xi

i 1


The solution vector is a linear combination of some samples of the training set, whose i is non-zero, called Support Vec- tors. The hyperplane decision function can thus be written as:


yu sgnç å yi

è i 1


i xu , xi ÷



Fig. 3. Mapping the Input Space into a High Dimensional Fea- ture Space with a kernel function

Where xu is an unseen sample.
With the use of kernels, it is possible to work implicitly in F while all the computations are done in the input space. The classical kernels used in remote sensing are the polynomial kernel and the Gaussian radial basis function:

k poly ( xi , x j )

( xi

x j ) 1


IJSER © 2012

International Journal of Scientific & Engineering Research, Volume 3, Issue 6, June-2012 4

ISSN 2229-5518

k gauss ( xi , x j )



xi x j


extraction of spectral information and spatial information, so we compute Grey Level Co-occurrence Matrix (GLCM) to extract Haralick texture features that we add to spectral information, and

3.3 Multiclass SVM

SVMs are designed to solve binary problems where the class labels can only take two values: ±1. For a remote sensing ap- plication, several classes are usually of interest. Various ap- proaches have been proposed to address this problem [23]. They usually combine a set of binary classifiers. Two main approaches were originally proposed for a k-classes problem.
One versus the Rest: k binary classifiers are applied on each class against the others. Each sample is assigned to the class with the maximum output.
then the result will be used as an input to SVM classifier (Fig 4).
The performance of SVM varies depending on the choice of the kernel function and its parameters. For RBF kernel, two pa-
rameters, which are regularization parameter (C) and kernel width ( ), need to be defined. It is not clear which pairs of pa- rameter produce the best classification result for a given data set. Therefore, optimum parameter search must be performed [24].
In this work, the parameters of RBF kernel were determined by a grid search method using cross validation approach. The
Pairwise Classification: k (k

1) 2 binary classifiers are ap-
main idea behind the grid search method is that different pairs
plied on each pair of classes. Each sample is assigned to the



class getting the highest number of votes. A vote for a given class is defined as a classifier assigning the pattern to that class.


4.1 Data

The first image used in classification is a sample of high reso- lution Quickbird satellite image. Its size is 240x360 pixels. It represents scene urban areas. We dispose of four spectral bands: blue, green, red and near infrared. We can see in Fig.5 (a) a representation of this image.
The second test image is another sample of Quickbird satellite
image with exactly the same properties except the size, 500x280
pixels. The scene does contain also urban areas. The original im- age is represented in Fig.6 (a).
We will have two files for each image, “TrainFile.dat” and “TestFile.dat” respectively for learning and for classification, di- vided on six classes as described in Table 1.

4.2 Experiments and results

The idea for a good spectral classification of the pixels is to directly consider the value of pixels image as input data of the classifier, and each texture measure will create a new band that will be incorporated with this spectral information to use jointly spatial and spectral information.
The proposed workflow has two main tasks, we start with the
of parameters are tested and the one with the highest cross vali-

dation accuracy is selected. The method is conducted in two steps. In the first step, a coarser grid is applied with an expo- nentially growing sequence of (C, ). In the second step, after identifying the optimal region on the grid, the finer grid search is executed. The results are used to perform the final training process [24], [25].

Fig. 4. A representative illustration of the workflow

For classification we have used SVMlight wich is an imple- mentation of Support Vector Machines (SVMs) in C language [26].
To file that contains spectral and spatial information ob-
tained from the original images (a) respectively in Fig 5 and Fig 6 we apply an SVM classification with RBF kernel what gives us images (b) represented respectively in Fig 5 and Fig 6.
Also, to the same images ((a) respectively in Fig 5 and Fig 6) we made a spectral classification with SVM, and then we ap- ply Graph cuts approach [12] to the classified images what gives us images (c) represented respectively in Fig 5 and Fig 6.
The results have progressed with the combined use of spec-

IJSER © 2012

International Journal of Scientific & Engineering Research, Volume 3, Issue 6, June-2012 5

ISSN 2229-5518

tral and spatial information. In addition, a visual analysis of classification maps shows those areas more homogeneous for the maps obtained with the proposed Spectral&spatial-SVM method.
Thus, we can note the appearance of misclassifications in
the SVM-Graph Cuts method because it only relies on spectral
information (e.g. shadow and tree in the image 1 (a) in Fig 5), also, it does not keep the edges and there are fewer classes detected unlike to our Spectral&spatial-SVM approach.












The resulting Spectral&spatial-SVM matches well with an urban land cover map in terms of smoothness of the classes specially when using Haralick texture features also it represents more connected classes.
Table 2 summarizes the results obtained using the Spec- tral&spatial-SVM and the SVM-Graph Cuts method. These
values were extracted from the confusion matrix, table 3 and table 4 present examples of the confusion matrix obtained with the Spectral&spatial-SVM and for the used images re- spectively for image 1 (a) in Fig 5 and image 2 (a) in Fig 6. The overall accuracy is the percentage of correctly classified pixels. Kappa coefficient is another criterion classically used in re- mote sensing classification to measure the degree of agree- ment and takes into account the correct classification that may have been obtained ”by chance” by weighting the measured accuracies.
We can see in Table 2 that of the Spectral&spatial-SVM
classification which reaches a global accuracy of 94.05%, slightly higher than the accuracy of 92.45% derived by the SVM-Graph Cuts classification and of 87.22% derived by the SVM classification using only spectral information.

The use of Spectral&spatial-SVM gives slightly higher clas- sification results for the Kappa coefficient; with all of the accu- racies over 90%, this method seems also promising for the classification of remotely sensed images.

Fig. 5. (a) original image 1, (b) result by combination of spatial and spectral information as input data for SVM classifier, (c) result by SVM classification ameliorated by the graph cuts ap- proach. (Asphalt ,Green area,Tree ,Soil ,Building ,Shadow.)

IJSER © 2012

International Journal of Scientific & Engineering Research, Volume 3, Issue 6, June-2012 6

ISSN 2229-5518

Fig. 6. (a) original image 2, (b) result by combination of spatial and spectral information as input data for SVM classifier, (c) result by SVM classification ameliorated by the graph cuts ap- proach. (Asphalt ,Green area,Tree ,Soil ,Building ,Shadow. )


In this paper, the classification of multispectral satellite images data using support vector machines was investigated. SVMs proved to provide very accurate classification, even with a very limited number of training sample and high dimensional data. The used kernel is a Gaussian RBF kernel. We have pre- sented a method which makes it possible to combine spatial and spectral information by the use of Haralick texture fea- tures to refine the classification of multispectral satellite im- ages.
The experimental results are promising and comparisons
with GraphCuts approach applied to the spectral classification have shown that the proposed method is able to find better
classes, however it remains to improve even more these re- sults.
As perspectives, the workflow of this study can be used in other remote sensing application, especially, in rural areas for thematic land cover. In the coming work, we will be concen- trating on the study of the kernel choice in order to determine the appropriate one, for this type of image classification. An- other aspect of the proposed method which should be im- proved is the set of features used for the classification. We think that it is not possible, from a Spectral&spatial-SVM point of view, to eliminate a significant number of features among those used in our method; some feature selection methods should be tested.


This work was funded by CNRST Morocco and CNRS France
Grant under “Convention CNRST CRNS” program SPI09/11.


[1] C. Samson "Contribution à la classification des images satellitaires par approche variationnelle et équations aux dérivées partielles" : Thesis of doctorate, university of Nice-Sophia Antipolis, 2000.

[2] J.R.G. Townshend, "Land cover". International Journal of Remote Sens- ing, vol. 13, pp. 1319–1328, 1992.

[3] F.G. Hall, J.R. Townshend, E.T. Engman, "Status of remote sensing algorithms for estimation of land surface state parameters." Remote Sensing of Environment, vol. 51, pp. 138–156, 1995.

[4] D. Lu, Q. Weng, "A survey of image classification methods and tech- niques for improving classification performance." International Journal of Remote Sensing, vol. 28, pp. 823–870, 2007.

[5] C. Huang, L.S. Davis, and J.R.G. Townshed, "An assessment of sup-

port vector machines for land cover classification." International Jour- nal of Remote Sensing, vol. 23, pp. 725–749, 2002.

[6] T. Kavzoglu, S. Reis, "Performance analysis of maximum likelihood and artificial neural network classifiers for training sets with mixed pixels." GIScience and Remote Sensing, vol. 45, pp. 330–342, 2008.

[7] M. Pal, and P. M. Mather,"Support vector machines for classification in remote sensing." International Journal of Remote Sensing, vol. 26, pp.

1007−1011, 2005.

[8] G. Zhu, and D. G. Blumberg, "Classification using ASTER data and SVM algorithms: The case study of Beer Sheva, Israel." Remote Sens- ing of Environment, vol. 80, pp. 233-240, 2002.

[9] B. Scholkopf, K. Sung, C. Burges, F. Girosi, P. Niyogi, T. Poggio, et al. "Comparing support vector machines with gaussian kernels to radial basis function classifiers." IEEE Transactions on Signal Processing, vol. 45, pp.2758−2765, 1997.

[10] X. Cao, J. Chen, H. Imura, O. Higashi, "A SVM-based method to ex- tract urban areas from DMSP-OLS and SPOT VGT data", Remote Sensing of Environment, vol. 113, pp. 2205–2209, 2009.

[11] J. Inglada, "Automatic recognition of man-made objects in high reso-

lution optical remote sensing images by SVM classification of geo- metric image features", ISPRS Journal of Photogrammetry & Remote Sensing, vol. 62, pp. 236–248, 2007.

[12] A. Bekkari, S. Idbraim, K. Housni, D. Mammass and Y. Chahir "Clas-

sification of high resolution urban satellite Images combining SVM and Graph Cuts" IEEE 5th International Symposium on Image/Video Communication over fixed and Mobile Networks, 2010.

[13] M. Fauvel, J. A. Benediktsson, J. Chanussot and J. R. Sveinsson, “Spectral and Spatial Classification of Hyperspectral Data Using SVMs and Morphological Profiles” IEEE International Geoscience and Remote Sensing Symposium, IGARSS 07, Barcelona Spain, 2007.

[14] W. Y. Chiu, and I. Couloigner "Evaluation of incorporating texture into wetland mapping from multispectral images" University of Cal- gary, Department of Geomatics Engineering, Calgary, Canada, EARSeL eProceedings, 2004.

[15] R.M. Haralick, K. Shanmugam, and I. Dinstein, "Textural Features for Image Classification." IEEE Transactions on Systems Man and Cybernet- ics, 1973.

[16] J.S. Weszka, C.R. Dyer, and A. Rosenfeld. "A Comparative Study of

Texture measures for Terrain Classification." IEEE Transactions on

Systems Man and Cybernetics, 1976.

[17] R.W. Conners, and C. A. Harlow, "A Theoretical Comaprison of Tex- ture Algorithms." IEEE Transactions on Pattern Analysis and Machine Intelligence, 1980.

[18] R.W. Conners, M.M. Trivedi, and C.A. Harlow, and "Segmentation of a High-Resolution Urban Scene using Texture Operators." Computer Vision, Graphics and Image Processing, 1984.

[19] V. Arvis, C. Debain, M. Berducat, A. Benassi, "Generalization of the

cooccurrence matrix for colour images: application to colour texture classification" Journal Image Analysis and Stereology, vol. 23, pp. 63-72,


[20] L. Chapel " Maintenir la viabilité ou la résilience d’un système : les machines à vecteurs de support pour rompre la malédiction de la dimensionnalité ? " : Thesis of doctorate, university of Blaise Pascal - Clermont II, 2007.

IJSER © 2012

International Journal of Scientific & Engineering Research, Volume 3, Issue 6, June-2012 7

ISSN 2229-5518

[21] S. Aseervatham " Apprentissage à base de Noyaux Sémantiques pour le traitement de données textuelles " : Thesis of doctorate, university of Paris 13 –Galilée Institut Laboratory of Data processing of Paris Nord, 2007.

[22] O. Bousquet "Introduction au Support Vector Machines (SVM) ",

Center mathematics applied, polytechnique school of Palaiseau, 2001.

[23] M. Fauvel, J. Chanussot and J.A. Benediktsson "A Combined Support Vector Machines Classification Based on Decision Fusion" IEEE In- ternational Geoscience and Remote Sensing Symposium, IGARSS 06, Denver, USA, 2006.

[24] C.W. Hsu, C.C. Chang, C.J. Lin, "A practical guide to support vector classification", 2008.

[25] S.T. Chen, P.S. Yu, "Real-time probabilistic forecasting of flood

stages." Journal of Hydrology, vol. 340, pp. 63–77, 2007.

[26] SVMlight Version: 6.02 Developed at University of Dortmund, Informatik, AI-Unit Collaborative Research Center on 'Complexity Reduction in Multivariate Data' (SFB475), 2008.

IJSER © 2012