International Journal of Scientific & Engineering Research, Volume 5, Issue 3, March-2014 248

ISSN 2229-5518

Brain MR Image Classification Using Least

Squares Support Vector Machine

#1 Mrs.Sapna N.Gaikwad, #2 Prof. Dr. D. S. Bormane,

#1 Department of Electronics Engineering, J.S.P.M. College of Engineering, University Of Pune, Pune , India

#2 Principal, J.S.P.M. College of Engineering, University Of Pune, Pune , India

Abstract— This research paper proposes an intelligent classification technique to identify tumor. The manual interpretation of tumor based on visual examination by Radiologist/physician may lead to missing diagnosis when a large number of data are analyzed. To avoid the hu- man error, an automated intelligent classification system is proposed which caters the need for classification of medical image after identi- fying the volume normal and abnormal images for tumor identification. In this research work, advanced classification techniques based on Least Squares Support Vector Machines (LS-SVM) are proposed and applied to medical image classification using features derived from images.

Index Terms—. Classification, LS-SVM, SVM, RBF Kernel, K means, PCA,Single Value Decomposition

—————————— ——————————

1 INTRODUCTION

HIS field of medical imaging gains its importance with increase in the need of automated and efficient diagnosis in a short period of time. Computer and Information
Technology are very much useful in medical image pro- cessing, medical analysis and classification. In, medical images there are normal and abnormal data of images are use for clas- sification detecting tumor. Vapnik introduced the SVM (sup- port vector machine). SVM classifier was used for medical images classification with statistical features. The latest devel- opment in data classification research has focused more on Least Squares Support Vector Machines (LS-SVMs) because several recent studies have reported that LS-SVM generally are able to deliver higher classification accuracy than the other existing data classification algorithms.LS-SVM introduced by suykens.
In this paper, the potential benefit of using an LS-SVM based approach is investigated for the automated classification of medical images. This is for separating normal and abnormal images from data collection containing tumor. The purpose is to perform segmentation process for tumor detection. Segmen- tation is done by using K-means algorithm. The categorization of medical images is done using statistical features of images such as mean, variance, and co-occurrence based textural fea- tures of images such as energy, contrast and correlation.
The motivation behind this paper is to develop a machine classification process for evaluating the classification perfor- mance of LS-SVM classifier with RBF kernel to this problem in terms of statistical performance measure.
The paper is organized as follows. Preliminaries dealing with LS-SVM techniques are presented in Section 2. Section 3 dis- cusses the proposed methodology using LS-SVM for MRI brain images classification. Section 4 highlights the implemen- tation of proposed methodology. Results and outputs after
Implementation of the proposed approach is given in Section
5. Finally, the conclusion is presented in Section 6.

2 REVIEW OF LS-SVM LEARNING FOR CLASSIFICATION

The Support Vector Machine algorithm was first developed in
1963 by Vapnik and Lerner’s and Vapnik and Chervonenkis as
an extension of the Generalized Portrait algorithm. This algo-
rithm is firmly grounded in the framework of statistical learn- ing theory – Vapnik Chervonenkis (VC) theory, which im- proves the generalization ability of learning machines to un- seen data . In the last few years Support Vector Machines have
shown excellent performance in many real-worlds applica- tions including hand written digit recognition, object recogni- tion, speaker identification , face detection in images and text categorization . SVM is a classification algorithm based on linear and non linear classification. Dot-products can be com- puted efficiently in higher dimensional space. The dominant feature which makes SVM very attractive is that classes which are nonlinearly separable in the original space can be linearly separated in the higher dimensional feature space. Thus SVM is capable of solving complex nonlinear classifica- tion problems. Important characteristics of SVM are its ability to solve classification problems by means of convex quadratic programming (QP) and also the sparseness resulting from this QP problem. The learning is based on the principle of structural risk minimization. Instead of minimizing an objec- tive function based on the training samples (such as mean square error), the SVM attempts to minimize the bound on the generalization error (i.e., the error made by the learning ma- chine on the test data not used during training). As a result, an SVM tends to perform well when applied to data outside the training set. SVM achieves this advantage by focusing on the training examples that are most difficult to classify. These “borderline” training examples are called support vectors. A least squares version of SVM (LS-SVM) is introduced by Suy-

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 3, March-2014 249

ISSN 2229-5518

kens in with the idea of modifying Vapnik’s SVM formula- tion by adding a least squares term in the cost function. This variant circumvents the need to solve a more difficult QP Problem and only requires the solution of a set of linear equations. This approach significantly reduces the complexity and computation in solving the problem.
In this paper, we treat slice classification as a two class pattern classification problem. We apply all the MRI slices to classifi- er to determine whether the tumor is present or not. We refer to these two classes throughout as “normal” and “abnormal” slices. Let vector x ∈R d enote a p attern to be Cl assi fied , and l et scalar y denote its class label (i.e.,y ∈+ 1-,1) In addition, let

denote a given set of l training examples. The problem is how to construct a classifier [i.e., a decision func- tion f(x)] that can correctly classify an input pattern x that is not necessarily from the training set.

2.1 Linear SVM Classifier

Let us begin with the simplest case, in which the training pat- terns are linearly separable.
That is, there exists a linear function of the form

such that for each training example xi, the function yield

In other words, training examples from the two different clas- ses are separated by the hyperplane where w is the unit vector and b is a constant. For a given training set, while there may exist many hyperplanes that maximize the separating margin between the two classes, the SVM classifier is based on the hyperplane that maximizes the separating margin between the two classes (Figure 1). In other words, SVM find s the hyperplane that causes the largest separa- tion between the decision function values for the “border- line” examples from the two classes. Mathematically, this hy- perplane can be found by minimizing the cost function:

In Figure 1, SVM classification with a hyperplane that mini- mizes the separating margin between the two classes are indi- cated by data points marked by “X” s and “O”s. Support vec- tors are elements of the training set that lie on the oundary hyperplanes of the two classes.

2.2 Non Linear SVM Classifiers


The linear SVM can be readily extended to a nonlinear classifi- er by first using a nonlinear operator (φ) to map the input pat- tern x into a higher dimensional space H. The nonlinear SVM classifier so obtained is defined as
Which is linear in terms of the transformed data Φ(x), but non- linear rs in terms of the original Data x∈R follow ing nonlinear transformation, the parameters of the decision function f(x) are determined by the following minimization:


subject to

The data with linear separability may be analyzed with a hy- per plane, and the linearly non separable data are analyzed with kernel functions such as higher order polynomials, Gaussian RBF and tan-sigmoid described as:
The output of an SVM is a linear combination of the training examples projected onto a high- dimensional feature space through the use of kernel functions.
Least Squares SVM classifier The advantage of nonlinear SVM classifiers is the ability to solve classification problems by means of convex quadratic programming, as well as the

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 3, March-2014 250

ISSN 2229-5518

sparseness as a result of this QP problem. Suykens proposed

the idea of modifying the Vapnik’s SVM formulation by add- ing a least squares term in the cost function, which trans- formed the problem from solving a QP problem to solving a set of linear equations. This approach significantly reduces the cost in complexity and computation time for solving the problem. Suykens formulated the SVM modification as fol- lows:

such that

which corresponds to a classifier in the primal space where φ(·) is the mapping to the high dimensional feature
space as in the standard SVM. The Vapnik formulation is
modified here at two points. First, instead of inequality con- straints, equality constraints are used, where the value 1 at the right hand side is considered as a target value instead of a threshold value. Error variable ek allows some tolerance of misclassification in the case of overlapping distributions. This error variables function is similar to the slack variable ξk in SVM formulation. Second, a squared loss function is taken for this error variable. As explained below, these modifications will greatly simplify the problem.
In the case of a linear classifier the primal problem can be

solved easily, however w in general might become infinite dimensional. Therefore let us derive the dual problem for the LS-SVM nonlinear classifier formulation. The Lagrangian for the problem is
where the αk values are the Lagrange multipliers, which can
be positive or negative, due to the equality constraints.
The conditions for optimality yield

for k = 1, ...,N. These can be written as a linear system
Where


Elimination of w and e gives

Mercer’s condition can be applied to the matrix

The classifier in dual space takes the form:
similar to the standard SVM.
A chosen kernel function should be a positive definite function and satisfy the Mercer condition. All comments on the kernel functions can be equally applied to the use of kernels in the LS-SVM context. However, in this work we focus on the use of linear and RBF kernels.
LS-SVM can be efficiently estimated using iterative methods. In order to make an LS-SVM model, we need two extra pa- rameters: γ(gama) is the regularization parameter, determin- ing the trade-off between the fitting error minimization and smoothness. In the common case of the RBF kernel, is the bandwidth.

3 PROPOSED METHODOLOGY

The proposed methodology of classifying MR image slices of human brain is shown in Figure 2. The method uses the steps of feature extraction and classification. Significant difference between tissue types, observed in variety of textural meas-

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 3, March-2014 251

ISSN 2229-5518

urements in MR image, is used for this classification. The var- ious measurements based on statistical and co-occurrence ma- trix textural features from the MR images are given as input to the classifiers for training. If the features of new slices are given as input, the trained classifier can able to classify it. The results of Least Squares Support Vector Machine are analysed to detect the tumour type.

3.1 Medical Image / MR Image ( Input Image Data) Magnetic Resonance Imaging (MRI) uses magnetic energy and radio waves to create images (“slices”) of the human body.

MR imaging measures the magnetic properties of nuclei
within the body tissues. The energy absorbed by the nuclei is
then released, returning the nuclei to their initial state of equi-
librium and this transmission of energy by the nuclei is ob
served as the MRI signal.MR images are generated by the res- onating nuclei for each spatial location. The image gray level in MRI mainly depends on three tissue parameters viz., proton density (PD), spin-lattice (T1) and spin-spin (T2) relaxation time. Generally, for most of the soft tissues in the body, the proton density is very homogenous but may exhibit higher intensity for gray matter. T1 and T2 are sensitive to the local environment; they are used to characterize different
Tissue types. T1, T2 and PD type images are mostly used by different researchers for different MR applications. Recently, the FLAIR sequence has replaced the PD image . FLAIR imag- es are T2 weighted with the CSF signal suppressed. T1 shows higher intensity for white matter, T2 presents higher intensity for cerebrospinal fluid. In practice, MR images represent all these three properties with variable weightings, these images are produced as T1 weighted, T2 weighted and T2 FLAIR weighted for each spatial plane. Though several orientations of magnetic resonance imaging are possible, the axial orienta- tion is frequently used in segmentation.

3.2 Segmentation

Segmentation plays an important role in medical image pro- cessing. Segmentation of medical image is the division or sep- aration of the image into disjoint regions of similar attribute. We proposed a methodology that integrates K-Means cluster- ing segmentation algorithm for medical image segmenta- tion.K-means algorithm proposed for large datasets and to find initial centroid. An algorithm is described for segmenting medical image (MRI Images) into K different tissue types, which include gray, white matter and CSF, and other abnor- mal tissues. Medical images considered can be either scale- or multivalued. Each scale-valued image is modeled as a collec- tion of regions with slowly varying intensity plus a white Gaussian noise. The proposed algorithm is an adaptive K- means clustering algorithm for 3-dimensional and multi- valued images.Each iteration consists of two steps:estimate mean intensity at each location for each type, and estimate tissue types, Its performance is tested using patient data.
K-Means Algorithm
K-means algorithm is under the category of Squared Error- Based Clustering (Vector Quantization) and it is also under the
category of crisp clustering or hard clustering. K-means algo- rithm is very simple and can be easily implemented in solving many practical problems. Steps of the K-means algorithm are given below.
1. Choose k cluster centres to coincide with k randomly chosen patterns inside the hyper volume containing the pattern set.(C)
2. Assign each pattern to the closest cluster center. (Ci, i = 1,2,.
. . . C)
3. Recompute the cluster centers using the current cluster memberships.(U)

4. If a convergence criterion is not met, go to step 2 with new cluster centers by the following equation, i.e., minimal de- crease in squared error.

The performance of the K-means algorithm depends on the initial positions of the cluster centres. This is an inherently iterative algorithm. And also there is no guarantee about the convergence towards an optimum solution. The convergence centroids vary with different initial points. It is also sensitive to noise and outliers. It is only based on numerical variables.

3.3 Feature extraction

The purpose of feature extraction is to reduce the original data set by measuring certain properties ,or features, that dis- tinguish one input pattern from another.
The extracted features provide the characteristics of the input type to the classifier by considering the description of the rele- vant properties of the image into a feature space. Most of the tumor is heterogeneous tissues and the mean values of relaxa- tion times are not at all sufficient to characterize the heteroge- neity of the different tumor types .
An alternative approach, which is being investigated within the framework of this study, is to apply texture analysis to the T1, T2 and T2 FLAIR images to describe quantitatively the brightness and texture of the images. Texture analysis covers a wide range of techniques based on first- and second order im- age texture parameters.
In the present study the statistical features based on image intensity like mean & variance and features from gray level co-occurrence matrices (GCMs) such as contrast, energy, and

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 3, March-2014 252

ISSN 2229-5518


Correlation are used to investigate the adequacy for the dis- rimination of normal and abnormal patient.
The following statistical features are computed as described in the following equation . Let x(i,j)be the image intensity for the location (i, j).

Where X & Y are number of pixels available in rows and col- umns of images respectively. The spatial coordinates and in- tensities of the extracted pixels were preserved for construct- ing the Gray Level Co-Occurrence Matrix (GCM). The choice of Haralick features based on GCMs was made considering their proven applicability to analyze objects with irregular outliners. The GCMs are constructed by mapping the gray level co-occurrence probabilities based on spatial relations of pixels in different angular directions.

A GCM P( i, j) reflects the distribution of the probability of occurrence of a pair of gray levels (i,j) given the spacing be- tween the pixels is x∆ and y∆ in the x and y dimensions re- spectively. The symmetric co-occurrence matrixes are estimat- ed by averaging the matrixes and
so eliminating the distinction be-
tween opposite offset directions. Four angles namely 0, 45, 90,
135 as well as a predefined offset distance of one pixel in the
formation of symmetric co-occurrence matrices are consid-
ered. A pixel offset distance of one is preferred to ensure a
large numbers of co-occurrences derived from images. From
these four GCM matrices, set of features are computed (called
feature vectors), and four texture measures in the present
work as in were computed as described below,

Difference moment: A measure of contrast
where N is the number of gray levels, equal to 256 for images in the present study. R is equal to the total number of pixel pairs used for the calculation of texture features in the speci- fied angular direction.
Energy: A measure of homogeneity

Inverse Difference Moment: A measure of local homogeneity
Inverse Difference Moment

Correlation: A measure of linear dependency of brightness.


In the above expressions, are the mean and standard deviation values of GCM values accumulated in the x and y directions, respectively.

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 3, March-2014 253

ISSN 2229-5518

4 IMPLEMENTATION OF PROPOSED METHODOLOGY

4.1 MR image data

Each axial brain slice used in this work consists of three
feature images: T1 weighted, T2 weighted and T2 FLAIR
weighted.The 43 image slices (29 abnormal slices and 14 nor-
mal slices) of 03 abnormal patient volumes were considered in
this work.
Training & Testing Data
The MRI image slices were grouped into two classes, namely normal and abnormal depending on the tumor present in the slice. The MRI data set contains 43 slices (29 abnormal slices and 14 normal slices) and from which all 43 slices are grouped
to have training of classifier and 20 images are grouped to have testing of classifier(13 abnormal and 07 normal images).

4.2 Feature extraction

In this paper, for every medical image of each patient, various statistical features and features for the four GCMs described above are computed. Instead of using angularly dependent features directly in GCM, the average (which are invariant under rotation) of each four co-occurrence features over the four angular directions are used, thus resulting in a total of
5features per image. The features need to be normalized so that no one feature dominates the others. To facilitate training, the feature vectors are normalized to have zero mean & unit variance. The normalization is done by using

4.3 Classifier

The MRI slices were classified using KULeuven’s MATLAB/C LS-SVMlab toolbox for LS-SVM classification with both linear and RBF kernels. Various parameters for the classifiers are selected through pilot runs.
For the LS-SVM classifier, the value of gamma is chosen as 10 and value of sig2 is chosen as 0.2.

4.4 Performance measure

All classification result could have an error rate and on occasion will either fail to identify an abnormality, or identify an abnormality which is not present. It is common to describe this error rate by the terms true and false positive and true and false negative as follows:
True Positive (TP): the classification result is positive in the presence of the clinical abnormality.
True Negative (TN): the classification result is negative in the absence of the clinical abnormality.
False Positive (FP): the classification result is positive in the absence of the clinical abnormality.
False Negative (FN): the classification result is negative in the presence of the clinical abnormality.
Table 1 is the contingency table which defines various terms used to describe the clinical efficiency of a classification based on the terms above and
Sensitivity = TP/ (TP+FN) *100% Specificity = TN/ (TN+FP) *100% Accuracy
= (TP+TN)/ (TP+TN+FP+FN)*100 %
are used to measure the performance of the classifiers.

5 RESULT AND OUTPUT


The results of segmentation by using K-Means clustering algo- rithm on medical image ( Brain MRI image) is as follows:-

The results of feature extraction after implementation of pro- posed methodology for feature extraction is as follows:-
The results of Medical Image (Brain MRI image) classification

IJSER © 2014 http://www.ijser.org

International Journal of Scientific & Engineering Research, Volume 5, Issue 3, March-2014 254

ISSN 2229-5518

using LS-SVM as per the proposed methodology is as follows.

6 CONCLUSION AND FUTURE SCOPE

The computer based technique for automatic classification of MRI slices as normal or abnormal with various MR image fea- tures using LS-SVM classifier is proposed.
The performance of the classifier in terms of statistical measures such as sensitivity, specificity and classification ac- curacy are proposed for the analysis.
This proposed methodology suggests that LS-SVM is a prom- ising technique for image classification in a medical imaging application.
This proposed methodology provides the indication that the
LS-SVM approach definitely yields the better performance.
This methodology can be used in computer aided intelligent health care systems. This automated analysis system could be further used for classification of images with different patho- logical condition, types and disease status.

REFERENCES

[1] [1] H.Selvaraj, S.Thamarai Selvi, D.Selvathi, L.Gewali , “Brain MRI slices clas- sification using least squares support vector machine” , International Journal of Intelligent Computing in Medical Science and Image Processing, IC-MED, Vol. 1, No. 1, Issue 1, Page 21 of 33,2007

[2] [2] Prof. dr. ir. J. Vandewalle , Prof. dr. ir. S. Van Huffel , “Least squares sup- port vector machines classification applied to brain tumour recognition using magnetic resonance spectroscopy ”, D/2003/7515/71,ISBN 90-5682-460-0

[3] [3] Milan Sonka, Vaclav Hlavac, Roger Boyle, “ Image processing ,analysis , and machine vision”, Pages 399 – 461,ISBN : 13:978-0-495-24428-7.

[4] [4] B. Scholkopf, S. Kah-Kay, C. J. Burges, F.Girosi, P. Niyogi, T. Poggio, and V.Vapnik, “Comparing support vector machines with Gaussian kernels to radial basis function classifiers,” IEEE trans Signal Processing, vol. 45, pp.

2758-2765,1997.

[5] [5] Jan Luts, Fabian Ojeda, Raf Van de Plas, Bart De Moor, Sabine Van Huffel, Johan A.K. Suykens,” A tutorial on support vector machine-based methods for classification problems in chemometrics”, Department of Electrical Engi- neering (ESAT), Research Division SCD, Katholieke Universiteit Leuven, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium, Analytica Chimica Acta

665 (2010) 129–145.

[6] [6] Tony Van Gestel,Johan A.K.Suykens ,Bart Baesens ,Stijn Viaene ,Jan Van- thienen ,Guido Dedene , Bart De Moor,Joos Vandewalle,” Benchmarking Least Squares Support Vector Machine Classifiers”, Kluwer Academic Pub- lishers. Manufactured in The Netherlands, Machine Learning, 54, 5–32, 2004.

[7] [7] Arjan Gijsberts , Giorgio Metta , L´eon Rothkrantz ,” Evolutionary Optimi- zation of Least-Squares Support Vector Machines “,Italian Institute of Tech- nology, Via Morego, 30 – Genoa 16163, Italy

[8] [8] Kristiaan Pelckmans, Johan A.K.Suykens,T. Van Gestel, J. De Brabanter, L.

Lukas, B. Hamers, B. De Moor and J.Vandewalle,” LS-SVMlab: a MATLAB/C toolbox for Least Squares Support Vector Machines”, ESAT- SCD-SISTA K.U. Leuven Kasteelpark Arenberg 10 B-3001 Leuven-Heverlee, Belgium.

[9] L. Lukas, A. Devos, J.A.K. Suykens, L. Vanhamme, S. Van Huffel, A.R.Tate, C. Maj´os, C. Ar´us, ”The use of LS-SVM in the classification of brain tumors based on Magnetic Resonance Spectroscopic signals”, in Proc. Of the Europe- an Symposium on Artificial Neural Networks (ESANN’2000), Bruges, Bel- gium, 2000, pp. 131-136

IJSER © 2014 http://www.ijser.org