Refined Clustering technique based on boosting and outlier detection [ ]


Boosting is the repetitive process to perk up the accuracy in functions for prediction that supervised learning (SL) system learn using training data. In this prediction process, boosting considers multiple function rather that considering only single function from the same supervised learning system. Boosting process then predicts the label for new data instances using a weighted vote over all the functions. By considering and merging multiple functions together , boosting manage to get fine grained decision boundary on training data than using single function. Boosting for supervised learning having certain limitations like e.g. because of problematic data difficulty arises to analyze the data , over-fitting of training data , wrong label prediction by initial function etc. Previous worked reflected that boosting is resistant to over fitting problem. Also in case of wrong label prediction from function, boosting achieves higher accuracy when multiple functions are used to decide the labels for clusters. Previous work have some difficulties like A] Wrong data i.e. label noise in training data which causes wrong output instances and B] Another problem is that when feature of label instances are different and not relevant with respective rest of training data then its proper cluster cannot be defined properly. Hence there must be proposed system that work on these problems. Also clustering can be achieved on problematic dataset also. For this cluster based boosting (CBB) approach should be adopted to achieve this. Also along with CBB, the outlier detection should be achieved so that data will be easy to analyze and cluster can be formed smartly.