Discrimination Resistant Privacy Preserved Data Mining

Discrimination Resistant Privacy Preserved Data Mining [ READ ]
Anju Sundaresan, Lakshmi S

These Data mining is the process of analyzing and summarizing data and extracting some useful information. There are some negative issues related to data mining. Potential discrimination and potential privacy invasion are two important issues among them. Discrimination is the phenomenon of unfairly treating people based on their membership in some group or category. Automated data collection may lead the way to making automated decisions, like loan granting/denial, insurance premium computation, job granting/denial etc. So, antidiscrimination techniques such as discrimination discovery and prevention have been introduced in data mining. Mainly there are two types of discrimination, direct discrimination and indirect discrimination. Privacy Preserving Data Mining deals with protecting the privacy of individual data or sensitive knowledge. There are no studies developed yet to avoid discrimination and privacy invasion simultaneously. In this work, to avoid these issues the following methods are combined. The basic idea behind discrimination finding is to apply the rule mining algorithms on the given datasets. The measure of impact of the rules is found out using elift function; the measure may be used in algorithms for prevention of direct and indirect discrimination. A discriminative attribute may be protected using two methods- direct rule protection and rule generalization algorithms. Privacy preservation is carried out based on K anonymity algorithm. In K anonymity attributes are suppressed or generalized until each row is identical with at least k-1 other rows.