Author Topic: Modified Prime Number Based Partition Algorithm  (Read 1766 times)

0 Members and 1 Guest are viewing this topic.

IJSER Content Writer

  • Sr. Member
  • ****
  • Posts: 327
  • Karma: +0/-1
    • View Profile
Modified Prime Number Based Partition Algorithm
« on: August 21, 2011, 06:37:45 am »
Author : Mr. Praveen Kumar Mudgal, Prof. Ms. Shweta Modi
International Journal of Scientific & Engineering Research Volume 2, Issue 6, June-2011
ISSN 2229-5518
Download Full Paper : PDF

Abstract— Frequent pattern mining is always an interesting research area in data mining to mine several hidden and previously unknown pattern. The better algorithms are always introduced and become the topic of interest. Association rule mining is an implication of the form X implies Y, where X is a set of antecedent items and Y is the consequent items. There are several techniques have been introduced in data mining to discover frequent item sets. This paper describes and takes an approach for mining frequent pattern and suggests some modification. The new algorithm uses both the concepts of top-down and bottom-up approach.To calculate the support count prime representation of item sets is used .It enables to save time in calculating frequent item sets. Through this Efficiency of system improves when the frequent item sets are generating in lesser time.

Index Terms—Association Rule Mning,Cluster Based Partition Algorithm,Data mining,Frequent patterns,KDD,Support Count,Prime numbers.

1   INTRODUCTION                                                                      
The method of association rule mining was given by R. Agrawal in 1993.It can show the association of various products by generating frequent item sets. The data mining requires, need to store wide variety of data to be stored in memory for future processing .It takes several years to accomplish this large storage. Because the size of data is maintained and processed basically captures more then tera bytes of space in memory. This kind of large volume of data may originate from business enter-prises and scientific research. Data mining is the proc-ess of extracting interesting and undetected patterns from large storage after their processing. Data mining is a one step of the process known as Knowledge Discovery in Database; through this process relative and previously unknown and needful information can be extracted. For extracting such kind of important information data filtered through KDD process of data mining. This data helps in decision support, market analysis, deciding market policies, weather forecasting, medical diagnosis and many other applications.   

2.1 Concept of Association Rule Mining
Problem is based on the market basket problem. One can state the problem as follows: Let A= {a,b,c,…………,y,z} be the set of m different literals. Transaction database T is a collection of transaction. Each transaction contains a set of items {a,b,………….,y,z} is a subset of T.
Generally an association rules mining algorithm contains the following steps:
• The set of candidate k-item sets is generated by 1-extensions of the large (k -1) - Item sets generated in the previous iteration.
• Supports for the candidate k-item sets are generated by a pass over the database.
•Item sets that do not have the minimum support are discarded and the remaining item sets are called large k-item sets.
                                     This process is repeated until no larger items sets are found.There are some algorithms which are used for mining frequent item set-
1) Apriori algorithm.
2) Partition algorithm
3) Pincer search algorithm
And many more algorithms are described after these three.

3.1 Concept of Prime number
This algorithm uses the concept of prime numbers to represent the items in the transaction. Each item is assigned a unique prime number. Each transaction is collection of items, so transaction is represented by prime product of uniquely assigned prime number of each item. Since the product of prime number is unique and modulo division of prime multiple of transaction by prime multiple of each item set can check the presence of item set in the transaction. Each division generates either remainder- 
•   If remainder = 0, item set is present in the trans-action.
•   If remainder <> 0, item set is present in the transaction.

                                            The advantage of using prime number is presence of each item set can be calculated very quickly. Prime numbers representation method generates only one number for one transaction.Each number is very easily operatable and storable in memory.Each number requires less processing time to calculate support count because each item set is unique.

Read More: Click here...