IJSER Home >> Journal >> IJSER
International Journal of Scientific and Engineering Research
ISSN Online 2229-5518
ISSN Print: 2229-5518 2    
Website: http://www.ijser.org
scirp IJSER >> Volume 3,Issue 2,February 2012
Web Mining Using Topic Sensitive Weighted PageRank
Full Text(PDF, )  PP.501-504  
Author(s)
Shesh Narayan Mishra, Alka Jaiswal, Asha Ambhaikar
KEYWORDS
— Web structure mining; Weighted PageRank; Topic sensitive PageRank; TSWPR
ABSTRACT
The World Wide Web contains the large amount of information sources. While searching the web for particular topics, users usually fetch irrelevant and redundant information causing a waste in user time and accessing time of the search engine. So narrowing down this problem, user's interests and needs from their behavior have become increasingly important. Web structure mining plays an effective role in this approach. Some page ranking algorithms PageRank, Weighted PageRank are commonly used in web structure mining. The original PageRank algorithm search-query results independent of any particular search query. To yield more specific and accurate search results against a particular topic, we proposed a new algorithm Topic Sensitive Weighted PageRank based on web structure mining that will show the relevancy of the pages of a given topic is better determined, as compared to the existing PageRank, Topic sensitive PageRank and Weighted PageRank algorithms. For ordinary keyword search queries, Topic Sensitive Weigted PageRank scores will satisfy the topic of the query.
References
[1] W. Xing and Ali Ghorbani, “Weighted PageRank Algorithm”, Proc. ofthe Second Annual Conference on Communication Networks and Services Research (CNSR ’04), IEEE, 2004.

[2] Taher H. Haveliwala. Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search. IEEE Transactions on Knowledge and Data Engineering, Vol. 15, No4, July/August 2003, 784-796.

[3] R. Kosala, H. Blockeel, “Web Mining Research: A Survey”, SIGKDD Explorations, Newsletter of the ACM Special Interest Group on Knowledge Discovery and Data Mining Vol. 2, No. 1 pp 1-15, 2000.

[4] N. Duhan, A. K. Sharma and K. K. Bhatia, “Page Ranking Algorithms:A Survey, Proceedings of the IEEE International Conference on Advance Computing, 2009.

[5] M. G. da Gomes Jr. and Z.Gong, “Web Structure Mining: An Introduction”, Proceedings of the IEEE International Conference on Information Acquisition, 2005.

[6] A. Broder, R. Kumar, F Maghoul, P. Raghavan, S. Rajagopalan, R. Stata, A. Tomkins, J. Wiener, “Graph Structure in the Web”, Computer Networks: The International Journal of Computer and telecommunications Networking, Vol. 33, Issue 1-6, pp 309-320, 2000.

[7] X. Wang, T. Tao, J. T. Sun, A. Shakery and C. Zhai, “DirichletRank: Solving the Zero-One Gap Problem of PageRank”. ACM Transaction on Information Systems, Vol. 26, Issue 2, 2008.

[8] Z. Gyongyi and H. Garcia-Molina, “Web Spam Taxonomy”. Proc. of the First International Workshop on Adversarial Information Retrieval on the Web”, 2005.

[9] M. Bianchini, M.. Gori and F. Scarselli, “Inside PageRank”. ACM Transactions on Internet Technology, Vol. 5, Issue 1, 2005

[10] C.. H. Q. Ding, X. He, P. Husbands, H. Zha and H. D. Simon, “PageRank: HITS and a Unified Framework for Link Analysis”. Proc. of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 2002.

[11] J. Cho and S. Roy, “Impact of Search Engines on Page Popularity”. Proc. of the 13th International Conference on WWW, pp. 20-29, 2004.

[12] J. Cho, S. Roy and R. E. Adams, “Page Quality: In search of an unbiased web ranking”. Proc. of ACM International Conference on Management of Data”. Pp. 551-562, 2005.

[13] A. M. Zareh Bidoki and N. Yazdani, “DistanceRank: An intelligent ranking algorithm for web pages” Information Processing and Management, Vol 44, No. 2, pp. 877-892, 2008.

[14] S. Chakrabarti, B. Dom, D. Gibson, J. Kleinberg, R. Kumar, P. Raghavan, S. Rajagopalan, A. Tomkins, “Mining the Link Structure of the World Wide Web”, IEEE Computer Society Press, Vol 32, Issue 8 pp. 60 – 67, 1999.

[15] L. Page, S. Brin, R. Motwani, and T. Winograd, “The Pagerank Citation Ranking: Bringing order to the Web”. Technical Report, Stanford Digital Libraries SIDL-WP-1999-0120, 1999.

[16] S. Brin, L. Page, “The Anatomy of a Large Scale Hypertextual Web search engine,” Computer Network and ISDN Systems, Vol. 30, Issue 1- 7, pp. 107-117, 1998.

[17] J. Hou and Y. Zhang, “Effectively Finding Relevant Web Pages from Linkage Information”, IEEE Transactions on Knowledge and Data Engineering, Vol. 15, No. 4, 2003.

[18] J. Dean and M. Henzinger, “Finding Related Pages in the World Wide Web”, Proc. Eight Int’l World Wide Web Conf., pp. 389-401, 1999.

[19] R. Kumar, P. Raghavan, S. Rajagopalan, D. Sivakumar, A. Tompkins and E. Upfal, “Web as a Graph”, Proceedings of the Nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Database systems, 2000.

[20] R. Cooley, B. Mobasher and J. Srivastava, “Web Minig: Information and Pattern Discovery on the World Wide Web”. Proceedings of the 9th IEEE International Conference on Tools with Artificial Intelligence, pp. (ICTAI’97), 1997.

[21] Sung Jin Kim and Sang Ho Lee, “An Improved Computation of the PageRank Algorithm”, In proceedings of the European Conference on Information Retrieval (ECIR), 2002.

[22] Ricardo Baeza-Yates and Emilio Davis ,"Web page ranking using link attributes" , In proceedings of the 13th international World Wide Web conference on Alternate track papers & posters, PP.328-329, 2004.

Untitled Page