Search Query Performance Improvement on Medica Data Bases

Home >> Journal >> IJSER

International Journal of Scientific and Engineering Research

ISSN Online 2229-5518

ISSN Print: 2229-5518 7

Website: http://www.ijser.org

IJSER >> Volume 3,Issue 7,July 2012

Search Query Performance Improvement on Medica Data Bases

Full Text(PDF, ) PP.160-170

Author(s)

Thayyaba Khatoon Mohammed, Gayatri.M, G. Swathi, Sukerthi.S.

KEYWORDS

Interactive data exploration and discovery, search process, graphical user interfaces, interaction styles.

Search queries on biomedical databases, such as PubMed, often return a large number of results, only a small subset of which is relevant to the user. Ranking and categorization, which can also be combined, have been proposed to alleviate this information overload problem. Results categorization for biomedical databases is the focus of this work. A natural way to organizebiomedical citations is according to their MeSH annotations. MeSH is a comprehensive concept hierarchy used by PubMed. In thi spaper, we present the BioNav system, a novel search interface that enables the user to navigate large number of query results by organizing them using the MeSH concept hierarchy. First, the query results are organized into a navigation tree. At each node expansion step, BioNav reveals only a small subset of the concept nodes, selected such that the expected user navigation cost is minimized. In contrast, previous works expand the hierarchy in a predefined static manner, without navigation cost modeling. We show that the problem of selecting the best concepts to reveal at each node expansion is NP-complete and propose an efficient heuristic as well as a feasible optimal algorithm for relatively small trees. We show experimentally that BioNav outperforms state-of-the-art categorization systems by up to an order of magnitude, with respect to the user navigation cost.


References

[1] J.S. Agrawal, S. Chaudhuri, G. Das, and A. Gionis, “Automated Ranking of Database Query Results,” Proc. First Biennial Conf. Innovative Data Systems Research, 2003. [2] K. Chakrabarti, S. Chaudhuri, and S.W. Hwang, “Automatic Categorization of Query Results,” Proc. ACM SIGMOD, pp. 755- 766, 2004. [3] Z. Chen and T. Li, “Addressing Diverse User Preferences in SQLQuery- Result Navigation,” Proc. ACM SIGMOD, pp. 641-652, 2007. [4] L. Comtet, Advanced Combinatorics: The Art of Finite and Infinite Expansions, pp. 176-177, Reidel, 1974. [5] R. Delfs, A. Doms, A. Kozlenkov, and M. Schroeder, “GoPubMed: OntologyBased Literature Search Applied to Gene Ontology and PubMed,” Proc. German Conf. Bioinformatics, pp. 169-178, 2004. [6] D. Demner-Fushman and J. Lin, “Answer Extraction, Semantic Clustering, and Extractive Summarization for Clinical Question Answering,” Proc. Int‟l Conf. Computational Linguistics and Ann. Meeting of the Assoc. for Computational Linguistics, pp. 841-848, 2006. [7] Entrez Programming Utilities, http://www.ncbi.nlm.nih.gov/ entrez/query/static/eutils_help.html, 2008. [8] U. Feige, D. Peleg, and G. Kortsarz, “The Dense k-Subgraph Problem,” Algorithmica, vol. 29, pp. 410-421, 2001. [9] V. Hristidis and Y. Papakonstantinou, “DISCOVER: Keyword Search in Relational Databases,” Proc. Int‟l Conf. Very Large Data Bases (VLDB), 2002. [10] R. Hoffman and A. Valencia, “A Gene Network for Navigating theLiterture,” Nature Genetics, vol. 36, no. 7, p. 664, 2004. [11] iHOP—Information Hyperlinked over Protein, http://www.ihopnet.org/UniPub/iHOP/, 2008. [12] M. Kaki, “Findex: Search Results Categories Help When Document Ranking Fails,” Proc. ACM SIGCHI Conf. Human Factors in Computing Systems, pp. 131-140, 2005. [13] A. Kashyap, V. Hristidis, M. Petropoulos, and S. Tavoulari, “BioNav: Effective Navigation on Query Results of Biomedical Databases,” Proc. IEEE Int‟l Conf. Data Eng. (ICDE), (short paper), pp. 1287-1290, 2009. [14] S. Kundu and J. Misra, “A Linear Tree Partitioning Algorithm,” SIAM J. Computing, vol. 6, no. 1, pp. 151-154, 1977. [15] W. Lee, L. Raschid, H. Sayyadi, and P. Srinivasan, “Exploiting Ontology Structure and Patterns of Annotation to Mine Significant Associations between Pairs of Controlled Vocabulary Terms,” Proc. Fifth Int‟l Workshop Data Integration in the Life Sciences (DILS), pp. 44-60, 2008. [16] J. Lin and W.J. Wilbur, “Pubmed Related Articles: A Probabilistic Topic Based Model for Content Similarity,” BMC Bioinformatics, vol. 8, article no. 423, 2007 [17] D. Lindberg, B. Humphreys, and A. McCray, “The Unified Medical Language System,” Methods of Information in Medicine, vol. 32, no. 4, pp. 281- 291, 1993. [18] D. Maglott, J. Ostell, K.D. Pruitt, and T. Tatusova, “Entrez Gene: GeneCentered Information at NCBI,” Nucleic Acids Research, vol. 33, pp. D54- D58, Jan. 2005. [19] Medical Subject Headings (MeSH), http: //www.nlm.nih.gov/ mesh/, 2010. [20] J.A. Mitchell, A.R. Aronson, and J.G. Mork, “Gene Indexing: Characterization and Analysis of NLM’s GeneRIFs,” Proc. AMIA Ann. Symp., pp. 460-464, Nov. [21] OMIM—Online Mendelian Inheritance in Man, http:// www.ncbi.nlm.nih.gov/Omim/, 2008. [22] C. Perez-Iratxeta, P. Bork, and M.A. Andrade, “Exploring MEDLINE Astracts with XplorMed,” Drugs of Today, vol. 38, pp. 381- 389, 2002. [23] C. Plake, T. Schiemann, M. Pankalla, J. Hakenberg, and U. Leser, “Ali Baba: PubMed as a Graph,” Bioinformatics, vol. 22, no. 19, pp. 2444-2445, 2006. [24] PubMatrix: A Tool for Multiplex Literature Mining, http:// pubmatrix.grc.nia.nih.gov/, 2003. [25] PubMed PubReMiner: A Tool for PubMed Query Building and Literature Mining, http://bioinfo.amc.uva.nl/human-genetics/ pubreminer/, 2008. [26] H. Shatkay and R. Feldman, “Mining the Biomedical Literature in the Genomic Era: An Overview,” J. Computational Biology, vol. 10, no. 6, pp. 821- 855, 2003. [27] Stanford Univ.—HighWire Press, http://highwire.stanford.edu/, 2008. [28] Transinsight GmbH—GoPubMed, http://www.gopubmed.org/, 2008. [29] Vivı´simo, Inc.—Clusty, http://clusty.com/, 2008. [30] XplorMed: eXploring Medline abstracts, http://www.ogic.ca/ projects/xplormed/, 2008. [31] T. Zhang, R. Ramakrishnan, and M. Livny, “BIRCH: An Efficient Data Clustering Method for Very Large Databases,” Proc. ACMSIGMOD, pp. 103-114, 1996.

Untitled Page