XM-Tree, a new index for web information retrieval

Authors

  • Claudia Deco Departamento de Sistemas e Informática, Facultad de Ciencias Exactas, Ingeniería y Agrimensura, Universidad Nacional de Rosario, Rosario, Argentina
  • Guillermo Pierángeli Departamento de Sistemas e Informática, Facultad de Ciencias Exactas, Ingeniería y Agrimensura, Universidad Nacional de Rosario, Rosario, Argentina
  • Cristina Bender Departamento de Sistemas e Informática, Facultad de Ciencias Exactas, Ingeniería y Agrimensura, Universidad Nacional de Rosario, Rosario, Argentina
  • Nora Susana Reyes Departamento de Informática, Universidad Nacional de San Luis, San Luis, Argentina

Keywords:

Metric spaces, Similarity Searching, M-Tree, XM-Tree

Abstract

Web Information Retrieval is another problem of searching elements of a set that are closest to a given query under a certain similarity criterion. It is of interest to take advantage of metric spaces in order to solve a search in an effective and efficient way. In this article, we present an extension of the M-Tree index, called XM-Tree, in order to improve search results. This index allows dynamic insertion of new data, reduces search costs using pruning and precalculated distances, and uses a tolerable amount of space, which makes this index apt for the extensive and dynamic Web. The proposed extension indexes Web documents, uses L2 as indexing distance and L as similarity criterion to solve queries. We also present experiments validating the results.

Downloads

Download data is not yet available.

References

[1] E. Chávez, G. Navarro, R. Baeza-Yates y J. L. Marroquín. “Searching in Metric Spaces”, ACM Computing Surveys, 33(3), 2001; pp 273–321.
[2] R. Baeza-Yates y B. Ribeiro-Neto. (eds.). Modern Information Retrieval. ACM Press, New York, 1999.
[3] P. Ciaccia, M. Patella y P. Zezula. “M-Tree: An Efficient Access Method for Similarity Search in Metric Spaces”, Proc. of the 23rd Conference on Very Large Databases (VLDB’97), 1997; pp. 426–435.
[4] C. Traina Jr., A. Traina, B. Seeger y C. Faloutsos. “Slim-trees: High Performance Metric Trees Minimizing Overlap between Nodes”, Proc. of 7th International Conference on Extending Database Technology, LNCS, 1777, 2000; pp. 51–68.
[5] X. Zhou, G. Wang, J. Xu Yu y G. Yu. “M+-tree: A New Dynamical Multidimensional Index for Metric Spaces”, Proc. of the 14th Australasian Database Conference, 2003; pp 161 – 168.
[6] X. Zhou, G. Wang, X. Zhou y G. Yu. “BM+-tree: A Hyperplane-based Index Method for High-Dimensional Metric Spaces”, Proc. of 10th International Conference Database Systems for Advanced Applications, LNCS, 3453, 2005; pp 398–409.
[7] M. R. Vieira, C. Traina Jr., F. J. T. Chino y A. J. M. Traina. “DBM-Tree: Trading Height-Balancing for Performance in Metric Access Methods”, Journal of the Brazilian Computer Society, v. 11, n. 3, 2006; pp 20.
[8] A. Ocsa y E. Cuadros-Vargas. “DBM*-Tree: An Efficient Metric Acces Method”, Proc. of ACM Southeast Regional Conference, 2007; pp 401–406.
[9] P. Ciaccia y M. Patella. “The M2-tree: Processing Complex Multi-Feature Queries with just one Index”, Proceedings of DELOS Workshop: Information Seeking, Searching and Querying in Digital Libraries, 2000.
[10] S. Brin y L. Page. “The Anatomy of a Large-Scale Hypertextual Web Search Engine”, Proc of the Seventh International World Wide Web Conference, vol. 30 of Computer Networks and ISDN Systems, 1998, pp. 107–117.
[11] D. J. Delorie. DJGPP. Copyright (c) 2003 URL http://www.delorie.com/djgpp/

Downloads

Published

2008-07-01

How to Cite

Deco, C., Pierángeli, G., Bender, C., & Reyes, N. S. (2008). XM-Tree, a new index for web information retrieval. Journal of Computer Science and Technology, 8(02), p. 78–84. Retrieved from https://journal.info.unlp.edu.ar/JCST/article/view/745

Issue

Section

Original Articles

Most read articles by the same author(s)

1 2 > >>