Buckets inverted lists for a search engine with BSP

Authors

  • Graciela Verónica Gil Costa LIDIC - Computer Science Department, University of San Luis, San Luis, Argentina
  • Alicia Marcela Printista LIDIC - Computer Science Department, University of San Luis, San Luis, Argentina
  • Juan Mauricio Marin Cahiuan Center of Web Research, University of Magallanes, Punta Arenas, Chile

Keywords:

textual databases, supersteps, search engine, BSP, buckets

Abstract

Most information in science, engineering and business has been recorded in form of text. This information can be found online in the World-Wide-Web. One of the major tools to support information access are the search engines which usually use information retrieval techniques to rank Web pages based on a simple query and an index structure like the inverted lists. The retrieval models are the basis for the algorithms that score and rank the Web pages. The focus of this presentation is to show some inverted lists alternatives, based on buckets, for an information retrieval system. The main interest is how query performance is effected by the index organization on a cluster of PCs. The server design is effected on top of the parallel computing model Bulk Synchronous Parallel-BSP.

Downloads

Download data is not yet available.

References

[1] Serge Abiteboul and Victor Vianu. Queries and Computation on the Web. Proceedings of the International Congerence on Database Theory. Delphi, Greece 1997.
[2] C. S. Badue. Distributed query processing using partitioned inverted files. Master's thesis, Federal University of Minas Gerais, Belo Horizonte, Minas Gerais, Brazil, March 2001.
[3] R. A. Barbosa. Departameho de consultas em bibliotecas digitais fortemente aclopadas.Master's thesis, Federal Univerity of Minas Gerais, Belo Horizonte, Minas Gerais Brazil, May 1998. óó-in Portuguese.
[4] R. Baeza and B. Ribeiro. Modern Information Retrieval.Addison-Wesley. 1999.
[5] R. Baeza-Yates and A. Moffat and G. Navarro. Searching Large Text Collectionsî, Handbook of Massive Data Sets,Kluwer Academic Publishers, 2002, ISBN 1-4020-0489-3.
[6] T. Berners-Lee, R. Cailliau, A. Luotonen, H. Nielsen, and A. Secret.The World-Wide Web. Comm. of the ACM, 37(8):76-82,aug 1994.
[7] G. V. Gil Costa. Procesamiento Paralelo de Queries sobre Base de Datos Textuales. Tesis de licenciatura. Universidad Nacional de San Luis. 2003.
[8] Veronica Gil Costa, A.Marcela Printista. Estrategia de Buckets para Listas Invertidas Paralelas. XII Jornadas Chilenas de computacion. Arica, Chile. 8-12 de noviembre del 2004.
[9] V. Gil Costa, M. Printista y M. Marín. Modelizacion de Listas Invertidas Paralelasî. X Congreso Argentino de Ciencias de la Computacion, 4-8 de Octubre 2004.(CACIC2004).
[10] M. Goudreau and J. Hill and K. Lang and B. McColl and S. Rao. A Proposal for the BSP Worldwide Standard Library. http://www.bsp-worldwide.org/standar/stand2.html. 1996.
[11] A. MacParlane, J.A.McCanny S.E. Robertson. ìParallel Search Using Inverted Filesî. In the 7th. International Symposium on String Processing and Information Retrieval, 2000.
[12] M. Marin, C. Bonacic y S. Casas. ìAnalysis of two indexing structures for text databasesî, Actas del VIII Congreso Argentino de Ciencias de la Computación (CACIC2002). Buenos Aires, Argentina, Octubre 15- 19, 2002.
[13] M. Persin, J. Zobel, R.Sacks-Davis.ìFilteres Document Retrieval with Frequency-Stores Indexesî. Journal of the American Society for Information Science, 1996.
[14] B.A. Ribeiro-Neto and R.A. Barbosa. ìQuery performance for tightly coupled distributed digital librariesî. In Third ACM Conference on Digital Libraries, pages 182-190, 1998.
[15] C. Santos Badue, R. Baeza-Yates, B. Ribeiro-Neto, and N. Ziviani. ìConcurrent query processing using distributed inverted files. In the 8th. International Symposium on String Processing and Information Retrieval, pages 10-20, 2001.
[16] D.B. Skillcorn and J. Hill and W.F. McColl. ìQuestions and Answers about BSPî. Oxford University Computing Laboratory. PRG-TR-15-96. 1996.
[17] L. Valiant. ìA Bridging Model for Parallel Computationî. Communications of the ACM, Vol. 33, Pp 103-111, 1990.
[18] I. Qitten, A. Mof fat and T. C. Bell. ìManaging Gigabytes- Compressing and Indexing Documents and Imagesî. Morgan Kaufmann Publishers, Inc. second edition, 1999.
[19] WWW .BSP and Worldwilde Standard, http://www.bsp-worldwide.org
[20] WWW.BSPPUB Library ar Paderborn Univertity, http://www.uni-paderborn.de/bsp

Downloads

Published

2006-04-03

How to Cite

Gil Costa, G. V., Printista, A. M., & Marin Cahiuan, J. M. (2006). Buckets inverted lists for a search engine with BSP. Journal of Computer Science and Technology, 6(01), p. 28–35. Retrieved from https://journal.info.unlp.edu.ar/JCST/article/view/826

Issue

Section

Original Articles

Most read articles by the same author(s)