New technologies for big multimedia data treatment
Keywords:Metric Space, Hybrid Computation, GPU, Index, Parallel Searching
With the technology advance and the growth of Internet, the information that can be found in this net, as well as the number of users that access to look for speciﬁc data is bigger. Therefore, it is desirable to have a search system that allows to retrieve information at a reasonable time and in an efﬁcient way. In this paper we show two computing paradigms appropriate to apply in the treatment of large amounts of data consisting of objects such as images, text, sound and video, using hybrid computing over MPI+OpenMP and GPGPU. The proposal is developed through experience gained in the construction of various indexes and the subsequent search, through them, of multimedia objects.
 E. Chávez, G. Navarro, R. Baeza-Yates, and J. Marroquı́n, “Searching in metric spaces,” ACM Comput. Surv., vol. 33, no. 3, pp. 273–321, 2001.
 H. Samet, Foundations of Multidimensional and Metric Data Structures (The Morgan Kaufmann Series in Computer Graphics and Geometric Modeling). San Francisco, CA, USA: Morgan Kaufmann Publishers Inc., 2005.
 K. Figueroa, E. Chávez, G. Navarro, and R. Paredes, “Speeding up spatial approximation search in metric spaces,” ACM Journal of Experimental Algorithmics, vol. 14, p. article 3.6, 2009.
 P. Ciaccia and M. Patella, “Approximate and probabilistic methods,” SIGSPATIAL Special, vol. 2, no. 2, pp. 16–19, Jul. 2010. [Online]. Available: http://doi.acm.org/10.1145/1862413.1862418
 M. Patella and P. Ciaccia, “Approximate similarity search: A multi-faceted problem,” J. Discrete Algorithms, vol. 7, no. 1, pp. 36–48, 2009.
 B. Bustos and G. Navarro, “Probabilistic proximity searching algorithms based on compact partitions,” Discrete Algorithms, vol. 2, no. 1, pp. 115–134, Mar. 2004. [Online]. Available: http://dx.doi.org/10.1016/S1570-8667(03)00067-4
 A. Singh, H. Ferhatosmanoglu, and A. Tosun, “High dimensional reverse nearest neighbor queries,” in The twelfth international conference on Information and knowledge management, ser. CIKM ’03. New York, NY, USA: ACM, 2003, pp. 91–98. [Online]. Available: http://doi.acm.org/10.1145/956863.956882
 F. Moreno-Seco, L. Micó, and J. Oncina, “A modification of the laesa algorithm for approximated k-nn classification,” Pattern Recognition Letters, vol. 24, no. 13, pp. 47 – 53, 2003.
 E. Chávez, K. Figueroa, and G. Navarro, “Proximity searching in high dimensional spaces with a proximity preserving order,” in Proc. 4th Mexican International Conference on Artificial Intelligence (MICAI), ser. LNAI 3789, 2005, pp. 405–414.
 A. Labrinidis and H. V. Jagadish, “Challenges and opportunities with big data,” Proc. VLDB Endow., vol. 5, no. 12, pp. 2032–2033, 2012.
 G. Hager and G. Wellein, Introduction to High Performance Computing for Scientists and Engineers, ser. Computational Science, 2010.
 G. Hager, G. Jost, and R. Rabenseifner, “Communication characteristics and hybrid mpio/penmp parallel programming on clusters of multi-core smp nodes,” Proceedings of Cray User Group Conference, vol. 4, no. d, pp. 54–55, 2009.
 D. B. Kirk and W. W. Hwu, Programming Massively Parallel Processors, A Hands on Approach. Elsevier, Morgan Kaufmann, 2010.
 J. D. Owens, M. Houston, D. Luebke, S. Green, J. Stone, and J. Phillips, “GPU computing,” Proceedings of the IEEE, vol. 96, no. 5, pp. 879–899, May 2008.
 NVIDIA, “Nvidia cuda compute unified device architecture, programming guide version 4.2.” in NVIDIA, 2012.
 V. Mancini, F. Bustos, V. G. Costa, and A. M. Printista, “Data partitioning evaluation for multimedia systems in hybrid environments,” in 3PGCIC, 2012, pp. 321–326.
 N. R. Brisaboa, A. Fariña, O. Pedreira, and N. Reyes, “Similarity search using sparse pivots for efficient multimedia information retrieval,” in ISM. IEEE Computer Society, 2006, pp. 881–888.
 V. G. Costa, M. Marı́n, and N. Reyes, “Parallel query processing on distributed clustering indexes,” J. Discrete Algorithms, vol. 7, no. 1, pp. 3–17, 2009.
 V. G. Costa, R. J. Barrientos, M. Marı́n, and C. Bonacic, “Scheduling metric-space queries processing on multi-core processors,” in PDP, 2010, pp. 187–194.
 R. Barrientos, J. Gomez, C. Tenllado, and M. Prieto, “Heap based k-nearest neighbor search on gpus,” 09/2010 2010, pp. 559–566.
 V. Garcia, E. Debreuve, and M. Barlaud, “Fast k nearest neighbor search using GPU,” in CVPR Workshop on Computer Vision on GPU (CVGPU), Anchorage, Alaska, USA, June 2008.
 V. Garcia, E. Debreuve, F. Nielsen, and M. Barlaud, “k-nearest neighbor search: fast GPU-based implementations and application to high-dimensional feature matching,” in IEEE International Conference on Image Processing, Hong Kong, Sept. 2010.
 K. Kato and T. Hosino, “Solving k-nearest neighbor problem on multiple graphics processors,” in 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing, CCGRID, ACM, Ed., 2010, pp. 769–773.
 Q. Kuang and L. Zhao, “A practical gpu based knn algorithm,” in International Symposium on Computer Science and Computational Technology (ISC-SCT), 2009, pp. 151 – 155.
 S. Liang, Y. Liu, C. Wang, and L. Jian, “Design and evaluation of a parallel k-nearest neighbor algorithm on CUDA-enabled GPU,” in IEEE 2nd Symposium on Web Society (SWS), 2010, pp. 53 – 60.
 T.Rozen, K.Boryczko, and W.Alda, “Gpu bucket sort algorithm with applications to nearest-neighbour search,” Journal of WSCG, vol. 16, no. 1-3, pp. 161–167, 2008.
 R. Barrientos, J. Gomez, C. Tenllado, and M. Prieto, “Query processing in metric spaces using gpus,” 2011.
 R. J. Barrientos, J. Gomez, C. Tenllado, M. Prieto, and M. Marin, “kNN Query Processing in Metric Spaces using GPUs,” vol. 6852, 2011, pp. 380–392.
 S.Brown and J.Snoeyink, “Gpu nearest neighbors using a minimal kd-tree,” in Second Workshop on Massive Data Algorithmics (MASSIVE 2010), Snowbird, Utah, USA, June 2010.
 D.Qiu, S.May, and A. Nüchter, “Gpu-accelerated nearest neighbor search for 3d registration,” in Proceedings of the 7th International Conference on Computer Vision Systems: Computer Vision Systems, ser. ICVS ’09. Berlin, Heidelberg: Springer-Verlag, 2009, pp. 194–203.
 K.Zhou, Q.Hou, R.Wang, and B.Guo, “Real-time kd-tree construction on graphics hardware,” in ACM SIGGRAPH Asia 2008 papers, ser. SIGGRAPH Asia ’08. New York, NY, USA: ACM, 2008, pp. 126:1–126:11.
 M. Lopresti, N. Miranda, F. Piccoli, and N. Reyes, “Efficient similarity search on multimedia databases,” in XVIII Congreso Argentino de Ciencias de la Computacin, CACIC 2012, 2012, pp. 1079–1088.
 ——, “Permutation index and gpu to efficiently solve many queries,” in Proc. HPCLatam 2013, 2013. [Online]. Available: http://hpc2013.hpclatam.org/
 N. Miranda, E. Chávez, F. Piccoli, and N. Reyes, “(very) fast (all) k-nearest neighbors in metric and non metric spaces without indexing,” in Proc. International Conference on Similarity Search and Applications (SISAP 2013). A Coruña, Spain: Elseiver, 2013.