Part of the book series:Lecture Notes in Computer Science ((LNAI,volume 8077))
Included in the following conference series:
1555Accesses
Abstract
Nowadays astronomical catalogs contain patterns of hundreds of millions of objects with data volumes in the terabyte range. Upcoming projects will gather such patterns for several billions of objects with peta- and exabytes of data. From a machine learning point of view, these settings often yield unsupervised, semi-supervised, or fully supervised tasks, with large training and huge test sets. Recent studies have demonstrated the effectiveness of prototype-based learning schemes such as simple nearest neighbor models. However, although being among the most computationally efficient methods for such settings (if implemented via spatial data structures), applying these models on all remaining patterns in a given catalog can easily take hours or even days. In this work, we investigate the practical effectiveness of GPU-based approaches to accelerate such nearest neighbor queries in this context. Our experiments indicate that carefully tuned implementations of spatial search structures for such multi-core devices can significantly reduce the practical runtime. This renders the resulting frameworks an important algorithmic tool for current and upcoming data analyses in astronomy.
This is a preview of subscription content,log in via an institution to check access.
Access this chapter
Subscribe and save
- Get 10 units per month
- Download Article/Chapter or eBook
- 1 Unit = 1 Article or 1 Chapter
- Cancel anytime
Buy Now
- Chapter
- JPY 3498
- Price includes VAT (Japan)
- eBook
- JPY 5719
- Price includes VAT (Japan)
- Softcover Book
- JPY 7149
- Price includes VAT (Japan)
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Andoni, A., Indyk, P.: Near-optimal hashing algorithms for approximate nearest neighbor in high dimensions. Communications of the ACM 51(1), 117–122 (2008)
Bentley, J.L.: Multidimensional binary search trees used for associative searching. Communications of the ACM 18(9), 509–517 (1975)
Beygelzimer, A., Kakade, S., Langford, J.: Cover trees for nearest neighbor. In: Proceedings of the 23 International Conference on Machine Learning, pp. 97–104. ACM (2006)
Borne, K.: Scientific data mining in astronomy, arXiv:0911.0505v1 (2009)
Bustos, B., Deussen, O., Hiller, S., Keim, D.: A graphics hardware accelerated algorithm for nearest neighbor search. In: Alexandrov, V.N., van Albada, G.D., Sloot, P.M.A., Dongarra, J. (eds.) ICCS 2006, Part IV. LNCS, vol. 3994, pp. 196–199. Springer, Heidelberg (2006)
Garcia, V., Debreuve, E., Barlaud, M.: Fast k nearest neighbor search using GPU. In: CVPR Workshop on Computer Vision on GPU, Anchorage, Alaska, USA (June 2008)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning, 2nd edn. Springer (2009)
Ivezic, Z., Tyson, J.A., Acosta, E., Allsman, R., andere: Lsst: from science drivers to reference design and anticipated data products (2011)
Kirk, D.B., Wen-mei, H.: Programming Massively Parallel Processors: A Hands-on Approach, 1st edn. Morgan Kaufmann Publishers Inc., San Francisco (2010)
Munshi, A., Gaster, B., Mattson, T.: OpenCL Programming Guide. OpenGL Series. Addison-Wesley (2011)
Nakasato, N.: Implementation of a parallel tree method on a gpu. CoRR, abs/1112.4539 (2011)
nVidia Corporation. OpenclTM best practices guide (2009),http://www.nvidia.com/content/cudazone/CUDABrowser/downloads/papers/NVIDIA_OpenCL_BestPracticesGuide.pdf
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)
Polsterer, K.L., Zinn, P., Gieseke, F.: Finding new high-redshift quasars by asking the neighbours. Monthly Notices of the Royal Astronomical Society (MNRAS) 428(1), 226–235 (2013)
Shakhnarovich, G., Darrell, T., Indyk, P.: Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing). MIT Press (2006)
York, D.G., et al.: The sloan digital sky survey: Technical summary. The Astronomical Journal 120(3), 1579–1587
Author information
Authors and Affiliations
Department of Computing Science, University of Oldenburg, 26111, Oldenburg, Germany
Justin Heinermann & Oliver Kramer
Faculty of Physics and Astronomy, Ruhr-University Bochum, 44801, Bochum, Germany
Kai Lars Polsterer
Department of Computer Science, University of Copenhagen, 2100, Copenhagen, Denmark
Fabian Gieseke
- Justin Heinermann
You can also search for this author inPubMed Google Scholar
- Oliver Kramer
You can also search for this author inPubMed Google Scholar
- Kai Lars Polsterer
You can also search for this author inPubMed Google Scholar
- Fabian Gieseke
You can also search for this author inPubMed Google Scholar
Editor information
Editors and Affiliations
Business Informatics I, University of Trier, 54286, Trier, Germany
Ingo J. Timm
Institute for Web Science and Technologies, University of Koblenz, Universitätsstr. 1, 56070, Koblenz, Germany
Matthias Thimm
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Heinermann, J., Kramer, O., Polsterer, K.L., Gieseke, F. (2013). On GPU-Based Nearest Neighbor Queries for Large-Scale Photometric Catalogs in Astronomy. In: Timm, I.J., Thimm, M. (eds) KI 2013: Advances in Artificial Intelligence. KI 2013. Lecture Notes in Computer Science(), vol 8077. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40942-4_8
Download citation
Publisher Name:Springer, Berlin, Heidelberg
Print ISBN:978-3-642-40941-7
Online ISBN:978-3-642-40942-4
eBook Packages:Computer ScienceComputer Science (R0)
Share this paper
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative