Por favor, use este identificador para citar o enlazar a este item: http://hdl.handle.net/10261/234576
COMPARTIR / EXPORTAR:
logo share SHARE logo core CORE BASE
Visualizar otros formatos: MARC | Dublin Core | RDF | ORE | MODS | METS | DIDL | DATACITE

Invitar a revisión por pares abierta
Título

CosmoHub: Interactive exploration and distribution of astronomical data on Hadoop

AutorTallada-Crespí, Pau; Carretero, Jorge CSIC ORCID; Casals, J.; Acosta-Silva, C.; Serrano, Santiago CSIC ORCID; Caubet, Marc; Castander, Francisco J. CSIC ORCID; César, Eduardo; Crocce, Martín CSIC ORCID; Delfino, Manuel C.; Eriksen, Martin Borstad CSIC ORCID; Fosalba, Pablo CSIC ORCID; Gaztañaga, Enrique CSIC ORCID; Merino Arévalo, Gonzalo; Neissner, Christian; Tonello, Nadia
Palabras claveApache Hadoop
Apache Hive
ASDF
Data distribution
Data exploration
FITS
Fecha de publicación2020
EditorElsevier
CitaciónAstronomy and Computing 32: 100391 (2020)
ResumenWe present CosmoHub (https://cosmohub.pic.es), a web application based on Hadoop to perform interactive exploration and distribution of massive cosmological datasets. Recent Cosmology seeks to unveil the nature of both dark matter and dark energy mapping the large-scale structure of the Universe, through the analysis of massive amounts of astronomical data, progressively increasing during the last (and future) decades with the digitization and automation of the experimental techniques. CosmoHub, hosted and developed at the Port d'Informació Científica (PIC), provides support to a worldwide community of scientists, without requiring the end user to know any Structured Query Language (SQL). It is serving data of several large international collaborations such as the Euclid space mission, the Dark Energy Survey (DES), the Physics of the Accelerating Universe Survey (PAUS) and the Marenostrum Institut de Ciències de l'Espai (MICE) numerical simulations. While originally developed as a PostgreSQL relational database web frontend, this work describes the current version of CosmoHub, built on top of Apache Hive, which facilitates scalable reading, writing and managing huge datasets. As CosmoHub's datasets are seldomly modified, Hive it is a better fit. Over 60 TiB of cataloged information and 50×10 astronomical objects can be interactively explored using an integrated visualization tool which includes 1D histogram and 2D heatmap plots. In our current implementation, online exploration of datasets of 10 objects can be done in a timescale of tens of seconds. Users can also download customized subsets of data in standard formats generated in few minutes.
Versión del editorhttp://doi.org/10.1016/j.ascom.2020.100391
URIhttp://hdl.handle.net/10261/234576
DOI10.1016/j.ascom.2020.100391
Identificadoresdoi: 10.1016/j.ascom.2020.100391
issn: 2213-1337
Aparece en las colecciones: (ICE) Artículos




Ficheros en este ítem:
Mostrar el registro completo

CORE Recommender

SCOPUSTM   
Citations

29
checked on 20-abr-2024

WEB OF SCIENCETM
Citations

23
checked on 05-feb-2024

Page view(s)

86
checked on 02-may-2024

Download(s)

58
checked on 02-may-2024

Google ScholarTM

Check

Altmetric

Altmetric


Este item está licenciado bajo una Licencia Creative Commons Creative Commons