Towards a sustainable astronomical data infrastructure. Optimising linking data from the Rucio datalake to the users areas within the SKA Regional Centres Network

dc.contributor.authorParra-Royón, Manuel
dc.contributor.authorGarrido-Sánchez, Julián
dc.contributor.authorSánchez-Expósito, Susana
dc.contributor.authorDarriba-Pol, Laura
dc.contributor.authorSánchez-Castañeda, Jesús
dc.contributor.authorMendoza, M. Ángeles
dc.contributor.authorColes, Jeremy
dc.contributor.authorMcConkey, Sean
dc.contributor.authorJoshi, Rohini
dc.contributor.authorBarnsley, Rob
dc.contributor.authorSalgado, Jesús
dc.contributor.authorVerdes-Montenegro, Lourdes
dc.date.accessioned2026-04-24T12:09:29Z
dc.date.issued2026
dc.description.abstractThe distributed architecture of the SKA Regional Centre Network (SRCNet) aims to provide scientific communities worldwide with efficient computational and storage resources to exploit the massive data volumes produced by the SKA Observatory (SKAO). Given the amount of SKAO data, traditional data management paradigms — where data is transferred to computational resources— are no longer feasible. Instead, computational workflows must increasingly be relocated closer to data storage locations, emphasizing efficient data access strategies and avoiding unnecessary duplication or redundancy. In this context, we present PrepareData, a modular and extensible data delivery service developed within SRCNet prototyping activities. Our proposal for this service addresses the critical challenge of redundant data transfers and duplication at both node and user levels by enabling seamless delivery of requested datasets from local Rucio Storage Elements (RSEs) directly into users’ working environments. PrepareData operates as a local service within each SRCNet node and it is integrated into a broader ecosystem of federated services. Specifically, we designed and evaluated two distinct yet complementary implementations to avoid unnecessary data duplication and to enable a dynamic data bridge between the RSEs and the user storage areas, through: (1) a filesystem-based solution leveraging CephFS, which uses shared filesystem mount points and bind mounts to ensure consistent and immediate data availability of the data across computational nodes, and (2) a Kubernetes model using Persistent Volumes and Persistent Volume Claims, dynamically injecting data into a user’s areas. To tackle this work we detail the architectural design and development, the technical implementation, the integration of both solutions with science enabling tools, such as JupyterHub, CARTA or virtually any application, and finally we provide a performance evaluation. This contribution provides a scalable and sustainable blueprint for data delivery in federated scientific infrastructures, supporting the broader goals of green computing and efficient resource utilisation.
dc.identifier.doi10.12688/openreseurope.22118.2
dc.identifier.issn2732-5121
dc.identifier.urihttps://irf.fhnw.ch/handle/11654/56552
dc.identifier.urihttps://doi.org/10.26041/fhnw-16100
dc.language.isoen
dc.publisherF1000 Research
dc.relation.ispartofOpen Research Europe
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subject.ddc004 - Computer Wissenschaften, Internet
dc.titleTowards a sustainable astronomical data infrastructure. Optimising linking data from the Rucio datalake to the users areas within the SKA Regional Centres Network
dc.type01A - Beitrag in wissenschaftlicher Zeitschrift
dc.volume6
dspace.entity.typePublication
fhnw.InventedHereYes
fhnw.ReviewTypeAnonymous ex ante peer review of a complete publication
fhnw.oastatus.auroraVersion: Accepted *** Embargo: None *** Licence: CC BY *** URL: https://v2.sherpa.ac.uk/id/publication/39499
fhnw.openAccessCategoryDiamond
fhnw.publicationStatePublished
fhnw.targetcollectionb508cce9-5084-49ae-a565-d8e5c348c3ab
relation.isAuthorOfPublication5cc50e95-cda8-4580-9d7a-7d0e61ace79f
relation.isAuthorOfPublication.latestForDiscovery5cc50e95-cda8-4580-9d7a-7d0e61ace79f
Dateien

Originalbündel

Gerade angezeigt 1 - 1 von 1
Lade...
Vorschaubild
Name:
5552b11b-519f-4e0c-9844-8952968a3624_oreu22118.pdf
Größe:
1.01 MB
Format:
Adobe Portable Document Format

Lizenzbündel

Gerade angezeigt 1 - 1 von 1
Lade...
Vorschaubild
Name:
license.txt
Größe:
2.66 KB
Format:
Item-specific license agreed upon to submission
Beschreibung: