(Extracted from the EOSC-Hub EAP2 final report)
Principal investigator: Baptiste Cecconi (obspm)
Shepherd: Baptiste Grenier (EGI.eu)
About the pilot - initial ambition
VESPA (Virtual European Solar and Planetary Access) is a mature project, with 50 VESPA providers distributing open access datasets throughout the world (EU, Japan, USA). In October 2019, the current number of data products available within the VESPA network reaches 18.3 millions (among which 5 millions products from the ESA/PSA, Planetary Science Archive).
The VESPA team is supported by the Europlanet-RI-2024 project (started on Feb 1st 2020 for 48 months, H2020 grant agreement No 871149).
Each VESPA provider (institutes, scientific teams...) is hosting and maintaining a server (physical or virtualized) with the same software distribution (DaCHS, Data Centre Helper Suite), which implements the interoperability layers (from IVOA, International Virtual Observatory Alliance, and VESPA) and following FAIR principles. Each server hosts a table of standardized metadata with URLs to data files or data services. Data files can be hosted by the VESPA provider team, or in an external archive (e.g., ESA/PSA - Planetary Science Archive).
The VESPA architecture relies on the assumption that data provider’s servers are up and running continuously. The VESPA network is distributed but not redundant. For small teams with little or no IT support is available locally, the services are down regularly. We thus need a more stable and manageable platform for hosting those services. The EOSC-hub “cloud container compute” service would solve this problem.
We propose to use the EOSC infrastructure to host VESPA provider's servers (through a controlled deployment environment with git-managed containers).
The open-source DaCHS framework is developed for Debian distribution. A docker containerization will be used to facilitate the framework deployment on other Linux environments.
Progress and key results
The VESPA team has implemented a prototype that proves the feasibility and relevance of the initial proposal. The prototype proposes a workflow based on the deployment of a docker-based container implementing the Astronomy Virtual Observatory framework (DaCHS, Data Centre Helper Suite), together with selected data services. In the course of the pilot project, VESPA has transitioned to a more sustainable service configuration management, using an eduEDUTEAMS managed VO, for managing the access to the services.
- The docker-based workflow prototype with a small test data service is functional
- The server configuration files are managed on a gitlab server (hosted by Obs. Paris)
- The data services configuration are managed on a gitlab server
- The access to the gitlab server is managed through eduEDUTEAMS AAI.
- The cloud compute resources were provided by CC-IN2P3 and CESNET
- The mapping between an “admin:cloud” group defined in EDUTEAMS AAI has been mapped to the VESPA VO at EGI -Checkin, to allow the access to the VM deployment.
- The GÉANT team provided the VESPA support for connecting a gitlab server to eduEDUTEAMS.
The VESPA team is willing to consolidate the prototype before onboarding it on the EOSC Portal.
Description of the integrations
Allocated ICT resources (cloud, storage, etc.)
Used as a community AAI to manage the user community's authentication and authorisation
Used as e-infrastructure AAI proxy, mapping the attributes from eduTEAMS to be consumed by EGI services
Service access and integration with eduTEAMS
Deploying and running the Virtual Machines supporting the service
Cloud resources (VMs and storage)
The VESPA project architecture has been consolidated thanks to the EOSC-Hub EAP:
- Configurations of servers and services managed by git, which improves the robustness and sustainability of the framework
- We tested a federatedn AAI service, and we think it is relevant for this type of service.
- Implementation of AAI-managed gitlab server
- Development of openstack VM deployment script with our application, from git-managed configurations (for deployment on EOSC or locally)
- Better understanding of EOSC ecosystem
VESPA-Cloud is still at a prototype stage, but the project consolidated the overall VESPA framework, and openeds up new solutions and opportunities for future VESPA service implementations.
Specifically, for data providers who are not able or not willing to host a VESPA server for a long period of time, we now have a working solution for service deployment, either on EOSC or on local data centres.
Future plans and sustainability aspects
The VESPA teamtime is willing to continue the VESPA-Cloud pilot project, and explore further the use of EOSC resources for sharing solar system data. The prolongation of the SLA has been approved under the same conditions of the current EAP. In the future, a cooperation model will have to be decided on, in order to continue the access to the resources.