This tutorial is still under development, but nearly finalized. Please try out and post any necessary comment! |
|| workpackage | WP6 | || task | 2 | || document number | 018 | || document version | 0.1 | || document title | VESPA service tutorial with intermediate metadata table | || document type | TP | |
EPN2020-RI
EUROPLANET2020 Research Infrastructure
H2020-INFRAIA-2014-2015
Grant agreement no: 654208
Document: VESPA--
-
-
-v
(
)
Date:
Start date of project: 01 September 2015
Duration: 48 Months
Responsible WP Leader: Stéphane Erard
Project co-funded by the European Union's Horizon 2020 research and innovation programme | ||
Dissemination level | ||
PU | Public | |
PP | Restricted to other programme participants (including the Commission Service) | |
RE | Restricted to a group specified by the consortium (including the Commission Services) | |
CO | Confidential, only for members of the consortium (excluding the Commission Services) |
Project Number | 654208 |
Project Title | EPN2020 - RI |
Project Duration | 48 months: 01 September 2015 – 30 August 2019 |
Document Number |
|
Delivery date | YYYY.MM.DD |
Title of Document | |
Contributing Work package (s) | |
Dissemination level | PU |
Author (s) |
Abstract: This tutorial |
Document history (to be deleted before submission to Commission) | ||||
Date | Version | Editor | Change | Status |
| 0.0 | Initial version |
| |
| first complete version | | ||
DD Mon YYYY | X.Y | Insert person | explain changes | |
DD Mon YYYY | X.Y | Insert person | explain changes |
Table of Contents
This document presents the set up of an EPN-TAP service using an intermediate metadata table stored in CSV format. The CSV file is placed on the DaCHS server. It is then imported into the internal DaCHS database as described in the resource descriptor for your test service. This resource descriptor is using a csvGrammar script to load your metadata from the CSV file into the EPNcore parameters to be shared with EPN-TAP.
This tutorial implies that you have a running DaCHS server (see EPN-TAP Server Installation for VESPA Data Provider Tutorial).
In this tutorial the EPNcore metadata are stored into a CSV file (Comma Separated Values). This file must contain a header row with column names. Each row then contains the metadata for a data product that you distribute. It is recommended to use the actual EPNcore keywords as column names in the file, as well as to prepare your metadata directly in the correct form or units. However, some editing can be done in the resource descriptor (it is demonstrated in this tutorial).
The example metadata catalog is extracted from the APIS (Auroral Planetary Imaging and Spectroscopy) service. A series of six data products have been selected from this database. The metadata CSV file has been prepared and is available for download: apis_test.csv. This file contains several columns but not all the EPNcore keywords are present. We have only put the columns that will be present (not filled with NULL values) in the database. We present here the various columns:
You may have other columns for your specific case. Before going to the next step, identify which optional columns (see EPNcore v2 specification) you need and write down the list. In the case of the example based on APIS data, the optional columns are:
target_region access_url access_format access_estsize publisher thumbnail_url |
For extra columns that are not present in the current list of keywords (neither in the mandatory, nor in the optional section), you will have to declare them manually in the ressource descriptor. For this step you need to define the following elements (for each extra column):
We also would like to be included in discussions about new columns, in order to make sure that there are no planned extension of EPNcore that would be conflicting with your proposed extra keywords.
The configuration of the service is done in a file called a resource descriptor, usually named "q.rd
" in DaCHS. The service must be prepared as follows.
The first step is to decide on the short name of your service. In this tutorial, we opt for "test". The first step is to create the service directory in DaCHS.
Log on the DaCHS server, with your own user. In order to setup the service properly, you have to switch to the dachsroot
user. You can do this from your user account, issuing the two following commands:
sudo -s su dachsroot |
The first line is switching to root account (your user password may have to be provided), and the second is switching to dachsroot
.
Logged as
dachsroot
, create a new directory for your new service in the /var/gavo/inputs
directory:
mkdir /var/gavo/inputs/test |
The name of the directory must be the short name of your service. In that directory, you have to create a data/
subdirectory. You can then copy your CSV metadata file in that directory:
mkdir /var/gavo/inputs/test/data cd /var/gavo/inputs/test/data wget https://voparis-confluence.obspm.fr/download/attachments/10944992/apis_test.csv?version=3&api=v2&download=true mv apis_test.csv metadata.csv |
In the last command, we propose to copy and rename the example CSV file (that you have to copy to your user home directory before this step).
The resource description file q.rd is fully describing your service. It contains metadata concerning the service, that are used to fill out the registry record for your service. It also tells DaCHS how to build the table in the internal database. It finally tells DaCHS how to map from the internal database to the EPN-TAP interface. The example resource descriptor file q.rd hats be placed in the /var/gavo/inputs/test/
directory.
cd /var/gavo/inputs/test wget https://voparis-confluence.obspm.fr/download/attachments/10944992/q.rd?version=1&api=v2&download=true |
The resource descriptor can be validated:
cd /var/gavo/inputs/test gavo val q.rd |
If the file is valid, you won't have errors. A short message output states "q -- OK
". You can then test the import in "dump" mode:
cd /var/gavo/inputs/test gavo imp -d q.rd |
This will throw a lot of output: each row from the CSV file (your metadata source) is interpreted and the result displayed. The real import step is done with the "modify" mode:
cd /var/gavo/inputs/test gavo imp -m q.rd |
At this stage, your service should be up and running.
You can test your service either from your DaCHS instance web interface (http://127.0.0.1:8000/adql in case of the local installation tutorial) , going to the ADQL query page, and issuing:
select * from test.epn_core |
You can also use the VESPA main query interface (using the "custom resource" tab) to access your service if it is available on the internet, or on a local instance, if there is one available for you.