The DMP facilitates data management in the VPH-Share project. The DMP supports the data providers with tools to select data to be exposed, semantically annotate the data and securely provide the data to the VPH community by hosting it in the VPH-Share Cloud environment and exposing it via web service, REST and Linked Data interfaces.
The DMP involves client software known as the Data Publication Suite (DPS), to extract, semantically annotate, de-identify and publish data to the cloud hosted secure data storage servers. This suite overcomes issues for institutions whereby they cannot directly host their data on the internet and where patient identifiable data needs to be de-identified due to legal, ethical and internal policy restrictions. These servers are accessed through web services where data can be accessed in numerous ways including the standard database query language, SQL, but also semantically through Linked Data and the semantic query language, SPARQL.
The DMP server uses a ‘Database to RDF Server’ to enable semantic interactions with the data by creating a bridge between the relational database using a mapping file constructed from the annotations made by the user via the DPS. The DMP incorporates the dataset service provisioning environment (DSE) of the Vienna Cloud Environment which harnesses the power of ODSA-DAI for the provisioning of individual and virtual dataset services. The DSE supports exposing relational and semantic data sources and facilitates data mediation technologies as well as distributed query processing. This technology chain enables linking VPH-Share internal data sources and incorporating external datasets into the community. This allows VPH-Share to incorporate existing datasets on the internet and accommodate for new data provided by the VPH community.