KEMBAR78
Horizon 2020 and the open research data pilot | PPTX
Horizon 2020 and the Open
Research Data pilot
Sarah Jones
Digital Curation Centre, Glasgow
sarah.jones@glasgow.ac.uk
Twitter: @sjDCC
Horizon 2020, Open Data and Data Management Plans, Trinity College Dublin, 19 October 2016
The EC Open Research Data pilot
Key sources of information
• Guidelines on Open Access to Scientific Publications and Research
Data in Horizon 2020
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/
hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
• Guidelines on Data Management in Horizon 2020
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/
hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf
• Annotated model grant agreement, clause 29.3
http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/
amga/h2020-amga_en.pdf
• New infographic summarising key policy points
http://ec.europa.eu/research/press/2016/pdf/opendata-
infographic_072016.pdf
The following European Commission branded slides come from the
EC’s open access team and provide an overview to the key points.
Content from Jean-Francois Dechamp and colleagues.
Mail: RTD-open-access@ec.europa.eu
Web: http://ec.europa.eu/research/openscience/index.cfm
Twitter: @OpenAccessEC
Requirements in a nutshell
Develop a DMP
Select which data
to make open
License data openly
for the widest reuse
Use established
community standards
for interoperability
Provide metadata for
data discovery and reuse
Deposit in a
data repository
Share details about the
tools and instruments
used to allow verification
More than just open data
CC-BY Andreas Neuhold
https://commons.wikimedia.org/wiki/File:Open_Science_-_Prinzipien.png
What is a data management plan?
A brief plan written at the start of a project to define:
• how the data will be created?
• how it will be documented?
• who will access it?
• where it will be stored?
• who will back it up?
• whether (and how) it will be shared & preserved?
DMPs are often submitted as part of grant applications, but are
useful whenever researchers are creating data.
A FAIR approach to DMPs
Findable
– Assign persistent IDs, provide metadata, register in a searchable resource...
Accessible
– Retrievable by their ID using a standard protocol, metadata remain
accessible even if data aren’t...
Interoperable
– Use formal, broadly applicable languages, use standard vocabularies,
qualified references...
Reusable
– Rich metadata, clear licences, provenance, use of community standards...
www.force11.org/group/fairgroup/fairprinciples
DMPonline
A web-based tool to help researchers write DMPs
Includes a template for Horizon 2020
https://dmponline.dcc.ac.uk
How the tool works
Click to write a
generic DMP
Or choose your
funder to get their
specific template
Pick your uni to
add local
guidance and
to get their
template if no
funder applies
Choose any
additional
optional
guidance
DCC support on DMPs
• Webinars and training materials
• How-to guides and other advisory documents
• Checklist on what to cover in DMPs
• Example DMPs
• DMPonline
www.dcc.ac.uk/resources/data-management-plans
Example H2020 DMPs in Zenodo
Helix Nebula – High Energy Physics example
https://zenodo.org/record/48171#.WATexnriF40
Tweether – engineering (micro-electronics) example
https://zenodo.org/record/55791#.WATei3riF40
AutoPost – ICT example
https://zenodo.org/record/56107#.WATefXriF40
www.dcc.ac.uk/resources/how-guides/license-research-data
License research data openly
This DCC guide outlines the pros and cons of
each approach and gives practical advice on
how to implement your licence
CREATIVE COMMONS LIMITATIONS
NC Non-Commercial
What counts as commercial?
ND No Derivatives
Severely restricts use
These clauses are not open licenses
Horizon 2020 Open Access
guidelines point to:
or
EUDAT licensing tool
Answer questions to determine which licence(s) are
appropriate to use
http://ufal.github.io/lindat-license-selector
Deposit in a data repository
http://databib.org
http://service.re3data.org/search
The EC guidelines point to Re3data as one of the registries that
can be searched to find a home for data
Searching with Re3data.org
www.fosteropenscience.eu/content/re3data-demo
How to select a repository?
• Look for provision from your community, university, publisher, funder etc
• Check they match your particular data needs: e.g. formats accepted; mixture
of Open and Restricted Access.
• See if they provide guidance on how to cite the deposited data.
• Do they assign a persistent & globally unique identifier for sustainable
citations and to links back to particular researchers and grants?
• Look for certification as a ‘Trustworthy Digital Repository’ with an explicit
ambition to keep the data available in long term.
www.openaire.eu/opendatapilot-repository
Zenodo is a multi-disciplinary repository that can be
used for the long-tail of research data
• An OpenAIRE-CERN joint effort
• Multidisciplinary repository accepting
– Multiple data types
– Publications
– Software
• Assigns a Digital Object Identifier (DOI)
• Links funding, publications, data & software
www.zenodo.org
Zenodo
Use metadata standards
Metadata Standards Directory
Broad, disciplinary listing of
standards and tools. Maintained by
RDA group
http://rd-alliance.github.io/
metadata-directory
Biosharing
A portal of data standards,
databases, and policies
Focused on life, environmental
and biomedical sciences
https://biosharing.org
Choose appropriate file formats
If you want your data to be re-used and sustainable in the long-term, you
typically want to opt for open, non-proprietary formats.
Type Recommended Avoid for data sharing
Tabular data CSV, TSV, SPSS portable Excel
Text Plain text, HTML, RTF
PDF/A only if layout matters
Word
Media Container: MP4, Ogg
Codec: Theora, Dirac, FLAC
Quicktime
H264
Images TIFF, JPEG2000, PNG GIF, JPG
Structured data XML, RDF RDBMS
Further examples:
www.data-archive.ac.uk/create-manage/format/formats-table
Managing and sharing data:
a best practice guide
http://data-archive.ac.uk/media/2894/managingsharing.pdf
FOSTER
Facilitate Open Science Training for European Research
• Network of open access trainers
• Programme of open science courses
• Portal to training materials
• E-learning courses on open access, open data and open
science forthcoming
www.fosteropenscience.eu
OpenAIRE
http://vimeo.com/108790101
Open Access Infrastructure for research in Europe
• aggregates data on OA publications
• mines & enriches it content by linking thing together
• provides services & APIs e.g.
to generate publication lists
www.openaire.eu
EUDAT services
EUDAT offers a pan-European
solution, providing a generic
set of services to ensure
minimum level of
interoperability
Building common data services
in close collaboration with 25+
communities
www.eudat.eu
Discipline-specific infrastructure
Plan to share data from the outset
Negotiation on licenses and consent agreement may
preclude later sharing if not careful
Costings can’t be included retrospectively
Useful to consider data issues at the consortium
negotiation stage to make sure potential issues are
identified and sorted asap
Decisions made early on affect what you can do later
Thanks for listening
DCC resources on Data Management
www.dcc.ac.uk/resources
Follow us on twitter:
@digitalcuration and #ukdcc

Horizon 2020 and the open research data pilot

  • 1.
    Horizon 2020 andthe Open Research Data pilot Sarah Jones Digital Curation Centre, Glasgow sarah.jones@glasgow.ac.uk Twitter: @sjDCC Horizon 2020, Open Data and Data Management Plans, Trinity College Dublin, 19 October 2016
  • 2.
    The EC OpenResearch Data pilot Key sources of information • Guidelines on Open Access to Scientific Publications and Research Data in Horizon 2020 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/ hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf • Guidelines on Data Management in Horizon 2020 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/ hi/oa_pilot/h2020-hi-oa-data-mgt_en.pdf • Annotated model grant agreement, clause 29.3 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/ amga/h2020-amga_en.pdf • New infographic summarising key policy points http://ec.europa.eu/research/press/2016/pdf/opendata- infographic_072016.pdf
  • 3.
    The following EuropeanCommission branded slides come from the EC’s open access team and provide an overview to the key points. Content from Jean-Francois Dechamp and colleagues. Mail: RTD-open-access@ec.europa.eu Web: http://ec.europa.eu/research/openscience/index.cfm Twitter: @OpenAccessEC
  • 15.
    Requirements in anutshell Develop a DMP Select which data to make open License data openly for the widest reuse Use established community standards for interoperability Provide metadata for data discovery and reuse Deposit in a data repository Share details about the tools and instruments used to allow verification
  • 16.
    More than justopen data CC-BY Andreas Neuhold https://commons.wikimedia.org/wiki/File:Open_Science_-_Prinzipien.png
  • 17.
    What is adata management plan? A brief plan written at the start of a project to define: • how the data will be created? • how it will be documented? • who will access it? • where it will be stored? • who will back it up? • whether (and how) it will be shared & preserved? DMPs are often submitted as part of grant applications, but are useful whenever researchers are creating data.
  • 18.
    A FAIR approachto DMPs Findable – Assign persistent IDs, provide metadata, register in a searchable resource... Accessible – Retrievable by their ID using a standard protocol, metadata remain accessible even if data aren’t... Interoperable – Use formal, broadly applicable languages, use standard vocabularies, qualified references... Reusable – Rich metadata, clear licences, provenance, use of community standards... www.force11.org/group/fairgroup/fairprinciples
  • 19.
    DMPonline A web-based toolto help researchers write DMPs Includes a template for Horizon 2020 https://dmponline.dcc.ac.uk
  • 20.
    How the toolworks Click to write a generic DMP Or choose your funder to get their specific template Pick your uni to add local guidance and to get their template if no funder applies Choose any additional optional guidance
  • 21.
    DCC support onDMPs • Webinars and training materials • How-to guides and other advisory documents • Checklist on what to cover in DMPs • Example DMPs • DMPonline www.dcc.ac.uk/resources/data-management-plans
  • 22.
    Example H2020 DMPsin Zenodo Helix Nebula – High Energy Physics example https://zenodo.org/record/48171#.WATexnriF40 Tweether – engineering (micro-electronics) example https://zenodo.org/record/55791#.WATei3riF40 AutoPost – ICT example https://zenodo.org/record/56107#.WATefXriF40
  • 23.
    www.dcc.ac.uk/resources/how-guides/license-research-data License research dataopenly This DCC guide outlines the pros and cons of each approach and gives practical advice on how to implement your licence CREATIVE COMMONS LIMITATIONS NC Non-Commercial What counts as commercial? ND No Derivatives Severely restricts use These clauses are not open licenses Horizon 2020 Open Access guidelines point to: or
  • 24.
    EUDAT licensing tool Answerquestions to determine which licence(s) are appropriate to use http://ufal.github.io/lindat-license-selector
  • 25.
    Deposit in adata repository http://databib.org http://service.re3data.org/search The EC guidelines point to Re3data as one of the registries that can be searched to find a home for data
  • 26.
  • 27.
    How to selecta repository? • Look for provision from your community, university, publisher, funder etc • Check they match your particular data needs: e.g. formats accepted; mixture of Open and Restricted Access. • See if they provide guidance on how to cite the deposited data. • Do they assign a persistent & globally unique identifier for sustainable citations and to links back to particular researchers and grants? • Look for certification as a ‘Trustworthy Digital Repository’ with an explicit ambition to keep the data available in long term. www.openaire.eu/opendatapilot-repository
  • 28.
    Zenodo is amulti-disciplinary repository that can be used for the long-tail of research data • An OpenAIRE-CERN joint effort • Multidisciplinary repository accepting – Multiple data types – Publications – Software • Assigns a Digital Object Identifier (DOI) • Links funding, publications, data & software www.zenodo.org Zenodo
  • 29.
    Use metadata standards MetadataStandards Directory Broad, disciplinary listing of standards and tools. Maintained by RDA group http://rd-alliance.github.io/ metadata-directory Biosharing A portal of data standards, databases, and policies Focused on life, environmental and biomedical sciences https://biosharing.org
  • 30.
    Choose appropriate fileformats If you want your data to be re-used and sustainable in the long-term, you typically want to opt for open, non-proprietary formats. Type Recommended Avoid for data sharing Tabular data CSV, TSV, SPSS portable Excel Text Plain text, HTML, RTF PDF/A only if layout matters Word Media Container: MP4, Ogg Codec: Theora, Dirac, FLAC Quicktime H264 Images TIFF, JPEG2000, PNG GIF, JPG Structured data XML, RDF RDBMS Further examples: www.data-archive.ac.uk/create-manage/format/formats-table
  • 31.
    Managing and sharingdata: a best practice guide http://data-archive.ac.uk/media/2894/managingsharing.pdf
  • 32.
    FOSTER Facilitate Open ScienceTraining for European Research • Network of open access trainers • Programme of open science courses • Portal to training materials • E-learning courses on open access, open data and open science forthcoming www.fosteropenscience.eu
  • 33.
    OpenAIRE http://vimeo.com/108790101 Open Access Infrastructurefor research in Europe • aggregates data on OA publications • mines & enriches it content by linking thing together • provides services & APIs e.g. to generate publication lists www.openaire.eu
  • 34.
    EUDAT services EUDAT offersa pan-European solution, providing a generic set of services to ensure minimum level of interoperability Building common data services in close collaboration with 25+ communities www.eudat.eu
  • 35.
  • 36.
    Plan to sharedata from the outset Negotiation on licenses and consent agreement may preclude later sharing if not careful Costings can’t be included retrospectively Useful to consider data issues at the consortium negotiation stage to make sure potential issues are identified and sorted asap Decisions made early on affect what you can do later
  • 37.
    Thanks for listening DCCresources on Data Management www.dcc.ac.uk/resources Follow us on twitter: @digitalcuration and #ukdcc

Editor's Notes

  • #21 From the start, the DCC has offered guidance, independent of funder or discipline. EUDAT and OpenAIRE and others are developing extra guidance as well.
  • #24 Guidance from the DCC can also help researchers to understand data licensing. This guide outlines the pros and cons of each approach e.g. the limitations of some CC options The OA guidelines under Horizon 2020 point to CC-0 or CC-BY as a straightforward and effective way to make it possible for others to mine, exploit and reproduce the data. See p11 at: http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa-pilot-guide_en.pdf
  • #34 OpenAIRE is also worth checking out. This is an EC-funded project to provide infrastructure for open access. They’ve recently released a short video that tells you how they can help. Essentially OpenAIRE aggregates metadata from different repositories to compile a complete list of publications and related outputs. They mine and enrich the content, de-duplicating entries and linking together publications with data, details about the project, authors, funders etc. OpenAIRE also provides a number of useful services & APIs, for example you can embed a publication list for your project in your website that is automatically updated whenever someone adds a new paper to a repository (this is harvested into OpenAIRE and pushed out to your list).
  • #35 All share common challenges: – Reference models and architectures – Persistent data identifiers – Metadata management – Distributed data sources – Data interoperability