KEMBAR78
Linked data introduction w exempel | PPTX
Linked Data
Introduction


“I’m encouraged by what actually can be done to improve
the research and commercial utility of information.”

Here’s an intro to why …


Kerstin Forsberg
AstraZeneca R&D, Sweden
Web of Documents




                               Web 3.0




                    Web of (Linked) Data   An Intro To The Semantic Web: Why You Need To Know
                                           About It Sooner Than Later , by Samantha Wong
                                           Image Source: Frederic Martin
    Linked Data Introduction
2
Linking Open Data (LOD) cloud
Two Forerunners: UK and US Government




     http://data.gov/




    http://data.gov.uk/


                                        The Linking Open Data
      Linked Data Introduction              cloud diagram
3
Linking Open Data (LOD) cloud
Linked Data ClinicalTrial.gov




                     Global Identifier (URI) for a AZ study
             http://data.linkedct.org/resource/trial/NCT00755378
                                 48 RDF Triples

    Linked Data Introduction
4
4 Principles for Linked Data …
… and 5 stars for Linked Open Data
    1.     Use URIs (Uniform Resource Identifiers)
           as names for things.
    2.     Use HTTP URIs so that people can look
           up (dereference) those names.
    3.     When someone looks up a URI, provide
           useful information.
    4.     Include links to other URIs so that they
           can discover more things.




                                    Source: Linked Open Data star scheme by example

                                 More resources introducing and describing the Linked Data idea
     Linked Data Introduction
5
Linked Enterprise Data




                                 Source: What does Open Data mean for Enterprises?

                               More resources introducing and describing the Linked Data idea




    Linked Data Introduction
6
“Linked R&D Data”
Examples of Building Blocks

    • Global identifier (URI) scheme for AZ entities
      (e.g. people, projects, studies, drugs)

    • Recommended biomedical ontologies and
      basic vocabularies to provide context
      (semantic and provenance) to data

    • Dataset Catalogue




                                                       From the AZ RDI report: “Persistent URIs and Linked Data “
                                                       ordered by Mike Westaway

     Linked Data Introduction
7
I’m encouraged by …


• … what actually can be done by applying
  Linked Data principles, together with a
  stepwise implementation and pragmatic
  application of crucial building blocks, to …       Health Care and Life Sciences (HCLS)
                                                     Interest Group
                                                     Linking Open Drug Data
• … improve the research and commercial
  utility of information
   • Organized for associations
   • Prepared for not yet defined use                EU project The Large Knowledge Collider
                                                     Linked Life Data
   • Ready for automation where computers
                                                     A 2-page summary of our learnings from
      can function alongside us to                   participating in these external projects:
         • Mitigate the complexity in discovering,   Linked Data in Pharma, 2011, Bo Andersson
                                                     and Kerstin Forsberg
           accessing, connecting and interpreting
           information
         • Improve the productivity in managing
           information
    Linked Data Introduction
8
Extras

• One example
   - Spending Data in UK

• Two things to remember
   - RDF Triples
   - Global Identifiers (URIs)

• Scenarios
   • Linked Clinical Study Metadata
   • Linked Patient Data in a Clinical Study




    Linked Data Introduction (Extras)
9
One example
     Spending Data in UK


                                                                                      From the Linking Open Data cloud




                                                                                      A Payment from Lichfield District
                                                                                      Council, one local authority in the UK.

                                                                                                Globally identified by a URI
                                                                                      http://spending.lichfielddc.gov.uk/spend/8605670




      Linked Data Introduction (Extras)                                                                Linked Spending Data –
10                                                                                                      How and Why Bother
                    Kerstin Forsberg CDISC Interchange Europe 2011   eHR and the World Beyond
Two things to remember
RDF Triples
      Resource
     Description                         subject predicat object
     Framework
                                          Example from a text book
                                     The sky   has the color       blue
                            Example from the Spending data example in UK

        Payment number 8605670                  Net Amount         120.00
        Payment number 8605670                      Type           Expenditure Line
        Payment number 8605670                      Payer          Lichfield District Council
         Lichfield District Council                 Type           Local Authority

                Triples for the standards that provides the semantics
         The property Net Amount         comment          “The net amount of the payment.
                                                          This is the effective cost to the
                                                          payer after any reclaimable tax
                                                          has been deducted.”
      The class Expenditure Line                subclass of        Observation in a multi-dimensional
     Linked Data Introduction (Extras)
                                                                   data cube for statistics
11
Two things to remember
Global Identifiers
      Uniform
     Resource
     Identifier
        URI



                    Examples of Identifiers for “things” in Spending data example

                          http://spending.lichfielddc.gov.uk/spend/8605670

                      http://statistics.data.gov.uk/id/local-authority/41UD


                 Examples of Identifiers for “types of things” / “standards for things”

     http://statistics.data.gov.uk/def/administrative-geography/LocalAuthority

                    http://reference.data.gov.uk/def/payment#ExpenditureLine
                         http://reference.data.gov.uk/def/payment#netAmount

                               http://purl.org/linked-data/cube#Observation

     Linked Data Introduction (Extras)
12
One example – “under the hood”

                                                                         Live view using the Web Data Inspector




     2 of 10 RDF Triples




        Linked Data Introduction (Extras)
13
                      Kerstin Forsberg CDISC Interchange Europe 2011   eHR and the World Beyond
One example – with a top-down approach to
     standardization of the semantics
                           The Payment Ontology
                                                                                  Live view using the Web Data Inspector




      Linked Data Introduction (Extras)
14
                    Kerstin Forsberg CDISC Interchange Europe 2011   eHR and the World Beyond
UK government: Top-down approach to
     standardization for Spending Data
                      Statistical Data perspective
                     Linked Data Cube Vocabulary




          Presentation:
     Statistical Data in RDF

       The RDF Data Cube
           vocabulary
                                                                                           Payment Ontology




                    Guide to the
                 Payments Ontology

         Linked Data Introduction (Extras)
15
                         Kerstin Forsberg CDISC Interchange Europe 2011   eHR and the World Beyond
Scenario: Linked Clinical Study Metadata
                                                                       http://clinical.reference.astrazenenca.com/DI/SIZE#LARGE

                                                                                                                 Internal
                                                                                                            categorization to
                                           http://reference.cdisc.org/ct/sdtm/TSPARAMCD#ROUTE
                                                                                                            support Design &
                                                                                                              Interpretation
                                                                                                                decisions




http://clinial.data.astrazeneca.com/id/study/D8180C00011
                                                         owl:sameAs
                                                                              http://data.linkedct.org/resource/trial/NCT00755378




                                What would we like to
                          What would we like internal the
                                  see on a to see as
                           linked data description of it?
                                      webpage
                                presenting linked data
                                 describing a clinical
                                       study?


16     Linked Data Introduction (Extras)
                      Kerstin Forsberg CDISC Interchange Europe 2011    eHR and the World Beyond
Scenario
Linked Patient Data in a Clinical Study

•    If each AZ clinical study had a global identifier/URI (could be something like this
     http://data.astrazenenca.com/id/clinicalstudy/D8180C00011/ similar to what exist already
     today for studies in ClinicalTrial.gov e.g. http://data.linkedct.org/resource/trial/NCT00755378

•    If each identified Observation in a clinical study dataset delivered to AZ had a global
     identifier, e.g. http://data.astrazenenca.com/data/observation/D8180C00011/20000034

•    If each individual Observation via a RDF triple linked to its global identified test procedure,
     similar to what exists today for CDISC’s SDTM submission values e.g. hemoglobin
     measurement http://linkedlifedata.com/resource/umls/id/C051801

•    If each individual Observation via a RDF triple linked to contextual information, similar to what
     exists today for CDISC’s SDTM submission values for e.g.
     http://linkedlifedata.com/resource/umls/id/C0038846 - prefLabel ‘Supine Position’. (Hopefully,
     CDISC, together with NCI, will in the future publish their standards in a similar way to make it
     easier to link clinical data.)

•    If each identified Observation also was delivered with its provenance information, for example
     two RDF triples expressing references to the identified measurement device and to the SOP
     document being used. For more details see the provenance vocabulary
     http://trdf.sourceforge.net/provenance/ns.htm

17   Linked Data Introduction (Extras)

Linked data introduction w exempel

  • 1.
    Linked Data Introduction “I’m encouragedby what actually can be done to improve the research and commercial utility of information.” Here’s an intro to why … Kerstin Forsberg AstraZeneca R&D, Sweden
  • 2.
    Web of Documents Web 3.0 Web of (Linked) Data An Intro To The Semantic Web: Why You Need To Know About It Sooner Than Later , by Samantha Wong Image Source: Frederic Martin Linked Data Introduction 2
  • 3.
    Linking Open Data(LOD) cloud Two Forerunners: UK and US Government http://data.gov/ http://data.gov.uk/ The Linking Open Data Linked Data Introduction cloud diagram 3
  • 4.
    Linking Open Data(LOD) cloud Linked Data ClinicalTrial.gov Global Identifier (URI) for a AZ study http://data.linkedct.org/resource/trial/NCT00755378 48 RDF Triples Linked Data Introduction 4
  • 5.
    4 Principles forLinked Data … … and 5 stars for Linked Open Data 1. Use URIs (Uniform Resource Identifiers) as names for things. 2. Use HTTP URIs so that people can look up (dereference) those names. 3. When someone looks up a URI, provide useful information. 4. Include links to other URIs so that they can discover more things. Source: Linked Open Data star scheme by example More resources introducing and describing the Linked Data idea Linked Data Introduction 5
  • 6.
    Linked Enterprise Data Source: What does Open Data mean for Enterprises? More resources introducing and describing the Linked Data idea Linked Data Introduction 6
  • 7.
    “Linked R&D Data” Examplesof Building Blocks • Global identifier (URI) scheme for AZ entities (e.g. people, projects, studies, drugs) • Recommended biomedical ontologies and basic vocabularies to provide context (semantic and provenance) to data • Dataset Catalogue From the AZ RDI report: “Persistent URIs and Linked Data “ ordered by Mike Westaway Linked Data Introduction 7
  • 8.
    I’m encouraged by… • … what actually can be done by applying Linked Data principles, together with a stepwise implementation and pragmatic application of crucial building blocks, to … Health Care and Life Sciences (HCLS) Interest Group Linking Open Drug Data • … improve the research and commercial utility of information • Organized for associations • Prepared for not yet defined use EU project The Large Knowledge Collider Linked Life Data • Ready for automation where computers A 2-page summary of our learnings from can function alongside us to participating in these external projects: • Mitigate the complexity in discovering, Linked Data in Pharma, 2011, Bo Andersson and Kerstin Forsberg accessing, connecting and interpreting information • Improve the productivity in managing information Linked Data Introduction 8
  • 9.
    Extras • One example - Spending Data in UK • Two things to remember - RDF Triples - Global Identifiers (URIs) • Scenarios • Linked Clinical Study Metadata • Linked Patient Data in a Clinical Study Linked Data Introduction (Extras) 9
  • 10.
    One example Spending Data in UK From the Linking Open Data cloud A Payment from Lichfield District Council, one local authority in the UK. Globally identified by a URI http://spending.lichfielddc.gov.uk/spend/8605670 Linked Data Introduction (Extras) Linked Spending Data – 10 How and Why Bother Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond
  • 11.
    Two things toremember RDF Triples Resource Description subject predicat object Framework Example from a text book The sky has the color blue Example from the Spending data example in UK Payment number 8605670 Net Amount 120.00 Payment number 8605670 Type Expenditure Line Payment number 8605670 Payer Lichfield District Council Lichfield District Council Type Local Authority Triples for the standards that provides the semantics The property Net Amount comment “The net amount of the payment. This is the effective cost to the payer after any reclaimable tax has been deducted.” The class Expenditure Line subclass of Observation in a multi-dimensional Linked Data Introduction (Extras) data cube for statistics 11
  • 12.
    Two things toremember Global Identifiers Uniform Resource Identifier URI Examples of Identifiers for “things” in Spending data example http://spending.lichfielddc.gov.uk/spend/8605670 http://statistics.data.gov.uk/id/local-authority/41UD Examples of Identifiers for “types of things” / “standards for things” http://statistics.data.gov.uk/def/administrative-geography/LocalAuthority http://reference.data.gov.uk/def/payment#ExpenditureLine http://reference.data.gov.uk/def/payment#netAmount http://purl.org/linked-data/cube#Observation Linked Data Introduction (Extras) 12
  • 13.
    One example –“under the hood” Live view using the Web Data Inspector 2 of 10 RDF Triples Linked Data Introduction (Extras) 13 Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond
  • 14.
    One example –with a top-down approach to standardization of the semantics The Payment Ontology Live view using the Web Data Inspector Linked Data Introduction (Extras) 14 Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond
  • 15.
    UK government: Top-downapproach to standardization for Spending Data Statistical Data perspective Linked Data Cube Vocabulary Presentation: Statistical Data in RDF The RDF Data Cube vocabulary Payment Ontology Guide to the Payments Ontology Linked Data Introduction (Extras) 15 Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond
  • 16.
    Scenario: Linked ClinicalStudy Metadata http://clinical.reference.astrazenenca.com/DI/SIZE#LARGE Internal categorization to http://reference.cdisc.org/ct/sdtm/TSPARAMCD#ROUTE support Design & Interpretation decisions http://clinial.data.astrazeneca.com/id/study/D8180C00011 owl:sameAs http://data.linkedct.org/resource/trial/NCT00755378 What would we like to What would we like internal the see on a to see as linked data description of it? webpage presenting linked data describing a clinical study? 16 Linked Data Introduction (Extras) Kerstin Forsberg CDISC Interchange Europe 2011 eHR and the World Beyond
  • 17.
    Scenario Linked Patient Datain a Clinical Study • If each AZ clinical study had a global identifier/URI (could be something like this http://data.astrazenenca.com/id/clinicalstudy/D8180C00011/ similar to what exist already today for studies in ClinicalTrial.gov e.g. http://data.linkedct.org/resource/trial/NCT00755378 • If each identified Observation in a clinical study dataset delivered to AZ had a global identifier, e.g. http://data.astrazenenca.com/data/observation/D8180C00011/20000034 • If each individual Observation via a RDF triple linked to its global identified test procedure, similar to what exists today for CDISC’s SDTM submission values e.g. hemoglobin measurement http://linkedlifedata.com/resource/umls/id/C051801 • If each individual Observation via a RDF triple linked to contextual information, similar to what exists today for CDISC’s SDTM submission values for e.g. http://linkedlifedata.com/resource/umls/id/C0038846 - prefLabel ‘Supine Position’. (Hopefully, CDISC, together with NCI, will in the future publish their standards in a similar way to make it easier to link clinical data.) • If each identified Observation also was delivered with its provenance information, for example two RDF triples expressing references to the identified measurement device and to the SOP document being used. For more details see the provenance vocabulary http://trdf.sourceforge.net/provenance/ns.htm 17 Linked Data Introduction (Extras)