KEMBAR78
Semantic Web Science | PPT
Semantic Web Science Jim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information Technology and Web Science Rensselaer Polytechnic Institute http://www.cs.rpi.edu/~hendler @jahendler (twitter)
Following Lazlo
Punchline Semantic Web is real Growing at a fast pace Producing lots of interesting networks That no one is really analyzing from a network science perspective Which could hugely help those of us trying to use this for some really hard real world problems For example, open govt data
Sem Web 2010 4/2010
Semantic Web 2010 7/2010
Semantic Web 2010 11/2010
Sem Web 2010 7/2010
Sem Web 2010 8/2010
Sem Web 2010 What is different now? Semantics in  Search Advertising drives Web markets “ Buzz” around data on the Web Facebook OGP, Open Govt Data, … Maturation of RDF technologies SPARQL endpoints RDFa Lightweight Knowledge A little semantics goes a long way
Friend of a Friend (our former favorite example) FOAF >60M Foaf people (not necessarily distinct individuals) in hundreds of applications touched by a large community  (> 100,000,000 users) Used by a number of large providers If you use LiveJournal, you have a FOAF file Also flickr, ecademy, tribe, joost, … And you can export Foaf from Facebook and many other social networking sites
FOAF Network has been  explored as Social network  per se
Foaf complicated compared to OGP (Facebook’s Open Graph Protocol) og:title - The title of your object as it should appear within the graph, e.g., "The Rock". og:type - The type of your object, e.g., "movie". Depending on the type you specify, other properties may also be required. og:image - An image URL which should represent your object within the graph. og:url - The canonical URL of your object that will be used as its permanent ID in the graph og:description - A one to two sentence description of your object. og:site_name - If your object is part of a larger web site, the name which should be displayed for the overall site. e.g., "IMDb".
OGP use growing quickly Facebook incentivizing use of RDFa like buttons 15,178 sites of top 1,000,000 as of 3/3/11
OGP creates a fast-growing,  multiply-labeled, network FB reports ~ 10-15% of  > 3,000,000 likes per day!
Important Real World Use Case: Government Data Sharing January 1, 2009 “ Openness will strengthen our democracy and promote efficiency and effectiveness in Government.” --- President Obama Putting Govt Data online- Data.gov.uk beta May 21, 2009 January 19, 2010 data.gov.uk online May 21, 2010 data.gov online data.gov relaunch with semantic web featured June30,2009 December 8, 2009 “ Open Government Directive” released 2009 2010 … 57 Data Sets ~6000 Data Set ~2000 Data Sets >305,000 Data Sets
Government Mashups and Applications See more than 50 of these at http://logd.tw.rpi.edu
Linking GDP of the US and China GDP of China (Billion Chinese Yuan ) GDP of the US (Billion Dollar) [Temporal Mashup] bea.gov + federalreserve.gov +stats.gov.cn
Linking GDP of the US and China GDP of China (Billion Chinese Yuan ) GDP of the US (Billion Dollar) [Temporal Mashup] bea.gov + federalreserve.gov +stats.gov.cn  This mashup was built in less than 8 hours – including conversion of data, web interface, and visualization!
Mashups allow comparisons that single data sets cannot Trends in Smoking Prevalence, Tobacco Policy Coverage and Tobacco Prices (1991-2007)  Extensible Mashups via Linked Data Diverse datasets from NIH Potentially linking to “unemployment rate” Accountable Mashups via  Provenance Annotate datasets used in demos Feedback users’ comment to gov contact (e.g. %)
Govt data linked to Social Media Metadata
There is a lot of workflow information in the mix derive derive create derive revision Convert Access Enhance Version SemDiff
A Web Science Challenge How can we search for data?
Effective open govt requires exploiting the  linked open govt network  http://linkeddata.org/
Effective open govt requires exploiting the  linked open govt network  Government Data is currently about ½ the cloud in size (~15B triples), 10s of thousands of links to other data (within and without)
Linked Open Data on the Web Linked Open Data – over 23B triples
Linked open data network Linked Open Data – over 23B triples in a sparsely connected graph of highly connected graphs (and we know very little about the properties of most of these, let alone of the whole)
Linked open data network The good news:  Web accessible, machine readable, anonymized and
Linked open data network Why is this hard?  Doubling in size every 10 months, very varied “authorities,” many different kinds of linking used (same URI, (sort of) same by assertion, (sort of) same by inference, transitive closures, …)
A new buzzword… Linked Data  (RDF, SPARQL) Semantic Web  (RDFS,  owl ) Web 3.0  Web 2.0 Web 3.0 extends current Web applications using Semantic Web, esp semantic and real-time search, technologies and graph-based, open data. Web (REST API)
And a new commercial motivator Web:  Google Underlying theory: exploit the Web graph Web 2.0: Facebook, YouTube, Twitter… Underlying theory: exploit the Social network Web 3.0: (Your company here) Underlying theory: exploit the “semantics” in all these graphs
Punchline: Web and Network Science Challenge Semantic Web is real Growing at a fast pace Producing lots of interesting networks That no one is really analyzing from a network science perspective Which could hugely help those of us trying to use this for some really hard real world problems For example, open govt data
Questions? New edition includes OGP, Data.gov, …

Semantic Web Science

  • 1.
    Semantic Web ScienceJim Hendler Tetherless World Professor of Computer and Cognitive Science Assistant Dean of Information Technology and Web Science Rensselaer Polytechnic Institute http://www.cs.rpi.edu/~hendler @jahendler (twitter)
  • 2.
  • 3.
    Punchline Semantic Webis real Growing at a fast pace Producing lots of interesting networks That no one is really analyzing from a network science perspective Which could hugely help those of us trying to use this for some really hard real world problems For example, open govt data
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9.
    Sem Web 2010What is different now? Semantics in Search Advertising drives Web markets “ Buzz” around data on the Web Facebook OGP, Open Govt Data, … Maturation of RDF technologies SPARQL endpoints RDFa Lightweight Knowledge A little semantics goes a long way
  • 10.
    Friend of aFriend (our former favorite example) FOAF >60M Foaf people (not necessarily distinct individuals) in hundreds of applications touched by a large community (> 100,000,000 users) Used by a number of large providers If you use LiveJournal, you have a FOAF file Also flickr, ecademy, tribe, joost, … And you can export Foaf from Facebook and many other social networking sites
  • 11.
    FOAF Network hasbeen explored as Social network per se
  • 12.
    Foaf complicated comparedto OGP (Facebook’s Open Graph Protocol) og:title - The title of your object as it should appear within the graph, e.g., "The Rock". og:type - The type of your object, e.g., "movie". Depending on the type you specify, other properties may also be required. og:image - An image URL which should represent your object within the graph. og:url - The canonical URL of your object that will be used as its permanent ID in the graph og:description - A one to two sentence description of your object. og:site_name - If your object is part of a larger web site, the name which should be displayed for the overall site. e.g., "IMDb".
  • 13.
    OGP use growingquickly Facebook incentivizing use of RDFa like buttons 15,178 sites of top 1,000,000 as of 3/3/11
  • 14.
    OGP creates afast-growing, multiply-labeled, network FB reports ~ 10-15% of > 3,000,000 likes per day!
  • 15.
    Important Real WorldUse Case: Government Data Sharing January 1, 2009 “ Openness will strengthen our democracy and promote efficiency and effectiveness in Government.” --- President Obama Putting Govt Data online- Data.gov.uk beta May 21, 2009 January 19, 2010 data.gov.uk online May 21, 2010 data.gov online data.gov relaunch with semantic web featured June30,2009 December 8, 2009 “ Open Government Directive” released 2009 2010 … 57 Data Sets ~6000 Data Set ~2000 Data Sets >305,000 Data Sets
  • 16.
    Government Mashups andApplications See more than 50 of these at http://logd.tw.rpi.edu
  • 17.
    Linking GDP ofthe US and China GDP of China (Billion Chinese Yuan ) GDP of the US (Billion Dollar) [Temporal Mashup] bea.gov + federalreserve.gov +stats.gov.cn
  • 18.
    Linking GDP ofthe US and China GDP of China (Billion Chinese Yuan ) GDP of the US (Billion Dollar) [Temporal Mashup] bea.gov + federalreserve.gov +stats.gov.cn This mashup was built in less than 8 hours – including conversion of data, web interface, and visualization!
  • 19.
    Mashups allow comparisonsthat single data sets cannot Trends in Smoking Prevalence, Tobacco Policy Coverage and Tobacco Prices (1991-2007) Extensible Mashups via Linked Data Diverse datasets from NIH Potentially linking to “unemployment rate” Accountable Mashups via Provenance Annotate datasets used in demos Feedback users’ comment to gov contact (e.g. %)
  • 20.
    Govt data linkedto Social Media Metadata
  • 21.
    There is alot of workflow information in the mix derive derive create derive revision Convert Access Enhance Version SemDiff
  • 22.
    A Web ScienceChallenge How can we search for data?
  • 23.
    Effective open govtrequires exploiting the linked open govt network http://linkeddata.org/
  • 24.
    Effective open govtrequires exploiting the linked open govt network Government Data is currently about ½ the cloud in size (~15B triples), 10s of thousands of links to other data (within and without)
  • 25.
    Linked Open Dataon the Web Linked Open Data – over 23B triples
  • 26.
    Linked open datanetwork Linked Open Data – over 23B triples in a sparsely connected graph of highly connected graphs (and we know very little about the properties of most of these, let alone of the whole)
  • 27.
    Linked open datanetwork The good news: Web accessible, machine readable, anonymized and
  • 28.
    Linked open datanetwork Why is this hard? Doubling in size every 10 months, very varied “authorities,” many different kinds of linking used (same URI, (sort of) same by assertion, (sort of) same by inference, transitive closures, …)
  • 29.
    A new buzzword…Linked Data (RDF, SPARQL) Semantic Web (RDFS, owl ) Web 3.0 Web 2.0 Web 3.0 extends current Web applications using Semantic Web, esp semantic and real-time search, technologies and graph-based, open data. Web (REST API)
  • 30.
    And a newcommercial motivator Web: Google Underlying theory: exploit the Web graph Web 2.0: Facebook, YouTube, Twitter… Underlying theory: exploit the Social network Web 3.0: (Your company here) Underlying theory: exploit the “semantics” in all these graphs
  • 31.
    Punchline: Web andNetwork Science Challenge Semantic Web is real Growing at a fast pace Producing lots of interesting networks That no one is really analyzing from a network science perspective Which could hugely help those of us trying to use this for some really hard real world problems For example, open govt data
  • 32.
    Questions? New editionincludes OGP, Data.gov, …