KEMBAR78
Open data for UK public sector organisations | PPT
Op Open data: challenges & opportunities  #wmod10  Andrew Mackenzie
Open data: what & why
What is open government data? By “open” we mean open as in the Open (Knowledge) Definition — in essence material (data) is open if it can be freely used, reused and redistributed by anyone. By “government data” we mean data and information produced or commissioned by government or government controlled entities. (OKFN)
Why open government data? • Transparency . In a well-functioning democratic society citizens need to know what their government is doing. To do that, they must be able freely to access government data and information and to share that information with other citizens. Transparency isn’t just about access it is also about sharing and reuse — often, to understand material it needs to be analysed and visualised and this requires that the material be open so that it can be freely used and reused. Releasing social and commercial value . In a digital age, data is a key resource for social and commercial activities. Everything from finding your local post office to building a search engine requires access to data much of which is created or held by government. By opening up data, government can help drive the creation of innovative business and services that deliver social and commercial value. Participatory Governance . Much of the time citizens are only able to engage with their own governance sporadically — maybe just at an election every 4 or 5 years. By opening up data, citizens are enabled to be much more directly informed and involved in decision-making. This is more than transparency: it’s about making a full “read/write” society, not just about knowing what is happening in the governance process but being able to contribute to it. (OKFN)
Technical standards
• A continuum from publishing low-cost Excel & PDF to linked data, RDF, triplestore. • ★   make your stuff available on the web (whatever format) ★★   make it available as structured data (e.g. excel instead of image scan of a table) ★★★   non-proprietary format (e.g. csv instead of excel) ★★★★   use URLs to identify things, so that people can point at your stuff ★★★★★   link your data to other people’s data to provide context
A long history to the idea of the semantic web. TBL Weaving the Web 1999. Since then, lightweight technologies like RSS have become successful. The tools have matured to make the web of data possible, but not yet easy to implement. • Pragmatically, different organisations have  different levels of in-house skill and resource • There may be no real match between the volume of data they hold and resources available to publish in open formats • Should availability of data be determined by number of staff who can do open data ie how the organisation prioritises?
So who is going to do the work? • Mistake to assume that all work will be done by data holders • Mistake to assume it will be to the timetable of the service provider's choice • Some providers will need to partner with specialist contractors • Some providers may find  the choice about who will implement and how to implement open data is made for them
Costs & benefits • It costs something to produce open data • Outcomes: there has to be a demonstrable benefit. Clearly defined objectives are hard to find • There will be competition for resources • How will we assess success or failure, compare use of resources? We need to evolve a methodology • The costs are in the public sector • There are gains for improved public service, through better distribution of knowledge within and between organisations • There are entirely new opportunities for the private sector to generate value using PSI
Links • Open Government D at a  opengovernmentdata .org • 5 stars of open d at a  inkdroid.org—the-5-stars-of-open-linked- data • JISC Guide to The Semantic Web & Linked Data  PD F  wiki.cetis.ac.uk—The_Semantic_Web .pdf • A basic explanation for policy makers Ing ri dK localdata.pbworks.com—A-basic-explanation-for-policy-ma kers • Public data princip le s  data.gov.uk—Public_Data_Princi ples • 10 principles for linked data in the culture sec to r  www.collectionstrust.org.uk—linked data • rethinking open d at a  blog.okfn.org—rethinking-open-data-lessons-learned-from-the-open-data-front-l ines • W3C Publishing Open Government D at a  www.w3.org—gov- data
Business & non-government users
• Talking about non-government users, which could include 3rd sector aka the people who previously did not hold psi • Mistake to think this is about public sector organisations making "their" data available • The economic case is based on private sector innovation generating growth (Models of public sector funding Rufus Pollock et al 2008 ) • The value lies in aggregating and analysing data at scale • So, outside the public sector, who is going to do this?
For the commercial sector, we can identify two groups • 1. providers: private companies who  provide data services at scale. Convert data, analyse & sell services themselves. Business information publishers, Reuters. • 2. users: smaller media companies, web developers & social enterprises -app store
So far • Open data as a movement has been the work of volunteers and committed public sector employees • Open source model and values important • There is a limit to the scale of work which can be produced by these methods. Developers need funding. • Seems likely the voluntary model will change and there will be a concentration of providers with larger organisations entering the space, simply because the scale of investment is bigger than start-up and the commercial rewards will follow added value. • Scale: example BBC linked data, Reuters Open Calais
Innovation • There has been tremendous innovation at smaller scale where public intervention and individual developers meet > London datastore TFL, FOI, bicycle stores, app store • Much of the innovation is by social enterprises in the civil society space. Innovation which is  technical and social. OpenlyLocal, MySociety,WhereDoesMyMoneyGo, Armchair auditor. See Community. • Unless tendering to public sector organisations is opened up, reporting at £500 will show the same contractors in future.
Looking ahead Think: business publishers merging psi and location data with other databases. Health care and insurance companies. • Loss of innocence. Opportunities for start-ups will diminish. • For the public sector the Pickles doctrine looks like cancelling publicly funded websites and pointing to data.gov.uk • Also, cancelling Local Area Assessment, RDAs, AWM ...  so, at the local and regional level, who will collect, maintain and interpret public data? Local Enterprise Panels, councils, health service etc. •   Data exchange to replace organisations. ... who looks after the public interest? Public scrutiny, competitors. • The coalition government would like to use the developer community  as the new outsourcing. Fortunately, the developer community will not be directed.
Links • Pickles 600 websites  £100m savin gs    www.telegraph.co.uk—Francis-Maude-Government-to-scrap-three-quarters-of-its-websites-to-save-100million. html • Local government data: lessons from Lon do n  www.guardian.co.uk—local-government-data-london-datas tore • London Data store: the story so  fa r  data.london.gov.uk—story-so -far • Cambridge Study: Models of public sector funding Rufus Pollock et al 2008  P DF www.rufuspollock.org—psi-funding-opt ions • Funding options for PSI hold er s  www.rufuspollock.org—psi-funding-opt ions
Media
• Data journalism. Skills. Visualisation. • Provenance: data, stories, citation • FoI, cost comparison with open data • Microformats and discovery for news stories hNews • Linked data and discovery for long news stories and research BBC, archive • Climate change emails and media coverage. Are they related? • Managing bad news
Decision making, lobbying records • View lobbyists by industry. USA Open secrets • Public distrust. Senior civil servants meet lobbyists. It's part of their job • There is an issue of inequality of access to the policymaking process • There are issues of standards of evidence and transparency of evidence • The appearance of consultation vs an open policy process. See also, commentable documents • Public relations through social media: Facebook Cameron/Zuckerberg
Links • Lessig Against Transpare nc y  www.tnr.com—against-transpar ency • Steph Grey Good and bad transpare nc y  blog.helpfultechnology.com—good-and-bad-transpar ency • EU lobbyist register 'will never be mandato ry '  www.euractiv.com—article-18 1390 • Open secr et s  www.opensecrets.org—view-lobbyists-by-industry-on. html • Senior Civil servants fight off transparency, lobbying industry victory 2 00 9  timesonline.typepad.com—senior-civil-servants-fight-off-transparency-lobbying-industry-score-huge-victory. html
Visualisation & mapping
Licenses • Derived works • A map is a universally readable way of displaying data • For data, a map is a visualisation choice • developers will use the map which gives them the richest api • all map organisations want your data in their world ▼ OSM risks being limited to cartography •   recent announcement of $1m investment • For data portability, need to make map output interchangeable • Location is  hot. Social media wants to know where you are, to sell your location data. Apple does. So does burglar.net.
Community & participation
Pareto curve of participation ▼ "In many ways, OpenStreetMap is similar to other open source and open knowledge projects, such as Wikipedia. These similarities include the patterns of contribution and the importance of  participation inequalities , in which a small group of participants contribute very significantly, while a very large group of occasional participants contribute only occasionally; the  general demographic  of participants, with strong representation from educated young males; or the  temporal patterns of engagements , in which some participants go through a peak of activity and lose interest, while a small group joins and continues to invest its time and effort to help the progress of the project. These aspects have been identified by researchers who explored volunteering and leisure activities, and crowdsourcing as well as those who explored commons-based peer production networks" (Haklay 2010) • Haklay goes on to argue that there are specific constraints of geography in OSM participation • Participation in OSM and other voluntary production also reflects  the geography of inequality • Pareto or Power Law curve for participation in OSM, Wikipedia, Galaxy Zoo. See also, political parties, campaign groups. • Active participants are often involved with the governance of their community. More rewarding than local volunteering?
Netflix longtail Photo by igrigorik -  http://flic.kr/p/ywXoC
Myth ▼ Echos of The Long Tail, but the parallel is with mythology • The Long Tail was something people wanted to believe • The more mundane truth is that Amazon discovered infinite shelf space, and blockbusters still outsell the long tail. Sorry. • So will we hit capacity in the numbers of people willing to commit a lot of time to open data? • Or are there ways of expanding participation which allow large numbers of people with limited technical knowledge to contribute a little time? cf Citizen science • You can make a case for this with crowdsourced archive, for example, cataloguing old BBC programmes • In the real world, it is unclear that there are large numbers of people longing to volunteer for community service in the Big Society
Baltimore Citistat and potholes. Mayors won't reduce level of service on potholes • Private contractor, stats set the agenda. Which stats? ▼ Civil society with targets set by units which are easy to measure and fulfil. • Professional values in public service, unpopular client groups,  ▼ Here, fixmystreet. Why don't local authorities use the service? • Open 311 a standard for fault reporting
Civil society • Information about the service is not a substitute for the service. Telephone call centre message. • The experts are not in the room. Management change • Redressing the balance. Duty of public bodies to engage a representative cross-section of society.
Links • Tyranny of place and OpenStreetMap Muki Ha kl ay povesham.wordpress.com—the-tyranny-of-place-and-openstree tmap • Commons-based Peer Production and Vi rt ue www.nyu.edu—jopp_235 .pdf • Chris Anderson The Long Tail Wired 2 00 4  www.wired.com—tail_pr. html • Behn report What All Mayors Would Like to Know About Baltimore’s CitiStat Performance Strat eg y  www.hks.harvard.edu—thebehnre port • Open 311 a standard for fault report in g  open311 .org • Write To Reply commentable docume nt s  writetoreply.org—draft-public-data-princi ples • 'Amplified Meetings and Participatory Deliberati on '  blog.ouseful.info—amplified-meetings-and-participatory-delibera tion
Andrew Mackenzie www.take21.org/blog take21(at)bethere(dot)co(dot)uk @DJSoup

Open data for UK public sector organisations

  • 1.
    Op Open data:challenges & opportunities #wmod10 Andrew Mackenzie
  • 2.
  • 3.
    What is opengovernment data? By “open” we mean open as in the Open (Knowledge) Definition — in essence material (data) is open if it can be freely used, reused and redistributed by anyone. By “government data” we mean data and information produced or commissioned by government or government controlled entities. (OKFN)
  • 4.
    Why open governmentdata? • Transparency . In a well-functioning democratic society citizens need to know what their government is doing. To do that, they must be able freely to access government data and information and to share that information with other citizens. Transparency isn’t just about access it is also about sharing and reuse — often, to understand material it needs to be analysed and visualised and this requires that the material be open so that it can be freely used and reused. Releasing social and commercial value . In a digital age, data is a key resource for social and commercial activities. Everything from finding your local post office to building a search engine requires access to data much of which is created or held by government. By opening up data, government can help drive the creation of innovative business and services that deliver social and commercial value. Participatory Governance . Much of the time citizens are only able to engage with their own governance sporadically — maybe just at an election every 4 or 5 years. By opening up data, citizens are enabled to be much more directly informed and involved in decision-making. This is more than transparency: it’s about making a full “read/write” society, not just about knowing what is happening in the governance process but being able to contribute to it. (OKFN)
  • 5.
  • 6.
    • A continuumfrom publishing low-cost Excel & PDF to linked data, RDF, triplestore. • ★ make your stuff available on the web (whatever format) ★★ make it available as structured data (e.g. excel instead of image scan of a table) ★★★ non-proprietary format (e.g. csv instead of excel) ★★★★ use URLs to identify things, so that people can point at your stuff ★★★★★ link your data to other people’s data to provide context
  • 7.
    A long historyto the idea of the semantic web. TBL Weaving the Web 1999. Since then, lightweight technologies like RSS have become successful. The tools have matured to make the web of data possible, but not yet easy to implement. • Pragmatically, different organisations have different levels of in-house skill and resource • There may be no real match between the volume of data they hold and resources available to publish in open formats • Should availability of data be determined by number of staff who can do open data ie how the organisation prioritises?
  • 8.
    So who isgoing to do the work? • Mistake to assume that all work will be done by data holders • Mistake to assume it will be to the timetable of the service provider's choice • Some providers will need to partner with specialist contractors • Some providers may find the choice about who will implement and how to implement open data is made for them
  • 9.
    Costs & benefits• It costs something to produce open data • Outcomes: there has to be a demonstrable benefit. Clearly defined objectives are hard to find • There will be competition for resources • How will we assess success or failure, compare use of resources? We need to evolve a methodology • The costs are in the public sector • There are gains for improved public service, through better distribution of knowledge within and between organisations • There are entirely new opportunities for the private sector to generate value using PSI
  • 10.
    Links • OpenGovernment D at a opengovernmentdata .org • 5 stars of open d at a inkdroid.org—the-5-stars-of-open-linked- data • JISC Guide to The Semantic Web & Linked Data PD F wiki.cetis.ac.uk—The_Semantic_Web .pdf • A basic explanation for policy makers Ing ri dK localdata.pbworks.com—A-basic-explanation-for-policy-ma kers • Public data princip le s data.gov.uk—Public_Data_Princi ples • 10 principles for linked data in the culture sec to r www.collectionstrust.org.uk—linked data • rethinking open d at a blog.okfn.org—rethinking-open-data-lessons-learned-from-the-open-data-front-l ines • W3C Publishing Open Government D at a www.w3.org—gov- data
  • 11.
  • 12.
    • Talking aboutnon-government users, which could include 3rd sector aka the people who previously did not hold psi • Mistake to think this is about public sector organisations making "their" data available • The economic case is based on private sector innovation generating growth (Models of public sector funding Rufus Pollock et al 2008 ) • The value lies in aggregating and analysing data at scale • So, outside the public sector, who is going to do this?
  • 13.
    For the commercialsector, we can identify two groups • 1. providers: private companies who provide data services at scale. Convert data, analyse & sell services themselves. Business information publishers, Reuters. • 2. users: smaller media companies, web developers & social enterprises -app store
  • 14.
    So far •Open data as a movement has been the work of volunteers and committed public sector employees • Open source model and values important • There is a limit to the scale of work which can be produced by these methods. Developers need funding. • Seems likely the voluntary model will change and there will be a concentration of providers with larger organisations entering the space, simply because the scale of investment is bigger than start-up and the commercial rewards will follow added value. • Scale: example BBC linked data, Reuters Open Calais
  • 15.
    Innovation • Therehas been tremendous innovation at smaller scale where public intervention and individual developers meet > London datastore TFL, FOI, bicycle stores, app store • Much of the innovation is by social enterprises in the civil society space. Innovation which is technical and social. OpenlyLocal, MySociety,WhereDoesMyMoneyGo, Armchair auditor. See Community. • Unless tendering to public sector organisations is opened up, reporting at £500 will show the same contractors in future.
  • 16.
    Looking ahead Think:business publishers merging psi and location data with other databases. Health care and insurance companies. • Loss of innocence. Opportunities for start-ups will diminish. • For the public sector the Pickles doctrine looks like cancelling publicly funded websites and pointing to data.gov.uk • Also, cancelling Local Area Assessment, RDAs, AWM ... so, at the local and regional level, who will collect, maintain and interpret public data? Local Enterprise Panels, councils, health service etc. • Data exchange to replace organisations. ... who looks after the public interest? Public scrutiny, competitors. • The coalition government would like to use the developer community as the new outsourcing. Fortunately, the developer community will not be directed.
  • 17.
    Links • Pickles600 websites £100m savin gs www.telegraph.co.uk—Francis-Maude-Government-to-scrap-three-quarters-of-its-websites-to-save-100million. html • Local government data: lessons from Lon do n www.guardian.co.uk—local-government-data-london-datas tore • London Data store: the story so fa r data.london.gov.uk—story-so -far • Cambridge Study: Models of public sector funding Rufus Pollock et al 2008 P DF www.rufuspollock.org—psi-funding-opt ions • Funding options for PSI hold er s www.rufuspollock.org—psi-funding-opt ions
  • 18.
  • 19.
    • Data journalism.Skills. Visualisation. • Provenance: data, stories, citation • FoI, cost comparison with open data • Microformats and discovery for news stories hNews • Linked data and discovery for long news stories and research BBC, archive • Climate change emails and media coverage. Are they related? • Managing bad news
  • 20.
    Decision making, lobbyingrecords • View lobbyists by industry. USA Open secrets • Public distrust. Senior civil servants meet lobbyists. It's part of their job • There is an issue of inequality of access to the policymaking process • There are issues of standards of evidence and transparency of evidence • The appearance of consultation vs an open policy process. See also, commentable documents • Public relations through social media: Facebook Cameron/Zuckerberg
  • 21.
    Links • LessigAgainst Transpare nc y www.tnr.com—against-transpar ency • Steph Grey Good and bad transpare nc y blog.helpfultechnology.com—good-and-bad-transpar ency • EU lobbyist register 'will never be mandato ry ' www.euractiv.com—article-18 1390 • Open secr et s www.opensecrets.org—view-lobbyists-by-industry-on. html • Senior Civil servants fight off transparency, lobbying industry victory 2 00 9 timesonline.typepad.com—senior-civil-servants-fight-off-transparency-lobbying-industry-score-huge-victory. html
  • 22.
  • 23.
    Licenses • Derivedworks • A map is a universally readable way of displaying data • For data, a map is a visualisation choice • developers will use the map which gives them the richest api • all map organisations want your data in their world ▼ OSM risks being limited to cartography • recent announcement of $1m investment • For data portability, need to make map output interchangeable • Location is hot. Social media wants to know where you are, to sell your location data. Apple does. So does burglar.net.
  • 24.
  • 25.
    Pareto curve ofparticipation ▼ "In many ways, OpenStreetMap is similar to other open source and open knowledge projects, such as Wikipedia. These similarities include the patterns of contribution and the importance of participation inequalities , in which a small group of participants contribute very significantly, while a very large group of occasional participants contribute only occasionally; the general demographic of participants, with strong representation from educated young males; or the temporal patterns of engagements , in which some participants go through a peak of activity and lose interest, while a small group joins and continues to invest its time and effort to help the progress of the project. These aspects have been identified by researchers who explored volunteering and leisure activities, and crowdsourcing as well as those who explored commons-based peer production networks" (Haklay 2010) • Haklay goes on to argue that there are specific constraints of geography in OSM participation • Participation in OSM and other voluntary production also reflects the geography of inequality • Pareto or Power Law curve for participation in OSM, Wikipedia, Galaxy Zoo. See also, political parties, campaign groups. • Active participants are often involved with the governance of their community. More rewarding than local volunteering?
  • 26.
    Netflix longtail Photoby igrigorik - http://flic.kr/p/ywXoC
  • 27.
    Myth ▼ Echosof The Long Tail, but the parallel is with mythology • The Long Tail was something people wanted to believe • The more mundane truth is that Amazon discovered infinite shelf space, and blockbusters still outsell the long tail. Sorry. • So will we hit capacity in the numbers of people willing to commit a lot of time to open data? • Or are there ways of expanding participation which allow large numbers of people with limited technical knowledge to contribute a little time? cf Citizen science • You can make a case for this with crowdsourced archive, for example, cataloguing old BBC programmes • In the real world, it is unclear that there are large numbers of people longing to volunteer for community service in the Big Society
  • 28.
    Baltimore Citistat andpotholes. Mayors won't reduce level of service on potholes • Private contractor, stats set the agenda. Which stats? ▼ Civil society with targets set by units which are easy to measure and fulfil. • Professional values in public service, unpopular client groups, ▼ Here, fixmystreet. Why don't local authorities use the service? • Open 311 a standard for fault reporting
  • 29.
    Civil society •Information about the service is not a substitute for the service. Telephone call centre message. • The experts are not in the room. Management change • Redressing the balance. Duty of public bodies to engage a representative cross-section of society.
  • 30.
    Links • Tyrannyof place and OpenStreetMap Muki Ha kl ay povesham.wordpress.com—the-tyranny-of-place-and-openstree tmap • Commons-based Peer Production and Vi rt ue www.nyu.edu—jopp_235 .pdf • Chris Anderson The Long Tail Wired 2 00 4 www.wired.com—tail_pr. html • Behn report What All Mayors Would Like to Know About Baltimore’s CitiStat Performance Strat eg y www.hks.harvard.edu—thebehnre port • Open 311 a standard for fault report in g open311 .org • Write To Reply commentable docume nt s writetoreply.org—draft-public-data-princi ples • 'Amplified Meetings and Participatory Deliberati on ' blog.ouseful.info—amplified-meetings-and-participatory-delibera tion
  • 31.
    Andrew Mackenzie www.take21.org/blogtake21(at)bethere(dot)co(dot)uk @DJSoup