KEMBAR78
Parsec 191119 slideshare | PPTX
Using satellite imagery and AI to discern the effects of
protected areas in your backyard, while improving the
interface between scientists and the digital world (the
PARSEC project).
Alison Specht, SEES, University of Queensland
With contributions from Shelley Stall, AGU, David Mouillot, U. Montpellier, Nicolas
Mouquet, CNRS, FRB-CESAB, Laurence Mabile, U Toulouse,
Marc Chaumont & Gérard Subsol, LIRMM.
The Belmont Forum is a global partnership of funding
organizations, whose operations aim to encourage:
International transdisciplinary research providing knowledge for
understanding, mitigating and adapting to global environmental
change.
The Science-driven e-Infrastructure Innovation (SEI) aims to:
• Enhance the impact of environmental change research by supporting
technological innovation that would accelerate discovery, inform policy,
and support decision making.
• Enable teams of computer and information scientists and technologists to
work together with natural and social scientists and related stakeholders
in transnational projects.
• These teams would integrate data streams and analysis systems,
amalgamate best practices from public and private sectors, and foster
open data and open access.
Belmont Forum « www.belmontforum.org »
The Belmont Forum is a global partnership of funding
organizations, whose operations aim to encourage:
International transdisciplinary research providing knowledge for
understanding, mitigating and adapting to global environmental
change.
The Science-driven e-Infrastructure Innovation (SEI) aims to:
• Enhance the impact of environmental change research by supporting
technological innovation that would accelerate discovery, inform policy,
and support decision making.
• Enable teams of computer and information scientists and technologists to
work together with natural and social scientists and related stakeholders
in transnational projects.
• These teams would integrate data streams and analysis systems,
amalgamate best practices from public and private sectors, and foster
open data and open access.
Belmont Forum « www.belmontforum.org »
The Belmont Forum is a global partnership of funding
organizations, whose operations aim to encourage:
International transdisciplinary research providing knowledge for
understanding, mitigating and adapting to global environmental
change.
The Science-driven e-Infrastructure Innovation (SEI) aims to:
• Enhance the impact of environmental change research by supporting
technological innovation that would accelerate discovery, inform policy,
and support decision making.
• Enable teams of computer and information scientists and technologists to
work together with natural and social scientists and related stakeholders
in transnational projects.
• These teams would integrate data streams and analysis systems,
amalgamate best practices from public and private sectors, and foster
open data and open access.
Belmont Forum « www.belmontforum.org »
https://media.institut-alternativa.org/2016/05/OpenDataInfoGraphic.jpg
IPBES 2019 Global Assessment Report
Demographic
and
sociocultural
Economic
and
technological
Institutions
and
governance
Conflicts
and
epidemics
Natural ecosystems have declined
by 47% on average, relative to their
earliest estimated states.
Approximately 25% of species are
already threatened with extinction in
most animal and plant groups studied.
Biotic integrity-the abundance of
naturally-present species-has declined by
23% on average in terrestrial communities,
*Since prehistory
Terrestrial
INDIRECT DRIVERS
ECOSYSTEM EXTENT AND CONDITION
SPECIES EXTINCTION RISK
ECOLOGICAL COMMUNITIES
BIOMASS AND SPECIES ABUNDANCE
NATURE FOR INDIGENOUS PEOPLES
AND LOCAL COMMUNITIES
Freshwater
Marine
Direct exploitation
Land/sea use exchange
Climate change
Pollution
Invasive alien species
Others
The global biomass of wild mammals
has fallen by 82%. Indicators of
vertebrate abundance have declined
rapidly since 1970.
72% of indicators developed by indigenous
people sand local communities show
ongoing deterioration of elements of
nature important to them.
47%
25%
23%
82%
72%
Valuesandbehaviours
Synthesis strand:
To combine remote sensing,
artificial intelligence and
socioeconomic data to
assess change in
socioeconomic conditions
Data strand:
To increase the number of
properly cited data sets,
provide credit and attribution,
and accurately track data
and code reuse
2018 2022Four years
PARSEC project in a nutshell
Synthesis strand:
To combine remote sensing,
artificial intelligence and
socioeconomic data to
assess change in
socioeconomic conditions
Data strand:
To increase the number of
properly cited data sets,
provide credit and attribution,
and accurately track data
and code reuse
Determine the
influence of protected
areas on
socioeconomic
outcomes
2018 2022Four years
PARSEC project in a nutshell
Synthesis strand:
To combine remote sensing,
artificial intelligence and
socioeconomic data to
assess change in
socioeconomic conditions
Data strand:
To increase the number of
properly cited data sets,
provide credit and attribution,
and accurately track data
and code reuse
Determine the
influence of protected
areas on
socioeconomic
outcomes
Improve practices
and tools for
interdisciplinary
projects worldwide
2018 2022Four years
PARSEC project in a nutshell
PARSEC – a transnational team
BRAZIL:
EPUSP : P Pizzigati Corrêa
NISR : J-P Ometto
São Paulo U : KMPMB Ferraz
SCIELO : S Santos
USA:
AGU: S Stall
Southern Oregon U: JE Trammell
U California: M O’Brien
The Nature Conservancy: S Reddy, J Evans
UK :
H Glaves : BGS
FRANCE:
N Mouquet : CESAB-FRB, CNRS
A Cambon-Thomsen,
L Mabile, M Thomsen: Toulouse U
M Chaumont, G Subsol: LIRMM
J Claudet, L Thiault : CNRS
L Durieux, F Sèyler : IRD
D Mouillot, L Velez : Montpellier U
O Hologne, R David : INRA
JAPAN:
Y Murayama, K Imai: NICT
Y Kondo, : RIHN
T Osawa: Tokyo Met U
AUSTRALIA:
A Specht: U of Queensland
L Wyborn: NCI
Associates:
PARSEC data strand
Leader: Shelley Stall,
Senior Director, Data Leadership, American Geophysical Union
• incentives for researchers…
• a sturdy credit and attribution
infrastructure that benefits
researchers…
• recommendations and best
practices that work for
researchers to encourage data
sharing and data reuse…
PARSEC data strand: objectives
Increase the number of properly cited data sets, provide credit and
attribution, and accurately track data reuse. For this we need:
• incentives for researchers…
• a sturdy credit and attribution
infrastructure that benefits
researchers…
• recommendations and best
practices that work for
researchers to encourage data
sharing and data reuse…
PARSEC data strand: objectives
Increase the number of properly cited data sets, provide credit and
attribution, and accurately track data reuse. For this we need:
• incentives for researchers…
• a sturdy credit and attribution
infrastructure that benefits
researchers…
• recommendations and best
practices that work for
researchers to encourage data
sharing and data reuse…
PARSEC data strand: objectives
Increase the number of properly cited data sets, provide credit and
attribution, and accurately track data reuse. For this we need:
A. Robustly connect identifiers across papers, people, and repositories. Currently, even if these
identifiers are included, the necessary linking to allow tracking is not fully implemented.
B. Conduct outreach and adoption campaigns on the importance of persistent identifiers and their
infrastructure to all relevant stakeholders—these include the data repositories, publishers, researchers,
and the key groups that set standards for publishers for reference tagging.
C. Promote and extend data usage metrics generated by RDA’s Data Usage Metrics Working Group
and data citations generated by RDA’s Scholix Working Group. By requiring that data be cited from a
trusted, community-accepted repository, the value of repositories can be better measured. Data sharing
through citation increases the likelihood of data discovery and reuse.
D. Provide guidance to our own science-synthesis team and the selected project teams from our
partners to optimize data reuse as well as data deposition of generated data for possible reuse.
Demonstrate that when researchers follow the FAIR Data Principles, data are better prepared for others to
understand, reuse increases, and discovery is improved.
E. Promote the work of integrated guidance that will address recommendations (generic and specific to
the ecological/biodiversity community) to improve each step of the process of data sharing, reuse, credit
and reward for researchers and repositories.
PARSEC data strand: work program
A. Robustly connect identifiers across papers, people, and repositories. Currently, even if these
identifiers are included, the necessary linking to allow tracking is not fully implemented.
B. Conduct outreach and adoption campaigns on the importance of persistent identifiers and their
infrastructure to all relevant stakeholders—these include the data repositories, publishers, researchers,
and the key groups that set standards for publishers for reference tagging.
C. Promote and extend data usage metrics generated by RDA’s Data Usage Metrics Working Group
and data citations generated by RDA’s Scholix Working Group. By requiring that data be cited from a
trusted, community-accepted repository, the value of repositories can be better measured. Data sharing
through citation increases the likelihood of data discovery and reuse.
D. Provide guidance to our own science-synthesis team and the selected project teams from our
partners to optimize data reuse as well as data deposition of generated data for possible reuse.
Demonstrate that when researchers follow the FAIR Data Principles, data are better prepared for others to
understand, reuse increases, and discovery is improved.
E. Promote the work of integrated guidance that will address recommendations (generic and specific to
the ecological/biodiversity community) to improve each step of the process of data sharing, reuse, credit
and reward for researchers and repositories.
PARSEC data strand: work program
A. Robustly connect identifiers across papers, people, and repositories. Currently, even if these
identifiers are included, the necessary linking to allow tracking is not fully implemented.
B. Conduct outreach and adoption campaigns on the importance of persistent identifiers and their
infrastructure to all relevant stakeholders—these include the data repositories, publishers, researchers,
and the key groups that set standards for publishers for reference tagging.
C. Promote and extend data usage metrics generated by RDA’s Data Usage Metrics Working Group
and data citations generated by RDA’s Scholix Working Group. By requiring that data be cited from a
trusted, community-accepted repository, the value of repositories can be better measured. Data sharing
through citation increases the likelihood of data discovery and reuse.
D. Provide guidance to our own science-synthesis team and the selected project teams from our
partners to optimize data reuse as well as data deposition of generated data for possible reuse.
Demonstrate that when researchers follow the FAIR Data Principles, data are better prepared for others to
understand, reuse increases, and discovery is improved.
E. Promote the work of integrated guidance that will address recommendations (generic and specific to
the ecological/biodiversity community) to improve each step of the process of data sharing, reuse, credit
and reward for researchers and repositories.
PARSEC data strand: work program
A. Robustly connect identifiers across papers, people, and repositories. Currently, even if these
identifiers are included, the necessary linking to allow tracking is not fully implemented.
B. Conduct outreach and adoption campaigns on the importance of persistent identifiers and their
infrastructure to all relevant stakeholders—these include the data repositories, publishers, researchers,
and the key groups that set standards for publishers for reference tagging.
C. Promote and extend data usage metrics generated by RDA’s Data Usage Metrics Working Group
and data citations generated by RDA’s Scholix Working Group. By requiring that data be cited from a
trusted, community-accepted repository, the value of repositories can be better measured. Data sharing
through citation increases the likelihood of data discovery and reuse.
D. Provide guidance to our own science-synthesis team and the selected project teams from our
partners to optimize data reuse as well as data deposition of generated data for possible reuse.
Demonstrate that when researchers follow the FAIR Data Principles, data are better prepared for others to
understand, reuse increases, and discovery is improved.
E. Promote the work of integrated guidance that will address recommendations (generic and specific to
the ecological/biodiversity community) to improve each step of the process of data sharing, reuse, credit
and reward for researchers and repositories.
PARSEC data strand: work program
A. Robustly connect identifiers across papers, people, and repositories. Currently, even if these
identifiers are included, the necessary linking to allow tracking is not fully implemented.
B. Conduct outreach and adoption campaigns on the importance of persistent identifiers and their
infrastructure to all relevant stakeholders—these include the data repositories, publishers, researchers,
and the key groups that set standards for publishers for reference tagging.
C. Promote and extend data usage metrics generated by RDA’s Data Usage Metrics Working Group
and data citations generated by RDA’s Scholix Working Group. By requiring that data be cited from a
trusted, community-accepted repository, the value of repositories can be better measured. Data sharing
through citation increases the likelihood of data discovery and reuse.
D. Provide guidance to our own science-synthesis team and the selected project teams from our
partners to optimize data reuse as well as data deposition of generated data for possible reuse.
Demonstrate that when researchers follow the FAIR Data Principles, data are better prepared for others to
understand, reuse increases, and discovery is improved.
E. Promote the work of integrated guidance that will address recommendations (generic and specific to
the ecological/biodiversity community) to improve each step of the process of data sharing, reuse, credit
and reward for researchers and repositories.
PARSEC data strand: work program
PARSEC data strand: credit and reward landscape
PARSEC data strand: partnership projects
Slide Credit: Martin Fenner,
DataCite, PARSEC meeting at
RDA P14, 21 Oct 2019
PARSEC data strand – synthesis strand linkage
PARSEC data strand – synthesis strand linkage
PARSEC data strand – synthesis strand linkage
PARSEC data strand – synthesis strand linkage
PARSEC synthesis strand
Leader: David Mouillot,
University of Montpellier, France
PARSEC synthesis strand: work program
• WP1: Stratified sampling of 200 rural
communities close to and far from
natural protected areas (PAs) using
matching algorithms.
• WP2: Estimate socioeconomic
conditions in the selected rural
communities using remote sensing and
artificial intelligence.
• WP3: Using paired comparison tests
determine whether proximity to a PA
can improve socioeconomic outcomes.
Identify contributing factors.
PARSEC WP1: stratified sampling
Step 1: Identification of suitable data for the project
Selection of socio-economic systems close to a Protected
Area (PA)
• PA: IUCN category 1-5; >10km2; creation date 2000-
2015
• Town/village: <5000 inhabitants < 20km from PA,
>100km from large city
Step 2: Acquiring Data – for example
• PA: Dipperu NP, IUCN category 1a, 112.05 km2,
created 2014
• Town: Nebo, 840 inhabitants, circa 20km from PA,
>100km from Mackay
PARSEC WP1: stratified sampling
Step 1: Identification of suitable data for the project
Selection of socio-economic systems close to a Protected
Area (PA)
• PA: IUCN category 1-5; >10km2; creation date 2000-
2015
• Town/village: <5000 inhabitants < 20km from PA,
>100km from large city
Step 2: Acquiring Data – for example
• PA: Dipperu NP, IUCN category 1a, 112.05 km2,
created 2014
• Town: Nebo, 840 inhabitants, circa 20km from PA,
>100km from Mackay
PARSEC WP1: stratified sampling
Further criteria for selection
(a) Socio-economic status of the town so information can be derived
about (for example):
• Gross Domestic Product (GDP)
• Human Development Index (HDI)
• Child Growth Failure (CGF)
• Consumption Expenditure (CE)
• Asset Health (AH)
Sources: UNESCO MICS surveys, World Bank studies, information
that allows comparison across the countries, and beyond. Local surveys
OK, but more difficult to use for comparison.
(b) Image analysis across time: before and after the creation of the
PA. Are the images available? In what format (Landsat, SPOT,
QuickBird etc)? Assessment of socio-economic status using AI
(c) What is the possibility for mirror sites?
PARSEC WP1: stratified sampling
Further criteria for selection
(a) Socio-economic status of the town so information can be derived
about (for example):
• Gross Domestic Product (GDP)
• Human Development Index (HDI)
• Child Growth Failure (CGF)
• Consumption Expenditure (CE)
• Asset Health (AH)
Sources: UNESCO MICS surveys, World Bank studies, information
that allows comparison across the countries, and beyond. Local surveys
OK, but more difficult to use for comparison.
(b) Image analysis across time: before and after the creation of the
PA. Are the images available? In what format (Landsat, SPOT,
QuickBird etc)? Assessment of socio-economic status using AI
(c) What is the possibility for mirror sites?
PARSEC WP1: stratified sampling
Further criteria for selection
(a) Socio-economic status of the town so information can be derived
about (for example):
• Gross Domestic Product (GDP)
• Human Development Index (HDI)
• Child Growth Failure (CGF)
• Consumption Expenditure (CE)
• Asset Health (AH)
Sources: UNESCO MICS surveys, World Bank studies, information
that allows comparison across the countries, and beyond. Local surveys
OK, but more difficult to use for comparison.
(b) Image analysis across time: before and after the creation of the
PA. Are the images available? In what format (Landsat, SPOT,
QuickBird etc)? Assessment of socio-economic status using AI
(c) What is the possibility for mirror sites?
The machine learning component
PARSEC WP2: estimation of socio-economic
conditions using remote sensing and artificial
intelligence
PARSEC WP2: machine learning protocol
• STEP 1. We « show » the « network » examples and counter-
examples
• STEP 2. We use the network 
Deep learning
fish
other
fish
other
THELEARNING
PARSEC WP2: decisions via a CNN*
•
convolutions
Average or max [+ sub-sampling]
Non linear function
(= activation function)
normalisation
*Convolutional Neural Network
PARSEC WP2: prediction of poverty with a CNN?
Predicted poverty in Uganda
Image from Xie et al. (2016)
Using 400x400 pixel images
(1 km x 1km) the CNN should
predict a poverty value (scalar)
(0 = Low ; 100 = high)
Google Static Maps API,
for the image 400 × 400 pixels
at zoom level 16
(poverty  annual consumption level of households)
Predicted poverty probabilities at a fine-grained
10 x 10km block level
PARSEC WP2: Parts of the image that “react”
Original daytime
satellite images from
Google Static Maps
filter activation maps
Overlay of activation
maps onto original
images
urban areas nonurban areas water roads
Jean et al. 2016
PARSEC WP2: but is this already done?
From Figure 3, Xie et al. (2016)
* World Resources Institute, 2009
PARSEC WP2: but is this already done?
From Figure 3, Xie et al. (2016)
* World Resources Institute, 2009
PARSEC WP2: but is this already done?
From Figure 3, Xie et al. (2016)
* World Resources Institute, 2009
PARSEC WP2: but is this already done?
From Figure 3, Xie et al. (2016)
Only 70% correlated to the ground truth evidence
* World Resources Institute, 2009
PARSEC WP2: how to improve the prediction?
• More images with more ground truthing
• Work on methods using a small number of images.
• Work with temporal sequences
• Add diversity in the learning database:
• Really poor to very rich
• Various places in the world
(should we create only one unique CNN or …)
• Challenges
• poverty data accessibility (within time-period, frequency, type)
• Integration of multi-resolution, multi-source, sparsity, time
integration, incomplete data, etc ..
And there is more!
WP3: Using paired comparison tests determine
whether proximity to a PA can improve
socioeconomic outcomes. Identify contributing
factors.
But that’s for 2022
Thankyou!
PARSEC: acknowledgements
IPBES (2019) Summary for policymakers of the global assessment report on
biodiversity and ecosystem services of the Intergovernmental Science-Policy
Platform on Biodiversity and Ecosystem Services. Eds S. Díaz, J. Settele, E. S.
Brondizio E.S., H. T. Ngo, M. Guèze, J. Agard, A. Arneth, P. Balvanera, K. A.
Brauman, S. H. M. Butchart, K. M. A. Chan, L. A. Garibaldi, K. Ichii, J. Liu, S. M.
Subramanian, G. F. Midgley, P. Miloslavich, Z. Molnár, D. Obura, A. Pfaff, S.
Polasky, A. Purvis, J. Razzaque, B. Reyers, R. Roy Chowdhury, Y. J. Shin, I. J.
Visseren-Hamakers, K. J. Willis, and C. N. Zayas. IPBES secretariat, Bonn,
Germany. 39 pages.
Jean M., Burke M., Xie M., Davis W. M., Lobell D. B. , Ermon S. (2016) Combining
satellite imagery and machine learning to predict poverty. Science 353(6301):
790-794.
World Resources Institute (2009) Mapping a better future: how spatial analysis can
benefit wetlands and reduce poverty in Uganda. 39p. Washington, D.C. (USA): WRI.
Xie M., Jean N., Burke M., Lobell D., Ermon S. (2016) Transfer learning from deep
features for remote sensing and poverty mapping, pp 3929-3935. Proc. 13th
AAAI Conference (AAAI-16).
PARSEC: cited references

Parsec 191119 slideshare

  • 1.
    Using satellite imageryand AI to discern the effects of protected areas in your backyard, while improving the interface between scientists and the digital world (the PARSEC project). Alison Specht, SEES, University of Queensland With contributions from Shelley Stall, AGU, David Mouillot, U. Montpellier, Nicolas Mouquet, CNRS, FRB-CESAB, Laurence Mabile, U Toulouse, Marc Chaumont & Gérard Subsol, LIRMM.
  • 2.
    The Belmont Forumis a global partnership of funding organizations, whose operations aim to encourage: International transdisciplinary research providing knowledge for understanding, mitigating and adapting to global environmental change. The Science-driven e-Infrastructure Innovation (SEI) aims to: • Enhance the impact of environmental change research by supporting technological innovation that would accelerate discovery, inform policy, and support decision making. • Enable teams of computer and information scientists and technologists to work together with natural and social scientists and related stakeholders in transnational projects. • These teams would integrate data streams and analysis systems, amalgamate best practices from public and private sectors, and foster open data and open access. Belmont Forum « www.belmontforum.org »
  • 3.
    The Belmont Forumis a global partnership of funding organizations, whose operations aim to encourage: International transdisciplinary research providing knowledge for understanding, mitigating and adapting to global environmental change. The Science-driven e-Infrastructure Innovation (SEI) aims to: • Enhance the impact of environmental change research by supporting technological innovation that would accelerate discovery, inform policy, and support decision making. • Enable teams of computer and information scientists and technologists to work together with natural and social scientists and related stakeholders in transnational projects. • These teams would integrate data streams and analysis systems, amalgamate best practices from public and private sectors, and foster open data and open access. Belmont Forum « www.belmontforum.org »
  • 4.
    The Belmont Forumis a global partnership of funding organizations, whose operations aim to encourage: International transdisciplinary research providing knowledge for understanding, mitigating and adapting to global environmental change. The Science-driven e-Infrastructure Innovation (SEI) aims to: • Enhance the impact of environmental change research by supporting technological innovation that would accelerate discovery, inform policy, and support decision making. • Enable teams of computer and information scientists and technologists to work together with natural and social scientists and related stakeholders in transnational projects. • These teams would integrate data streams and analysis systems, amalgamate best practices from public and private sectors, and foster open data and open access. Belmont Forum « www.belmontforum.org »
  • 5.
  • 6.
    IPBES 2019 GlobalAssessment Report Demographic and sociocultural Economic and technological Institutions and governance Conflicts and epidemics Natural ecosystems have declined by 47% on average, relative to their earliest estimated states. Approximately 25% of species are already threatened with extinction in most animal and plant groups studied. Biotic integrity-the abundance of naturally-present species-has declined by 23% on average in terrestrial communities, *Since prehistory Terrestrial INDIRECT DRIVERS ECOSYSTEM EXTENT AND CONDITION SPECIES EXTINCTION RISK ECOLOGICAL COMMUNITIES BIOMASS AND SPECIES ABUNDANCE NATURE FOR INDIGENOUS PEOPLES AND LOCAL COMMUNITIES Freshwater Marine Direct exploitation Land/sea use exchange Climate change Pollution Invasive alien species Others The global biomass of wild mammals has fallen by 82%. Indicators of vertebrate abundance have declined rapidly since 1970. 72% of indicators developed by indigenous people sand local communities show ongoing deterioration of elements of nature important to them. 47% 25% 23% 82% 72% Valuesandbehaviours
  • 8.
    Synthesis strand: To combineremote sensing, artificial intelligence and socioeconomic data to assess change in socioeconomic conditions Data strand: To increase the number of properly cited data sets, provide credit and attribution, and accurately track data and code reuse 2018 2022Four years PARSEC project in a nutshell
  • 9.
    Synthesis strand: To combineremote sensing, artificial intelligence and socioeconomic data to assess change in socioeconomic conditions Data strand: To increase the number of properly cited data sets, provide credit and attribution, and accurately track data and code reuse Determine the influence of protected areas on socioeconomic outcomes 2018 2022Four years PARSEC project in a nutshell
  • 10.
    Synthesis strand: To combineremote sensing, artificial intelligence and socioeconomic data to assess change in socioeconomic conditions Data strand: To increase the number of properly cited data sets, provide credit and attribution, and accurately track data and code reuse Determine the influence of protected areas on socioeconomic outcomes Improve practices and tools for interdisciplinary projects worldwide 2018 2022Four years PARSEC project in a nutshell
  • 11.
    PARSEC – atransnational team BRAZIL: EPUSP : P Pizzigati Corrêa NISR : J-P Ometto São Paulo U : KMPMB Ferraz SCIELO : S Santos USA: AGU: S Stall Southern Oregon U: JE Trammell U California: M O’Brien The Nature Conservancy: S Reddy, J Evans UK : H Glaves : BGS FRANCE: N Mouquet : CESAB-FRB, CNRS A Cambon-Thomsen, L Mabile, M Thomsen: Toulouse U M Chaumont, G Subsol: LIRMM J Claudet, L Thiault : CNRS L Durieux, F Sèyler : IRD D Mouillot, L Velez : Montpellier U O Hologne, R David : INRA JAPAN: Y Murayama, K Imai: NICT Y Kondo, : RIHN T Osawa: Tokyo Met U AUSTRALIA: A Specht: U of Queensland L Wyborn: NCI Associates:
  • 12.
    PARSEC data strand Leader:Shelley Stall, Senior Director, Data Leadership, American Geophysical Union
  • 13.
    • incentives forresearchers… • a sturdy credit and attribution infrastructure that benefits researchers… • recommendations and best practices that work for researchers to encourage data sharing and data reuse… PARSEC data strand: objectives Increase the number of properly cited data sets, provide credit and attribution, and accurately track data reuse. For this we need:
  • 14.
    • incentives forresearchers… • a sturdy credit and attribution infrastructure that benefits researchers… • recommendations and best practices that work for researchers to encourage data sharing and data reuse… PARSEC data strand: objectives Increase the number of properly cited data sets, provide credit and attribution, and accurately track data reuse. For this we need:
  • 15.
    • incentives forresearchers… • a sturdy credit and attribution infrastructure that benefits researchers… • recommendations and best practices that work for researchers to encourage data sharing and data reuse… PARSEC data strand: objectives Increase the number of properly cited data sets, provide credit and attribution, and accurately track data reuse. For this we need:
  • 16.
    A. Robustly connectidentifiers across papers, people, and repositories. Currently, even if these identifiers are included, the necessary linking to allow tracking is not fully implemented. B. Conduct outreach and adoption campaigns on the importance of persistent identifiers and their infrastructure to all relevant stakeholders—these include the data repositories, publishers, researchers, and the key groups that set standards for publishers for reference tagging. C. Promote and extend data usage metrics generated by RDA’s Data Usage Metrics Working Group and data citations generated by RDA’s Scholix Working Group. By requiring that data be cited from a trusted, community-accepted repository, the value of repositories can be better measured. Data sharing through citation increases the likelihood of data discovery and reuse. D. Provide guidance to our own science-synthesis team and the selected project teams from our partners to optimize data reuse as well as data deposition of generated data for possible reuse. Demonstrate that when researchers follow the FAIR Data Principles, data are better prepared for others to understand, reuse increases, and discovery is improved. E. Promote the work of integrated guidance that will address recommendations (generic and specific to the ecological/biodiversity community) to improve each step of the process of data sharing, reuse, credit and reward for researchers and repositories. PARSEC data strand: work program
  • 17.
    A. Robustly connectidentifiers across papers, people, and repositories. Currently, even if these identifiers are included, the necessary linking to allow tracking is not fully implemented. B. Conduct outreach and adoption campaigns on the importance of persistent identifiers and their infrastructure to all relevant stakeholders—these include the data repositories, publishers, researchers, and the key groups that set standards for publishers for reference tagging. C. Promote and extend data usage metrics generated by RDA’s Data Usage Metrics Working Group and data citations generated by RDA’s Scholix Working Group. By requiring that data be cited from a trusted, community-accepted repository, the value of repositories can be better measured. Data sharing through citation increases the likelihood of data discovery and reuse. D. Provide guidance to our own science-synthesis team and the selected project teams from our partners to optimize data reuse as well as data deposition of generated data for possible reuse. Demonstrate that when researchers follow the FAIR Data Principles, data are better prepared for others to understand, reuse increases, and discovery is improved. E. Promote the work of integrated guidance that will address recommendations (generic and specific to the ecological/biodiversity community) to improve each step of the process of data sharing, reuse, credit and reward for researchers and repositories. PARSEC data strand: work program
  • 18.
    A. Robustly connectidentifiers across papers, people, and repositories. Currently, even if these identifiers are included, the necessary linking to allow tracking is not fully implemented. B. Conduct outreach and adoption campaigns on the importance of persistent identifiers and their infrastructure to all relevant stakeholders—these include the data repositories, publishers, researchers, and the key groups that set standards for publishers for reference tagging. C. Promote and extend data usage metrics generated by RDA’s Data Usage Metrics Working Group and data citations generated by RDA’s Scholix Working Group. By requiring that data be cited from a trusted, community-accepted repository, the value of repositories can be better measured. Data sharing through citation increases the likelihood of data discovery and reuse. D. Provide guidance to our own science-synthesis team and the selected project teams from our partners to optimize data reuse as well as data deposition of generated data for possible reuse. Demonstrate that when researchers follow the FAIR Data Principles, data are better prepared for others to understand, reuse increases, and discovery is improved. E. Promote the work of integrated guidance that will address recommendations (generic and specific to the ecological/biodiversity community) to improve each step of the process of data sharing, reuse, credit and reward for researchers and repositories. PARSEC data strand: work program
  • 19.
    A. Robustly connectidentifiers across papers, people, and repositories. Currently, even if these identifiers are included, the necessary linking to allow tracking is not fully implemented. B. Conduct outreach and adoption campaigns on the importance of persistent identifiers and their infrastructure to all relevant stakeholders—these include the data repositories, publishers, researchers, and the key groups that set standards for publishers for reference tagging. C. Promote and extend data usage metrics generated by RDA’s Data Usage Metrics Working Group and data citations generated by RDA’s Scholix Working Group. By requiring that data be cited from a trusted, community-accepted repository, the value of repositories can be better measured. Data sharing through citation increases the likelihood of data discovery and reuse. D. Provide guidance to our own science-synthesis team and the selected project teams from our partners to optimize data reuse as well as data deposition of generated data for possible reuse. Demonstrate that when researchers follow the FAIR Data Principles, data are better prepared for others to understand, reuse increases, and discovery is improved. E. Promote the work of integrated guidance that will address recommendations (generic and specific to the ecological/biodiversity community) to improve each step of the process of data sharing, reuse, credit and reward for researchers and repositories. PARSEC data strand: work program
  • 20.
    A. Robustly connectidentifiers across papers, people, and repositories. Currently, even if these identifiers are included, the necessary linking to allow tracking is not fully implemented. B. Conduct outreach and adoption campaigns on the importance of persistent identifiers and their infrastructure to all relevant stakeholders—these include the data repositories, publishers, researchers, and the key groups that set standards for publishers for reference tagging. C. Promote and extend data usage metrics generated by RDA’s Data Usage Metrics Working Group and data citations generated by RDA’s Scholix Working Group. By requiring that data be cited from a trusted, community-accepted repository, the value of repositories can be better measured. Data sharing through citation increases the likelihood of data discovery and reuse. D. Provide guidance to our own science-synthesis team and the selected project teams from our partners to optimize data reuse as well as data deposition of generated data for possible reuse. Demonstrate that when researchers follow the FAIR Data Principles, data are better prepared for others to understand, reuse increases, and discovery is improved. E. Promote the work of integrated guidance that will address recommendations (generic and specific to the ecological/biodiversity community) to improve each step of the process of data sharing, reuse, credit and reward for researchers and repositories. PARSEC data strand: work program
  • 21.
    PARSEC data strand:credit and reward landscape
  • 22.
    PARSEC data strand:partnership projects Slide Credit: Martin Fenner, DataCite, PARSEC meeting at RDA P14, 21 Oct 2019
  • 23.
    PARSEC data strand– synthesis strand linkage
  • 24.
    PARSEC data strand– synthesis strand linkage
  • 25.
    PARSEC data strand– synthesis strand linkage
  • 26.
    PARSEC data strand– synthesis strand linkage
  • 27.
    PARSEC synthesis strand Leader:David Mouillot, University of Montpellier, France
  • 28.
    PARSEC synthesis strand:work program • WP1: Stratified sampling of 200 rural communities close to and far from natural protected areas (PAs) using matching algorithms. • WP2: Estimate socioeconomic conditions in the selected rural communities using remote sensing and artificial intelligence. • WP3: Using paired comparison tests determine whether proximity to a PA can improve socioeconomic outcomes. Identify contributing factors.
  • 29.
    PARSEC WP1: stratifiedsampling Step 1: Identification of suitable data for the project Selection of socio-economic systems close to a Protected Area (PA) • PA: IUCN category 1-5; >10km2; creation date 2000- 2015 • Town/village: <5000 inhabitants < 20km from PA, >100km from large city Step 2: Acquiring Data – for example • PA: Dipperu NP, IUCN category 1a, 112.05 km2, created 2014 • Town: Nebo, 840 inhabitants, circa 20km from PA, >100km from Mackay
  • 30.
    PARSEC WP1: stratifiedsampling Step 1: Identification of suitable data for the project Selection of socio-economic systems close to a Protected Area (PA) • PA: IUCN category 1-5; >10km2; creation date 2000- 2015 • Town/village: <5000 inhabitants < 20km from PA, >100km from large city Step 2: Acquiring Data – for example • PA: Dipperu NP, IUCN category 1a, 112.05 km2, created 2014 • Town: Nebo, 840 inhabitants, circa 20km from PA, >100km from Mackay
  • 31.
    PARSEC WP1: stratifiedsampling Further criteria for selection (a) Socio-economic status of the town so information can be derived about (for example): • Gross Domestic Product (GDP) • Human Development Index (HDI) • Child Growth Failure (CGF) • Consumption Expenditure (CE) • Asset Health (AH) Sources: UNESCO MICS surveys, World Bank studies, information that allows comparison across the countries, and beyond. Local surveys OK, but more difficult to use for comparison. (b) Image analysis across time: before and after the creation of the PA. Are the images available? In what format (Landsat, SPOT, QuickBird etc)? Assessment of socio-economic status using AI (c) What is the possibility for mirror sites?
  • 32.
    PARSEC WP1: stratifiedsampling Further criteria for selection (a) Socio-economic status of the town so information can be derived about (for example): • Gross Domestic Product (GDP) • Human Development Index (HDI) • Child Growth Failure (CGF) • Consumption Expenditure (CE) • Asset Health (AH) Sources: UNESCO MICS surveys, World Bank studies, information that allows comparison across the countries, and beyond. Local surveys OK, but more difficult to use for comparison. (b) Image analysis across time: before and after the creation of the PA. Are the images available? In what format (Landsat, SPOT, QuickBird etc)? Assessment of socio-economic status using AI (c) What is the possibility for mirror sites?
  • 33.
    PARSEC WP1: stratifiedsampling Further criteria for selection (a) Socio-economic status of the town so information can be derived about (for example): • Gross Domestic Product (GDP) • Human Development Index (HDI) • Child Growth Failure (CGF) • Consumption Expenditure (CE) • Asset Health (AH) Sources: UNESCO MICS surveys, World Bank studies, information that allows comparison across the countries, and beyond. Local surveys OK, but more difficult to use for comparison. (b) Image analysis across time: before and after the creation of the PA. Are the images available? In what format (Landsat, SPOT, QuickBird etc)? Assessment of socio-economic status using AI (c) What is the possibility for mirror sites?
  • 34.
    The machine learningcomponent PARSEC WP2: estimation of socio-economic conditions using remote sensing and artificial intelligence
  • 35.
    PARSEC WP2: machinelearning protocol • STEP 1. We « show » the « network » examples and counter- examples • STEP 2. We use the network  Deep learning fish other fish other THELEARNING
  • 36.
    PARSEC WP2: decisionsvia a CNN* • convolutions Average or max [+ sub-sampling] Non linear function (= activation function) normalisation *Convolutional Neural Network
  • 37.
    PARSEC WP2: predictionof poverty with a CNN? Predicted poverty in Uganda Image from Xie et al. (2016) Using 400x400 pixel images (1 km x 1km) the CNN should predict a poverty value (scalar) (0 = Low ; 100 = high) Google Static Maps API, for the image 400 × 400 pixels at zoom level 16 (poverty  annual consumption level of households) Predicted poverty probabilities at a fine-grained 10 x 10km block level
  • 38.
    PARSEC WP2: Partsof the image that “react” Original daytime satellite images from Google Static Maps filter activation maps Overlay of activation maps onto original images urban areas nonurban areas water roads Jean et al. 2016
  • 39.
    PARSEC WP2: butis this already done? From Figure 3, Xie et al. (2016) * World Resources Institute, 2009
  • 40.
    PARSEC WP2: butis this already done? From Figure 3, Xie et al. (2016) * World Resources Institute, 2009
  • 41.
    PARSEC WP2: butis this already done? From Figure 3, Xie et al. (2016) * World Resources Institute, 2009
  • 42.
    PARSEC WP2: butis this already done? From Figure 3, Xie et al. (2016) Only 70% correlated to the ground truth evidence * World Resources Institute, 2009
  • 43.
    PARSEC WP2: howto improve the prediction? • More images with more ground truthing • Work on methods using a small number of images. • Work with temporal sequences • Add diversity in the learning database: • Really poor to very rich • Various places in the world (should we create only one unique CNN or …) • Challenges • poverty data accessibility (within time-period, frequency, type) • Integration of multi-resolution, multi-source, sparsity, time integration, incomplete data, etc ..
  • 44.
  • 45.
    WP3: Using pairedcomparison tests determine whether proximity to a PA can improve socioeconomic outcomes. Identify contributing factors.
  • 46.
    But that’s for2022 Thankyou!
  • 47.
  • 48.
    IPBES (2019) Summaryfor policymakers of the global assessment report on biodiversity and ecosystem services of the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services. Eds S. Díaz, J. Settele, E. S. Brondizio E.S., H. T. Ngo, M. Guèze, J. Agard, A. Arneth, P. Balvanera, K. A. Brauman, S. H. M. Butchart, K. M. A. Chan, L. A. Garibaldi, K. Ichii, J. Liu, S. M. Subramanian, G. F. Midgley, P. Miloslavich, Z. Molnár, D. Obura, A. Pfaff, S. Polasky, A. Purvis, J. Razzaque, B. Reyers, R. Roy Chowdhury, Y. J. Shin, I. J. Visseren-Hamakers, K. J. Willis, and C. N. Zayas. IPBES secretariat, Bonn, Germany. 39 pages. Jean M., Burke M., Xie M., Davis W. M., Lobell D. B. , Ermon S. (2016) Combining satellite imagery and machine learning to predict poverty. Science 353(6301): 790-794. World Resources Institute (2009) Mapping a better future: how spatial analysis can benefit wetlands and reduce poverty in Uganda. 39p. Washington, D.C. (USA): WRI. Xie M., Jean N., Burke M., Lobell D., Ermon S. (2016) Transfer learning from deep features for remote sensing and poverty mapping, pp 3929-3935. Proc. 13th AAAI Conference (AAAI-16). PARSEC: cited references

Editor's Notes

  • #6 As you all known, advances in science, both today and in the future, will depend on the openness, accessibility and reusability of data, software, samples, and data products. This a a collective responsibility but also an opportunity for us to push forward our global intelligence to face modern challenges we are confronted to.