PYAPI Workbook v2.0.1
PYAPI Workbook v2.0.1
Workbook
V2.0.1 UK
An Introduction to Administering your
Portal using the for ArcGIS API for Python
Copyright
Copyright © 1992-2010 Environmental Systems Research Institute, Inc. and ESRI (UK) Ltd
All rights reserved. Printed in Aylesbury, Bucks, England.
The information contained in this document is subject to change without notice and is the exclusive property of ESRI (UK)
Ltd. and Environmental Systems Research Institute, Inc., and any respective copyright owners. This work is protected under
United Kingdom and United States copyright law and other international copyright treaties and conventions and except with
the prior written consent of ESRI (UK) Ltd, shall be used solely for the purpose of undergoing training provided by or with
the express written permission of ESRI (UK) Ltd. To the extent permitted by law, no part of this work may be reproduced or
transmitted in any form or by any means, electronic or mechanical, including but not limited to photocopying and recording,
or by any information storage or retrieval system, except as expressly permitted in writing by ESRI (UK) Ltd. If any lawful
copies are made, all such copies, whether in whole or in part, shall include the appropriate ESRI (UK) Ltd and
Environmental Systems Research Institute, Inc. copyright notice. All requests should be sent to Attention: Contracts
Manager, ESRI (UK) Ltd., Millennium House, 65 Walton Street, Aylesbury, Bucks HP21 7QG, England.
@esri.com, 3D Analyst, ADF, AML, ARC/INFO, ArcAtlas, ArcCAD, ArcCatalog, ArcCOGO, ArcData, ArcDoc, ArcEdit,
ArcEditor, ArcEurope, ArcExplorer, ArcExpress, ArcFM, ArcGIS, ArcGrid, ArcIMS, ArcInfo Librarian, ArcInfo,
ArcInfo—Professional GIS, ArcInfo—The World's GIS, ArcLogistics, ArcMap, ArcNetwork, ArcNews, ArcObjects,
ArcOpen, ArcPad, ArcPlot, ArcPress, ArcQuest, ArcReader, ArcScan, ArcScene, ArcSchool, ArcSDE, ArcSdl, ArcStorm,
ArcSurvey, ArcTIN, ArcToolbox, ArcTools, ArcUSA, ArcUser, ArcView, ArcVoyager, ArcWatch, ArcWeb, ArcWorld,
Atlas GIS, AtlasWare, Avenue, BusinessMAP, Database Integrator, DBI Kit, ESRI, ESRI—Team GIS, ESRI—The GIS
People, FormEdit, Geographic Design System, Geography Matters, Geography Network, GIS by ESRI, GIS Day, GIS for
Everyone, GISData Server, InsiteMAP, MapBeans, MapCafé, MapObjects, ModelBuilder, MOLE, NetEngine, PC
ARC/INFO, PC ARCPLOT, PC ARCSHELL, PC DATA CONVERSION, PC STARTER KIT, PC TABLES, PC
ARCEDIT, PC NETWORK, PC OVERLAY, Rent-a-Tech, RouteMAP, SDE, SML, Spatial Database Engine, StreetEditor,
StreetMap, TABLES, the ARC/INFO logo, the ArcAtlas logo, the ArcCAD logo, the ArcCAD WorkBench logo, the
ArcCOGO logo, the ArcData logo, the ArcData Online logo, the ArcEdit logo, the ArcEurope logo, the ArcExplorer logo,
the ArcExpress logo, the ArcFM logo, the ArcFM Viewer logo, the ArcGIS logo, the ArcGrid logo, the ArcIMS logo, the
ArcInfo logo, the ArcLogistics Route logo, the ArcNetwork logo, the ArcPad logo, the ArcPlot logo, the ArcPress for
ArcView logo, the ArcPress logo, the ArcScan logo, the ArcScene logo, the ArcSDE CAD Client logo, the ArcSDE logo,
the ArcStorm logo, the ArcTIN logo, the ArcTools logo, the ArcUSA logo, the ArcView 3D Analyst logo, the ArcView
Business Analyst logo, the ArcView Data Publisher logo, the ArcView GIS logo, the ArcView Image Analysis logo, the
ArcView Internet Map Server logo, the ArcView logo, the ArcView Network Analyst logo, the ArcView Spatial Analyst
logo, the ArcView StreetMap 2000 logo, the ArcView StreetMap logo, the ArcView Tracking Analyst logo, the ArcWorld
logo, the Atlas GIS logo, the Avenue logo, the BusinessMAP logo, the Data Automation Kit logo, the Digital Chart of the
World logo, the ESRI Data logo, the ESRI globe logo, the ESRI Press logo, the Geography Network logo, the MapCafé
logo, the MapObjects Internet Map Server logo, the MapObjects logo, the MOLE logo, the NetEngine logo, the PC
ARC/INFO logo, the Production Line Tool Set logo, the RouteMAP IMS logo, the RouteMAP logo, the SDE logo, The
World's Leading Desktop GIS, Water Writes, www.esri.com, www.geographynetwork.com, www.gisday.com, and Your
Personal Geographic Information System are trademarks, registered trademarks, or service marks of ESRI in the United
States, the European Community, or certain other jurisdictions.
Other companies and products mentioned in this document may be trademarks or registered trademarks of their respective
trademark owners.
INFO and PC-INFO are trademarks of Doric Computer Systems International Ltd.
ArcView GIS uses Neuron Data’s Open Interface
Contains Ordnance Survey data © Crown copyright and database right November 2023
Table of Contents
Introduction
Over the next two days you will learn how to get the most out of the ArcGIS API for Python.
The ArcGIS API for Python is a lightweight library for analysing spatial data, managing your
Web GIS, and performing spatial data science. In this course you will learn how to navigate
the API’s software developer kit (SDK) to help streamline your administrative tasks which
will help you save time and improve efficiency by automating administration and
management of your Web GIS.
There are many ArcGIS Geospatial System diagrams available. This one shows the
importance of the Portal with in the platform.
You may be aware that there are two types of Portal that you have access to.
The first type of portal is the cloud-based portal which is commonly known as ArcGIS Online.
The second type of portal is the on-premise portal which lives behind a secure firewall and is
known as ArcGIS Enterprise which has the Portal for ArcGIS extension enabled.
Both portals are essentially content management systems controlling who has access to
content (maps, apps, layers and applications).
Access is granted via an organisational account. This account allows you to gain access to
your content through any device or desktop which is connected to the web.
Using the organisational account allows the user to perform management tasks based upon
the provided role and user type associated with the account.
Course goals
By the end of this course, you will be able to:
• Lessons: learning objectives at the beginning of each lesson to help you find the
information you're looking for
• Guided activities: interactive activities to reinforce key topics
• Exercises: step-by-step instructions for accomplishing essential tasks and building
skills
• Review: questions and answers that reinforce key concepts
Additional resources
Sample Notebooks
https://developers.arcgis.com/python/sample-notebooks/
Markdown Guide
https://www.markdownguide.org/
ArcGIS Documentation
https://doc.arcgis.com/en/
Jupyter Notebook
https://jupyter.org/
You will use the named user accounts associated with the Esri UK online training portal
which will grant access to the API through ArcGIS Pro.
Your instructor will provide you with a temporary account to use during class.
Username: _______________________________________________________________
Password: _______________________________________________________________
The ArcGIS API for Python is an Application Programming Interface (API) for managing your
portal. It is the preferred programming language of choice within ArcGIS and it continues to
evolve.
It is used widely within ArcGIS Pro, ArcGIS Enterprise and ArcGIS Online and is accessible
within these products through the addition of ArcGIS Notebooks, based upon the popular
Jupyter Notebook developer environment.
But how does the API communicate with your portal from within its hosting application?
How is it installed? and how can access to the API be obtained?
Learning objectives
After completing this lesson, you will:
• Understand what modules are installed with the ArcGIS API for Python.
• Understand how the API communicates with your portal.
• Installation options for the API.
• How to access the API.
Using the API you can take advantage of the many capabilities of the ArcGIS Geospatial
Platform to create and display web maps, query and administer your content while taking
advantage of the many analytical capabilities, such as performing analysis and geocoding.
The GIS represents either your on-premise portal or your online portal. Your GIS can be
accessed through the arcgis site package and this is the ArcGIS API for Python.
The ArcGIS API for Python also requires the SciPy ecosystem of modules (NumPy, Pandas,
SymPy and IPython) which are used extensively within the scientific and engineering
communities.
It also integrates well with the Jupyter Notebook Integrated Development Environment
(IDE), which provides the ability to write and share code, along with providing excellent
visualisations and descriptive comments.
The API communicates directly with your GIS server (ArcGIS Online or ArcGIS Enterprise)
using the REST architectural style; for example you can get access to your GIS by sending a
request to the GIS which includes properties identifying the address of your server (the
URL), username and password.
These requests to the server and the responses are all defined using the Python scripting
language.
Conda is a Python package manager that helps you install, manage and update packages and
their dependencies. The ArcGIS API for Python is managed via conda as are its
dependencies, such as the SciPy stack of modules.
The API can be installed in a number of ways. Complete instructions on how to install the
API can be found on the following page at the developers.arcgis.com website:
https://developers.arcgis.com/python/guide/install-and-set-up/
By far the simplest way to access the API is to install ArcGIS Pro.
The API and Conda is shipped with ArcGIS Pro 2.1 and all later releases. All you have to do is
install ArcGIS Pro.
The 2x release of ArcGIS Pro contains the Python Package Manager in the backstage area of
ArcGIS Pro, in which the arcgis package and its supporting dependencies can be viewed.
This is not the case at the 3x release of ArcGIS Pro. The arcpy and arcgis site packages
are not displayed in the backstage area, even though they are installed.
You can view the current release of arcpy and arcgis by viewing the installed packages
in the PyCharm developer environment:
At the ArcGIS Pro 3.1 release the release of the API is 2.1.0.2 and this is also the latest
release of the API as of September 2023.
Jupyter notebook is also installed and can be accessed through the Python Command
Prompt.
There are a number of ways to install the ArcGIS API for Python. A few ways are mentioned
below but for a complete list (with steps on how to install the ArcGIS API for Python, you are
recommended to read the documentation for full instructions:
https://developers.arcgis.com/python/guide/install-and-set-up/
Anaconda installs Python, Conda (for package management) and many additional useful
Python packages. Anaconda is a distribution and includes Conda. The ArcGIS API for Python
requires Python 3.5 or higher and so the latest release of Anaconda contains the 3.11
release of Python, plus additional useful packages.
The API is installed via the Anaconda terminal application using the command:
If you wish to perform deep learning using the API or ArcGIS Pro then additional deep
learning dependencies will need to be installed which will support the API’s learn module.
These dependencies include PyTorch and TensorFlow, which are used for language
processing and object detection.
• The latest version of Anaconda for Python 3x for your operating system.
• The appropriate version of the API for your operating system from the Esri channel
on anaconda.org.
Anaconda should then be installed and configured for use offline, and then the API can be
installed.
For more information on Docker, containers and images then please go to the following link:
https://www.docker.com/resources/what-container
The API is shipped as an image which you can download and power up whenever you want
to use it. The container containing the API runs in an isolated environment without making
any changes to your underlying file system.
Google Colaboratory
A new way to access the ArcGIS API for Python is via Google Colaboratory. It is basically a
Google-hosted Jupyter notebook service that allows a user to access notebooks from
anywhere by storing them in Google Drive.
The ArcGIS Notebooks are all based upon the Jupyter Notebook The Jupyter Notebook is the
preferred IDE for data scientists for its ability to provide markup, its ability to run code
within cells and its rich data display function:
It is within these different adaptations of Jupyter Notebook and IDEs that you can write
Python code which replicates and realises your workflows.
As well as accessing the ArcGIS API for Python through these different developer
environments, you can also access the additional Python packages installed by the API.
https://www.arcgis.com
❑ Sign-in to the online portal using the credentials you were supplied with a few
moments ago.
Your online portal contains several tabs which allow you to access different parts of your
portal.
The Notebook tab provides you with access to create a new ArcGIS Notebook.
The Notebooks page is displayed. This is where any notebooks you create can be managed,
where sample notebooks can be accessed and where new notebooks can be created.
❑ Click the New Notebook button to display the options that are available for the type
of notebook you wish to create:
Notice that two of the options consume credits. The Advanced option allows you to use the
ArcPy site package in an online environment and consumes 3 credits an hour. The Advanced
with GPU support notebook engine is best suited for deep learning processes and will cost
30 credits and hour.
The online implementation of the ArcGIS Notebook allows you to perform Python based
processes and workflows within your own portal. It is where you will write your Python code
to perform simple administrative tasks as well as more complicated data analytics. It allows
you to easily search for and add content into your Notebook which is hosted in your portal
or within The Living Atlas of the World.
The default Notebook contains code cells which help you to access your GIS (portal)…
…and a code cell, underneath, for you to write your own code.
The notebook allows you to execute code contained within your cells. This is handy if you
have a cell which contains code which will use credits (for example performing enrichment
of a layer) and you only want to run that cell once to save on credit consumption. You can
run other cells which subsequently takes advantage of the data created in that previous cell
without having to run the code in it again.
❑ In the Notebook locate the cell underneath the message “Now you are ready to
start!”.
❑ Place you cursor in the empty cell and write the following line of Python code:
print(“Hello world!!”)
Notice that the cell which you have focus in, for example as you are typing, is green and that
you have code colourisation, just as you would get in any other Python developer
environment.
This will run the code in the cell, display any output underneath the cell and create a new
cell.
As you can see it is quite straight forward to write your code in cells and execute cells
containing lines of code.
❑ Finally, save your Notebook (call it MyFirstNotebook) and fill in the metadata for
your new notebook.
___________________________________________________________________________
___________________________________________________________________________
This will create a new Notebook item in your Content area of your portal.
❑ Click the burgers (yep – that’s what they’re called…) next to the name of your
notebook and click the Content menu item.
This will display the contents (items) in your portal which you have created. At the moment
you only have your notebook as the single item.
You can open this up and use it again in future or make it available to others either
internally within your organisation or externally by altering the item’s sharing properties.
❑ Click the Notebook tab again to return to the notebook section of your portal.
The notebook samples can be accessed from the Esri sample notebook button:
Notice that there are about 20 samples and they are organised into categories, such as
‘Administration’ and ‘Data Science and Analysis’.
❑ Locate the sample called “Content Management: Check for broken URLs”.
❑ Click on the sample’s notebook icon to display the metadata page for the sample
notebook and click Open Notebook to display the code.
It can look quite daunting…. But if you spend a couple of minutes reading the sample then
the comments (markup) provide a description of the sample.
❑ Close all of the open tabs on your browser which relate to ArcGIS Online.
One of the aims of this course is to make you feel comfortable reading and investigating the
samples by understanding the different aspects of the API.
On the ArcGIS Sign-in dialog notice that the Sign me in automatically check box is ticked.
This indicates that when you start ArcGIS Pro again then you will not have to login using
your ArcGIS organisation accounts.
❑ Fill in the username and password and press the Sign in button.
You will see the ArcGIS Pro Home screen from which you can create new projects or open
existing ones
You also started to investigate some of the sample notebooks that are available to you in
the ArcGIS Notebooks. These samples can appear daunting at first site but as you become
more familiar with the Notebook and the API you will soon start to gain confidence in
understanding what the notebooks do and start to write your own code within the different
types of Notebook.
You will start to investigate the other types of Notebook, in its many different guises, in the
next section.
Exercise End
Answer: You must fill in the Title and Tags properties. The Folder option is already filled in
with a default value. All other properties are optional.
2 Introducing Notebooks
There are many tools which you can use to write your Python code. One of the most popular
tools at the moment is to use Jupyter Notebook.
Jupyter notebooks are a free, open-source web tool and is an example of a computational
notebook. They can be used to combine code, create rich output, explanatory text and
multimedia resources in a single document. As well as this, improvements in the underlying
web software, combined with increasing use of Python and data science in general, and the
ease in which remote, cloud-based data can be accessed has seen a huge increase in the use
of Jupyter Notebooks.
The Jupyter Notebook doesn’t just work with Python and this is reflected in its name which
is inspired, according to its co-founder Fernando Perez, by the programming languages Julia
(Ju), Python (Pyt) and R.
This section will introduce you to the different types of Notebook that are available to write
your API code.
Learning objectives
After completing this lesson, you will:
PyCharm is the recommended IDE for writing geoprocessing-based Python scripts for ArcGIS
Pro as it allows you to use virtual environments (as defined by Conda). Notice that the
Project Interpreter is using the default Python 3.9 environment called arcgispro-py3. The
arcgis package is one of the available packages within this default environment which
means that you can use the ArcGIS API for Python (and supported modules and packages) in
conjunction with arcpy.
The Jupyter Notebook App is the recommended IDE for working with the ArcGIS API for
Python. This is because you can run Python code interactively, visualise script output (such
as charts, maps and tables), display advanced comments and mark-up, and share scripts to
other IDEs.
If ArcGIS Pro is installed, or the API is installed through Anaconda, then Jupyter Notebook
will be installed as well.
As well as the traditional Jupyter Notebook app, Esri has introduced variations of the app
which are embedded into many Esri products, and are collectively known as ArcGIS
Notebooks.
All Python functionality is available including the core standard library, ArcPy and the ArcGIS
API for Python, as well as useful and dependent packages such as NumPy and pandas.
Existing Notebooks can be accessed within the project through the Catalog pane.
A Notebook can be viewed within ArcGIS Pro by opening the notebook document (ipynb)
file. This will display the notebook document in its own view tab.
If the Notebook needs to be part of the ArcGIS Pro project then it can be added to the
project via the Catalog pane.
A new entry called Notebooks will then be added to the Catalog pane.
You have the ability to create a new Notebook within ArcGIS Pro via the Insert tab > New
Notebook and you should then specify the location and the name of the notebook
document.
The location of a notebook can be identified by hovering over the notebook in the
Notebooks tab.
Notebooks can be displayed via the Notebook folder on the Catalog pane by locating the
Notebook and choosing Open Notebook via its context menu.
Portal Notebooks
ArcGIS Notebooks are also available within your Portal (ArcGIS Online and ArcGIS
Enterprise).
If you have permissions to create Notebooks then you will see a new Notebook option in the
top ribbon navigation of ArcGIS Online.
For non-administrators to create notebooks then a new set of privileges will need to be
granted as part of a custom role:
• Advanced privilege (Premium Content) to use ArcPy or GPU enabled notebooks called
Advanced Notebooks.
Creating and editing notebooks using the General privileges, will not consume credits (apart
from running the online geoprocessing tools), as it takes advantage of standard open source
libraries or functionality that are already built into ArcGIS Online.
Running a notebook using the Advanced runtime (which will run ArcPy) will consume 3
credits an hour per notebook.
If you are running a notebook using the Advanced with GPU runtime (to run deep learning
models) then this will cost 30 credits per hour, per notebook.
When creating a notebook in ArcGIS Online then code will be written for you to help you get
started:
Let’s first of all define what an ArcGIS Enterprise installation is comprised of.
• Web Adaptor
• Portal for ArcGIS
• GIS Server
• Data Store
These four components are known as the ‘base deployment’.
If you want to be able to have access to and create your own Notebooks then you need to
perform the following tasks:
The machine then needs to be licensed to run the Notebook Server software. This creates
an ArcGIS Server site, licensed with the ArcGIS Notebook Server role, which can provide
notebook capabilities to an ArcGIS Enterprise deployment.
This is done by logging into Portal for ArcGIS and adding the notebook server as a Server
site. In doing so you are federating the ArcGIS Notebook Server site with your Enterprise
portal. Other federated ArcGIS Server sites then provide functionality to the notebook, for
example a federated ArcGIS Image Server site provides image analysis capabilities to an
ArcGIS notebook site.
Performing the above tasks as an administrator will add the Notebook tab to your portal.
You now need to allow users to create, edit and run notebooks in the portal.
A custom role needs to be created and assigned to those users who will perform tasks with
notebooks in the portal. The custom role will have privileges assigned to it which determine
what the users assigned with the role can do. The Create and edit notebooks privilege must
be assigned to the custom role at the very least in order for users to……… create and edit
notebooks. Additional privileges are available which help users perform additional tasks
such as Advanced notebooks which allow users to utilise notebooks using the Advanced
notebook runtime.
For detailed information on ArcGIS Notebooks in ArcGIS Enterprise then please go to the
following URL:
https://enterprise.arcgis.com/en/notebook/
Jupyter Notebook
The Jupyter Notebook App is an open source web application which is running on your local
desktop. It allows you to create, edit and run your Python code in notebook documents. It is
installed when the ArcGIS API for Python is installed with ArcGIS Pro, or through the Conda
package management system.
1: A web application: Running on your own machine which provides you with the ability
to create and author notebook documents. This is called Notebook dashboard.
The notebook can be started by typing jupyter notebook at the prompt for the Python
command prompt; or via the Start menu.
In doing so some information about the notebook will be displayed in the command
window, including the URL of the web application which is hosting the dashboard by default
http://127.0.0.1:8888.
The picture below displays the Notebook Dashboard and this allows you to access local
notebook documents and Python scripts:
The landing page of the Jupyter Notebook web application, the dashboard, shows the
notebooks currently available in the notebook directory. This directory is displayed as this is
the directory in which the notebook server was started, namely:
C:\Program Files\ArcGIS\Pro\bin\Python\envs\arcgispro-py3
The new notebook will be created in the same directory which is currently being viewed and
will be reflected in the notebook list in the dashboard.
The contents of existing notebooks can be displayed by clicking on the notebook (ipynb) file.
In the folder list.
Either action will launch a new tab on the browser and will start a kernel.
The Jupyter Notebook app has a kernel for Python code, but there are other kernels
available for other programming languages.
The kernel remains active even if the browser window is closed. Reopening the same
notebook from the dashboard will reconnect the web application to the same kernel.
Kernel information is printed to the command prompt terminal, including the unique ID of
the kernel.
Periods of notebook inactivity will close the associated kernel down, and you may lose your
work as a result if you have not saved the notebook.
You will see the Untitled notebook name, a menu bar, a simple toolbar and an empty cell in
which you can write your code.
The default name of the notebook is set to Untitled. This can be changed by clicking on the
name and typing a new name in the dialog box.
The menu bar contains options that help you manage notebooks.
The toolbar contains many quick-access options which will help you write, run and format
your code.
The code cell is where you will write your code. There are different types of cells.
Types of cell
Jupyter notebook support three types of cell. Code cells and Markdown cells are more
important than the Raw cells.
• Markdown provides rich text for annotating and commenting your code.
• You can test each process independently without having to re-run the entire script.
Code cells
A code cell allows you to edit and write your code. It contains full syntax intellisence and tab
completion.
When the code cell is executed, the contained code is sent to the kernel, executed and
returned back to the notebook as the cell’s output. The output can be text, HTML tables,
matplotlib figures and maps. This is known as Jupyter Notebook’s rich display capability.
Markdown cells
Markdown allows you to document your notebook with text and images using rich text. In
iPython this is accomplished by marking up text with the Markdown language, which allows
text to be emphasised using italics, make your text bold etc.
Raw cells
Raw cells provide a place in which you can write output directly. They are used if you wish to
convert a notebook to another format (such as HTML) and those cells will be converted in a
way specific to HTML.
Executing cells
Cells do not execute automatically – you will need to execute the cells yourself. You have
choice as to which cells you wish to execute.
Once you have written your code in the cell you should press <SHIFT> and <ENTER> to
execute the code in the cell, or you can execute the code in the cell through the toolbar or
menu bar. Successful execution will also create a new cell for you.
A common issue is that you might forget to run code in a previous cell, but that code is
required in order for subsequent code to be executed successfully.
If you have written code in multiple cells and want to run all the code then from the Cell
menu choose Run All.
There are additional options for running cells on the Cell menu.
Checkpoints
Every time you create and save a notebook a corresponding checkpoint file is created. This is
stored in a folder called \ipynb_checkpoints and it is in the same folder as the parent
notebook file.
Every time you perform a manual save then the checkpoint file is updated with your
amendments.
These files will be “downloaded” from your active Python environment to your Downloads
folder.
Headings
Headings start with a ‘#’ for the largest heading and end with 6 ‘#’s for minor headings:
Blockquotes
Blockquotes can be thought of as general comments. When viewed they are indented and
are obtained by using the Markdown symbol ‘>’.
Displays…
Notice that the code is highlighted when the markup cell is executed.
Notice the line break <br> tag, which adds a new line into the markup.
Creating Lists
List can be created using a couple of techniques:
Ordered (numbered) lists can be created using the <ol> tag and then <li> embedded inside
of the <ol> tag:
Unordered (bullet) lists can be created pretty much in the same way as ordered lists; the
only difference is that the <ul> tag is used instead of <ol>.
There are many additional formatting tags that can be used which aren’t mentioned here,
such as creating tables and inserting images. For a more extensive list you are
recommended to have a look at the following blog article:
https://www.datacamp.com/community/tutorials/markdown-in-jupyter-notebook
Code colourisation
Just like in other IDEs you have code colourisation to help you write your code:
So, operators are purple, strings are brown, statements are bold green while built-in
functions are green.
Code completion
Auto-completion is available although it doesn’t occur naturally, as is the case with other
IDEs. Instead you have to specify the name of the object, followed by a full-stop, and then
the <TAB> key.
Enabling intellisence
By default intellisence is not enabled in the Notebook. To enable it then the following code
should be written at the top of the Notebook:
%config IPCompleter.greedy=True
And then within the brackets of the function enter <Shift> and <Tab> to display the floating
help.
First of all you will start Jupyter Notebook, create a new notebook and identify where the
notebook has been created.
This will display the Python command prompt. It is very similar to a DOS command prompt
except that it allows you to run Python code within the active Python environment.
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
This Python environment was created when ArcGIS Pro was installed, and this is where
Python is installed.
❑ Open up File (Windows) Explorer and browse to the directory location which is the
answer to question 2.
Notice the directories which are contained here. These are the standard Python 3x
installation folders and associated files.
You shouldn’t create any scripts or notebooks in this directory as it contains the Python
system files and folders, and there may be locks placed upon the files and folders in this
location. Instead you should start Jupyter Notebook in a more accessible location.
You will launch the Jupyter Notebook in a location from which you can explore the course
data and access any existing notebooks to create new ones.
This will send the command prompt to the root of the C:\ drive.
jupyter notebook
After a moment or two the command window creates the notebook server and runs
additional Jupyter related information, including the location at which notebook server is
running at.
The command line information also indicates how to shut down your local web server which
is running this instance of Jupyter Notebook by using the <CTRL> and C keys.
A new tab in your browser opens to show the notebook dashboard. The dashboard displays
the contents of the \PYAPI folder.
You can access the contents of any of these sub directories within the dashboard, and
create new notebooks in here too.
Inside is an existing notebook called CreateMap.ipynb. You will investigate this later on in
the exercise.
You can see that the notebook has a simple user interface with a single cell that is ready to
accept your code. Your notebook is currently called Untitled.
When a notebook is created for the first time a kernel is started. The kernel executes code
and returns the results back to the notebook. An entry relating to the kernel can be seen in
the Python command window.
❑ Click Untitled in the title area and rename the notebook as “HelloWorld”.
A new file called HelloWorld.ipynb is created in the Home folder of the notebook dashboard.
print(“Hello World…”)
❑ Press <Shift> and <Enter> to run the code in the active cell.
The kernel executes the code and returns the results back to the notebook for display.
Notice, also, that a new cell is created:
You can ‘download’ (export) your notebook into the Downloads folder on your machine.
This can be useful if you want to export your notebook to a different format.
Although it appears that the Jupyter Notebook has been closed, in reality it hasn’t. It is still
running in the background, including the kernel. To close the notebook down properly you
will need to do this in the terminal window.
❑ Locate the Python command window and press <Ctrl> and C to shutdown the
Notebook session.
Once the session has been terminated you will be taken back to the command prompt.
In this first step you have investigated the notebook server and the dashboard. You also
created a basic script, renamed it and downloaded it to a more accessible location. You
finally learned how to close the notebook server.
❑ Start Jupyter notebook, click the Notebooks directory to access its contents and
create a new notebook called ProcessDictionary.
❑ In the empty cell create a variable called trees and assign to it an empty dictionary.
trees = {}
As you know, a Python dictionary is composed of key : value pairs. Your dictionary will be
composed of the name (key) and the genus (value); for example, Hazel : Corylus.
❑ Add the following items in the table (see the next page) to the trees dictionary, for
example to add Hazel (key) and Corylus (value) to the dictionary you would type:
trees[“Hazel”] = “Corylus”
Key Value
Hazel Corylus
Elm Ulmus
Oak Quercus
Beech Fagus
Ash Fraxinus
❑ Press <Shift> and <Enter> to run the cell (and create a new cell).
❑ In the new cell, type trees, and then a full stop followed by a press of the <TAB>
key.
This will display a drop-down list of valid members for the trees dictionary object:
❑ Create a variable called treeItems and assign to it the trees dictionary with the
items() method:
treeItems = trees.items()
You can obtain documentation (the docstring) for the method by placing the mouse cursor
between the brackets and pressing <SHIFT> and <TAB>.
You will now write a basic for loop to process the dictionary’s items. You will store the key
and value pair in two variables which will be part of the for loop.
❑ Print the data contained in the key and value inline variables.
In this quick step you have had a brief recap of working with Python dictionaries, which are
used frequently within the ArcGIS API for Python.
The Jupyter Notebook allows you to write traditional Python code and provide you with the
ability to obtain basic intellisence and documentation.
You will first of all create your main heading and some additional descriptive text.
❑ Click in the top cell (the cell in which the dictionary is created) and from the Insert
menu choose Insert Cell Above.
This will add a new Code cell above the existing cell.
❑ Add a hash sign # to the markdown cell and then type after it “Working with a
Python Dictionary”, so:
Underneath it you will then add a blockquote which allows you to provide slightly indented
free text.
❑ Underneath the main heading add a > and then type the following “In this simple
notebook you will explore how to create a dictionary, populate it and process it
using a for loop. The output will be written to the end of each cell.”
The markup in the Markdown cell should look like the following:
❑ Make sure the mouse cursor is in the Markdown cell and press the Run button.
The markup language has been interpreted and the cell has now been formatted.
❑ Add a line break tag <br> after the text “populate it”.
This will split the line onto two lines which is more pleasing on the eye.
❑ Add the following markup underneath the blockquote to create an unordered list
explaining the key : value pairs which are added to the dictionary.
In this quick step you have refreshed your memory on how to work with Python dictionaries
and how to create some descriptive notation which aids the understanding of your script.
This is one of many pre-written notebooks which you will use in the course.
❑ In the dashboard tab make sure you are in the Notebooks folder and that the
CreateMap.ipynb file is present.
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
❑ Place the mouse cursor in the cell containing the code and click the Run button.
As the code in the cell executes notice the * next to the cell. This indicates that the kernel is
executing the code:
Once execution has been completed, the kernel then displays the result back to the
notebook in the form of a map widget object.
The map contained inside of the map widget is displayed on the next page.
This map is fully interactive in that you can pan, zoom and rotate the map!
In the next section you will start to investigate the code in more detail and gain a full
understanding of how to build a map and add extra detail to it by exploring the
documentation.
❑ Close all of the browser tabs containing the Jupyter Notebook and close the Python
command prompt window.
So in this exercise you have investigated the Jupyter Notebook IDE. You have explored how
to write some basic Python code and run it; create markup and obtain help. Finally, you
have learned how to launch Jupyter Notebook in a directory which can the access pre-
written notebooks to run some API code.
Exercise End
The GIS class is imported from the arcgis package, gism dule; a new instance of the GIS is
created and from that a map, centred on Aylesbury is displayed.
The kernel executes code and provides the results back to the notebook for display
Close all tabs down in your browser and press CTRL and C in the Python command window
to close the kernel down.
Question 5: How do you launch a Jupyter Notebook in a directory other than the home
directory?
In the Python command window you should use the cd command to navigate to the desired
folder.
The key to learning how to write code is to understand how to read the software developer
kit (SDK). The SDK for the ArcGIS API for Python is comprehensive in that it provides:
This section will show you how to take full advantage of the SDK by learning how to navigate
the API reference, while introducing the sample notebooks and the guide.
Learning objectives
After completing this lesson, you will:
Section keywords
Keyword Description
As we have seen, the API is installed through ArcGIS Pro, or via conda or by pip.
The API is supplied as a Python package called arcgis. The package contains a number of
different modules which in turn contain classes, functions and types. These allow you to
perform administration, spatial analysis, run deep learning models and data cleaning across
the ArcGIS platform.
The modules provide access to and organise functionality and focuses on one part of your
GIS.
As you can see, there is a certain amount of colour-coding which is there to help you
identify functionally similar modules. Generally speaking:
• Green modules provide access to geographic data and spatial capabilities, for
example geoprocessing functions and helper objects (more about helper objects
later). The env module provides the ability to share variables between different
modules, for example the default geocoder and environment settings that are
common to all geoprocessing tools.
• Purple modules work with either feature data, feature layers and collections of
feature layers in your portal; providing extended support for the Pandas DataFrame;
and for storing continuous data stored in the form of raster data and imagery layers.
• Orange modules generally speaking allow you to visualise and share your data and
results of your work. The widgets module allows you to work with the MapView
notebook widget.
• The red module contains very basic functionality for managing Notebooks.
• The gray Learn module provides access to many models for machine learning within
the API. The module is not installed automatically and needs to be installed
separately.
It represents and provides access to your GIS. We will look at this in a few moments. But
first……..
https://developers.arcgis.com/python/guide/overview-of-the-arcgis-api-for-python/
This page provides a nice pictorial representation of the contents of the ArcGIS
API for Python. It is not accessible directly through the Guide so it is worth
bookmarking this page for your future reference.
Question 1: The arcgis.gis module provides functionality to manage GIS users, groups
and content. True or False?
___________________________________________________________________________
___________________________________________________________________________
Question 2: Which module allows you to work with feature data, feature layers and
collections of feature layers?
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
Question 4: What are the three types of geocoding that the arcgis.geocoding module
performs?
___________________________________________________________________________
___________________________________________________________________________
Question 5: Which module would you investigate if you wanted to process sensor data and
real-time data feeds?
___________________________________________________________________________
___________________________________________________________________________
The gis module is composed of a number of classes which help you manage your GIS.
These include GIS, Item, User, Group, Role.
There are also a number of sub-modules, such as admin, server, agonb, sharing and tasks.
Each of these sub modules contain their own classes which help you administer your GIS.
There are also a number of Helper classes, so-called Resource Managers which help you
manage your users, groups and content in general.
We will explore the pure administration side of the gis module more in a subsequent
section.
To access the gis module you might type the following code:
import arcgis
myGIS = arcgis.gis
It is basically saying….
“from the arcgis package, and from the gis module, give me
access to my GIS through the GIS object.”
GIS is a class within the gis module and it is your entry point into your code and into
either your online or on-premise portal.
The corresponding GIS object is used to manage users, groups and datastores and for
accessing your GIS content.
To access your organisation in ArcGIS Online you would write the following code:
myGIS = GIS(“https://www.arcgis.com”,
Org_uname, Org_pword)
If no username and password are supplied then a connection to ArcGIS Online is made as an
anonymous user. This will allow you to access content that has been shared with the
EVERYONE option.
http://machinename.domain.com/webadapter
Built-in user account: Similar to how accounts are managed in ArcGIS Online, through a built
in Identity Store.
Use an Enterprise Identity Store: Use web-tier \ Portal-tier authentication using LDAP
(Lightweight Directory Access Protocol), or Portal-tier authentication using Active Directory.
Using PKI for Enterprise web-tier authentication: PKI stands for Public Key Infrastructure.
There are two methods for connecting to a PKI protected ArcGIS Enterprise; a file is used
either as a client-side certificate, or as a formatted certificate file and password.
Through ArcGIS Pro: This might be a good option to consider if you have an advanced
authentication scheme such as Kerberos or Smart Card.
https://developers.arcgis.com/python/guide/working-with-different-authentication-
schemes/
But how do we know what the arguments are for working with the GIS class? Are they
mandatory or are they optional? The API Reference contains all of this information.
The API Reference for the ArcGIS API for Python can be found here:
https://developers.arcgis.com/python/api-reference/
The reference displays documentation for the arcgis.gis module by default and it is
from here that you can locate the documentation for the GIS class, and all of the other
classes.
Clicking the link for the GIS class will display its documentation within the API Reference.
Class description
The API reference provides a description of what the class is and what it does, along with
information as to how to create an object from that class.
The keyword class indicates that GIS is indeed a class. Its location within the API (arcgis
package \ gis module) is also presented.
The constructor (the brackets) specifies what arguments the GIS class can receive which
helps instantiate the GIS object; in this case the authentication options mentioned earlier in
the section. For the GIS class the arguments are all optional and specify default values, or
what will happen if no optional arguments are supplied.
Generally, the documentation provides a description for each argument and indicates if the
argument is optional or not, and its type i.e. string, boolean, integer:
Code snippets
The documentation for the GIS also includes some code samples which should be studied:
More often than not the return value for the property will be specified in the
documentation, but in this example it is not.
Notice that the groups property provides a hyperlink to another part of the API Reference
from which you can access the GroupManager helper class which is found in the
arcgis.gis module.
Hovering over the GroupManager hyperlink provides a floating message indicating the
location of the GroupManager class.
Methods can be identified by having a pair of brackets after their name, for example
update_properties(). Once again, arguments can either be optional or required, with
the method documentation describing the arguments and the method’s return value.
If a hyperlink exists then you can hover over the hyperlink to determine where the object
created by the method resides within the API.
The map widget is instantiated by calling the map() method, which is found on the GIS
object.
The map() method takes a number of arguments which, for example, allow you to specify a
location to centre the map on and a zoom scale based upon the basemap levels of detail
scale settings.
The map() method returns a reference to a MapView object. The corresponding MapView
class is found in the arcgis.widgets module.
By exploring the MapView class documentation, the properties and methods of the
MapView can be examined and applied to the map. For example, add / remove a layer,
change the basemap, draw on it, rotate the basemap and save it as an item in the GIS.
Code snippet
This code snippet demonstrates how to work with some of the members on the MapView
object, such as basemap, and the save() method.
The above snippet sets a new basemap for the map widget and then the save() method
creates a brand new web map item in the online portal.
Notice that a dictionary of item properties is supplied to the save() method, as this is
used to provide the required metadata for your new portal item.
https://developers.arcgis.com/python/api-reference/
• Scroll down to find the map() method on the GIS class and answer the following
questions:
Question 1: How many arguments are optional for the map() method and what are they?
___________________________________________________________________________
___________________________________________________________________________
Question 2: Is the map widget only supported within Jupyter Notebook? (Yes / No)
___________________________________________________________________________
Question 3: What hyperlink should you click on to access the documentation for the map
widget?
___________________________________________________________________________
___________________________________________________________________________
Question 5: On the MapView class help, what does the snap_to_zoom() method do?
___________________________________________________________________________
___________________________________________________________________________
Answer: TRUE
Question 2: Which module allows you to work with feature data, feature layers and
collections of feature layers?
Answer: the arcgis.features module allows you to accomplish the above tasks.
Answer: The arcgis.raster module contains classes and raster analysis functions for
working with raster data and imagery layers.
Question 4: What are the three types of geocoding that the arcgis.geocoding module
performs?
Question 5: Which module would you investigate if you wanted to process sensor data and
real-time data feeds?
Answer: The arcgis.realtime module provides types and functions for processing
sensor data and real-time data feeds.
Answer: The map() method takes 4 optional arguments and they are location, zoomlevel,
mode and geocoder.
Question 2: Is the map widget only supported within Jupyter Notebook? (Yes / No)
Question 3: What hyperlink should you click to access the documentation for the map
widget?
Answer: Clicking on the hyperlink will take you to the arcgis.widgets module.
Question 5: On the MapView class help, what does the snap_to_zoom() method do?
Answer: It allows the map to display at the next level of detail when zooming in or out
when set to True, or for continuous zooming when set to False.
First of all you will start Jupyter Notebook and navigate to the exercise working directory.
❑ In the Windows search bar type “Python command prompt” and press <Return>.
..\..\arcgispro-py3>cd \
This will take to the root directory of your C:\ drive. This is a good place to be as you can
now start Jupyter Notebook and navigate other folders while within the Jupyter Notebook
environment.
❑ Once again, navigate into the C:\EsriTraining\PYAPI folder using the cd command:
cd C:\EsriTraining\PYAPI
The Python command prompt should look the same as the picture below:
The Notebook Dashboard will display in a new browser tab after a few moments,
Inside is a single text file called ThisFolderIsEmpty.txt. It is in this folder that you will create
your notebook.
❑ Create a new notebook by clicking on the New button > Python 3 (ipykernel).
This will open up your new notebook in a new tab on your browser.
___________________________________________________________________________
___________________________________________________________________________
❑ From the File menu choose Rename and change the name to InvestigatePortal and
click the Rename button.
You are now ready to write some code and connect to your portal.
Before you do that you will first of all add some markdown notes to your notebook to
indicate what you are going to do.
This will display the following in the cell. You have created some basic comments.
❑ Open the API Reference for the ArcGIS API for Python by accessing the following help
page:
https://developers.arcgis.com/python/api-reference/
❑ Spend a few moments reading the documentation for the GIS class…
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
In order to access your GIS (i.e. your portal) you will perform some imports to get access to
the API.
❑ In the newly created cell underneath the markup import the API and access the
GIS class by typing the following:
❑ Now create an instance of the GIS class called gis and pass in the arcgis online url
and your username into the GIS() class constructor:
gis = GIS(“https://www.arcgis.com”,
“<<add your_ArcGIS_username here>>”)
❑ Underneath, add a print() function and provide a message to indicate that you
have successfully connected to your portal.
Notice that when the code in the cell is executing a blue (*) asterisk is displayed next to the
cell.
You will need to enter a password for the username you supplied.
Hopefully a message will be displayed indicating that a connection to the portal was made
successfully.
Once you have successfully logged into the online portal a new cell is created on successful
execution of your code.
The final thing you will do is create a quick map of where you live.
❑ In the documentation for the GIS class look for the map() method.
Notice that all of the arguments for the method are optional.
Question 5: What argument might allow you to specify a location for your map?
___________________________________________________________________________
Question 6: If supplying a location what data type should the location be supplied as?
___________________________________________________________________________
Question 7: What does the zoom level indicate? What is the data type that should be
supplied for the zoom level?
___________________________________________________________________________
___________________________________________________________________________
A zoom level of 1 will display the whole of the default basemap as defined by
your portal administrator. The Learning Service default basemap is the
Topographic basemap.
❑ In the newly created cell write a line of code which will create a map widget at the
location of your choice.
❑ To display the map widget containing the map just type the name of the variable
storing the map.
You can change the zoom level and re-run the cell without have to run
any previous cells.
The map below displays the Cambridge area on the Topographic basemap.
Creating a map is always a good test to see whether or not you have made a connection to
your portal.
The code to create the map and display it, may look similar to the code displayed below:
In this step you have connected to your online portal through the GIS class and referenced
your portal in a variable called gis. You have also used a method on the GIS class called
map() to create a map using a specified location and zoom level as arguments to the
method.
The documentation for the map() method states that the method “Creates a map widget
centred at the declared location with the specified zoom level…. See MapView for more
information”.
___________________________________________________________________________
___________________________________________________________________________
❑ In the new cell, underneath the displayed map, write some basic markup as you did
for step 1. Include the following text:
❑ To the myMap variable use the map() method on the gis to create a map. Supply
the following arguments to the method:
Zoomlevel: 12
❑ To display the map, underneath the last line of code you wrote type:
myMap
The variable myMap is now referencing the MapView class in the arcgis.widgets module.
❑ If you haven’t done so already then locate the MapView class in the widgets module
by clicking the widgets hyperlink or by navigating to the class in the documentation.
You will now investigate the basemaps that are available to you for this map view widget.
Question 10: Which property will display the basemaps for the map view?
___________________________________________________________________________
❑ Comment out the line of code which displays the map view widget and underneath
it, create a variable called viewBasemaps.
This will display the basemaps that are available for you to display in the map view.
By looking at the basemaps property on the MapView class you will see that it returns a
Python list of basemap names.
❑ Assign the basemap of your choice to the basemap property on the myMap object
and display the map by writing the following code:
myMap.basemap = ‘topo-vector’
myMap
This will toggle your map view into either a 2D view or a 3D view. Full navigation properties
are available in both views and it is a good way to explore your area of interest.
❑ Click the 2D / 3D toggle button to place the map view mode into 3D.
❑ Explore some of the navigation controls for panning and tilting the 3D view.
❑ Once you have explored the 3D view switch back to the 2D mode and save your
notebook.
Your code for this step should look similar to the following:
In this exercise you have navigated from the GIS class in the arcgis.gis module to the
MapView class in the arcgis.widgets module and explored a couple of the properties on the
MapView class, and in doing so you have started to investigate the API Reference.
Exercise end
Answer: If no url or credentials are supplied then anonymous access to ArcGIS Online is
obtained.
Question 5: What argument might allow you to specify a location for your map?
Question 6: If supplying a location what data type should the location be supplied as?
Question 7: What does the zoom level indicate? What is the data type that should be
supplied for the zoom level?
Answer: The zoom level specifies the map scale based upon levels of detail and should be
supplied as a number (integer).
Question 10: Which property will display the basemaps for the map view?
Answer: basemaps
Exercise 3 Solution
In the previous section you saw that the GIS class is a great starting point into the API. It
represents your portal and from this you can then start to access content within your portal,
whatever that content might be.
This section will show you how can start to get access to your content. This is an important
pre-cursor to any administration of your portal. This is achieved by taking advantage of a
number of resource manager classes (available via the GIS class) which provide access to the
different types of items within your portal.
Learning objectives
After completing this lesson, you will:
GIS.content
GIS.datastore
GIS.groups
GIS.users
Each of these properties provides access to so-called resource manager classes, namely
UserManager, PortalDataStore, GroupManager and ContentManager. These resource
manager classes provide the ability to manage users, create and manage roles, manage
groups and to add, access, modify and delete GIS content.
Think of the resource manager classes as abstract classes in that they are not instantiable.
Instead, access is granted to the desired resource object by working with the relevant
attribute.
Looking at the documentation for each of these properties provides access to the respective
class, for example, the users property provides access to the corresponding resource
manager for GIS users – namely UserManager.
There are similar methods on the returned helper (resource manager) object such as
get(), search() and create() to manage the resource in question.
The UserManager resource manager contains a property called roles. This property
provides access to an instance of the resource manager helper object called RoleManager.
Code snippet
In the code snippet below a new folder is created called “MyNewFolder” in the online
portal. To accomplish this the content property, found on the GIS object, is used to gain
access to an instance of the ContentManager class, from which the create_folder()
method can then be used.
If the folder was successfully created then a JSON object can be displayed and interrogated.
The search() method found on the ContentManager class has the following syntax:
The search() method on the UserManager is very similar to the syntax for the
ContentManager class, and is as follows:
The GroupManager class’ search() method again has similarities to the other
search() methods:
Each variation of the search() method has a number of arguments, whose descriptions
are available within the API documentation, for example, sort_order, max_fields,
outside_org and query.
The query argument is a string which defines which items you wish to search for. It is not
obvious how this string is constructed.
A term can be a single word, for example (“London”), or a phrase (multiple words
surrounded by double quotes, such as “London City”). Operators include AND, OR, NOT.
When searching for items or groups within your portal the query can be based upon named
fields or by using default fields. When using default fields then the name of the actual field
does not need to be specified – just the search term. Default fields vary depending upon the
type of content you are searching for in your portal.
A description of those fields which are default and those which need to be named are found
in the following help page:
https://developers.arcgis.com/rest/users-groups-and-items/search-reference.htm
title
tags
snippet
description
type
typekeywords
The following are some of the non-default fields which can be searched upon for an item:
id
owner
created
modified
title
type
typekeywords
description
tags
snippet
accessinformation
access
Any field can be searched for by typing the name of the field followed by a colon, and then
the search item, so:
query=”owner: Ed_Training”
Searching of groups will have slightly different default fields and general search fields.
The type filter for item is case sensitive and must be surrounded by double
quotation marks for exact item type matching.
For example, use type:"Web Map" if you are only interested in finding web
map items; type:"web map" will return both web map and Web Mapping
Application if both exist.
Date ranges can be searched by specifying the lower and upper bounds of the dates. The
dates are specified as milliseconds (more about working with dates later in this section).
Range queries can be inclusive or exclusive of the upper and lower bounds. Inclusive range
queries are denoted by brackets ([]). Exclusive range queries are denoted by braces ({}).
For example, to find all items created between December 1, 2009, and December 9, 2009,
the search expression is as follows:
The content property accesses an instance of the ContentManager helper class. The
search() method on the helper class is then used to query content in the portal. The
query identifies all Web Maps owned by Ed_Training. A maximum of 100 items (Web Maps)
are returned.
Valid values for the item_type argument can be found in the Type column for the various
tables which can be found on the following web page:
https://developers.arcgis.com/rest/users-groups-and-items/items-and-item-types.htm
Each item can be accessed based upon its index position within the list or processed using a
for loop.
An item object is a Python dictionary object which has a number of useful keys.
Wildcard searches can be carried out by using the “*” symbol; for example:
Importing the IPython module will provide the ability to display the rich representation view
of your portal items:
In Python, dates are based upon “the beginning of time” (AKA the epoch) which is taken as
being 1st January 12:00am 1970. Time and dates are measured in seconds from this so-
called epoch. There are a number of modules that support working with dates and time such
as the time and calendar modules and the popular datetime module.
The time module works with unix time stamps and is expressed as a floating point number
which can make calculations cumbersome. The datetime module supports many of the
functions which the time module supports and also provides a more object orientated
approach.
If you wish to query items based upon dates then you will need to import the datetime
module; often the module name is simplified by using the as keyword so:
import datetime as dt
If you wish to create a date as a basis for searching then the datetime module supports
the datetime() class, which has 3 mandatory arguments: year, month and day and five
optional arguments (hour, minute, second, microsecond, tzone) whose default values are
0.; so:
The date you have created needs to be based upon the number of seconds that have
elapsed since the epoch and this is achieved by working with the timestamp() function.
But, your portal stores times and dates not as seconds but as milliseconds. This means that
if you want to perform a search based upon a date then the date object will need to be
converted to seconds via the timestamp() method, and then multiplied by 1000 to
convert into milliseconds:
ts = date.timestamp()
These representations of portal dates can be used to return items either by using list
comprehension or by using a range search based upon the created field:
Notice that the format() method is used to place the dates used in the range into the
query string.
https://developers.arcgis.com/python/api-reference/
The reference contains the documentation to help you write your code.
As you complete this activity you should be working with the GIS(), ContentManager() and
Item() classes. Use the documentation to navigate to the relevant classes and take
advantage of the hyperlinks that the help provides.
Hopefully the Jupyter Notebook is still open and available from the last exercise.
❑ If Jupyter Notebook is open then click on the Dashboard tab and navigate to the
\PYAPI\Searching folder to display its contents.
Inside is a single text file called ThisFolderIsEmpty.txt. It is in this folder that you will create
your notebook.
❑ Create a new notebook by clicking the New button > Python 3 (ipykernel).
This will open up your new notebook in a new tab on your browser.
You are now ready to write some code and connect to your portal.
Before you do that you will first of all add some markdown notes to your Notebook to
indicate what you are going to do.
This will display a markdown cell with comments in and a new cell is created underneath it.
You will now make a connection to your portal, just as you did in the previous exercise.
❑ Write the following code to access the gis module, in the API, and use the GIS class.
❑ Now create an instance of the GIS class called gis and pass in the ArcGIS Online url
and your username into the GIS() class constructor:
gis = GIS(“https://www.arcgis.com”,
“<<your_ArcGIS_username>>”)
❑ Underneath, add a print() function and provide a message to indicate that you
have successfully connected to your portal.
You will need to enter a password for the username you supplied.
Hopefully a message will be displayed indicating that a connection to the portal was made
successfully.
A new cell has been created underneath the successfully executed code cell.
Your portal is made up of different types of content, such as web maps, web scenes, feature
layers, web mapping applications, etc. The API provides excellent searching facilities to
isolate the type of content you are interested in. In this step you will find information about
feature layers that are within your portal.
❑ In the newly created cell add some mark-up so that the final result looks like
following when the cell is run as markdown:
The GIS class contains a number of properties which provide access to so-called resource
manager objects, such as UserManager, GroupManager and ContentManager. Think of
these objects as helper objects; they are there to provide functionality for managing their
corresponding objects such as adding, or searching for content
❑ Within the new cell create a new variable called contManager and assign
gis.content to it.
The property content is found on the GIS class and provides access to the corresponding
ContentManager object.
You will use the search() method on the ContentManager class to find all items of type
“Feature Layer” in your portal.
❑ Locate the documentation for the search() method on the ContentManager class
On the GIS class locate the content property and click on the hyperlink
associated with the ContentManager.
___________________________________________________________________________
The query argument will, by default, search in a number of content metadata fields, such
as title, tags, and type. There are others. These search parameters should be supplied as a
string.
If an empty query string is supplied, then there is no restriction on the types of content
which will be returned.
The item_type argument defines what type of content the search() method will
return. The values for this argument can be found in the associated documentation. You will
search for ‘Feature Layer’ item types.
Question 2: What is the default maximum number of items returned by the search()
method?
___________________________________________________________________________
Your portal contains more than 10 feature layers. You will set the max_items argument to
a value of 500.
❑ Write the following code to return all those items in the portal which are of type
‘Feature Layer’.
Question 3: What data type is the featLayers variable? (HINT: Use the type() builtin
function)
___________________________________________________________________________
❑ Use the print() function to count the number of feature layers that are currently
within the portal. HINT : Use the len() function.
___________________________________________________________________________
The number of feature layers is based upon all feature layers in the portal.
Let’s try and amend the query argument to find the number of feature layers owned by
emorris@esriuk.com_EsriUK_Training.
owner: emorris@esriuk.com_EsriUK_Training
___________________________________________________________________________
❑ Create a for loop to print out the basic metadata information for each feature layer
item in the list.
Remember that the items in the list are only those which the owner has shared to the
organisation or to all users who have access to the ArcGIS Online portal.
❑ In the newly created cell underneath your existing code, add some mark-up so that
the final result looks like following when the cell is run as markdown:
❑ Write the following line of code in the new cell underneath the Step 3 comments:
❑ Recreate the for loop you wrote in step 2 underneath the import statement you
just wrote.
Remember to pass in the variable which currently holds the item you are processing in the
list within the brackets of the display() function.
The finished code for this step should look very similar to the following:
In this step you have learned how to perform a search by applying a number of arguments
which accesses feature layers owned by a particular portal member. You have also displayed
a subset of those feature layers using the rich representation functionality available within
Jupyter Notebook.
https://www.arcgis.com/
❑ Log in to the portal using the credentials you used to connect to the portal while
using the ArcGIS API for Python.
❑ Using your ArcGIS portal skill search for the “York_Historic_Buildings” layer to
display its metadata.
___________________________________________________________________________
Question 7: What is the unique item ID? (Hint: Look at the URL)
___________________________________________________________________________
__________________________________________________________________________
❑ In the newly created cell add some mark-up to indicate that this is an optional step
to obtain some item information stored within the portal.
❑ Create a variable called yorksList and write some code to perform a search to
find the York_Historic_Buildings item. You should construct a line of code which
performs the following:
• For the query parameter search for the “York_Historic_Buildings” item using
title as the search tag.
Question 9: What data type is the yorksList variable storing? (Hint: use the type()
built-in function.)
❑ Create a variable called yorks and use the zero-index notation to obtain the item
from the list and store it in the yorks variable.
The variable yorks stores an Item object. By looking in the documentation for the Item
object you can see that it is based upon a Python dictionary. The Item class documentation
is found in the arcgis.gis module.
❑ Write the following line of code to display the keys() that are found on the yorks
dictionary (item) object.
print(yorks.keys())
You can use these keys on the yorks dictionary object to obtain the associated values of the
feature layer.
❑ Write some code to obtain the values which are associated with the id, owner and
created keys.
Your code, and output, should be very similar to the code displayed below:
One striking difference is the value returned for the created date. This number refers to the
date stored as milliseconds i.e. the number of milliseconds that have elapsed since the 1 st
January 1970, 12:00am – the beginning of the Python epoch.
If you would like a challenge (and you have time) then perhaps you might like to create a
human-readable date based upon the milliseconds value referenced by the ‘created’ key.
• import datetime as dt
• Obtain the value associated with the ‘created’ key and divide by 1000.
• Pass the result of the above line into the fromtimestamp() method on the
dt.datetime class to create a readable date.
Exercise end
Question 2: What is the default maximum number of items returned by the search()
method?
Question 7: What is the unique item ID? (Hint: Look at the URL)
Answer: b15076e69a514364a94d809688d4c081
Answer: emorris@esriuk.com_Esriuk_Training
Question 9: What data type is the yorksList variable storing? Hint: use the type()
built-in function.
Exercise 4 Solution
The ability to administer your portal is one of the most important things you will do with the
ArcGIS API for Python. Administration covers all sorts of things, from creating new users and
groups, assigning users to groups, managing content (for example identifying which user’s
content accounts for the most space), to cloning your existing portal.
This section will investigate some of the classes and their members that allow administrative
tasks to be performed with just a few lines of code. You will also investigate some of the
samples that are available and how to visualise your results.
Learning objectives
After completing this lesson, you will:
• Consider how online users and groups are created and managed.
• Introducing how your content is managed within your portal.
• An overview of the gis.admin and gis.server submodules to support Portal for
ArcGIS.
• Investigate how Portal for ArcGIS enterprise servers are managed.
• Basic service administration within your enterprise
• Identify administration samples.
• Investigate visualisation through Matplotlib.
Administration capabilities
The ArcGIS API for Python can programmatically manage every aspect of your portal by
working with the gis module and two of its sub-modules: gis.admin and gis.server.
The gis module provides access to the base classes which allow for the access and creation
of corresponding items, for example the UserManager base class contains the create()
method which will create a new user.
Items in your portal can be managed by properties and methods on the relevant class, for
example the User class represents a user in your portal and can be administered by the
properties and methods on the associated class.
The gis.admin submodule provides the ability to perform many of the supporting
administration tasks not covered by the traditional admin-item classes or the gis module.
These tasks can be summarised below:
The submodule contains many classes which perform these administration tasks; many of
which relate to the ArcGIS Enterprise portal, for example:
• PortalAdminManager class which is the root resource class for administering your
on-premise portal).
• Logs class to investigate log files created on a machine hosting Portal for ArcGIS.
• Federation class, which provides information about the ArcGIS Server registered
with your Portal.
• Access services.
These accounts are associated with a user, also known as a member of the portal, and they
are managed by portal administrators. The API provides classes which allow you to manage
members of your portal.
The GIS class contains the users property which provides access to the UserManager
helper class; remember that this class is not created directly. This class contains methods
which allow for the management of users contained within your portal, such as create(),
get(), search(), invite(), to name but a few.
The user_groups() method returns back the groups a user belongs to; more about that
in a moment.
A useful property is me. This will get information about the user who has signed into the
portal and is currently running the notebook. It returns a reference to the User class.
The get() method allows you to access information about a specific user through a User
object:
If you wish to identify members which belong to your portal or you want to get a subset of
members then use the search() method. Note that there a number of arguments which
the search() method can implement. The result is a Python list of User objects.
Code snippet: Use the UserManager.search() method to find those users beginning
with ‘ed’.
Many of the methods and the me property provide access to a User object.
The User object is a dictionary object; it has many keys such as firstName, lastLogin,
availableCredits. A full list of keys is obtained by inspecting the User class definition
in the API reference.
A user can be deleted, the ability for the user to login into the portal can be enabled or
disabled; access to My Esri or GeoNet can be enabled through the esri_access property,
thumbnail information can be obtained as well as a list of folders the user has created. The
groups the user belongs to and the items owned by the user can also be obtained. More
about that in a moment…
Code Snippet: Get a User and identify groups the user belongs to
https://doc.arcgis.com/en/arcgis-online/share-maps/groups.htm
In the API, access to an organisation’s groups for the logged in user can be accessed by the
groups property on the GIS class. The groups property provides access to the
GroupManager helper class.
The search() method on the GroupManager class is used to gain access to groups within
an organisation. The query parameter can be used to isolate groups of interest. It contains
a number of default fields, such as title, description and owner in which the query
automatically searches within the group metadata.
Returns:
The GroupManager provides access to a Group object, for example, via the search()
method or the get() method.
The Group class provides users with the ability to add users to the group, obtain a list of
items shared with the group, delete the group, obtain membership information about the
group, invite users into a group and remove users from the group, amongst other things.
Code Snippet: Create a new group with required title and tags, add a couple of users and
then update with an image.
So how do we get access to this content within your portal? Remember that the content
property on the GIS class provides access to the ContentManager helper class. This is the
gateway to your portal’s content.
The ContentManager class contains methods for folder management, obtaining a specific
item via the get() method, searching for items based upon supplied arguments via the
search() method, and methods for sharing or un-sharing items and deleting items.
Individual items contained within a portal can be obtained by using the get() method or
through the search() method – remember that the search method returns a Python list
of item objects.
An item is a “unit of content” in your portal. Each item has a unique item identifier and a
URL that allows access to the item if access permissions have been granted on that item.
Methods exist for managing the item such as update(), move() and delete(). The
dry_run() method provides information about the item’s ability to be deleted while the
delete_items() method allows for the deletion of a collection of items.
Additional methods also exist for publishing data based upon an item resource via the
publish() method, copying items and changing the access permissions of the item via
the share() method. Methods also exist for downloading and exporting the item – more
about that in the following section.
It is important to realise that to use the gis.server module the account used to connect to
the gis must have administrator privileges and a connection must be made to an ArcGIS
Enterprise portal.
To use these classes you must first work with the admin property on the gis objects. This is
set at runtime and is based upon what portal the administrator user connects to.
For an ArcGIS Enterprise portal the admin property will return an instance of the
PortalAdminManager class. This class is the root resource for administering your on-
premise portal, from which all resources and portal items can be obtained.
The ServerManager class contains a number of members which help perform validation of
the servers in the enterprise portal, to obtain a list of all servers in the on-premise portal or
to obtain a specific server machine.
Once a particular server object has been obtained then it is now possible to get access to
manage datastores, access to content, query the log records, and ultimately start, stop,
publish and delete services.
import arcgis.gis.admin
# Validate your federated servers with your portal; True all OK,
# False if there is an issue with one or more federated servers
In the code snippet below a list of all the services found on a particular server machine is
obtained. A for loop can then process each service in the list to perform an administrative
action. The properties property on the Service object returns a dictionary whose keys
are exposed as properties, for example serviceName.
if service.properties.serviceName == “SampleWorldCities”:
service.stop()
Once a particular service has been found then it can be stopped, started or its status can be
obtained (i.e. has it started? Is it stopped?)
Administration samples
The ArcGIS API for Python has many samples that can be used directly or as a basis on which
to replicate your workflows.
https://developers.arcgis.com/python/samples/
These have already been downloaded for you as part of your course data.
Additional samples are available if you are accessing ArcGIS Notebooks through ArcGIS
Enterprise or ArcGIS Online.
https://enterprise.arcgis.com/en/notebook/latest/use/windows/about-sample-notebooks.htm
Matplotlib is the preferred 2D plotting library for Python. It is used to provide graphing and
charting capabilities when a map is not required to present your analysis as it can be used
within Jupyter Notebook as it supports IPython which underpins the Notebook’s rich display
capabilities.
The matplotlib website contains an example for each different chart type, full
documentation and a number of tutorials.
https://matplotlib.org/index.html
A wide variety of chart and graph types are supported, for example:
• Polar charts
• Scatter plots
• Pie charts
• Boxplots
Properties for each chart type can be controlled, such as title, chart symbology, legend
information.
Each chart will have unique properties which can be further manipulated.
Many of the visualisation techniques used by matplotlib rely on using sequence objects such
as tuples and lists as well as Python dictionaries.
The pie chart, on the previous page, was created with the following code:
%matplotlib inline
explode = (0, 0.1, 0, 0) # only "explode" the 2nd slice (i.e. 'Web Maps')
ax1.pie(sizes, explode=explode,
plt.show()
There are many tasks which fall under the category of “administration”, for example credit
budgeting, creating users and assigning relevant privileges to those users, re-assigning
content to a new user if a user leaves the organisation, re-categorising content…. You will
have your own administrative requirements and thoughts.
This exercise will concentrate on identifying those users who have not used the portal lately
and to ultimately identify those users who are inactive (i.e. those uses who have never
logged into the portal…). This will help you to reassign those dormant accounts to more
deserving users.
• Access the portal and add some useful Python site packages which your notebook
will use.
• Create a function to calculate the number of days which have elapsed since a user
last logged into the portal.
• An optional step to create a bar chart which will display on average the number of
days it has taken for accounts associated with each trainer to login to the portal.
❑ Open up the UsageStats notebook and spend a moment reading the comments.
You will see some markdown comments specifying what you are going to do in the exercise
and empty code cells into which you will write the code for each step.
There is also a pre-written function which relates to the optional step you might want to
have a go at later on.
❑ Locate the code cell under the comment Step 1: Log into your portal.
❑ In the cell write the following code to import the ArcGIS API to access the GIS class:
You are also going to work with the datetime module as you are going to calculate how
many days have elapsed since a particular point in time.
The above line of code imports the datetime module and references it through an alias
called dt.
Think of this variable as a global variable in that you can use throughout your script,
including any functions which you might create.
❑ Write a line of code which logs you into your GIS using your credentials and store the
resulting portal reference in the gis variable.
❑ Finally write a line of code that will confirm that you have logged into your portal
successfully.
Once your code has successfully run, and you have logged into the portal, you are now
ready to carry out the main body of your task.
You are going to obtain all users that are within the portal and then call the function to
calculate how many days have elapsed since the last login for each user. Placing that code in
a function makes it fairly simple to manage the code for each user and will keep the code
that processes each user compact.
❑ Locate the cell underneath the markdown comment “Step 2: Create a function to
calculate…….”.
The parameter user will be used to accept the current portal user whose number of days
since the last login occurred needs to be calculated within the function.
In the previous section we discussed how dates are stored as milliseconds in the portal, and
that the User object has a number of properties, such as availableCredits,
fullName, etc.
❑ In the API Reference locate the arcgis.gis module, and find the User class.
Question 1: What key on the User object will return the date in milliseconds of the last login
for that user?
___________________________________________________________________________
❑ Assign to the variable, code which will obtain the last login for the user, divide that
by 1000 and cast the resulting number as an integer using the int() built-in
function.
The datetime module has a class called datetime which allows you to create dates and
manipulate timestamaps. Remember that a timestamp represents the number of seconds
since the Unix epoch i.e. January 1, 1970 1200am. The fromtimestamp() method
converts a time stamp stored in the portal, to a usable date.
dt.datetime.fromtimestamp(ts_of_last_login)
❑ Use the now() method on the dt.datetime class to identify the current date and
time and store that in the now variable.
Question 2: What object does the now() method on the datetime class return?
___________________________________________________________________________
Both the objects contained inside of the now and last_login variables are datetime
objects. To work out the difference between the two you can take last_login away
from now. Use days property on the result to display the difference between the two as
the number of days.
❑ Create a variable called days and write some code based upon the above
description which will calculate the number of elapsed days between the user’s last
login and the current date. Remember to use the days property.
❑ Finally pass the contents of the variable days back to the calling code by the
return keyword.
Your code to calculate the number of days since the user last logged in should look like the
following:
The function should be placed before the code which calls the function otherwise the
function will not be recognised when it is called.
• Use the search() method on the UserManager resource manager helper class to
find all the users in the portal and override the default max_users argument.
To understand how to obtain all the users in the portal you will investigate the
documentation.
Question 3: Which property on the GIS class will provide you with access to the resource
manager for GIS users?
___________________________________________________________________________
❑ Click the UserManager hyperlink in the description for the users property.
Notice that you are taken to the documentation for the UserManager object.
___________________________________________________________________________
___________________________________________________________________________
This default value of 100 is not enough as there are more than 100 users in the Esri UK
Training Portal.
___________________________________________________________________________
❑ Based upon your investigations write a line of code underneath which will provide
access to all portal users (set a value of 1000 for the maximum number of users):
gis.users._________(__________)
❑ Using the code you have written as a guide on the line above, create a loop which
will obtain a list of the portal users and store the current user being processed in an
inline variable called user.
❑ Within the loop create a variable called num_days and call the
num_days_since_last_login() function, passing in the current user.
❑ Still within the loop, create a variable called user_name and use the
user[‘username’] notation to obtain the current user name of the user being
processed.
The user_name variable stores a string which will be useful in reporting the number of
days since the last portal login for that user.
The code you have written should look like the following:
Do you remember the last_logged_in dictionary object you created at the beginning
of step 1? You will now populate it in which the dictionary’s key will be the unique user
name and the value will be the number of days since last login (i.e. the value returned by
the function).
last_logged_in[user_name] = num_days
The population of the dictionary must be inside the loop otherwise the
number of days since the users last login will not be recorded.
Finally, you will display the contents of the dictionary to identify the users and the number
of days that have elapsed since their last login.
❑ Outside of the loop use the pprint module and the pprint() method to display
the contents of the last_logged_in dictionary.
Your code for this step should look like the following:
The pprint() method displays the key : value pairs which is the user name and the
number of days elapsed since that user last logged into the portal:
You can scroll through this to pick out some useful information, but it is fairly
unconstructed. In the next step you will process the dictionary to identify those users who
have logged in today.
If you spend a moment studying the last_logged_in dictionary you will see that the
keys are the users and are displayed pretty much in alphabetical order, and the values are
the number of days since last logged in. As you might expect the number of days since a user
last logged in will be different for the portal users, there is no pattern in how often the users
log in to the portal.
To help us answer the question posed in the title of the step we need to group all those keys
(users) which have a value of 0. The collections module will help us answer this question.
For more information about the collections module please have a look at the
following page:
https://stackabuse.com/introduction-to-pythons-collections-module/
The collections module has been developed to provide additional support to existing types
of Python collections such as lists, dictionaries and tuples. You are going to use the
defaultdict() method to help extract all the keys which have a value of 0.
❑ Locate the empty cell underneath the markdown cell “Step 4: Identify those users
who have logged into…”.
key_list = collections.defaultdict(list)
The defaultdict() method creates a Python dictionary first and foremost but the
method’s argument specifies the data type of the value of the dictionary. The above line of
code specifies that the dictionary’s value will be a Python list.
❑ Write the following lines of code to access the key and values (items)for the
last_logged_in dictionary and then populate the key_list dictionary with
the number of days since the last login as the dictionary’s key, and the user which
shares that number of days as the dictionary’s value:
key_list[days].append(user)
❑ Outside the loop, use the pprint() method on the pprint module to print out
the populated key_list dictionary.
As you can see the dictionary is populated with key and value pairs as you might expect.
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
❑ Construct a for loop which will create the following formatted output.
The number of users and the names of the users will more than likely
be different.
You hopefully have identified those members of the Esri UK Training organisation who have
logged in today. Is your account one of them? It should be…..
Before you start to write some code to create the bar chart you will quickly inspect a
function which collates all of the accounts governed by each Esri UK Trainer and count the
number of days since the last login for those accounts and creates a summed total number
of days for each trainer.
❑ Spend a few moments understanding what it does and answer the following
questions:
Question 10: What is passed into the function?
___________________________________________________________________________
Question 11: What Python data type does the function return?
___________________________________________________________________________
Question 12: What does the key in the returned dictionary represent?
___________________________________________________________________________
Question 13: What does the value in the returned dictionary represent?
___________________________________________________________________________
The function returns a Python dictionary. The keys represent the user names managed by
the Esri UK Trainers and the values are the average number of days it takes for the accounts
managed by each trainer to login to the online portal. This gives an indication as to which
trainers’ accounts are the most active and which trainers’ account are the least active.
%matplotlib inline
The matplotlib.pyplot class is a plotting library containing basic functions for creating
charts, such as pie charts, bar charts (including horizontal bar charts), box plots, scatter plots
etc.
The line underneath the import statement allows for output generated by any plotting
commands associated with maptplotlib to be displayed in the Jupyter Notebook underneath
the cell which created the plot.
A ‘figure’ keeps track of how the chart is displayed and presented on the plotting canvas. It
includes things such as axes, legend, title, grid and the types of plot which display the data
on the chart.
❑ On the fig object call the add_axes() method and pass in the following list [0,
0, 1, 1] and store in a variable called ax.
The list represents the bottom left, bottom right, width and height of the axes. It will place a
figure that is exactly as large as the canvas itself.
The X-axis will be the trainer names and the Y-axis will be the average counts per trainer.
❑ Write some Python code to extract the names (keys) of the trainers and average
time since last login (values) from the user_counts dictionary. Store the results in
variables called trainers and ave_time_since, respectively.
Look at the answers for questions 12 and 13 in order to obtain the information.
❑ Finally on the plt alias call the show() method to display the finished plot in the
Notebook, when you run the cell.
❑ Run the cell containing the code you have just written.
A bar chart will be created and displayed which should look similar to the one displayed
below:
The reason a particular trainer’s accounts do not access ArcGIS Online might be because
that particular trainer does not teach courses with ArcGIS Online…….. possibly.
Matplotlib is a powerful Python library for creating charts and plots for inclusion within
reports. It is definitely worth spending some time to become familiar with this module (it is
part of Python’s ‘standard library’) if you are going to perform administrative / reporting
tasks as it is easier to gain an understanding of performance within your portal in picture
form when tables of figures may hide underlying patterns.
In step 4 you created a variable called key_list. This contains the users (as values of the
key_list dictionary) and the days since that user last logged into the Esri UK Training portal
(as keys of the dictionary).
Finding those users who have never logged into the portal is actually quite straight forwards
as they will be associated with the largest key in the key_list dictionary.
Write code which carries out the following (and create any necessary variables to store any
objects you might want to create:
2: Use the max() builtin function to find the largest number in the returned keys. This is
the number of days since the user(s) last logged in, as you know.
3: Obtain the value (list of users) from the key_list based upon the largest key.
Question 14: Which users have never logged into the Esri UK Training portal?
___________________________________________________________________________
Question 15: What is the rough date that the key represents?
___________________________________________________________________________
To answer the above question, you will need to take the value returned back by the max()
function and divide it by 365. That will give you the number of years
What do you think is significant about the years value you have worked out?
Well…. It relates roughly to 1st Jan 1970 (not taking into account leap years…) which is the
length of time which has elapsed since the beginning of Unix time.
Exercise End
Answer: lastLogin is the name of the key (property) which will return the date in
milliseconds of the user’s last login date.
Question 2: What object does the now() method on the datetime class return?
Question 3: Which property on the GIS class will provide you with access to the resource
manager for GIS users?
Answer: The users property will provide access to the UserManager helper class.
Answer: The search() method will allow you to search for users in your portal.
Answer: The key represents the number of days since the last login.
Answer: The value represents the portal users associated with the key.
Question 11: What Python data type does the function return?
Question 12: What does the key in the returned dictionary represent?
Answer: The key in the dictionary represents the Esri UK Trainer who manages the
associated accounts
Question 13: What does the value in the returned dictionary represent?
Answer: The values are the average amount of days it takes for the accounts to login to
the portal which are managed by the Esri UK Trainer.
Question 14: Which users have never logged into the Esri UK Training portal?
Answer: You might find that at least 3 users have never logged into the portal
(cw_esriuk_st13, 14, and 15) and they are owned by Colin.
Question 15: What is the rough date that the key represents?
Answer: The date relates to approximately 50 years ago! Which refers to the 1 st January
1970 – the beginning of Python time!
Exercise 5 solution
So how is this content uploaded into the portal? You can use tools within ArcGIS Pro, or
tools within your portal, or through a mixture of the ArcPy site package or the ArcGIS API for
Python.
Learning objectives
After completing this lesson, you will:
Layers are based on a variety of data sources. Some sources are native to the portal, such as
hosted services, while others are file based, such as CSV files, or open standards such as
KML or OGC web sources.
In order to add data to your portal you need an account which allows you to create content.
Once the data has been uploaded you can then publish the item as a hosted layer if the
account you’re are using has publisher-based privileges and the item allows it.
3: Using the ArcGIS API for Python and or the ArcPy site packages.
File geodatabases and shapefiles must be added to your portal as a zip file. They can then be
published as a hosted feature layer:
The “zipped up” shapefile is added as a Shapefile item and the hosted feature layer is then
published from it.
The New Item button also allows other items to be added, for example KML, documents
and OGC-based web services.
If you have data (CSV files, GeoJSON, Excel spreadsheets, zipped up content) stored in a
cloud drive (Dropbox / MS One Drive) then these repositories can be used as sources for
content items.
The Share tab provides the ability to share a number of dataset types in different ways:
Maps or selected layers can be shared as web layers, which can then be used for
visualisation, analysis and editing. Scenes and standalone tables can also be shared.
Packages (map, geoprocessing, deep learning, etc) can also be shared to your portal as a
compressed package file and they are a great way of sharing data, preparing data for apps
and for disseminating workflows to other audiences.
Before the dataset is shared it is analysed. This process checks for performance issues and
data errors. Data errors must be fixed before the dataset is shared (and published to your
portal).
At the 3.1 release of ArcGIS Pro, ArcPy provides access to over 1700 geoprocessing tools
along with a number of helper functions and classes.
The ArcPy site package complements the ArcGIS API for Python and can be used in both
desktop and web GIS workflows.
The workflow for publishing a layer in a project to your portal is displayed below:
The ArcGISProject class in the mapping (mp) module is used to obtain a reference to an
existing map project.
The project can contain many maps, which in turn can contain many layers. The
listMaps() method provides a Python list of map objects. The listLayers() method
on the Map class provides a Python list of Layer objects.
The particular layer you wish to publish can be obtained from this list.
import arcpy
project = arcpy.mp.ArcGISProject(r”C:\MyProjects\Liverpool.aprx”)
map = project.listMaps()[0]
layer = map.listLayers(“Grounds”)[0]
It specifies:
“FEATURE”,
“Football_Grounds”,
[layer])
The above line of code returns a Feature Sharing Draft object. Other sharing draft objects
are created if the service_type argument is TILE or MAP_IMAGE.
Once configured, the Share Draft object can then be shared to a Service Definition Draft
(.sddraft) file using the exportToSDDraft() function on the object.
featSD.exportToSDDraft(r”C:\MyData\Grounds.sddraft”)
The SDDraft file is then passed as input into the Stage Service geoprocessing tool.
arcpy.StageService_Server(r”C:\MyData\Grounds.sddraft”,
r”C:\MyData\Grounds.sd”)
If there are no errors then the service definition file can be uploaded and published to the
portal.
arcpy.UploadServiceDefinition(r“C:\MyData\Grounds.sd”,
“My Hosted Services”)
But what about if you do not have access to ArcGIS Pro? The ArcGIS API for Python provides
you with the ability to upload and publish new content through the add() method on the
ContentManager helper class and the publish() method on the Item class.
In the arcgis.gis module locate the ContentManager class and find the add()
method:
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
Question 4: What keys are recommended when working with the item_properties
dictionary?
___________________________________________________________________________
___________________________________________________________________________
Question 5: What optional argument establishes a path (or URL) to your data that you wish
to add?
___________________________________________________________________________
___________________________________________________________________________
In the arcgis.gis module locate the Item class and answer the following questions:
___________________________________________________________________________
___________________________________________________________________________
Question 7: What are the types of services that a publisher can create?
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
Question 10: When publishing a CSV file, why should you specify the optional
address_fields dictionary?
___________________________________________________________________________
___________________________________________________________________________
The newly added Item object can be inspected directly in your portal or through the rich
representation display of the Notebook.
Content which can be added to your portal can be in the form of zipped files, OGC compliant
web services, CSV and SD files, references to applications and packages.
The add() method allows you to specify item properties (an optional argument in the form
of a Python dictionary which helps define metadata for the new item), a pathway to the
source data, a thumbnail, a folder in the portal in which the item is to be created in, to
name but a few arguments.
Code snippet:
By default, the item owner will be the logged in user who uploaded the item.
The user requires publishing credentials defined by a user type or a custom role.
A number of different service types can be created through the publish() method:
• Scene services.
The type of service is defined by the item source, for example, if vector tile packages are
added to the portal then a vector tile layer will be created once the package has been
published.
Creating a hosted feature service from a CSV file requires the optional address_fields
argument to be populated. Consider the following CSV file:
The file contains a LOCATION field which is used to position the data points, as there is no
X,Y columns. The address_items argument is a dictionary object in which standardised
field names are used as the dictionary key and the field containing the address information
is the dictionary value so:
Those from outside your organisation can not access your item.
Groups Sharing to a group restricts access to the item to members of the group.
Only the owner and others who have privileges to view content owned
by other members can view private items.
The methods to allow items to be shared are found on the Item class:
The share() method shares the item to groups of users, the organisation or to the wider
public.
The shared_with() method determines to whom a particular item has been shared
with.
The zip file is an archive file format which supports lossless compression. Its advantages are:
• It’s a great way to keep related files together, such as the many files we see in a
shapefile and file geodatabase.
The zipfile module contains the ZipFile class. This is used to create a Zipfile object.
The Python with statement is used to manage an archive. The location and name of the
archive is specified along with the operation that is to be performed on the archive, for
example “r” for reading an existing archive, “w” to create a new archive and “a” to add
items into an existing archive.
Code Snippet:
The ZipFile class contains the following methods which will be useful for managing an
archive:
• extractall()
• extract()
ZipFile • printdir()
• read()
• write()
The extractall() method will extract all the contents of the archive into the current
working directory while the extract() method will extract a particular file into a
specified location.
As you can imagine the ZipFile module may be useful in your workflows for packaging files
into an archive for uploading the shapefile or file geodatabase into your portal, or for
extracting files once the archive has been downloaded from your portal.
Your workflow may require you to download content from your portal. For example, you
might be a data manager who needs to collate the results of surveys into a single file on
your local machine using ArcGIS Pro. The API can export and download these files into a
single file location and then post-process using ArcGIS Pro.
The Item class contains methods for exporting your data from one format to another and
then downloading the item to your host machine.
Code Snippet:
The new shapefile item can then be obtained and downloaded to your local machine:
Not only can the source data be downloaded but supplementary information such as an
associated thumbnail and metadata.
The functionality is available within your portal through the Download and Export buttons.
Downloading and accessing attachments has always been a problem if using the “out of the
box” functionality of the portal. The ArcGIS API for Python provides a workflow for accessing
attachments.
The FeatureLayer class is found in the features module. It has an attachments attribute
which provides access to a helper class called AttachmentManager which is in the
features.managers submodule. This manages the manipulation of attachments.
The oid argument can either be a single OID value (supplied as a string) or if a list of OIDs is
provided then all of the attachments for those object IDs will be downloaded.
Code snippet:
Sd (Service Definition) file; zipped up file geodatabases and shapefiles, CSV, layer package,
geoprocessing package and map package.
Question 4: What keys are recommended when working with the item_properties
dictionary?
Answer: Although optional, it is recommended that you include the following keys:
title, type, typekeywords, tags, snippet and description.
Question 5: What optional argument establishes a path (or URL) to your data that you wish
to add?
Answer: The data argument allows you to specify a pathway to your data.
In the arcgis.gis module locate the Item class and answer the following
questions:
Question 7: What are the types of services that a “publisher” can create?
Answer: A publisher can create feature, tiled map, vector tile and scene services.
Answer: Some formats are not automatically detected so adding a file type confirms the
format that is to be published.
Question 10: When publishing a CSV file, why should you specify the optional address fields
dictionary?
Answer: The addition of the optional address fields dictionary allows the CSV file to be
spatially enabled
Ultimately you would like to create an automated workflow for adding content to your
portal when new datasets arrive in a directory from various departments in your company
and publish datasets to a portal group called Contractors, so external contractors can view
the datasets.
In the exercise you will add zipped files to the portal. You must add your
initials to the name of the file geodatabase to provide a unique name to
the uploaded item.
You can not have a duplicate item of the same name in your portal as
this will cause an error!
C:\EsriTraining\PYAPI\Publishing\AylesburyProject
As you can see it is made up of 3 feature layers and a couple of layers which make up the
background basemap.
You can see that the 3 feature layers are a combination of point, polyline and polygon
feature types. But what formats are their data sources?
Question 1: What is the data source for both the CarParks and MajorRoads feature layers?
___________________________________________________________________________
Both feature layers have data sources which are made up of multiple files (shapefiles and
file geodatabases). This information is necessary as you will need to carry out some
additional processing in order to add your datasets to your portal.
❑ Save the ArcGIS Pro project and close ArcGIS Pro down.
You will carry out the next step in Jupyter Notebook. This is because you will
write some code to create the zip archives. If you are viewing a feature class in
the file geodatabase in ArcGIS Pro while you try and create the zip file of the
geodatabase, within ArcGIS Pro, then the process will fail as a lock is applied
against files relating to the open feature class.
❑ Open the Python Command Prompt window and change directory (cd) into the
following directory:
C:\EsriTraining\PYAPI\Publishing\Notebooks
The first cell is importing a number of useful modules and creating some variables specifying
directory locations for the datasets. Notice that the zipfile module is imported along with
the glob and os modules.
The glob module is useful for accessing files in a folder by specifying a wildcard and is useful
for finding all of the files whose name is the same but whose file extension is different, so
this would be handy for finding those files which make up a shapefile.
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
The function is called zipFileGDB() and it is used to zip up a file geodatabase. Remember
that a file geodatabase is a directory and it contains many binary files.
This means that the function can now be called in future code.
This cell contains code which identifies the source file geodatabase to be zipped and the
output name of the archive. The os.path.join() function is used to concatenate file
names to directories which are going to be passed into the function.
❑ Underneath the comment write a line of code which calls the zipFileGDB()
function. Pass in the outputZIP variable as the first argument and inputGDB as
the second argument.
❑ Add a print() function indicating that the geodatabase has been successfully
packaged up.
Before you run the code in the cell you will just change the name of the zip file which will be
created.
❑ Locate the variable outputGdbZIPname and alter the name of the file to include
your initials at the end of the name of the file geodatabase, for example:
outputGdbZIPname = r”Transport_<Your_Initials_Here>.gdb.zip”
This will help the uploading process as a portal can not have duplicate
names of the same item type.
You are going to add a notebook into the project which contains some pre-written code.
❑ Locate the Insert tab and choose New Notebook > Add Notebook.
C:\EsriTraining\PYAPI\Publishing\Notebooks
Notice that the Catalog pane has a new node called Notebooks and the newly added
PublishItems notebook is referenced here.
After a few moments the notebook will open and you will see that there is existing mark-up
and comments, which is provided for you as a guide as to where you will write your code for
the rest of this exercise.
❑ In the first cell is a comment # Create a reference to the GIS. Add the following lines
of code:
Notice that you are using the pro authentication option. This creates a reference to the
active portal associated with the running ArcGIS Pro application without having to pass your
credentials into the script.
On successful login to the active portal your organisational account username will be
displayed.
You can now use the gis variable within the rest of your notebook.
Before you can publish the zip file archive you created in the previous step you need to add
it to your portal. You write some code to achieve this.
https://developers.arcgis.com/python/api-reference/
❑ Locate the ContentManager class in the arcgis.gis module and then find the add()
method.
You will see that the item_properties argument is a required dictionary. The dictionary
contains information about the item you wish to upload and add to the portal. This is
normally the information that you would enter on the “Add an item from your computer”
dialog box.
# TODO: Add code for uploading the file geodatabase zip file
❑ Create a dictionary called dTransport and assign to it the following key : value
pairs:
KEY VALUE
“title” “AylesburyTransport_<Your-initials>”
Hopefully you have added your initials to the AylesburyTransport title key :
value pair.
❑ Create a variable called transportData and assign to it the name and location of
the zipped up geodatabase.
Remember to include your initials in the name of the zip file BEFORE the
.gdb.zip part of the file to maintain a unique naming convention.
gis.content.add(item_properties=dTransport,
data=transportData)
As long as there are no errors then the archive will have been uploaded into your portal!
❑ Check that the item is present within your portal, for example:
You are now ready to publish your newly added item as hosted features service.
You will now use the Item : publish() method to create hosted feature service from your
uploaded portal item.
Notice that all of the arguments for the method are all optional, including the
publish_parameters argument. This means that the metadata associated with the
source item will be used when publishing as a hosted feature service.
On successful completion of running the code cell you will see the message indicating that
the hosted feature layer has been created successfully.
In the final step you will inspect your newly created feature layer.
❑ Open a browser and log into ArcGIS Online using the credentials you were provided
with in the course.
You should see three items in your Content area of the portal: the File geodatabase, the
Feature Layer (Hosted) and the notebook you created in your first exercise:
❑ On the Content tab click the AylesburyTransport feature layer to display the
metadata for the service.
___________________________________________________________________________
___________________________________________________________________________
❑ Inspect each layer in turn in the Visualization tab to make sure the layers have been
published correctly. You might want to check that:
• The layers display when you zoom in / out and pan around the map.
• The pop-up attribute dialog box appear when you click on a feature.
❑ Click the Data tab and inspect the attribute table for each layer.
❑ Compare the attributes in each layer in your portal to the attributes of each layer in
ArcGIS Pro.
These are the sorts of things you should have check when you are enacting a workflow. Can
you think of anything else?
Notice that the symbology is not the same as it was in ArcGIS Pro. This is because you have
published the layers via a notebook. If the layers were published from within ArcGIS Pro
using the Share As Web Layer dialog then the symbology would have been preserved in the
publishing process.
❑ Close ArcGIS Pro down and do not save any changes if you are asked to do so.
If time permits then perhaps you might wish to investigate the optional step of changing the
sharing options for your newly created hosted feature layer.
Otherwise…..
In this exercise you have experienced the publishing workflow, from viewing the original
source data, to using the zip and glob modules to help you accomplish particular tasks to
then using the ArcGIS API for Python to add content and publish items. Please remember
that many types of data can be published, and the process of publishing is similar to that
described and performed in this exercise.
It is strongly recommended that you investigate the Sample Notebooks and the API
Reference for additional code samples in order to publish other types of file.
In this optional step you will write some code to share your newly created hosted feature
layer to the existing portal Contractors group. Only members of this group will be able to
see the contents of this group.
You will use the share() method on the Item class to share the content to the
Contractors group.
❑ Locate the share() method on the Item class in the API reference, and answer the
following questions:
Question 5: If you wish to share content to a group, what must you supply to the groups
argument?
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
___________________________________________________________________________
❑ In the PublishItems notebook, which is open in ArcGIS Pro, locate the code cell
which has the comment # TODO: Share your items…….
Notice that there is a line of code which already references the Contractors group ID.
Question 7: What is the name of the variable which references the Contractors group ID?
___________________________________________________________________________
❑ Write a line of code which will share the transport feature layer to the Contractors
group, using the answers to the above questions as your guide.
❑ Run the code cell which shares your content to the Contractors group.
On successfully running the code cell you should see a message indicating that the feature
layer was shared to the Contractors group.
Within the ArcGIS Online portal click the Content tab and choose All my content in the
Folders panel:
___________________________________________________________________________
___________________________________________________________________________
Let’s make sure your hosted feature layer is available in the Contractors group.
❑ In the portal click the Groups tab and locate the Contractors group.
Notice that under the Recently added content part of the page you should find your
Aylesbury_Transport feature layer which has been shared to the group.
You have successfully shared your hosted Aylesbury Transport feature layer to the
Contractors group so other group members can now access your item.
Exercise End
Question 1: What is the data source for both the CarParks and MajorRoads feature layers?
Answer: The data source for both layers is a file geodatabase feature class.
Answer: The output zip file name and the source file geodatabase to be zipped up, passed
in as a directory.
Answer: There are two layers within the AylesburyTransport service; CarParks and
MajorRoads.
Question 5: If you wish to share content to a group, what must you supply to the groups
argument?
Answer: You find the ID of the group as you would do any other item, by looking at its URL
and taking note of the alphanumeric characters which follow the “id” part of the URL.
Question 7: What is the name of the variable which references the Contractors group ID?
Answer: The hosted feature layer is shared to Owner (you) and the Contractors group.
Exercise 6 solution
CreateArchive.ipynb
PublishItems.ipynb
One of the things you will need to do is to validate and possibly correct your data, especially
if you are going to be working with table-based data, such as CSV files. To help you perform
these types of operations you can use the Pandas module and its DataFrame object or its
close cousin, the Spatially Enabled Dataframe which is found inside of the ArcGIS API for
Python.
The Matplotlib site package compliment both DataFrames and is a great way to further
visualise your data.
Key to working with the Pandas module is an understanding of the DataFrame as this
contains a number of methods for inspecting and amending your data. Esri’s Spatial
DataFrame is based upon the Pandas DataFrame and it manipulates, manages and
translates data into new information.
This section will introduce you to both types of DataFrame object and the types of
operations that they support.
Learning objectives
After completing this lesson, you will be able to:
Notes
At the headline level it allows you to take data contained inside of a CSV file, an SQL table ,
or pretty much any other type of table, and create a Python object called a DataFrame. The
DataFrame is made up of rows and columns – it is essentially a table.
The other major component of the Pandas module is a Series object and represents a single
column in the DataFrame. It can be defined as a “one-dimensional labelled array”.
The following is an overview of just a few of the operations which can be performed, and
you are encouraged to further explore this module.
A Pandas DataFrame object can also be created from a Python dictionary. Consider the
following code:
There are a number of reader functions available in the Pandas module which allows you to
access CSV files, JSON objects, html documents and Excel spreadsheets to name but a few.
Each read function has a number of arguments and keyword arguments which specify how
the Pandas DataFrame is created,
The code below uses the Pandas read_excel() function which returns a DataFrame
object:
Notice, that by default the first five and the last five records are displayed, and that an
additional column has been added which is an auto-index column which starts at 0.
https://pandas.pydata.org/pandas-docs/stable/user_guide/io.html
• shape
• dropna()
• fillna()
• head()
• iloc()
DataFrame • info()
• isnull()
• loc()
• rename()
• tail()
To see the last 5 records from the dataframe then the tail() method can be used; again
a number can be specified in the argument list.
The info() method is another exploratory tool which provides information about a
number of properties for your dataset, such as how many rows there are, the number of
columns, the datatype of each column, the number of NULL (NaN) values in the dataframe
and the memory footprint of the dataframe.
The shape attribute returns a tuple of two values: the number of rows and the number of
columns:
(1000, 11)
Values within the dataframe can be obtained by using the iloc and loc indexers.
For example, the indexer iloc[2] will get the third row in the dataframe object as it is an
integer based indexer. It will return a Series object back if a single row is fetched:
If a list of rows, by index position, is specified then a DataFrame object is returned back:
The loc[] is label based; for example loc[“UK”] will obtain the row associated with the
“UK” label value. Data can be filtered if the loc[] is used.
Pandas DataFrames can be sliced (just as strings and other sequence objects can be sliced).
Slicing allows you to specify a start position and a finish position for rows; so:
df[0:15]
The dropna() method will delete any row (or column) containing a null value and return a
new dataframe or the edits can be persisted in the original dataframe. It contains optional
arguments which can alter the how records (or columns) are dealt with:
A boolean:
inplace True will affect the original dataframe.
False will create a new dataframe object.
One final thing…. You may have noticed that the column names have some issues; they have
spaces in them, one or two column names are on separate rows. The rename() method
will correct issues with the names of the columns.
In the example below, the suspect columns are replaced by passing in a dictionary of
original column name and the renamed column:
Pandas is a powerful module for exploring your datasets. Having a rudimentary knowledge
of this module is a must for any aspiring data scientist
You have been asked to create a new hosted feature service of GP Surgery locations for the
Covid-19 vaccination. The information about the surgeries is found in a spreadsheet and you
want to make sure that the data is in a suitable condition from which the hosted feature
layer can then be created.
1: _____________________________________________________________________
2: _____________________________________________________________________
3: _____________________________________________________________________
4: _____________________________________________________________________
5: _____________________________________________________________________
The SeDF reads from many different data sources, such as:
• Pandas DataFrames,
• Feature classes
• GeoJSON
• Feature layers
• Shapefiles
The functionality of the SeDF is based upon the geometry engine available to the SeDF
object at creation. The SeDF will try to use ArcPy, if that is not available then the Shapely
engine will be used, and finally, if no engine is available then base geometry objects will be
created. This needs to be taken into account when using the to_featureclass()
method.
How is the Spatially Enabled DataFrame built upon the Pandas module?
Esri has injected two accessors (namespaces) into the Pandas object when it is imported.
The accessors are called spatial and geom.
The spatial accessor provides access to spatial operations on the DataFrame while the
geom accessor provides functions which work on the spatial Series object.
This means all the good data processing operations are available through Pandas plus the
additional spatial operations through the SeDF.
The following code provides access to the SeDF by injecting / registering the two Geo
namespaces with Pandas:
import pandas as pd
Once the accessors have been injected into Pandas, spatial properties and functions can
then be applied against the DataFrame object, for example consider the following code:
df[‘SHAPE’].geom.area
So area is a property on a geometry object which is accessed via the geom accessor. It
returns back a series of areas.
The spatial accessor provides the DataFrame with access to a number of spatial
functions, for example from_xy().
Even though a SeDF has been created it should be noted that its type is still a Pandas
DataFrame. But notice that the SeDF has an additional SHAPE column which allows for the
performing of spatial operations. In essence the DataFrame is now geo-enabled!
A word of warning….
Be aware that within the documentation at the 1.8.2 release of the API that the API
reference makes note of a class in the arcgis.features module called
SpatialDataFrame.
This has now been deprecated since the 1.5x release of the API.
Annoyingly there is a sample in the ArcGIS Tutorials which still uses the SpatialDataFrame
class.
The SeDF can be obtained from a number of different sources. It can also be exported to a
number of different file formats too, including hosted feature layers within your portal.
There are a number of methods which create a SeDF or export to other formats:
• from_df()
• from_featureclass()
• from geodataframe()
GeoAccessor • from_layer()
• from_xy()
• to_featureclass()
• to_featurelayer()
The from_df() method will take a Pandas DataFrame and create a SeDF if an address
column is present. Credits will be consumed to perform geocoding against the address
column. A Geopandas DataFrame can also be converted into a SEDF through the
from_geodataframe() method.
The from_featureclass() method allows you to access local geospatial datasets. The
types of datasets that can be accessed depends upon what Python modules and site
packages are installed.
If the ArcPy site package is installed, due to the API being available through the installation
of ArcGIS Pro, then the from_featureclass() method can read geodatabase feature
classes, shapefiles and OGC services, and ArcGIS Online web services, to name but a few.
Also, if ArcPy is not installed and you wish to read a file geodatabase feature class,
then the fiona Python package must be installed.
The method supports a number of optional keyword arguments including a where clause.
The from_layer() method will read feature layers from online and on-premise portals
into a SEDF.
A Pandas DataFrame can be converted into a SeDF if it contains X and Y columns via the
from_xy() method. A spatial reference should be applied to confirm the spatial reference
of the newly created SeDF.
The GeoAccessor class provides a number of methods for exporting the SeDF to a number of
dataset types, for example to a feature set (to_featureset() ), or to a portal hosted
feature layer through the to_featurelayer() method.
The ArcPy site package is required if exporting to a file geodatabase feature class.
The argument for the to_featureclass() method requires a pathway and dataset
name if exporting to a shapefile or file geodatabase feature class; or just the name and
optional folder if creating a hosted feature layer from the SEDF.
Once the SeDF object has been created then the traditional Pandas operations can be
applied, such as head(), tail(), iLoc(), etc.
• bbox
• centroid
• full_extent
GeoAccessor • geometry_type
• overlay()
• plot()
• sanitize_column_names()
The GeoAccessor class contains many properties which return metadata about the SeDF,
such as the minimum bounding box (bbox) and the SeDF’s centroid (centroid).
The overlay() method allows for a limited number of spatial operations to be performed
using two SeDF objects. The default operation is union but other operations are erase,
identity and intersection. A new SeDF is returned.
• as_arcpy
• as_shapely
• buffer()
• centroid()
GeoSeriesAccessor • clip()
• contains()
• intersects()
• project_as()
• within()
The class contains many methods which allow for spatial analysis to be performed on the
SHAPE field, such as buffer(), clip(), contains(). There are methods which will
convert a geometry for use with other geometry engines such as ArcPy and Shapely. The
project_as() method re-projects the geometries stored in the series from one
coordinate system to another, while applying an optional transformation.
The plot() method is based upon the Matplotlib library plot() function and performs
similar operations.
To access spatial members on the SeDF the spatial accessor still needs to be applied
because the SeDF is still a Pandas DataFrame.
The code snippet below shows how the plot() method is used to display the SeDF on the
map. The default renderer and symbology for point geometries is used:
A map widget object should be supplied on to which the SeDF’s features are plotted, while a
number of different renderers / colour palettes can be applied to symbolise the features.
Renderer types are:
• Single symbol
• Unique values
• Class breaks
• Heatmap
The documentation for the plot() provides a description for the different symbol types.
An example of plotting the data within a SEDF using a single symbol while creating red
squares of a certain size is presented below:
symbol_type: The type of symbol the user is creating. Options are simple, picture,
text and carto.
Different keyword arguments are available depending upon if you are rendering Simple,
Unique, class breaks or heatmap renderers.
The most consistent way to apply colours to the SeDF is to use color ramps. These are
matplotlib colour ramps, for example, “Reds_r” refers to the reversed Reds colour ramp,
and cstep refers to a particular colour in the color ramp.
If you are plotting your data through a SEDF then remember that
all you are doing is investigating the data so keep the symbology
simple. Do not over complicate things or it will not work!
5: The column headings are not very descriptive. It might be a good idea to rename them.
In the section’s activity you inspected an Excel spreadsheet and you identified some issues
with it. For example, there are a number of columns which are composed entirely of NULL
values, some of the column names are not particularly descriptive and one or two of the
fields are duplicates of other existing columns.
You will use an existing Jupyter Notebook which contains some markup to provide you with
the necessary guidance.
The deciding factor is that because the data you are going to work with is in an Excel
spreadsheet, you will use the standard Jupyter Notebook. If you were to perform the
following actions in ArcGIS Online then the spreadsheet and the exercise’s Notebook would
have to be added as portal items.
To save a small amount of time you will use the Jupyter Notebook.
❑ Start Jupyter Notebook, if it is not been started already. And navigate to the
C:\EsriTraining\PYAPI folder.
As you might expect it contains some comments presented as markdown and a number of
empty code cells.
❑ Locate the cell underneath the Step 1: Accessing your Notebook markdown.
❑ Underneath the print() function write some code which will import the
pandas module and reference it as an alias called pd.
Now you will add a reference to two classes in the arcgis.features module which will allow
you to create a Spatially Enabled DataFrame (SeDF) and a series.
❑ Underneath the line of code which imports the Pandas module write the following
code:
Your code for the first step should look like the following:
In this first step, you have successfully made a connection to your portal, imported the
Pandas module and the two Accessor classes have been referenced as well.
❑ Locate the cell underneath the markdown which says: “Step 2: Create the Pandas
DataFrame”
In the previous step you created a reference to the imported Pandas module.
___________________________________________________________________________
The Pandas module contains a number of methods which allow you to create DataFrames
from a number of formats. You will use the read_excel() method to create a Pandas
DataFrame from the spreadsheet.
pd.read_excel()
The read_excel() method reads the Excel spreadsheet into a Pandas DataFrame.
Notice that the first and last 5 rows of the spreadsheet are displayed. Notice that the issues
you identified in the section activity are still present in the DataFrame:
The Pandas DataFrame is a good vehicle to correct these issues. Before you do that you will
obtain some information about the DataFrame.
First of all, let’s identify how many rows and columns are in the DataFrame.
❑ Underneath the cell which created the DataFrame, add a new cell.
❑ Write a line of code to use the df.shape attribute to return the number of rows
and columns.
Question 2: How many rows and columns does the DataFrame contain?
___________________________________________________________________________
Now that you have an idea of the number of rows and columns the DataFrame is composed
of, you will now get some information about the DataFrame columns and the amount of
memory the DataFrame occupies.
Question 3: What is the data type (Dtype) for the GP Practice Code field?
___________________________________________________________________________
___________________________________________________________________________
Question 5: For the f6 and f7 columns, how many records have non-null values?
___________________________________________________________________________
There is little point in keeping these columns and so you will remove them from the
DataFrame in the next step.
___________________________________________________________________________
___________________________________________________________________________
The DataFrame is three times smaller than the actual spreadsheet as it contains just the
data and basic column definition.
The info() method would have displayed something very similar to the following:
Finally, you will display just the first three records of the DataFrame.
df.head(3)
Notice that the rows are zero-indexed – the first row is referenced as index position 0.
Now that you have gained an understanding of the contents of the DataFrame you can now
correct the issues found in the spreadsheet through the methods found on the DataFrame
object.
You will first of all remove the columns composed entirely of null values.
Question 8: Which DataFrame method will delete columns if they contain null values?
___________________________________________________________________________
The dropna() method will delete any rows or columns containing null values. You can
specify whether it is rows or columns which are deleted, and the threshold for the number
of null values which determines when the deletion occurs. You can also specify if the
deletion occurs within the source DataFarme or if a new DataFrame is created.
In this step you will perform all operation son the same DataFrame. This will remove any
issues with DataFrames being referenced.
❑ Locate the markdown which displays “Step 3: Correct your DataFrame issues”.
Notice that both the f6 and f7 columns have been removed as they were composed entirely
of null values. This confirms your answer to question 5.
There are a number of columns which are not really required, for example EXCEL_ID.
❑ Underneath the previous cell which you just ran, create a new cell.
There are some additional columns which should also be removed; namely f4, f5, f8 and f9
as they contain data which is duplicated in other columns or data that is just not useful.
❑ Create a new cell under the last line of code you just executed.
❑ Use the same notation as above to delete the f4, f5, f8, f9 and objectId columns.
The final thing you will do in this step is to rename some of the columns to something more
useful.
❑ Create a new cell underneath the code which you just executed.
You will use the rename() method to perform the renaming of the columns.
The rename() method has an argument called columns which requires a Python
dictionary; the key of the dictionary is the existing field to be renamed, and the value is the
new column name.
Your final view of the DataFrame should look like the one displayed below:
❑ Locate the cell underneath the markdown which displays “Step 4: Plot the
DataFrame”.
You will first of all write a line of code which will convert the Pandas DataFrame to a
Spatially enabled DataFrame (SeDF). You will use the spatial accessor property on the
DataFrame object followed by the from_xy() method. This is found on the GeoAccessor
class, which the spatial property provides access to.
The x_column argument references the X column of the DataFrame and the y_column
references the Y column in the DataFrame.
___________________________________________________________________________
___________________________________________________________________________
It contains information about the geometry type and the spatial reference (WKID).
The DataFrame is now spatially enabled, and is referenced in the variable called sed.
❑ Write a line of code which creates a map focusing on Ealing, London with a
zoomlevel of 12.
Remember that this was one of the first things you created when
you first started to access the API.
❑ Display the map using the map variable and run the code cell.
Now that you have spatially enabled the DataFrame, you will use the plot() method on
the spatial accessor property to display the data in the SeDF on the map.
sed.spatial.plot()
You will use the information in the following table to correctly populate the plot()
method:
map_widget map
renderer_type ‘s’
symbol_type ‘simple’
symbol_style ‘o’
colors ‘Blues_r’
cstep 50
outline_color ‘Blues_r
marker_size 6
❑ Use the table to fill in the missing keyword arguments for the plot() method.
You can pan around the map, zoom in / out and click on the symbol to display information
associated with each feature:
Now you have inspected your DataFrame on a map by spatially enabling it, you are now
ready to publish it to your portal.
❑ Locate the markdown “Step 5: Create a Hosted Feature Layer from the SeDF”.
The GeoAccessor class has a number of methods which you can use to create new datasets,
for example - to_featureclass() will export a SeDF to a geodatabase feature class.
You will use the to_featurelayer() method to create a new hosted feature layer.
___________________________________________________________________________
There is only one required argument (title) – the other arguments are optional.
___________________________________________________________________________
❑ In the cell underneath the markdown create a new variable called fl and assign to it
the following:
sed.spatial.to_featurelayer()
❑ Use the following table as a guide to add the necessary arguments into the
to_featurelayer() method:
Argument Value
title ‘EalingsGPs_<YourInitials>’
gis gis
❑ On the next line type fl as this will display properties of the feature layer using the
Notebook’s rich display capabilities.
Your code should look like the following (Your initials for the title argument will be
different…):
The result will be the rich display of the newly created feature layer properties in the
Notebook:
You will quickly investigate the newly created feature layer in your portal.
❑ Click on the title of the hosted feature layer in the rich representation display.
Notice that the field headings are as expected; the unrequired columns have been deleted,
while other columns have been renamed.
Notice that the default symbology has been provided for the surgery locations.
❑ Click the Overview tab to take you back to the item’s metadata.
Perhaps you might like to change the name of the actual layer; possibly add a description
and a summary? That might be something to do for another time…..
In this exercise you have taken an Excel spreadsheet, cleaned it through methods found on
the Pandas DataFrame and then inspected the data using the SeDF’s plot() method.
Finally. you created a new hosted feature layer and validated your work!
Exercise End
Question 2: How many rows and columns does the DataFrame contain?
Question 3: What is the data type (Dtype) for the GP Practice Code field?
Answer: The data type for the x and y fields is float 64.
Question 5: For the f6 and f7 columns, how many records have non-null values?
Answer: Both fields contain zero non-null records, i.e the field values are composed
entirely of NULL values.
Question 8: Which DataFrame method will delete columns if they contain null values?
Exercise 7 Solution
DataCleansing.ipynb