Building Footprints Tutorial
Building Footprints Extraction
Tutorial Outlines
Overview .................................................................................................................................... 1
Skills Acquired ........................................................................................................................... 2
Step 1: Set up the project ........................................................................................................... 2
Step 2: Choose a pre-trained model........................................................................................... 2
Step 3: Select imagery bands .................................................................................................... 3
Step 4: Create training samples ................................................................................................. 6
Step 5: Export training data ........................................................................................................ 7
Step 6: Train a Deep Learning Model......................................................................................... 8
Step 7: Building Detection .........................................................................................................10
Requirements
• ArcGIS Pro (see options for software access)
• ArcGIS Image Analyst
• Deep Learning Libraries for ArcGIS Pro
• Recommended: NVIDIA GPU with a minimum of 8 GB of dedicated memory
Overview
Deep Learning (DL) has been around for over 50 years, yet its significant
advancements have predominately emerged in the recent decade. These
advancements happened due to the availability of more training data, better computers,
and improved deep-learning frameworks. In Esri’s ArcGIS system, the workhorse
applications of DL are located mainly in ArcGIS Pro, ArcGIS Online, and the ArcGIS API
for Python, specifically the learn module.
In this tutorial, you are a GIS analyst for the City of Lethbridge in Alberta who has been
tasked with generating building footprint layers for various neighbourhoods to support
city planning activities. You want to use deep learning to extract building footprints from
aerial imagery in ArcGIS Pro. You’ll choose a pre-trained model provided by ArcGIS
Living Atlas and use the transfer learning approach to fine-tune the model’s
performance in the City of Lethbridge. You’ll provide a small number of new training
samples and further train the model. You’ll then test the fine-tuned model on a
Lethbridge neighbourhood and extract enhanced results of the building footprint.
highered@esri.ca https://hed.esri.ca Twitter/X: GIS4HEd 1
Public
Building Footprints Tutorial
Skills Acquired
By completing this tutorial, you will become comfortable with the following skills:
• Create training samples using ArcGIS Pro
• Train and fine-tune a deep learning model
• Execute your building detection model in ArcGIS Pro
Step 1: Set up the project
a) Download the Building_Footprints_Extraction.zip file and locate the downloaded
file on your computer.
b) Right-click the Lethbridge_Building_Detection.zip file and extract it to an easy
find location on your computer, such as a folder on your C: drive.
c) Open the “Lethbridge_Building_Detection” folder and double click on
“Lethbridge_Building_Detection.aprx” to open the project in ArcGIS Pro. This is
the project file containing the required contents for this tutorial. Upon opening the
file, you should see a spatially referenced tiff file named “Lethbridge_1m.tif”
d) In contents, right-click on “Lethbridge_1m.tif” and choose Zoom To Layer. Pan
and zoom around the map to get an idea of different buildings across the city.
This is an aerial image captured in April 2021 for the City of Lethbridge. The
resolution of the original imagery was changed to 1 m resolution for the purpose
of this tutorial. You can see there are thousands of buildings in the city. We
decided that instead of counting the buildings on the ground or digitizing them
manually in this image, we would use Deep Learning tools in ArcGIS Pro to
inventory buildings across the city.
Note:
If you do not have the deep learning libraries installed, press Ctrl+S to save your
project, close ArcGIS Pro, and follow the instructions in the “install deep learning
frameworks for ArcGIS” document in the downloaded zip file. When the installation is
complete, you can reopen your ArcGIS Pro project and continue with the tutorial.
Step 2: Choose a pre-trained model
You want to use deep learning to extract buildings from the aerial imagery. If you don't
already have a deep learning model available, this first requires training a model from
scratch, feeding it large numbers of examples to show the model what a building is.
High-performing models can require being exposed to tens of thousands of examples.
An alternative is to use a model that was already trained for you. You will retrieve such
a model and learn about its specifications.
a) Go to the ArcGIS Living Atlas of the World website.
b) In the search box, type “Pretrained model” and press Enter. Browse the list of
results to see the more than 60 pre-trained models available.
highered@esri.ca https://hed.esri.ca Twitter/X: GIS4HEd 2
Public
Building Footprints Tutorial
c) In the search box, type Building Footprint Extraction and press Enter. The list of
results contains pre-trained deep learning models for different regions of the
world. Since your area of interest is in Canada and there is no model on the
Living Atlas for building extraction in Canada, you will choose the closest model
geographically, which is trained on buildings from the USA.
d) Click Building Footprint Extraction – USA and take some time to read the content
of that page (e.g., input, output, applicable geography, model architecture, etc.).
Then, proceed to download the model.
e) Create a folder named “PretrainedModel” in your Lethbridge_Building_Detection
folder. Then, move the “usa_building_footprints.dlpk” model file from your
download location to the PretrainedModel folder.
Step 3: Select imagery bands
You will now investigate to understand how well your imagery matches the downloaded
pre-trained model.
a) In the Contents pane, right-click Lethbridge_1m.tif and choose Properties. In the
layer properties window, click “Source” and expand “Raster Information”. Find the
Number of Bands field. Its value is 4. This is multispectral imagery composed of
four spectral bands: red, green, blue, and near infrared. The near infrared band is
often used for visualizing vegetation health.
b) Find the Cell Size X and Cell Size Y fields. The value is 1 in both cases (see
image below). This means that each cell (or pixel) in the imagery measures 1 by
1 meter. However, the pretrained model expects three bands (red, green, and
blue) and higher resolution imagery to perform well. You will need to remedy this
mismatch between your imagery and the pretrained model’s expectations.
highered@esri.ca https://hed.esri.ca Twitter/X: GIS4HEd 3
Public
Building Footprints Tutorial
c) To fix the band mismatch, you need to produce a new layer that contains only the
first three bands of the imagery so that it better matches the model's
expectations. This is an important step; if skipped, the model might not run or will
underperform.
d) On the ribbon, on the “Imagery” tab, in the Analysis group, click the “Raster
Functions” button.
e) In the Raster Functions pane, in the search box, type Extract Bands. Set the
following Extract Bands parameter values:
• For Raster, choose Lethbridge_1m.tif
• For Combination, verify that the value is 1 2 3, referencing the Bands 1
(red), 2 (green), and 3 (blue).
• For Missing Band Action, choose Fail.
• Click Create new layer.
A new layer, named Extract Bands_ Lethbridge_1m.tif, appears in the
Contents pane. Layers created by raster functions are computed
dynamically and not saved on disk. In this case, you want to persist the
resulting layer as a TIFF file on your computer. You will do that with Export
Raster.
• Right-click Extract Bands_ Lethbridge_1m.tif, and choose Data and
“Export Raster”. In the Output Location window, browse to Folders >
Lethbridge_Building_Detection > Imagery, for Name, type
highered@esri.ca https://hed.esri.ca Twitter/X: GIS4HEd 4
Public
Building Footprints Tutorial
Lethbridge_RGB.tif, and click Save. accept all other default values and
click “Export”.
• In the Contents pane, right-click Lethbridge_RGB.tif, and choose
“Properties”. click Source and expand Raster Information. Find the Number
of Bands field. The field value is 3, confirming that the layer now has three
bands, just as the pretrained model expects.
• Remove all imagery from the Contents Pane, except the
Lethbridge_RGB.tif and save your project!
Understand the need for Transfer Learning
You now must fix the resolution mismatch, the model expects a higher 10-40 cm
resolution, and your imagery is at 1-meter resolution. If you were to apply the Building
Footprint Extraction – USA pretrained model to the Lethbridge_RGB.tif layer directly,
you would get poor results, as you can see in the following example image:
In that image, the buildings detected are shown in red. Because of the resolution
mismatch, the model could detect the larger buildings but struggled to identify any of the
smaller ones. One approach to remedy this issue is to use transfer learning.
Transfer learning is a technique in machine learning in which knowledge learned from a
task is reused to boost performance on a related task. Here, the original task was to
detect buildings in 10-40 cm resolution imagery, and the new task is to detect buildings
in 1-meter resolution imagery. A major advantage of transfer learning is that it requires a
relatively small amount of training data and short training time compared to what would
be needed to train a model from scratch. However, there is a limit to what transfer
learning can do if the mismatch between your imagery and the expected input is too
extreme. Next, you will learn how to perform transfer learning to fine-tune the pre-
trained model to perform better on your data.
highered@esri.ca https://hed.esri.ca Twitter/X: GIS4HEd 5
Public
Building Footprints Tutorial
Step 4: Create training samples
To perform transfer learning, you first need to produce training examples to show the
model what a building looks like in your data. If you were training a model from scratch,
you would need tens of thousands of building samples. Thankfully, with transfer
learning, you only need a few hundred.
a) Click on “Lethbridge_RGB.tif” in the contents pane, and proceed to the “Imagery”
ribbon tab. Under “Classification Tools”, click on “Training Samples Manager”
and import the classification schema created for this tutorial by clicking on
“Browse to existing schema” as shown in the image below. This schema is called
“Buildings Schema.ecs” and is located in the “Output_provided” folder.
b) In the “Training Sample Manager”, click on “Building” and the sketch tools will be
illuminated, ready to create training samples. You can select a desirable sketch
tool (i.e., polygon), zoom to a small extent, and start sketching areas that visibly
contain buildings. A couple of bookmarks have been saved for you to help with
the training samples process.
c) On the ribbon, on the Map tab, in the Navigate group, click Bookmarks, and
select Training Location 1.
d) Once you’ve created your own training samples using all Training Location
bookmarks 1, 2, 3, and 4, save them as a feature class called “TrainingSamples”
in the default project geodatabase, Lethbridge_Building_Detection.gdb. In
general, the more training samples you have the better the training for the DL
model. You can also load the training sample locations that were pre-generated
for this Tutorial (located in the Results.gdb in the Output_provided folder) by
clicking on the load training sample folder as shown below, create several more
training samples, and don’t forget to save the feature class.
highered@esri.ca https://hed.esri.ca Twitter/X: GIS4HEd 6
Public
Building Footprints Tutorial
Load training samples
Example of training samples from the Training Location 1 bookmark.
Step 5: Export training data
After you’ve saved your training samples, the next step is to use the “Export Training
Data for Deep Learning” tool through Geoprocessing (see Tools in the Analysis ribbon
tab). Fill in the parameters as instructed below.
o Input Raster – Lethbridge_RGB.tif
o Output Folder – type ImageChips
o Input Feature Class or Classified Raster – TrainingSamples
o Class Value Field – Classvalue
o Image format – Tiff format
o Tile Size X – type 256
o Tile Size Y – type 256
o Stride X – type 64
o Stride Y – type 64
o For Metadata Format, ensure that RCNN Masks is chosen.
o Click the Environments tab. Under Raster Analysis, for Cell Size, choose Same
as layer Lethbridge_RGB.tif.
o Click on Run
highered@esri.ca https://hed.esri.ca Twitter/X: GIS4HEd 7
Public
Building Footprints Tutorial
The tool will generate image chips in tiff and xml files associated with the image chip.
Depending on the size of the imagery, tile size, stride size, and rotation angle specified,
not to mention the power of your computer – it can take quite a bit of time to complete.
So please be patient.
Step 6: Train a Deep Learning Model
You will now conduct transfer learning. You’ll use the chips you generated to further
train the usa_building_footprints.dlpk pretrained model. In the Geoprocessing pane,
search for and open the Train Deep Learning Model tool and set the following
parameter values:
• For Input Training Data, click the Browse button. Browse to Folders >
Lethbridge_Building_Detection > select ImageChips and click OK.
• For Output Model, click the Browse button. Browse to Folders >
Lethbridge_Building_Detection > Results and Type
Lethbridge_1m_Building_Footprint_Extraction and click OK.
• For Pre-trained Model, click the Browse button. Browse to the folder where you
saved the usa_building_footprints.dlpk pretrained model, select it, and click OK.
• Change the Batch Size to 8.
Change this to 4 or 2 if your computer runs out of memory.
• Expand the Advanced section, verify that the Freeze Model box is checked and
uncheck the stop when model stops improving
• Click the Environments tab. For Processor Type, choose GPU. Accept all other
default values and click Run. The process might take one hour or more to run.
highered@esri.ca https://hed.esri.ca Twitter/X: GIS4HEd 8
Public
Building Footprints Tutorial
Below is a snapshot of the tool's output after training the model for 20 epochs. As
shown, both the training and validation losses consistently decrease, while the model’s
average precision improves with each epoch. This trend is a strong indication that the
model is effectively learning.
highered@esri.ca https://hed.esri.ca Twitter/X: GIS4HEd 9
Public
Building Footprints Tutorial
Step 7: Building Detection
Since the object detection process is hardware intensive, it is best to run the tool on a
smaller area to test your parameters before running it on a full imagery dataset.
a) On the ribbon, click the Map tab. In the Navigate group, click Bookmarks.
Choose Detection Area.
b) Under the Geoprocessing menu (in Analysis ribbon tab, click Tools), search
and select “Detect Objects Using Deep Learning.” Fill in each of the required
parameters in the tool:
o Input Raster: Lethbridge_RGB.tif
o Output Detected Objects: type Lethbridge_Buildings
o Model Definition: browse to and select
Lethbridge_1m_Building_Footprint_Extraction.dlpk
o Padding: 64
o Threshold: 0.2
o For Batch Size, type 8.
Change this to 4 or 2 if your computer runs out of memory.
o For tile_size, verify that the value is 256.
o Check the Non Maximum Suppression box (this removes duplicate
features)
o In Max Overlap Ratio, type 0
o Click Environments tab
o Under Processing Extent, set Extent to Current Display Extent
o For Cell Size choose same as layer Lethbridge_RGB
o Set Processor Type to GPU
o Click Run
It is crucial to understand that the results of your model may not be ideal on the first
attempt. Training and deploying a deep learning model is a process that often requires
multiple iterations to achieve the best outcomes.
Better results can be achieved by doing the following:
• Increasing your initial sample size of features
• Increasing the number of max epochs to 50 or 100
• Ensuring that your training samples are accurately capturing the features you
want to detect.
• Make sure your training samples include features of different sizes.
• Adjusting the geoprocessing tools' parameters.
• Retraining an existing model using the Train Deep Learning Model tool's
advanced parameters
highered@esri.ca https://hed.esri.ca Twitter/X: GIS4HEd 10
Public
Building Footprints Tutorial
See the two examples below before and after to inspect the results. Notice that it did
a fairly good job of detecting Buildings, although some are missing.
Before Transfer Learning
After Transfer Learning
highered@esri.ca https://hed.esri.ca Twitter/X: GIS4HEd 11
Public
Building Footprints Tutorial
The use of GIS does not stop here, but this tutorial does. If you are interested in the
next steps, you may look consider the following options:
➢ Publishing the extracted buildings layer to ArcGIS Online as a feature service
➢ Using dashboard to monitor and communicate the results with stakeholders.
In this tutorial, you used deep learning to extract building footprints from aerial imagery
in ArcGIS Pro. You chose a pre-trained model from ArcGIS Living Atlas and learned the
importance of matching your input data to the model’s expectations. Then, you applied
transfer learning to fine-tune the model’s performance on your imagery. Finally, you
applied the model to a Lethbridge neighborhood and obtained enhanced results.
Acknowledgements
• The Imagery for this tutorial was downloaded from the City of Lethbridge
OpenData Catalogue
• World Topographic Map sources: Esri, HERE, Garmin, FAO, NOAA, USGS,
OpenStreetMap contributors, and the GIS User Community
• World Hillshade layer sources: Esri, Airbus DS, USGS, NGA, NASA, CGIAR, N
Robinson, NCEAS, NLS, OS, NMA, Geodatastyrelsen, Rijkswaterstaat, GSA,
Geoland, FEMA, Intermap, and the GIS user community.
© 2024 Esri Canada. All rights reserved. Trademarks provided under license from Environmental Systems Research Institute Inc. Other
product and company names mentioned herein may be trademarks or registered trademarks of their respective owners. Errors and omissions
excepted. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. The Education
and Research Group at Esri Canada makes every effort to present accurate and reliable information. The Web sites and URLs used in this
tutorial are from sources that were current at the time of production but are subject to change without notice to Esri Canada.
highered@esri.ca https://hed.esri.ca Twitter/X: GIS4HEd 12
Public