Tuesday, October 28, 2014

Canopy Height Models - An Object-Based Approaches

Canopy Height Models (CHM) derived from airborne LiDAR are nearly as old as LiDAR itself.  CHMs are typically raster representations of the tree canopy, but in some cases people have used the term to describe models that represent all features above ground, whether or not they consist of only canopy.  A true CHM is one in which other above-ground features such as buildings and utility lines are removed.

Even if a CHM is accurate in the sense that it only represents tree canopy LiDAR returns there are a two primary limitations with the most CHMs.  The first is that the CHM is stored in raster format.  Raster cells don't represent actual features and thus the data are less accessible to decision makers who may have questions such as "Where are the tallest trees in our community located?" and "How many trees over 80 feet do we have in our parks?"  The second limitation stems from the fact that LiDAR are often acquired leaf-off and thus a CHM derived from leaf-off LiDAR does not represent the canopy, but rather the occasional branch and stem that generated a return from the LiDAR signal.

As part of our tree canopy assessment for Boone, Campbell, and Kenton Counties in northern Kentucky that we carried out in collaboration with Mike Galvin (SavATree) for the Northern Kentucky Urban and Community Forestry Council we developed an object-based approach to canopy height mapping that overcomes the limitations of traditional CHMs.  Our object-based approach to land cover mapping integrates leaf-on imagery (A) and leaf-off LiDAR (B) to map tree canopy (C).  This process overcomes the limitations are inherent in the imagery (no clear spectral signature for trees) and the LiDAR (leaf-off returns resulting in tree canopy gaps) to create a highly accurate tree canopy map.  In this project the accuracy of the tree canopy class was 99%.  We then feed the LiDAR (B) and tree canopy (C) into a second object-based system that creates polygons approximating tree crowns and returns the max (D) and average (E) canopy height using only those LiDAR returns that are actually trees.  The result is a vector polygon database that can be easily queried and merged with other vector datasets for subsequent analysis.

This project would not have been possible without the LiDAR data distributed through the Kentucky from Above program.  If you would like to reference the graphic we have posted it to FigShare.


Tuesday, October 7, 2014

The Role of LiDAR Attributes in Feature Extraction

Over the past few weeks I have noticed a number of questions in online discussion forums around the topic of how LiDAR point cloud attributes, such as classification and return number, can be used to help identify or automatically extract features.  We have numerous other posts detailing our automated feature extraction workflow, specifically how we use an object-based approach to extract information from LiDAR, imagery, and other data sources.  In this post I would like to turn the focus to LiDAR, specifically how the the point cloud attributes can be used to highlight above ground features such as buildings and tree canopy.

Most of the LiDAR data that we work with is acquired using the USGS specification, with an average of 1-4 points per square meter.  As LiDAR datasets are typically acquired to support topographic mapping of the earth's surface they are done during leaf-off conditions.  As a LiDAR signal will be reflected by leaves, this increases the chance that  the laser signal will reach the ground.

As long as your LiDAR data is in LAS format each point contains a wealth of information beyond the elevation.  The LiDAR point attributes we will be most concerned with in this post are the class and the number of returns.  You can find out more about both of these attributes by reading up on the ASPRS LAS specification.  The class is assigned to each point, typically by the contractor who processed the data, using a semi-automated approach.  The most basic LAS classification will split the points into ground (class 2) and unclassified (everything else) points (either 0 or 1).  The graphics below show an example of LiDAR data in LAS format first symbolized by elevation and then symbolized by classification.
LiDAR point cloud.  Each point is colored by its absolute elevation.  Blue represents the low elevations and red the highest elevations.
LiDAR point cloud symbolized by class.  Green is ground, magenta is overlap, cyan is water, and red is unclassified.  Black areas are water that contain no LiDAR points as water absorbs the LiDAR signal.
The return information comes from the LiDAR sensor.  Discrete return LiDAR data typically have up to four returns.  The graphic below shows the same point cloud in which the points have been symbolized based on the number of returns.  Dense surfaces, such as buildings and ground, have a single return (red), but trees generally produce multiple returns (green, cyan, and blue).  The less dense structure of trees (particular deciduous trees that lack leaves) creates a return at the top of the tree, then other returns of off subsquent branches, and finally the ground.

LiDAR point cloud symbolized by return number.  Red indicates a single return, green - two returns, cyan - three returns, and blue - four returns.
These representations illustrate how point cloud information can provide insight into the type of feature.  For example, trees and buildings are both tall and assigned to a class other than ground or water.  When it comes to the number of returns we see that buildings have a single return whereas trees typically have more than one return.  The process of using a combination of class and return number to differentiate between trees and buildings becomes more clear when we generate raster surface models from the LiDAR point cloud.  A Normalized Digital Surface Model (nDSM) is a gridded dataset in which each pixel represent the height of features relative to the ground.  It is created by using the ground points (LAS class 2) to create a raster Digital Elevation Model (DEM), using the first returns to create a raster Digital Surface Model (DSM), then subtracting the DEM from the DSM.  The example below shows the nDSM for the same area as the point cloud examples from above.  Buildings and trees show up as tall (red and yellow), whereas non-tall features on the landscape such as roads and grass show up as short (blue).

Normalized Digital Surface Model (nDSM).
A similar approach is used to create a Normalized Digital Terrain Model (nDTM).  A DTM is generated from the last returns. The DEM is then subtracted from the DTM to create the nDTM.  The nDTM is very effective at highlighting buildings and suppressing trees.  This is because the height of the last returns for buildings (dense surfaces) is much greater than the ground as the LiDAR signal does not penetrate buildings.  As the LiDAR signal penetrates tree canopy in most cases the height difference between the DTM and DEM is often low.

Normalized Digital Terrain Model (nDTM).
Subtracting the nDTM from the nDSM highlights trees.  This is because the height difference of the first and last returns for buildings is often identical, whereas for trees it is typically much greater.
nDTM subtracted from the nDSM.
Although these LiDAR datasets are excellent sources by themselves for mapping features they are imperfect for tree canopy extraction due to the leaf-off nature.  To overcome this limitation we take an object-based approach in which we integrate the spectral information in imagery and use iterative expert systems that take into account context to reconstruct the tree canopy, filling in the gaps in the leaf-off LiDAR.  The result is a highly-accurate and realistic representation of tree canopy.  In general LiDAR gets us 80%-90% of the way there, then imagery the rest of the way.
Tree canopy extracted using an object-based approach overlaid on a hillshade layer derived from the nDSM.
Leaf-on imagery.
For more information on how to create the surface models mentioned in this post check out the Quick Terrain Modeler video tutorials.  If you want to generate raster surface models in ArcGIS this video will show you how.

Saturday, August 23, 2014

New Urban Tree Canopy (UTC) Assessment Project Map Portal

We have a new Urban Tree Canopy (UTC) Assessment Projects web mapping portal up.  The web site lists all the UTC projects completed by the USDA Forest Service's UTC assessment team (hopefully down the road we can add others), key information about each project, along with the ability to download the project report and high-resolution land cover data.  Credit for the web map goes to the brilliant Matt Bansak with database support from the SAL's Tayler Engel.

Monday, August 18, 2014

Generating road polygons from breaklines and centerlines

A number of years ago LiDAR was acquired for the entire state of Pennsylvania through the PAMAP program.  The LiDAR data are currently available from PASDA, and are a great resource.  In addition to point cloud and raster surface models, the deliverables also included breaklines.  Breaklines are great, but they are just lines representing the edge of the roads.  What if you want to calculate the actual road surface?  Using the road breaklines in combination with existing county road centerline data we developed an automated routine within eCognition to turn the breakline and centerline data into road polygons so that the actual road paved area can be computed.  This is another example of how the term "Object-Based Image Analysis" or "OBIA" no longer fits the type of work that we are doing with eCognition.

Here is how we went about it.
1) Turn the breaklines and centerlines into image objects.

2) Compute the Euclidian distance to the road centerlines.

3) Classify objects based on their relative border to the centerlines and breaklines, and the distance to centerlines.

4) Clean up the classification based on the spatial arrangement of the road polygons.

5) Vectorize the road objects and simplify the borders (yellow lines are the vector polygon edges, pink polygons are the original image objects).


Ecology of Prestige, exploring the evidence in NYC

We have new paper in Environmental Management that uses the SAL’s signature high resolution, 7-class, LiDAR-derived land cover data (3ft version available here for free). The current study replicates and extends a previous paper that uses the SAL’s land cover data in Baltimore. Between our work on mapping, assessing, estimating the carbon abatement potential, and the tree canopy affects on asthma and air quality, this dataset is getting quite a bit of use. We hope that because the data is freely available others will continue to use these data.

After using some fancy spatial statistics we are able to show that even after controlling for population density (available space for trees), socioeconomic status (available resources for trees), there is still quite a bit of variation – much of which is explained by lifestyle characteristics.

We conclude “To conserve and enhance tree canopy cover on private residential lands, municipal agencies, non-profit organizations, and private businesses may need to craft different approaches to residents in different market segments instead of a “one-size-fits-all” approach.  Different urban forestry practices may be more appealing to members of different market segments, and policy makers can use that knowledge to their advantage.  In this case, advocates may consider policies and plans that address differences among residential markets and their motivations, preferences, and capacities to conserve existing trees and/or plant new trees.  Targeting a more locally appealing message about the values of trees with a more appropriate messenger tailored to different lifestyle segments may improve program effectiveness for tree giveaways.  Ultimately, this coupling of theory and action may be essential to providing a critical basis for achieving urban sustainability associated with land management.


The paper was co-written with Northern Research Station scientist, J. Morgan Grove and Clark University doctoral student Dexter Locke

Monday, July 21, 2014

Comparing tree canopy percentages

NeighborhoodNewspapers.com is citing a study in which the tree canopy in Sandy Springs increased by 3% since from 2010.  This is a big jump in a 3-year period.  According to the Wikipedia entry Sandy Springs has 24,142 acres of land area.  The report from the study states that the tree canopy in 2010 was 59% and the tree canopy in 2013 was 62%.  Using these numbers we would conclude that Sandy Springs had 14,244 acres of tree canopy in 2010 and 14,948 acres of tree canopy in 2013, an increase of approximately 724 acres.  This equates to over 547 football fields of new tree canopy!  That is a lot of new growth in only three years  Furthermore, these calculations assume that there was no loss.  A quick look at some of the imagery from the 2010-2013 time period (thanks Google!) shows that there was tree loss in the area due to development.  In order to make gains of 3% there had to certainly be an increase of more than 724 acres sq km of new tree canopy over this time period.
Imagery from 2010.  Trees in the red circle were removed over the 3-year period.
Imagery from 2012 in which tree loss is clearly visible.
Unfortunately no good numbers exist for how much tree new tree canopy can be produced from an urban forest on an annual basis, but it is highly unlikely that there was enough new tree canopy to offset the loss in this 3-year period. After have completed dozens of tree canopy studies we are of the opinion that mapping tree canopy from imagery alone gets one to within ~2% of the actual area of tree canopy.  Registration errors between the imagery form each time period along with mapping errors contribute to the accuracy.  If the 2010 and 2013 estimates were +/- 2% accurate then a statement of 3% gain simply cannot be supported by the data.

Based on my quick assessment of the imagery I am inclined to believe that it is more likely that Sandy Springs lost tree canopy over the 2010-2013 time period.  In the short term it is far easier to document tree canopy loss than gain as trees are easy to remove but hard to grow.  The fact is that based on the available data we don't know if Sandy Springs lost or gained tree canopy over these 3 years, and it is highly unlikely that the 724 new acres of tree canopy appeared in such a short time.  Although we have published methods that provide increased precision for mapping tree canopy change, it is very difficult to accurately quantify changes in tree canopy over a short time period, such as was done in this study, without LiDAR.  My recommendation is that Sandy Springs wait 5 to 10 years between future tree canopy assessments and use the iTree protocols to analyze historical changes.

Monday, July 7, 2014

Data Fusion in eCognition Webinar


Today I gave a webinar showing some of the data fusion techniques we employ within eCognition to update features in vector layers.  eCognition often is seen as an OBIA (object-based image analysis) platform, but more recently it has morphed into a data fusion platform in which one can work with LiDAR point clouds, raster imagery, and vector points, lines, and polygons within a single environment.  The object data model breaks down the barriers that exist in working with these differing data types and the rule-based approach within eCognition allows for streamlined workflows in which everything from point cloud rasterization to vector overlay operations can be done within a single software package.

Although I am still working on the webinar recording I have posted the eCognition project, rule set, and associated data for download.  This will allow you to walk step-by-step through the example.  Please note that this project was designed to illustrate some techniques, not be a perfect feature extraction work flow.  Also note that although the LiDAR and imagery are for the Wicomico County, Maryland area, the vector data are fictional.

The data for this project consist of the following:

  • a LiDAR point cloud (LAS)
  • 4-band high-resolution imagery (raster)
  • Building polygons (vector shapefile)
  • Water polygons (vector shapefile)
  • Road centerlines (vector shapefile)
The scenario is one in which the water polygons and building polygons are out of date when compared to the imagery and LiDAR.  A rule-based expert system was developed in eCognition to identify missed water polygons, find false buildings that were either incorrectly digitized or removed, and add in new buildings that are present in the LiDAR.


The eCognition project contains a single LiDAR point cloud dataset in LAS format.
Point cloud tile used in the project
eCogniton project settings

The rule set consists of three main processes:

  1. Reset - clears the image object levels and removes all loaded and generated data.
  2. Load data - adds the imagery and vector layers to the project.
  3. Feature extraction - classifies water polygons missing
The reset process simply clears the project so that the end user can start from scratch.  In the Load Data portion of the rule set I use the create/modify project algorithm to add in the imagery and vector layers.  You can do this when you create the project, but I like to keep my project setup simple.  Plus, having the rule set load the data make it easier for me to reuse the rule set.
Imagery is displayed after being loaded as part of the rule set

The fist step of the feature extraction process focuses on classifying water that was missing from the vector layer.  Water clearly stands out in the imagery, so all it took was some basic segmentation algorithms followed by a threshold classification using the NIR values.  Once the missing water features were classified they were converted to a vector layer within eCognition and smoothed to look more realistic using the new vector algorithms introduced in version 9.

Existing water in the vector layer (blue) and new water features classified from the imagery (yellow)
To aid with the building extraction I used some of eCognition's LiDAR processing tools to take the point cloud and turn it into a Normalized Digital Surface Model (nDSM) in which each pixel represents the height above ground.  
LiDAR nDSM
I also used the roads layer to create a new raster in which each pixel represents the distance to the roads.  This distance to roads layer is used later in the rule set to remove features that are similar to buildings, such as large overhanging signs.
Roads overlaid on the new distance to roads raster
By converting the existing building vectors to objects we could use information in the LiDAR point cloud, specifically the average elevation of all points minus the average elevation of ground points, to get false buildings.  These locations are exported as a point file, along with the unique building ID, for use in GIS software.
Buildings in the vector layer (red) and false buildings (green)
Getting at the missed buildings was a bit more complex.  First I did a simple threshold to separate out the tall features.  Within the tall features I re-segmented them based on the imagery the classified those objects with a high difference in elevation between the first and last return and low NDVI as new buildings.  Some refinement was done using the road distance layer along with area thresholding.  Finally a series of vector processing algorithms were used to clean up the appearance of the new buildings.