Here we demonstrate a multi-scale LiDAR based approach to determine urban tree AGB for the London Borough of Camden, UK (Fig. 1). A new ALS ITD method is presented to identify and attribute individual trees with structure metrics. TLS is used to derive new allometry at four locations across the Borough, transferable tree structure metrics are identified and used to model tree volume. The new allometry is subsequently applied to the ALS segmented tree crowns to generate a Borough-wide map of AGB. To the best of our knowledge, LiDAR based ITD, to derive structural information for use in allometry, has not been previously applied in an urban context.

Fig. 1 A map of the London Borough of Camden and location in UK (right). Field locations are identified in italics. Contains OS data ©Crown copyright and database right (2018) Full size image

Table 1 TLS scanning location and description Full size table

Location

The London Borough of Camden is located in inner north west London and comprises an area of 21.8 km2 (Fig. 1). The area was once forested but was extensively developed during the nineteenth and twentieth centuries to a mix of residential and industrial land use. Camden was chosen as it is typical of inner London Boroughs, containing a range of urban land cover types (“unmanaged” urban forest, large managed parks, tree-lined streets, private gardens, industrial areas and transport infrastructure e.g. train lines) encompassing a broad range of tree and forest management strategies, age structures, species composition and municipal functions. Camden also has good coverage of recent UK Environment Agency (UK EA) ALS. The Borough contains the suburbs of Camden Town and Hampstead, large areas of park land, including Hampstead Heath, and a number of smaller public squares and private gardens.

The Borough is home to ~ 28,000 street trees with an additional 10–15 K trees in parks and nature reserves [46]; however, this does not include trees located in City of London managed parks as well as other private land. For example, there are an estimated 30 K additional trees on Hampstead Heath in the north of the Borough (pers. comm. David Humphries, Trees Management Officer, City of London). Street tree species are dominated by Platanus x acerifolia (London Plane) 15% and Tilia europaea (Common Lime) 7%; all other species (\(N=242\)) comprise ≤ 4% each.

To derive new allometry for the Borough, four locations were scanned with TLS (Fig. 1 and Table 1). The locations were chosen for their representativeness of park and street trees in Camden, Highgate Cemetery was chosen after preliminary analysis suggested the area contained very high AGB.

TLS acquisition and processing

TLS was captured with a RIEGL VZ-400 laser scanner (RIEGL Laser Measurement Systems GmbH) which has a beam divergence of 0.35 mrad, a pulse repetition rate of 300 KHz, a maximum range of 600 m and can record multiple returns. For all locations, the scanning resolution was set to an angular step of 0.04° as this has previously proved sufficient for tree extraction and QSM modelling [47]. As the RIEGL VZ-400 captures data in a panoramic field of view (100° in zenith when the scanner is upright), it is necessary to tilt the scanner by 90° to capture the full hemisphere. To capture data from multiple viewing positions and reduce the effects of occlusion, a number of scan positions were captured at each location (Table 2). To co-register scan positions it is necessary to have tie-points between scans that are easily identified in post-processing, here this was achieved using cylindrical retro-reflective targets mounted on poles [47]. Survey pattern was different for each location based upon tree density, leaf status, access and time constraints; mean distance between scan locations are presented in Table 2.

Table 2 Details of TLS scanning Full size table

Point clouds from each scan were co-registered using RIEGL RiSCAN Pro software. Individual trees were then identified and extracted using the treeseg software library [48]. V was estimated using the QSM approach of Raumonen et al. [45], where the patch size variable \(d_{min}\), which controls the size of cover sets used to generate cylinders (and ultimately the topological detail captured), was iterated over [48]. As the initialisation of each of QSM reconstruction is stochastic, 10 reconstructions for each tree point cloud and for each \(d_{min}\) value were generated [26], this resulted in up to 160 reconstructions per tree. The set of reconstructions with the largest value of \(d_{min}\) that produced satisfactory results [48] were chosen, from these the reconstruction with a volume closest to the mean was retained.

To reduce uncertainty in tree volume and subsequent allometry, point clouds and QSMs had to meet certain quality criteria to be considered for use in allometry development. These criteria were; (i) the mean nearest neighbour distance (computed as the mean Euclidean distance between a point and its four closest neighbours [47]) computed for each 1 m slice through a tree point cloud had to be ≤ 5 cm (excluding the uppermost slice), (ii) the 95% confidence level for the 10 QSM reconstructions for each tree point cloud had to be ≤ 10% of volume, and (iii) the point cloud had to be unaffected by wind i.e. no shadowing of branches visible in the point cloud. The set of trees that fulfilled this criteria, referred to as QSM trees, were used to construct allometric equations (see below).

TLS extracted trees could not be reliably mapped to a tree species, instead a mean wood density value for the dominant species on a per location basis (Table 1) were taken from the Global Wood Density Database [49].

ALS acquisition and processing

The UK EA capture ALS data over England primarily for flood risk mapping, this is distributed through an Open Government Licence by the UK Environment Agency as 1 km2 .las tiles [50]. Data for the area covering Camden were acquired on 6th February, 2015, at a pulse density of 2 pulses m–2 (calculated as the density of first returns in an open area) where for each outgoing pulse a maximum of 4 returns were recorded. Environment agency LiDAR data are captured to a vertical accuracy of ± 5 cm and a horizontal accuracy of ± 40 cm [51].

Data for the area intersecting the Camden Borough boundary were extracted from the global dataset. 5% of the Borough extent fell outside of the LiDAR footprint, previous UK EA acquisitions have been preprocessed to remove the majority of vegetation returns (Alastair Duncan, UK EA, pers comm) and were therefore unsuitable for filling gaps. Data were ground-normalised using the LAStools lasheight tool [52] so that z values were relative to the ground plane. A filter to remove points where \(z \le 1\) m was then applied to remove ground and other low returns.

Segmenting trees from Airborne LiDAR

Clustering techniques group individual data points into features sets that share some commonality. With regard to LiDAR data, features are often identified as groups of points connected in 3D space, such as street furniture [53] or tree crowns as discussed here. Some techniques require the number of features a priori e.g. k-means clustering, local maxima identified in the CSM are used to prime the algorithms as well as seed points from which clustering is initiated [29, 54]. Examples of cluster approaches that rely solely on the 3D point data included the Mean Shift algorithm [55] which uses a variable kernel to determine the search window size for which points are clustered and PTrees [56] which uses a multi-scale segmentation selecting the most likely segments as crown clusters. However, both of these approaches have only been applied to small forest plots and may not scale to large city-wide datasets owing to their complexity. Here we demonstrate a LiDAR point cloud based clustering approach that identifies individual tree crowns without additional imagery and that is scalable to large urban areas (Fig. 2).

Fig. 2 Individual tree detection work flow (i–vi) for segmenting ALS data into tree crowns, the bottom panel shows a TLS derived crown map as a comparison. Letters in panels 4 and 5 refer to common issues with the ITD crown segmentation where; A a small crown subsumed into a larger one, B remaining building points increasing crown area, C over segmentation of crowns, D commission errors, E under segmentation of crowns, and F omission errors (particularly of suppressed trees). Presented data is of Malet Street (Table 1) Full size image

A point cloud D contains points p where \(D = \{p^N\}\) and \(N = |D|\). Each \(p \in D\) is a set of coordinates and other metadata associated with the .las format, for simplicity we need only consider \(\{\mathbf {a}, rn\}\) where \(\mathbf {a}\) = (x, y, z) coordinate vector and rn refers to the “Number of Returns” metafield [57]. The aim is to compute a set of clusters \(C = \{c^N\}\) where cluster c corresponds to an individual tree crown. Each cluster \(c =\{P, H, Ar, r\}\), where P is the point cloud that corresponds to the tree crown, H is the maximum \(p_z \in P\), Ar is the projected crown area calculated as a 2D convex hull \(\forall p \in P\) [58] and \(r=\root \of {\dfrac{Ar}{\pi }}\), r was derived to simplify regression of crown dimensions with H (see below).

As urban areas are a patchwork of buildings, roads, trees, other green spaces etc., not all non-ground LiDAR returns are backscattered from tree crowns; therefore, \(D = C + \epsilon\) where \(\epsilon\) needs to be filtered before clustering can commence. This was achieved by firstly filtering D so that \(\forall p \in D: p_{rn} > 1\) [59, 60]. This step removes the majority of buildings and other hard surfaces, which tend to backscatter a single return i.e. \(p_{rn} = 1\) (Fig. 2ii). The majority of remaining points were resultant from vegetation backscatter, as well as from building edges, roof mounted air conditioning units and aerials, cranes etc [60]. This step also vastly reduces data volume, decreasing processing time in subsequent steps.

D was segmented into C using a two-step cluster approach. Here we use Density-Based Spatial Clustering of Applications with Noise (DBSCAN) [61] as a low pass filter to identify discrete tree crowns and canopies (Fig. 2iii) followed by Balanced Iterative Reducing and Clustering using Hierarchies (BIRCH) [62] to extract individual trees from canopy segments (Fig. 2iv). DBSCAN and BIRCH were both implemented using Python Scikit-Learn [63].

DBSCAN is suited to ITD from LiDAR point data as (i) |C| is not required as an a priori input, (ii) features can be of an arbitrary shape and size, (iii) outliers \(\epsilon\) are removed, examples here include linear features e.g. building edges, where points do not fulfil the criteria (i.e. density) to form a cluster, and (iv) efficient scaling to large datasets. Ayrey et al. [64] used DBSCAN to identify and remove understorey shrubs from an ALS dataset captured over a conifer forest. DBSCAN requires two parameters, a neighbourhood radius eps and a minimum number of points min_sample so that c is considered a cluster when \(|c_P| > min\_sample\) and \(p \in c_P\) if \(\Vert p - q\Vert < eps\). Values for eps and \(min\_sample\) are a function of crown morphology and the ALS point density, \(min\_sample\) increases monotonically with eps. If eps is too small, crowns tend to be split into sub-crown components (both horizontally and vertically) as well as an increase in false positive. If eps is too large then features of interest are ignored. Here, eps and \(min\_sample\) were set to 3.5 m and 20 points respectively, this allows for smaller features to be identified (\(\root \of {\pi 3.5}\approx 38\) m2) where point density ~ 2 points m–2.

DBSCAN will concatenate adjacent, or density-connected, points into larger clusters that have a radius \(>eps\) [61]. This is desirable as it allows c to have an arbitrary shape and size capturing the idiosyncrasies of a tree crown. However, this behaviour also leads to the merging of c into canopies, where points from adjacent crowns are in close enough proximity (Fig. 2). This is further exacerbated by low LiDAR point density that require lower values of \(min\_sample\). BIRCH is therefore applied to further segment the output of DBSCAN into its constituent crowns if:

$$\begin{aligned} \beta + \alpha (c_{H}) < c_{r} \end{aligned}$$ (1)

where \(\alpha\) and \(\beta\) were determined empirically from a regression of TLS derived maximum canopy height with the 95\({\mathrm {th}}\) percentile prediction interval of crown radius (Fig. 3). Prediction interval was chosen as the dependent variable to avoid segmenting larger crowns.

Fig. 3 Local and Borough-wide thresholds for initiating BIRCH as well as the Borough-wide \(B_t\) regression. Crowns that fall within the shaded area were further segmented with BIRCH Full size image

BIRCH is a hierarchical clustering algorithm that has two parameters; maximum radius of a cluster \(B_t\) (if \(c_r > B_t\) the cluster is split) and the total number of clusters \(B_N\). \(B_t\) was calculated in a similar way to the left hand side of Eq. 1 where instead crown radius was the dependent variable in the regression.

$$\begin{aligned} B_t = \beta + \alpha (c_{H}) \end{aligned}$$ (2)

Once BIRCH was initiated, it ran as a loop iteratively dividing c into smaller clusters for which \(B_t\) was recalculated. Division of clusters ceased when \(c_r \ge \beta + \alpha (c_H)\) for all new clusters. For each iteration of BIRCH was run twice; for the first run \(B_N\) was not set allowing BIRCH to return a non-optimal set of clusters constrained only by \(B_t\). For the second run \(B_N\) is set to the number of crowns identified in the first iteration, this producing an optimal segmentation [63].

ALS ITD models were developed using the set of QSM trees from each location (‘local’) and using all QSM trees (‘Borough-wide’). For each model, the functions that were used to split large c and determine \(B_t\) were computed as illustrated in Fig. 3.

Upscaling TLS volume estimates to ALS

Individual tree volume can not be directly measured with low pulse density ALS in a similar way to the TLS methods described above. Instead, ALS derived tree structure metrics are often used to infer volume and AGB. However, regression models computed using a suite of ALS variables can be idiosyncratic and only suitable for the domain in which they were derived [30]. In an urban context, there are a number of different forest types and scenarios which may preclude empirical modelling with multiple parameters. Further, as the aim is to extract and measure individual trees from both TLS and ALS instruments, metrics need to have an analogue for both measurement techniques. Considering these factors, maximum crown height H and projected crown area Ar were used as independent variables in the development of allometric equations [31, 33].

C was computed using the Borough-wide ALS model and exported as polygon vector layer of 2D crown envelopes attributed with Ar and H. Some cleaning was required (\(<3\%\) of polygons) to remove duplicate trees (usually vertically offset) as well as false positives e.g building edges, cranes etc., these were easily identified as having maximum crown heights greater than expected. Polygons with an area < 10 m2 were also removed as the tended to coincide with building edges. TLS derived allometric equations were then applied to estimate V for each polygon. To convert V to AGB, an estimate of mean wood density was derived by mapping the trees in the Camden Council street tree database to a wood density value in the Global Wood Density Database [49]. Trees were first mapped at the species level (\(N=9526\)) and then, if no match was found, at the genus level (\(N=10,973\)); 287 trees could not be matched at either level and were disregarded. A mean wood density of 537 kg m–3 (s.d. 0.08 kg m–3) was used to convert V to AGB.

Allometry uncertainty analysis

A Monte Carlo (MC) approach was used to identify and quantify uncertainties in allometry-derived AGB estimates [65, 66]. MC methods allow for complex and non-linear uncertainty to propagate to estimates of AGB. Estimates of uncertainty are computed by running the model N times where for each iteration the model input parameters are drawn from a probability density function (PDF) that characterises the uncertainty. Individual inputs can be also be isolated by freezing the other inputs, allowing for an estimate of their contribution to overall uncertainty.

Three potential sources of error were identified in the derivation and application of the allometry: (1) QSM estimates of V, (2) ALS-derived H and Ar, and (3) wood density values. Variability in TLS-derived tree structure parameters (H and Ar) were tested by random subsampling of TLS points clouds (\(N=100,\) \(\sigma =0.75\)); RMSE for H was < 0.05 and < 1.8 m for Ar; therefore, TLS-derived structure was not considered in the MC analysis. QSM uncertainty was estimated on a per tree basis using the 10 reconstructions, the mean and standard deviation of V were used to parametrise a Gaussian PDF. A sample of \(c \subset C\) (\(N=250\)) was used to estimate uncertainty in ALS derived crown structure. \(c_P\) were randomly subsampled (\(N=100\), \(\sigma =0.75\)) where H and Ar were calculated for each iteration. The standard deviation of H and Ar were then used to generate PDFs of measurement uncertainty for each extracted crown in C. Finally, a non-parametric PDF of wood density was constructed using wood density values mapped to each tree in the Camden street tree database.

For different scenarios, different sources of uncertainty were considered. When computing TLS AGB, wood density values were set to that of the dominant species, therefore, only QSM uncertainty was considered. When calculating ALS derived AGB at each of the TLS locations wood density was again assumed known and uncertainty in QSM and ALS measurements were computed. When computing AGB estimates for the entire Borough all sources of uncertainty were considered. For all scenarios, 100 MC simulations were run.