Gabor transform
The Gabor transform has rarely been used as a feature in a landscape classification OBIA approach but has been used in other OBIA processes such as fingerprint enhancement and human iris detection and for data dimensionality reduction24,29,30,31,32,33,34,35. Gabor filters are a bandpass filter applied to an image to identify texture. The different Gabor bandpass filters mathematically model the visual cortical cells of mammalian brains and thus is expected to improve segmentation and classification accuracy when compared to a human delineated and classified image26,27.
Samiappan et al.36 compared Gabor filters to other texture features (grey-level co-occurrence matrix, segmentation-based fractal texture analysis, and wavelet texture analysis) within the GEOBIA process, of a wetland, using sub-meter resolution multispectral imagery. These Gabor filters performed comparably, in overall classification accuracy and Kappa coefficients, with other texture features. However, they were still outperformed by all other texture features. This study did not use any other data for analysis for determining the performance of Gabor filters when paired with data sources such as spectral, NDVI, or LiDAR36,37. Wang et al.38 paired a Gabor transformation with a fast Fourier transformation for edge detection on an urban landscape image that contained uniform textures with promising results. Su30 used the textural attributes derived from Gabor filters for classification but had similar results to Samiappan et al.36 where they found that Gabor features were one of the least useful/influential that contributed to the classification of a mostly agricultural landscape.
Gabor filters are a Fourier influenced wavelet transformation, or bandpass filter, that identifies texture as intervals in a 2-D Gaussian modulated sinusoidal wave. This modulation differentiates the Gabor transform from the Fourier transform23,26. These Gabor transformed wavelets are parameterized by the angle at which they alter the image and the frequency of the wavelet. Rather than smoothing an image at the cost of losing detail through Fourier transforms or median filters, Gabor transformed images identify the repeated pattern of localized pixels and gives them similar values if they are a part of the same repeated sequence. Gabor features can closely emulate the visual cortex of mammalian brains that utilize texture to identify objects26,27. This is based on the evaluation of neurons associated with the cortical vertex that respond to different images or light profiles39. Marcelja27 identified that cortical cells responded to signals that are localized frequencies of light like what is represented by the Gabor transformations. Within the frequency domain, the Gabor transform can be defined by Eq. (1):
$$Gleft(u, v;f, theta right)= {e}^{-frac{{pi }^{2}}{{f}^{2}} ({gamma }^{2}({u}^{{prime}}-f{)}^{2}+{n}^{2}{v}^{{{prime}}2})}$$
(1)
where (f) is the user-determined frequency (or wavelength); (theta) is the user-determined orientation at which the wavelet is applied to the image; (gamma) and (n) are the standard deviations of the Gaussian function in either direction23,38. These parameters define the shape of the band pass filter and determines its effect on one-dimensional signals. Daugman26, created a 2-D application of this filter in Eq. (2);
$$gleft(u,vright)= {e}^{-{pi }^{2}/{f}^{2}[{gamma }^{2}{left({u}^{{prime}}-fright)}^{2}+{n}^{2}{{v}^{{prime}}}^{2}]}$$
(2)
where u’ = ucos − vsin θ θ and v’ = usin − vcos θ.
In order to implement Gabor filters on multi-band spectral images, we used Matlab’s Gabor feature on the University of Iowa’s Neon high performance computer (HPC)40 which has up to 512 GB of RAM, which was necessary for processing these images. The first implementation of Gabor filters was performed on a 1610 × 687 single band pixel array (a small subset of the study area), a filter bank of 4 orientations and 8 wavelengths, on a 32 GB RAM computer, and took approximately 8 h to complete. Filter banks are a set of Gabor filters with different parameters that is applied to the spectral image and are required to identify different textures with different orientations and frequencies. By lowering the number of wavelengths from 8 to 4 on an 8128 × 8128 single band pixel array on the same machine 32 GB RAM, the processing was reduced to an hour. Using the HPC, this was further reduced to approximately 90 s using the same filter bank. Before implementing on the HPC, the original spectral image was divided into manageable subsets with overlap in order to prevent ‘edge-effect.’ These images were converted to greyscale by averaging values across all three bands33. When wavelengths become too long, they no longer attribute the textural information desired from the image and therefore add unnecessary computing time. The wavelengths that were used for the filter bank were selected as increasing powers of two starting from 2.82842712475 ((24/sqrt{2})) up to the pixel length of the hypotenuse of the input image. From this, we used only 2.82842712475, 7.0710678, 17.6776695, and 44.19417382. The directional orientation was selected as 45° intervals, from 0 to 180: 0, 45, 90, 135. These parameters were based on the reasoning outlined within Jain and Farrokhina25. More directional orientations could have been included but four were used for computational efficiency. The radial frequencies were selected so that they could capture the different texture in the landscape represented by consistent changes in pixels values within each landcover class. When frequencies are too wide or fine of a width they no longer represent the textures of the different landcover classes and thus are not included. This selection of filter bank parameters are similar or the same as other studies that look into the use of Gabor features for OBIA25,30,31.
From the different combinations of parameters (four directions and four frequencies) in the Gabor Transform filter bank, sixteen magnitude response images were created from the converted greyscale three band average image. To limit high local variance within the output Gabor texture images, a Gaussian filter was applied. The magnitude response values were normalized across the 16 different bands so that a Principal Component Analysis (PCA) could be applied. The first principal component of the PCA, from these Gabor transformed images, was used for this study since it limits the computation time to process 16 separate Gabor features, in addition to the other data sources, while still retaining the most amount of information from the different Gabor response features. The Gabor band that was used for this study can be viewed in Fig. 2.
Segmentation
For this study, we used the watershed algorithm for the segmentation of GEOBIA, implemented by ENVI version 5.0 Feature Extraction tool, due to its ubiquitous use within GEOBIA, its ability to create a hierarchy of segmented objects, and support within the literature as a reliable algorithm37,41,39,43. The watershed algorithm can either use a gradient image or intensity image for segmentation. Based on the observed results, this study used the intensity method. The intensity method averages the value of pixels across bands. Scale, a user-defined parameter, is selected to identify the threshold that decides if a given intensity value within the gradient image can be a boundary. This allows the user to decide the size of the objects created. A secondary, user-defined, parameter defines how similar, adjacent, objects need to be before they are combined or merged. The user arbitrarily selects the parameter value based on how it reduces both under and over segmentation. The parameters selected for this study were visually chosen based on a compromise between over and under segmentation relative to the hand demarcated objects.
The merging of two separate objects was based on the full lambda schedule where the user selects a merging threshold ({t}_{i, j}) which is defined by Eq. (3):
$${t}_{i, j}= frac{frac{left|{O}_{i}right|cdot left|{O}_{j}right|}{left|{O}_{i}right|+ left|{O}_{j}right|}cdot {Vert {u}_{i}-{u}_{j}Vert }^{2}}{mathrm{length}(mathrm{vartheta }left({O}_{i},{O}_{j}right))}$$
(3)
where ({O}_{i}) is the object of the image, (left|{O}_{i}right|) is the area of (i), ({u}_{i}) is the average of object (i), ({u}_{j}) is the average of object (j), (Vert {u}_{i}-{u}_{j}Vert) is the Euclidean distance between the average values of the pixel values in regions (i) and (j), and (mathrm{length}left(mathrm{vartheta }left({O}_{i},{O}_{j}right)right)) is the length of the shared boundary of ({O}_{i}) and ({O}_{j}).
To compare the segmentation of a riparian landscape, with and without Gabor features, we conducted segmentation on two separate sets of data. One dataset was a normalized stacked layer of NDVI and CHM (see Fig. 3) with the original multispectral image used as ancillary data; the other dataset differed only by the inclusion of the Gabor feature. For both instances, the bands were converted to an intensity image by averaging across bands rather than being converted into a gradient image for segmentation. The dataset that included the Gabor features had a scale parameter set at 30 with merge settings at 95 and 95.7 for the sub and super-objects, respectively. The dataset that did not include the Gabor features had a scale parameter of 10 with merge settings at 95.6 and 98.5 for the sub and super-objects, respectively. This resulted in the creation of 87,198 and 62,905 segments for the sub and super objects, respectively, that were created when the Gabor feature was included. 191,050 and 51,664 segments were created for the sub and super objects when the Gabor features, respectively, were not included within the segmentation process. As you will see in the next section, these segments also represent the number of training data that will be included within the supervised classification.
To create a hierarchy of land cover classes, two sets of segmentation parameters needed to be selected for each dataset. One set of parameters would be used for the sub-objects within the hierarchy and the other set would be used to create super-objects. All parameters used the intensity and full lambda schedule algorithms for the watershed method. The only setting that changed between the sub and super-objects, for either dataset, was the merge parameter which helped maintain similar boundaries as much as possible. Despite this, boundaries could moderately change due to the Euclidean distance, between the pixel values of (i) and (j), changing from the merging of objects; causing ({t}_{i, j}) to cross the threshold which results in a new boundary being drawn. A representation of these results can be viewed and visually compared to the hand demarcated objects in Fig. 4.
Training data
The training data, used for this study, is the transfer of class attributes from hand demarcated and classified segments to automatically segmented objects based on the majority overlap of the hand demarcated segments. Experts identified them using two different classification schemes referenced from the General Wetland Vegetation Classification System44. The 7-class scheme within this system identified objects of either being forest, marsh, agriculture, developed, open water, grass/forbs, or sand/mud. The 13-class scheme identified objects of either being agriculture, developed, grass/forbs, open water, road/levee, sand/mud, scrub-shrub, shallow marsh, submerged aquatic vegetation, upland forest, wet forest, wet meadow, and wet shrub. Not every class from the 7-class scheme will have a sub-class (i.e. developed, open water) but some do for example wet and upland forest are sub-objects of the forest class and wet meadow and shallow marsh are sub-objects of marsh. Figure 5 visually illustrates both classification schemes across the study area.
ENVI’s feature extraction tool calculates several landscape, spectral, and textural metrics. These attributes were used for each random forest classifier. The Gabor and Hierarchical features will be included selectively to be able to compare their contributions to the (out-of-bag) OOB classification errors. When Gabor features are included within the classification, they are computed the same way as the other image bands.
Random forest
The random forest classifier was implemented in R using the random forest module45. The number of trees, that were randomly generated, was large enough (n = 250) to where the Strong law of large numbers would take effect as indicated by the decrease in the change of accuracy. The default number of variables randomly sampled as candidates at each split variable (mtry parameter) was the total number of variables divided by 3 for each dataset. R also generates two separate variable indices: mean decrease in accuracy and mean decrease Gini. Mean decrease in accuracy refers to the accuracy change in the random forest when a single variable is left out. This is a practical metric to determine the usefulness of a variable. The Gini index measures the purity change within a dataset when it is split based upon a given variable within a decision tree.
The random forest classification accuracy will be based on the OOB error. The random forest algorithm trains numerous decision trees on random subsets of the training set leaving out a number of training samples when training each decision tree. The samples that are left out of each decision tree are then classified by the decision tree that they were not included within during the training step. The OOB error is the average error of each predicted bootstrapped sample across the ensemble of decision trees within the random forest algorithm.
Figure 6 illustrates how the Gabor and hierarchal features were included within the classification of the super and sub-objects.
Hierarchical scheme
To attribute the hierarchical structure to the sub-objects, we first classified the larger segments that were created with and without the Gabor features using the broader 7-class scheme. These classified super objects were then converted to raster to calculate the majority overlap with the smaller sub-objects. This gave the sub-objects an attribute, the broader 7-class scheme, that could be used to contribute to the classification of the sub-objects with the finer 13-class scheme. This builds the hierarchical relationship between the two class schemes into the supervised classification of the sub-objects. Figure 6 illustrates how the hierarchal structure was included within two of the four sub-object’s list of features used within classification. This methodological approach aligns with O’Neill et al.21 landscape ecology principle that a super-object’s class could be a useful property in defining or predicting a sub-object. This is also different than the more common rule-based approach of iteratively classifying the landscape into smaller and smaller sub-classes22.
Segmentation assessment
Most studies rely upon the accuracy assessment of their classifiers to provide support for their analysis results. However, this does not provide evidence whether a new data fusion technique improves the ability to delineate objects of interest within an image. To assess the performance of our segmented polygons, this study evaluated the segments created with and without the Gabor feature using a method highlighted in Xiao et al.37.
Our segmentation results were evaluated using an empirical discrepancy measure, used frequently in image segmentation evaluation37,46,47. Discrepancy measures utilize ground truth images that represent the “correct” delineated/classified image to compare the semi-automated image results. In our study, the objects that were delineated and classified by experts from the U.S. Fish and Wildlife Service, were used as training data for our random forest classifier and as ground truth for the discrepancy measure. The discrepancy measure used the percentage of right segmented pixels (PR) in the whole image. To calculate PR, we converted the classified segmented and ground truth polygons to raster and measured the ratio of incorrect pixels to total amount of pixels which was converted to a percentage.
Additionally, landscape metrics were calculated using FRAGSTATS48, an open source program commonly used for calculating landscape metrics. FRAGSTATS computed these metrics from thematic raster maps that represent the land cover types of interest. These thematic classes, used for analysis, were the classified objects at both the super and sub-object level. Since we are not attempting to compare the segmentation results for any specific class or area, we calculated metrics on a landscape level. Landscape metrics will represent the segmentation patterns for the entire study area.
FRAGSTATS can calculate various metrics representing different aspects of the landscape. The metrics for analysis attempts to understand object geometry. The metrics calculated, for these analyses, were the average and standard deviation for the area (AREA), the fractal dimension index (FRAC), and the perimeter area ratio (PARA). The number of patches (NP) was also included in each result. To take a more landscape centric approach, the area weighted mean was chosen over a simple average.
Source: Ecology - nature.com