Machine Learning Strategies for Enhancing Bathymetry Extraction from Imbalanced Lidar Point Clouds
Title | Machine Learning Strategies for Enhancing Bathymetry Extraction from Imbalanced Lidar Point Clouds |
Publication Type | Conference Proceedings |
Year | 2019 |
Authors | Lowell, K, Calder, BR |
Conference Name | Oceans 2019 |
Conference Dates | Oct 27-31 |
Publisher | IEEE |
Conference Location | Seattle, WA |
Keywords | bathymetric lidar, confusion matrix decomposition, Extreme Gradient Boosting, imbalanced samples, probability decision threshold |
Abstract—Density-based approaches to extract bathymetry from airborne lidar point clouds generally rely on histogram/frequency-based disambiguation rules to separate noise from signal. The present work targets the improvement of such disambiguation rules by enhancing each pulse with a machine learning-based estimate of its p(Bathy) – i.e., its probability of truly being bathymetry. Extreme gradient boosting (XGB) is used to assess the strength of bathymetric signal in pulse return metadata. Because lidar point clouds can be highly imbalanced between Bathymetry and NotBathymetry, three strategies for mitigating the effects of imbalanced samples were examined. Impacts of an imbalanced lidar point cloud were successfully mitigated by:
However, decomposing a confusion matrix by iteratively discarding misclassified points and re-fitting an XGB model was not successful in improving the strength or detectability of the bathymetric signal in the metadata. The same was true for iteratively discarding correctly classified points. The bathymetric signal in the metadata was found to be sufficiently strong to explore the operational incorporation of results into the disambiguation rules of density-based bathymertric extraction methods. |