Application of random forest algorithm based on feature optimization in wetland information extraction:a case study of Honghu Wetland Nature Reserve in Hubei Province
XIA Ying1,2, LI Enhua1, WANG Xuelei1, ZHANG Yingying3, YANG Jiao1,2, ZHOU Rui1,2
(1. Key Laboratory for Environment and Disaster Monitoring and Evaluation of Hubei Province,Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences, Wuhan 430077, China;2.University of Chinese Academy of Sciences, Beijing 100049, China; 3.School of Geography and Tourism, Zhengzhou Normal University, Zhengzhou 450044, China)
Abstract:Taking Honghu wetland nature reserve as the research area, the random forest algorithm was used to extract the wetland information in the study area. Using Sentinel-2A remote sensing image data as the data source, kinds of characteristic variables, such as spectral characteristics, vegetation index, water index, red edge index and texture feature, were extracted. Under the framework of random forest algorithm, the importance of different feature variables was evaluated, and the feature selection was optimized by comparing the classification accuracy. The classification accuracy of random forest wetland information extraction based on feature optimization is compared with that of the traditional support vector machine and themaximum likelihood classification which are based on pixel classification. The proportion of the correct classification pixels in each algorithm is compared by using the two proportion Z test to count the distribution difference between the classification algorithms. The results are show as follows.1) when the number of features is 13, the classification accuracy reaches the maximum.With the increase of the number of features, the running time of the model increases and the classification accuracy shows a trend of decreasing.2) The blue light band has the highest importance score of 2.85, while visible light (B2, B3) and red edge index (IRECI, MCARI) rank in the top five. It is of great significance to extract the information of Honghu Wetland accurately.3) The classification accuracy of random forest method based on feature optimization is better thansupport vector machine and maximum likelihood method, and the overall accuracy is 6.02% and 7.57% higher, respectively. Through the test of classification accuracy difference,χ2 reaches 25.891 and 38.895, respectively, which has significant difference. Random forest algorithm classification based on feature optimization plays an important role in intelligent extraction of wetland information.
夏 盈,厉恩华,王学雷,张莹莹,杨 娇,周 瑞,. 基于特征优选的随机森林算法在湿地信息提取中的应用——以湖北洪湖湿地自然保护区为例[J]. 华中师范大学学报(自然科学版), 2021, 55(4): 639-648.
XIA Ying,LI Enhua,WANG Xuelei,ZHANG Yingying,YANG Jiao,ZHOU Rui,. Application of random forest algorithm based on feature optimization in wetland information extraction:a case study of Honghu Wetland Nature Reserve in Hubei Province. journal1, 2021, 55(4): 639-648.