1.Software School, East China Jiaotong University, Nanchang 330013, China;2.School of Information Engineering, East China Jiaotong University, Nanchang 330013, China;3.School of Cyber Science and Engineering, Wuhan University, Wuhan 430072, China
摘要为解决医疗资源不足、就诊量日增等问题,需设计基于计算机的乳腺癌图像识别模型,更高效地辅助病理医生的临床诊断工作.然而,现有算法多采用单类别特征完成识别,未充分发挥特征之间互补性.该文提出改进的自适应提升算法:在SIFT、Gist、HOG、VGG16特征提取基础上,改进有效区域基因选择(Effective Range Based Gene Selection,ERGS)算法,动态计算特征权重;采用自适应提升算法将弱分类器集成为强分类器,并对其输出的预估概率做ERGS加权,实现多特征融合.实验表明:1) 算法识别精准度达86.24%,较最强基线提高3.82%;2) SIFT、Gist、HOG特征之间具有较强互补性,它们有助于准确刻画乳腺癌图像;3) 阳性图像更易识别.
Abstract:To resolve the problems of insufficient medical resources and increase consultations, it is very necessary to design a computer-aided breast cancer image recognition model, which assists pathologists in their clinical diagnosis more effectively. However, existing works only use single feature to complete the recognition procedure. It means they didn't use the implicit complementarity among different features. To address the problem, an novel modified Adaboost algorithm is proposed. The traditional effective range based gene selection (ERGS) algorithm is modified to dynamically calculate the ERGS weight of each feature after several image features such as SIFT, Gist, HOG, and VGG16 are extracted. Then the Adaboost algorithm is utilized to build a strong classifier and the estimated probabilities are weighted by the corresponding ERGS weights to complete multi-feature fusion. Experimental results demonstrate as follows. 1)The accuracy of the proposed algorithm is about 86.24%, which is 3.82% higher than the most competitive baseline. 2)There is a strong complementarity among the SIFT, Gist and HOG features, which helps more accurately describe the breast cancer image. 3)Positive images are easier to be recognized by the proposed algorithm.