王延章

个人信息Personal Information

教授

博士生导师

硕士生导师

任职 : 电子政务模拟仿真国家地方联合工程研究中心主任

性别:男

毕业院校:大连理工大学

学位:博士

所在单位:信息与决策技术研究所

电子邮箱:yzwang@dlut.edu.cn

扫描关注

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

An improved random forest-based rule extraction method for breast cancer diagnosis

点击次数:

论文类型:期刊论文

发表时间:2020-01-01

发表刊物:APPLIED SOFT COMPUTING

收录刊物:EI、SCIE

卷号:86

ISSN号:1568-4946

关键字:Breast cancer diagnosis; Rule extraction; Random forest; Interpretability; MOEAs

摘要:Breast cancer has been becoming the main cause of death in women all around the world. An accurate and interpretable method is necessary for diagnosing patients with breast cancer for well-performed treatment. Nowadays, a great many of ensemble methods have been widely applied to breast cancer diagnosis, capable of achieving high accuracy, such as Random Forest. However, they are black-box methods which are unable to explain the reasons behind the diagnosis. To surmount this limitation, a rule extraction method named improved Random Forest (RF)-based rule extraction (IRFRE) method is developed to derive accurate and interpretable classification rules from a decision tree ensemble for breast cancer diagnosis. Firstly, numbers of decision tree models are constructed using Random Forest to generate abundant decision rules available. And then a rule extraction approach is devised to detach decision rules from the trained trees. Finally, an improved multi-objective evolutionary algorithm (MOEA) is employed to seek for an optimal rule predictor where the constituent rule set is the best trade-off between accuracy and interpretability. The developed method is evaluated on three breast cancer data sets, i.e., the Wisconsin Diagnostic Breast Cancer (WDBC) dataset, Wisconsin Original Breast Cancer (WOBC) dataset, and Surveillance, Epidemiology and End Results (SEER) breast cancer dataset. The experimental results demonstrate that the developed method can primely explain the black-box methods and outperform several popular single algorithms, ensemble learning methods, and rule extraction methods from the view of accuracy and interpretability. What is more, the proposed method can be popularized to other cancer diagnoses in practice, which provides an option to a more interpretable, more accurate cancer diagnosis process. (C) 2019 Elsevier B.V. All rights reserved.