王智慧

个人信息Personal Information

教授

博士生导师

硕士生导师

性别：女

毕业院校：大连理工大学

学位：博士

所在单位：软件学院、国际信息与软件学院

学科：软件工程

办公地点：大连理工大学开发区校区信息楼317室

联系方式：

电子邮箱：

扫描关注

同专业博导同专业硕导个人学术主页

论文成果

当前位置：王智慧老师 >> 科学研究 >> 论文成果

Weakly Supervised Fine-grained Image Classification via Correlation-guided Discriminative Learning

点击次数：

论文类型：会议论文

发表时间：2019-01-01

收录刊物：CPCI-S、EI

页面范围：1851-1860

关键字：Fine-grained Image Classification; Discriminative Region Grouping; Discriminative Feature Strengthening

摘要：Weakly supervised fine-grained image classification (WFGIC) aims at learning to recognize hundreds of subcategories in each basic-level category with only image level labels available. It is extremely challenging and existing methods mainly focus on the discriminative semantic parts or regions localization as the key differences among different subcategories are subtle and local. However, they localize these regions independently while neglecting the fact that regions are mutually correlated and region groups can be more discriminative. Meanwhile, most current work tends to derive features directly from the output of CNN and rarely considers the correlation within the feature vector. To address these issues, we propose an end-to-end Correlation-guided Discriminative Learning (CDL) model to fully mine and exploit the discriminative potentials of correlations for WFGIC globally and locally. From the global perspective, a discriminative region grouping (DRG) sub-network is proposed which first establishes correlation between regions and then enhances each region by weighted aggregating all the correlation from other regions to it. By this means each region's representation encodes the global image-level context and thus is more robust; meanwhile, through learning the correlation between discriminative regions, the network is guided to implicitly discover the discriminative region groups which are more powerful for WFGIC. From the local perspective, a discriminative feature strengthening sub-network (DFS) is proposed to mine and learn the internal spatial correlation among elements of each patch's feature vector, to improve its discriminative power locally by jointly emphasizes informative elements while suppresses the useless ones. Extensive experiments demonstrate the effectiveness of proposed DRG and DFS sub-networks, and show that the CDL model achieves state-of-the-art performance both in accuracy and efficiency.

上一条：CONTINUOUS SCALE ADAPTION FOR EFFICIENT BOX-BASED SCENE TEXT DETECTION

下一条：两阶段的视频字幕检测和提取算法