location: Current position: Home >> Scientific Research >> Paper Publications

基于动态贝叶斯网络的音视频联合说话人跟踪

Hits:

Date of Publication:2008-01-01

Journal:自动化学报

Affiliation of Author(s):软件学院

Issue:9

Page Number:1083-1089

ISSN No.:0254-4156

Abstract:Multi-sensor data fusion technique is applied to speaker tracking problem, and a novel audio-visual speaker tracking approach based on dynamic Bayesian network is proposed. Based on the complementarity and redundancy between speech and image of a speaker, three kinds of perception methods, including sound source localization based on microphone array, face detection based on skin color information, and maximization mutual information based on audio-visual synchronization, are proposed to acquire the tracking information. In the framework of dynamic Bayesian network, particle filtering is used to fuse the tracking information, and perception management is achieved to improve the tracking efficiency by information entropy theory. Experiments using real-world data demonstrate that the proposed method can robustly track the speaker even in the presence of perturbing factors such as high room reverberation and video occlusions.

Note:新增回溯数据

Pre One:基于多级延时和嵌套全通滤波的联合混响模型

Next One:基于子带粒子滤波的一种语音增强方法