个人信息Personal Information
副教授
博士生导师
硕士生导师
性别:女
毕业院校:大连理工大学
学位:博士
所在单位:数学科学学院
学科:计算数学
办公地点:大连理工大学数学科学学院505
联系方式:0411-84708351-8205
电子邮箱:yangjiee@dlut.edu.cn
An Algorithm for Motif Discovery with Iteration on Lengths of Motifs
点击次数:
论文类型:期刊论文
发表时间:2015-01-01
发表刊物:IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
收录刊物:SCIE、EI、Scopus
卷号:12
期号:1
页面范围:136-141
ISSN号:1545-5963
关键字:Motif discovery; motif's length; DNA sequences
摘要:Analysis of DNA sequence motifs is becoming increasingly important in the study of gene regulation, and the identification of motif in DNA sequences is a complex problem in computational biology. Motif discovery has attracted the attention of more and more researchers, and varieties of algorithms have been proposed. Most existing motif discovery algorithms fix the motif's length as one of the input parameters. In this paper, a novel method is proposed to identify the optimal length of the motif and the optimal motif with that length, through an iteration process on increasing length numbers. For each fixed length, a modified genetic algorithm (GA) is used for finding the optimal motif with that length. Three operators are used in the modified GA: Mutation that is similar to the one used in usual GA but is modified to avoid local optimum in our case, and Addition and Deletion that are proposed by us for the problem. A criterion is given for singling out the optimal length in the increasing motif's lengths. We call this method AMDILM (an algorithm for motif discovery with iteration on lengths of motifs). The experiments on simulated data and real biological data show that AMDILM can accurately identify the optimal motif length. Meanwhile, the optimal motifs discovered by AMDILM are consistent with the real ones and are similar with the motifs obtained by the three well-known methods: Gibbs Sampler, MEME and Weeder.