To understand the regulation of the gene expression, the identification of transcription start sites (TSSs) is a primary and important step. With the aim to improve the computational prediction accuracy, we focus on the most challenging task, i.e., to identify the TSSs within 50 bp in non-CpG related promoter regions. Due to the diversity of non-CpG related promoters, a large number of features are extracted. Effective feature selection can minimize the noise, improve the prediction accuracy, and also to discover biologically meaningful intrinsic properties. In this paper, a newly proposed multi-objective simulated annealing based optimization method, Archive Multi-Objective Simulated Annealing (AMOSA), is integrated with Linear Discriminant Analysis (LDA) to yield a combined feature selection and classification system. This system is found to be comparable to, often better than, several existing methods in terms of different quantitative performance measures.
|Original language||English (US)|
|Number of pages||11|
|Journal||Computational systems bioinformatics / Life Sciences Society. Computational Systems Bioinformatics Conference|
|State||Published - 2007|
ASJC Scopus subject areas