Abstract
Regions with abundant GC nucleotides in a genome, which are often referred to as CpG islands, have been used in methylation analysis and the prediction of promoter regions. In this study, we propose PSOEO (Particle Swarm Optimization with Extremal Optimization), a method for the prediction of CpG islands in the mammal genome. This method adopts the GGF criteria (GC content ¡Ù 50%, observed/expected (O/E) ratio ¡Ù0.6 and length ¡Ù200 bp) for the search of CpG islands. First, we used the PSO algorithm to predict CpG islands. In a second stage, we used EO to search for various output states (local search) in order to find a better result. Extremal optimization is a developed heuristic local search method. Finally, we used five evaluation criteria, namely the sensitivity (SN), specificity (SP), accuracy (ACC), correlation coefficient (CC) and performance coefficient (PC) to compare other methods in the literature. PSOEO method provided better SN and CC predictions for the locations of CpG islands than the other methods it was compared to.