2012 IEEE 12th International Conference on Data Mining Workshops
Download PDF

Abstract

During recent years, many biclustering algorithms have been developed for the analysis of gene expression data to complement and expand the capabilities of traditional clustering methods. With biclustering, genes with similar expression profiles can be identified not only over the whole data set but also across subsets of experimental conditions allowing genes to simultaneously belong to several expression patterns. This property makes biclustering a powerful approach especially when it is applied to data with large number of conditions. In spite of the clear theoretical benefit, the full potential of biclustering has not been realized within the gene expression research community and thus it has never truly become a part of the standard gene expression data analysis. Possible reasons include for example the unrealization of the various complementary ways in which biclustering can be applied to micro array or next-generation sequencing based gene expression data sets and the lack of reliable and fast algorithms. In this paper, we first illustrate the various opportunities of applying biclustering within a typical gene expression data analysis pipeline. Then a new biclustering method (BiclusterMiner) is presented that can be applied to all presented cases. The developed method is the first discrete biclustering algorithm that is able to simultaneously handle both up- and down-regulated genes by taking the direction of regulation into account and still discover all possible maximal biclusters. The efficiency of the proposed algorithm is demonstrated on real and synthetic datasets.
Like what you’re reading?
Already a member?
Get this article FREE with a new membership!

Related Articles