Abstract
Due to the high heterogeneity of tumor tissue, methylation profiles of tumor samples obtained in clinical experiments are always mixture signals from different cellular components, including cancer, normal and stromal cells, etc. Among them, the admixture of normal cells is deemed as a major confounding factor for many downstream analyses. Decomposing mixture signals into profiles of their primitive constituents is vital for accurate differential calling and patient grouping. However, methods for purification of tumor methylomes are still lacking, even given a reliable estimate of tumor purity. In this work, we present ResDec, a residual-decomposition linear regression model for tumor methylome purification. We systematically evaluated the performance of our method compared with existing methods on both simulation data and TCGA methylation samples. ResDec achieves consistently better performance under different scenarios, including different numbers of matched normal samples, perturbations of input tumor purities and matched normal methylomes.