Abstract
We consider the problem of Semi-supervised Learning (SSL) from general unlabeled data, which may contain irrelevant samples. Within the binary setting, our model manages to better utilize the information from unlabeled data by formulating them as a three-class () mixture, where class represents the irrelevant data. This distinguishes our work from the traditional SSL problem where unlabeled data are assumed to contain relevant samples only, either or , which are forced to be the same