2017 IEEE/ACIS 16th International Conference on Computer and Information Science (ICIS)
Download PDF

Abstract

Various malicious methods for a website to get more popularity than it deserves are mainly classed into two types, one of which is link-based spam. Mainstream link-based anti-spam algorithms, including ranking algorithms and spam algorithm only count the number and quality of links of a page to identify a spam page, with some of them using a whitelist or a blacklist. This paper proposed a PageRank-improved algorithm, that combined whitelist and blacklist, and used two effect factors — cheating similarity and cheating relevance to get a cheating tendence value. The cheating tendence value was used to modify original PageRank value and reranked sites. Experimental results proved that the new algorithm had a better link-based anti-spam performance.

Related Articles