Abstract
The discovery and analysis of valuable information hidden in the usage data become more and more important with the exponential growth of Web users, for offering personalized information access. Since the traditional methods are unable to effectively solve the tasks of mining semi-structured and/or unstructured data in the single platform, in this paper, we propose three methods for respectively mining user browsing preference, visiting frequency and participating characteristics, based on the Hadoop cluster by MapReduce. Moreover, we apply our methods to the Web server logs and Developer mailing lists, and analyze the visualization of mining results in order to gain a deeper understanding of user access patterns and interactive behaviors. The experimental results show that our methods can provide further insights into some useful information from usage data for decision making with a good speedup and scalability.