Abstract
Web directories cluster Web pages into categories and usually organize them into hierarchies. Many users used them to browse for interesting Web pages in a coarse-to-fine manner. Nowadays most of the Web directories access monolingual Web pages and provide only monolingual interface which may limit the coverage and accessibility of Web pages for users familiar only with their native languages. Bilingual or multilingual Web directories thus may relieve such limitations. In this work, we develop an automated process to create multilingual (or bilingual, specifically) Web directories from a set of parallel corpora. We adopted the self-organizing map model to cluster the Web pages and construct Web directories for each language. A hierarchy alignment process was then applied on these monolingual hierarchies to obtain the relationships between different languages. A multilingual Web directory was then created using such relationships. We conducted experiments on a set of parallel corpora and the result demonstrated that our method could be feasible.