Abstract
In this study, we have proposed an extraction method for inaccurate example sentences using a Web search engine for multilingual parallel texts. We developed a multilingual parallel-text sharing system named Tack Pad for multilingual communication in the medical field. However, it should be noted that parallel texts created by people can be inaccurate. Hence, we cannot use these parallel texts in fields where high levels of accuracy are required. Moreover, it is difficult for people to evaluate the parallel texts enough because these are large in number. Therefore, we proposed and evaluated an extraction method for inaccurate example sentences. This method uses the contents on the Web as wisdom of crowds. It splits an example sentence into n-grams and uses the Web search engine to locate the split words. Moreover, this method uses two thresholds to detect several mistakes which are typographical errors, grammatical errors, and so on. The contributions of this paper are the following results: (1) We proposed an extraction method that improves the accuracy of the example sentences using the Web search engine and (2) We showed an improvement in the accuracy of the example sentences using two thresholds.