Query Translation using Wikipedia

« Query Translation using Wikipedia-based resources for analysis and disambiguation »

Benoît Gaillard, Malek Boualem, Olivier Collin

Orange Labs

EAMT 2010

eamt_WiCLIR4

Abstract

This work investigates query translation using only Wikipedia-based resources in a two step approach: analysis and disambiguation. After arguing that data mined from Wikipedia is particularly relevant to query translation, both from a lexical and a semantic perspective, we detail the implementation of the approach. In the analysis phase, lexical units are extracted from queries and associated to several possible translations using a Wikipedia-based bilingual dictionary. During the second phase, one translation is chosen among the many candidates, based on topic homogeneity, asserted with the help of semantic information carried by categories of Wikipedia articles. We report promising results regarding translation accuracy.