BootCatting Comparable Corpora

Research areas: Year: 2011
Type of Publication: In Proceedings Keywords: web corpora
Book title: Proceedings of the 9th International Conference on Terminology and Artificial Intelligence
Pages: 123-126
Address: Paris, France
Organization: Multilingual Engineering Research Centre CRIM/ERTIM (EA2520) of INALCO, Institut National des Langues et Civilisations Orientales Month: November 8-10
The BootCaT method (Baroni and Bernardini, 2004) has proved a fast, effective and versatile approach to corpus building. The method has been applied to small specialist corpora for finding terminology and translations (as originally envisaged by Baroni and Bernardini), and to large, general corpora, for large numbers of languages. To date it has not been applied multilingually. This is our topic. We describe an implemented tool, Comparable Corpora BootCat, and a pilot evaluation.
JRESEARCH_FULLTEXT: KilgarriffEtAl_TIA2011.pdf