The PRESEMT Project

Research areas: Year: 2012
Type of Publication: In Proceedings Keywords: hybrid machine translation, machine learning, corpus distance measures, comparing corpora
  • 28, 34
Pages: 27-28
Address: Istanbul, Turkey
Organization: 5th Workshop on Building and Using Comparable Corpora (BUCC2012) [held in conjunction with LREC2012] Month: May 26
Within the PRESEMT project, we have explored a hybrid approach to machine translation in which a small parallel corpus is used to learn mapping rules between grammatical constructions in the two languages, and large target-language corpora are used for refining translations. We have also taken forward methods for ‘corpus measurement’, including an implemented framework for measuring the distance between any two corpora of the same language. We briefly describe developments in both these areas.