PRESEMT

Data

1. Evaluation data sets for development purposes [Link]

These data sets were manually developed based on content drawn from the web. Each set consists of ca. 200 sentences.

Language pairs:

{Czech, German, Greek, Norwegian} to English
{Czech, English, Greek, Norwegian} to German

Evaluation data sets (development) are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

2. Evaluation data sets for test purposes [Link]

These data sets, each of which contains 200 sentences, were manually developed based on content drawn from the web. They were used for the evaluation of the PRESEMT system by human evaluators.

Language pairs:

{Czech, German, Greek, Norwegian} to English
{Czech, English, Greek, Norwegian} to German

Evaluation data sets (testing) are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

3. Bilingual corpora [txt] [xml]

The specific corpora were manually developed based on content drawn from the web. Each set consists of 200-300 sentences.

Language pairs:

{Czech, German, Greek, Norwegian} to English
{Czech, English, Greek, Norwegian} to German

Bilingual corpora are licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License.

Top

Skip to content

Web design, realisation, maintenance and administration by Marina Vassiliou
Logo design and realisation by Zacharias Detorakis
The research leading to these results has received funding from the European Community's
Seventh Framework Programme (FP7/2007-2013) under grant agreement No 248307.

PRESEMT

Data

Results

Links

Login Form

The PRESEMT book