Results
Task 1: Recognition and normalization of temporal expressions
Best system: Alium_01
Prize winner: none (no submissions with source released)
| System name | F1 | P | R |
|---|---|---|---|
| Alium_01 (Strict Match) | 58.81 | 58.91 | 58.72 |
| Alium_01 (Relaxed Match) | 86.49 | 86.63 | 86.35 |
Attribute F1
| System name | Value | Type |
|---|---|---|
| Alium_01 (Strict Match) | 68.7 | 80.23 |
Task 2: Lemmatization of proper names and multi-word phrases
Best system: bronk
Prize winner: out_second_model_new1.
| System name | AccCS | AccCI | Score |
|---|---|---|---|
| bronk | 84.78 | 88.13 | 87.46 |
| out_second_model_new1 | 72.46 | 75.46 | 74.86 |
| out_second_model3 | 68.85 | 71.71 | 71.14 |
| System name | AccCS | AccCI | Score |
|---|---|---|---|
| zbronk.nlp.studio | 95.11 | 95.72 | 95.60 |
| PolEval2019-lemmatization-out-3 | 58.95 | 61.14 | 60.70 |
| PolEval2019-lemmatization-out-2 | 56.42 | 58.25 | 57.89 |
Task 3: Entity linking
Best system: zbronk.nlp.studio
Prize winner: Cheeky Mouse
| System name | Precision |
|---|---|
| zbronk.nlp.studio | 91.9 (withdrawn) |
| model-1 | 77.2 |
| Cheeky Mouse | 26.7 |
Task 4: Machine translation
Best system: SRPOL
Prize winner: DeepIf
EN-PL
| System name | BLEU | NIST | TER | METEOR |
|---|---|---|---|---|
| SRPOL | 28.23 | 6.60 | 62.13 | 47.53 |
| Google Translate | 16.83 | |||
| ModernMT | 16.29 | |||
| ModernMT (in-domain) | 14.42 | |||
| DeepIf (in-domain) | 4.92 | 2.27 | 86.56 | 21.74 |
| SIMPLE_SYSTEMS | 0.94 | 1.12 | 97.94 | 9.81 |
PL-RU
| System name | BLEU | NIST | TER | METEOR |
|---|---|---|---|---|
| Google Translate | 15.78 | |||
| ModernMT | 12.71 | |||
| DeepIf (in-domain) | 5.38 | 2.53 | 83.02 | 53.54 |
| SIMPLE_SYSTEMS | 0.69 | 0.85 | 102.75 | 41.06 |
RU-PL
| System name | BLEU | NIST | TER | METEOR |
|---|---|---|---|---|
| Google Translate | 13.54 | |||
| ModernMT | 11.45 | |||
| ModernMT (in-domain) | 5.73 | |||
| DeepIf (in-domain) | 5.51 | 2.97 | 85.27 | 24.08 |
| SIMPLE_SYSTEMS | 0.57 | 1.29 | 109.43 | 8.35 |
Task 5: Automatic speech recognition
Best system: GOLEM
Prize winner: GOLEM
| Overall stats | Per file WER stats | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| System name | WER% | CORR% | SUB% | DEL% | INS% | Mean | StdDev | Median | Type |
| GOLEM | 12.8 | 90.1 | 6.9 | 3 | 2.9 | 13.3 | 8.8 | 11.9 | closed |
| ARM-1 | 26.4 | 77 | 16.5 | 6.5 | 3.4 | 27.2 | 13.5 | 24.7 | open |
| SGMM2 | 41.3 | 65.2 | 27.1 | 7.7 | 6.5 | 41.3 | 18.1 | 38.8 | open |
| tri2a | 41.8 | 62.9 | 26.8 | 10.3 | 4.7 | 41.4 | 16.9 | 38.5 | open |
| clarin-pl/sejm | 11.8 | 89.7 | 5.4 | 5 | 1.4 | 12 | 7.9 | 9.8 | closed |
| clarin-pl/studio |
30.9 | 71.4 | 16 | 12.6 | 2.4 | 30.4 | 13.6 | 25.9 | open |
Task 6: Automatic cyberbullying detection
Subtask 6.1
Best system: n-waves ULMFiT
Prize winner: n-waves ULMFiT
| System name | Precision | Recall | F1 | Accuracy |
|---|---|---|---|---|
| n-waves ULMFiT | 66.67 | 52.24 | 58.58 | 90.10 |
| Przetak | 66.35 | 51.49 | 57.98 | 90.00 |
| ULMFiT + SentencePiece + BranchingAttention | 52.90 | 54.48 | 53.68 | 87.40 |
| ensamble spacy + tpot + BERT | 52.71 | 50.75 | 51.71 | 87.30 |
| ensamble + fastai | 52.71 | 50.75 | 51.71 | 87.30 |
| ensenble spacy + tpot | 43.09 | 58.21 | 49.52 | 84.10 |
| Rafal-1 | 41.08 | 56.72 | 47.65 | 83.30 |
| Rafal-2 | 41.38 | 53.73 | 46.75 | 83.60 |
| model1-svm | 60.49 | 36.57 | 45.58 | 88.30 |
| fasttext | 58.11 | 32.09 | 41.35 | 87.80 |
| SCWAD-CB | 51.90 | 30.60 | 38.50 | 86.90 |
| model2-gru | 63.83 | 22.39 | 33.15 | 87.90 |
| model3-flair | 81.82 | 13.43 | 23.08 | 88.00 |
| Task 6: Automatic cyberbullying detection (J.K.) | 17.41 | 32.09 | 22.57 | 70.50 |
Subtask 6.2
Best system: model1-svm
Prize winner: model1-svm
| System name | Micro-Average F1 | Macro-Average F1 |
|---|---|---|
| model1-svm | 87.60 | 51.75 |
| ensamble spacy + tpot + BERT | 87.10 | 46.45 |
| fasttext | 86.80 | 47.22 |
| model3-flair | 86.80 | 45.05 |
| SCWAD-CB | 83.70 | 49.47 |
| model2-gru | 78.80 | 49.15 |
| Task 6: Automatic cyberbullying detection (J.K.) | 70.40 | 37.59 |
| ensamble + fastai | 61.60 | 39.64 |