Task 1: Recognition and normalization of temporal expressions
The aim of this task is to advance research on processing of temporal expressions, which are used in other NLP applications like question answering, summarisation, textual entailment, document classification, etc. This task follows on from previous TempEval events organised for evaluating time expressions for English and Spanish (SemEval-2013). This time we provide corpus of Polish documents fully annotated with temporal expressions. The annotation consists of boundaries, classes and normalised values of temporal expressions.
Task 2: Lemmatization of proper names and multi-word phrases
The task consists in developing a tool for lemmatization of proper names and multi-word phrases.
Task 3: Entity linking
The task covers the identification of mentions of entities from a knowledge base (KB) in Polish texts. In this task as the reference KB we will use WikiData (WD), an offspring of Wikipedia – a knowledge base, that unifies structured data available in various editions of Wikipedia.
Task 4: Machine translation
Machine translation is a translation of text by a computer, with no human involvement. Pioneered in the 1950s, machine translation can also be referred to as automated translation, automatic or instant translation.
Task 5: Automatic speech recognition
This task deals with automatically transcribing an audio recording containing speech in a noisy environment. Automatic speech recognition is commonly defined as the process of automatically determining the most likely sequence of words given some sequence of observations available in the form of audio.
Task 6: Automatic cyberbullying detection
Although the problem of humiliating and slandering people through the Internet has existed almost as long as communication via the Internet between people, the appearance of new devices, such as smartphones and tablet computers, which allow using this medium not only at home, work or school but also in motion, has further exacerbated the problem. Especially recent decade, during which Social Networking Services (SNS), such as Facebook and Twitter, rapidly grew in popularity, has brought to light the problem of unethical behaviors in Internet environments, which has been greatly impairing public mental health in adults and, for the most, in younger users and children. It is the problem of cyberbullying (CB), defined as exploitation of open online means of communication, such as Internet forum boards, or SNS to convey harmful and disturbing information about private individuals, often children and students.