For this task, participants are asked to develop systems that automatically identify whether an English Tweet related to the novel coronavirus (COVID-19) is informative or not. Such informative Tweets provide information about recovered, suspected, confirmed and death cases as well as location or travel history of the cases.
Data is released on June 21, 2020!
Official evaluation will be between August 17, 2020 and August 21, 2020
(Please register HERE to participate).
There is a mailing list for future announcements.
The goals of our shared task are: (1) To develop a language processing task that potentially impacts research and downstream applications, and (2) To provide the community with a new dataset for identifying informative COVID-19 English Tweets.
As of mid-June 2020, the COVID-19 outbreak has led to about 445K deaths and 8.2M+ infected patients from 215 regions & countries, creating fear and panic for people all around the world. Recently, much attention has been paid to building monitoring systems (e.g. The Johns Hopkins Coronavirus Dashboard) to track the development of the outbreak and to provide users the information related to the virus, e.g. any new suspicious/confirmed cases near/in the users' regions. Note that most of the "official" sources used in the tracking tools are not kept up to date with the current outbreak situation, e.g. WHO updates the outbreak information only once a day. Those monitoring systems thus use social network data, e.g. from Twitter, as a real-time alternative source for updating the outbreak information, generally by crowdsourcing or searching for related information manually. However, the outbreak has been spreading rapidly, we observe a massive amount of data on social networks, e.g. about 4 millions of COVID-19 English Tweets daily on the Twitter platform in which the majority of these Tweets are uninformative. Thus it is important to be able to select the informative ones (e.g. COVID-19 Tweets related to new cases or suspicious cases) for downstream applications. However, manual approaches to identify the informative Tweets require significant human efforts, and thus are costly.
To help handle the problem, this shared task is to automatically identify whether a COVID-19 English Tweet is informative or not. Such informative Tweets provide information about recovered, suspected, confirmed and death cases as well as location or travel history of the cases. To be able to achieve the goals of this shared task, we construct a dataset of 10K COVID19 English tweets. We believe our dataset and systems developed for this shared task will be beneficial for the development of COVID-19 related monitoring systems.
The dataset consists of 10K COVID English Tweets, including 4719 Tweets labeled as INFORMATIVE and 5281 Tweets labeled as UNINFORMATIVE. Here, each Tweet is annotated by 3 independent annotators and we obtain an inter-annotator agreement score of Fleiss' Kappa at 0.818. We use a 70/10/20 rate for splitting the dataset into training/validation/test sets.
Although we provide a default training and validation split of the released data, participants are free to use this data in any way they find useful when training and tuning their systems, e.g. use a different split, perform cross-validation, and the like.
The raw test set will be released when the evaluation phase starts. To keep fairness among participants, the test set will be a relatively large set of Tweets, and the actual test Tweets by which your model will be evaluated are hidden in the large test set. Gold test labels will be released after the evaluation period.
Shared task data is available at: https://github.com/VinAIResearch/COVID19Tweet
Note that the data is released for you to use without downloading from Twitter (here, each data line consists of Tweet ID, Tweet text and label). Part of our agreement in sharing this data directly with you is that it is your responsibility to maintain Twitter's Term's of Service and to delete Tweets that are no longer publicly available.
Systems are evaluated using standard evaluation metrics, including accuracy, precision, recall and F1-score. Note that the latter three metrics will be calculated for the INFORMATIVE class only. The submissions will be ranked by F1-score.
By submitting results to this shared task, you consent to the public release of your scores at the W-NUT 2020 workshop and in the associated proceedings, at the task organizers' discretion. Scores may include, but are not limited to, automatic and manual quantitative judgments, qualitative judgments, and such other metrics as the task organizers see fit. You accept that the ultimate decision of metric choice and score value is that of the task organizers.
You further agree that the task organizers are under no obligation to release scores and that scores may be withheld if it is the task organizers' judgment that the submission was incomplete, erroneous, deceptive, or violated the letter or spirit of the shared task's rules. The inclusion of a submission's scores is not an endorsement of a team or individual's submission, system, or science.
You further agree that your system may be named according to the team name provided at the time of submission, or to a suitable shorthand as determined by the task organizers.
You can only join one participating team.
Result submission: https://competitions.codalab.org/competitions/25845
We seek submissions of systems and system descriptions, which will be published in Proceedings of the W-NUT 2020 workshop in the ACL Anthology.
Creators of systems with valid results that are submitted to this shared task are invited to send a short paper (4 pages plus references) to W-NUT 2020 that describes the system. There is no need to give a detailed description of the shared task in a system submission.
All submissions should conform to EMNLP 2020 style guidelines. Please submit your papers at the SoftConf link.
Contact Us: If you have any questions, please feel free to contact us at email@example.com.