2020 The 6th Workshop on Noisy User-generated Text (W-NUT)
Nov 11, 2020, Punta Cana, Dominican Republic (at EMNLP 2020)
The WNUT workshop focuses on Natural Language Processing applied to noisy user-generated text, such as that found in social media, online reviews, crowdsourced data, web forums, clinical records and language learner essays. The workshop hashtag is #wnut.
We are organizing a shared-task on entity and relation recognition over wet-lab protocols. Data will be released in April and evaluation will be in June 2020 (more information here; sign up for mailing list for future announcements).
We have best paper awards sponsored by Twitter this year.
Call for Papers
We seek submissions of
long and short papers on original and unpublished work (same page limit EMNLP main conference).
1-page abstracts on work-in-progress or work published elsewhere are also welcome and will *not* be included in the conference proceedings.
All accepted submissions will be presented as posters. Additionally, selected submissions will be presented orally.
Topics of interest include but are not limited to:
- NLP Preprocessing of Noisy Text
- Part of speech tagging
- Named entity tagging, including a wide range of categories, e.g. product names
- Chunking of user-generated text
- Text Normalization and Error Correction
- Normalizing noisy text for downstream tasks and for human readability
- Error detection and correction
- Robustness to Noise, both Natural and Adversarial
- Multilingual NLP in noisy text
- Machine Translation of Noisy Text
- Sentiment analysis
- Crowdsourcing of text data
- User prediction, e.g. gender, age, etc
- Stylistics, e.g. formality, politeness, etc
- Colloquial language, e.g. code-switching, idiom detection
- Bilingual translation of the noisy text
- Paraphrase identification and semantic similarity of short text or noisy text
- Information extraction from noisy text
- Domain adaptation to user-generated text
- Geolocation prediction
- Global and regional trend detection and event extraction
- Detecting rumors, contradictory information, sarcasm and humor on social media
- Extracting user demographics, profiles, and major life events
- Temporal aspects of user-generated content (resolving time expressions, concept drift, diachronic analyses, etc...)
All submissions should conform to EMNLP 2020 style guidelines
. Long and short paper submissions must be anonymized. Abstract submissions should include author information (and where the work was published in a footnote on the front page, if applicable). Please submit your papers at the SoftConf link
Double Submission Policy:
Papers that have been or will be submitted to other meetings or publications must indicate at submission time. Authors of a paper accepted for presentation must notify the workshop organizers by the camera-ready deadline as to whether the paper will be presented or withdrawn. (Exception: 1-page abstracts can be work-in-progress or work published elsewhere.)
- Submission Deadline: TBD (sometime around July 15 - Aug 10)
- Reviews Due: TBD
- Acceptance Notification: TBD
- Camera-Ready: TBD
- Workshop day: November 11
Shared task: Lab Protocols
Lab protocols specify steps in performing a lab procedure. They are noisy, dense, and domain-specific. Automatic or semi-automatic conversion of protocols into machine-readable format benefits biological research. In this task, system entries are invited for event recognition and relation extraction over these lab protocols. Note that these protocols are written by researchers and lab technicians worldwide, some of which may contain non-standard language or spelling errors. Here's a sample of the input data:
Details on the shared task are here.