September 7th, Copenhagen (at EMNLP 2017)
NEW! WNUT 2018 will be co-located again with EMNLP! (Brussels, Belgium on Oct 31 or Nov 1, 2018)
The WNUT workshop focuses on Natural Language Processing applied to noisy user-generated text, such as that found in social media, online reviews, crowdsourced data, web forums, clinical records and language learner essays. This year, there will be one shared task on Entity Recognition - details below.
The workshop hashtag is #wnut.
We're excited about our two joint best paper winners! Thank you to Snap Inc. for the prize donation. In alphabetical order:
9:00–9:05 | Opening |
9:05–9:50 | Invited Talk: Common Sense Knowledge as an Emergent Property of Neural Conversational Models (Bill Dolan) |
9:50–10:35 | Oral Session I |
9:50–10:05 | Boundary-based MWE segmentation with text partitioning Jake Williams |
10:05–10:20 | Towards the Understanding of Gaming Audiences by Modeling Twitch Emotes Francesco Barbieri, Luis Espinosa Anke, Miguel Ballesteros, Juan Soler and Horacio Saggion |
10:20–10:35 | Churn Identification in Microblogs using Convolutional Neural Networks with Structured Logical Knowledge Mourad Gridach, Hatem Haddad and Hala Mulki |
10:35–11:00 | Coffee Break |
11:00–12:30 | Oral Session II |
11:00–11:15 | To normalize, or not to normalize: The impact of normalization on Part-of-Speech tagging Rob van der Goot, Barbara Plank and Malvina Nissim |
11:15–11:30 | Constructing an Alias List for Named Entities during an Event Anietie Andy, Mark Dredze, Mugizi Rwebangira and Chris Callison-Burch |
11:30–11:45 | Incorporating Metadata into Content-Based User Embeddings Linzi Xing and Michael J. Paul |
11:45–12:00 | Simple Queries as Distant Labels for Predicting Gender on Twitter Chris Emmery, Grzegorz Chrupała and Walter Daelemans |
12:00–12:15 | A Dataset and Classifier for Recognizing Social Media English Su Lin Blodgett, Johnny Wei and Brendan O’Connor |
12:15–12:30 | Evaluating hypotheses in geolocation on a very large sample of Twitter Bahar Salehi and Anders Søgaard |
12:30–14:00 | Lunch |
14:00–14:45 | Invited Talk: Tweets in Finance (Miles Osborne) |
14:45–14:55 | Lightning Talks |
The Effect of Error Rate in Artificially Generated Data for Automatic Preposition and Determiner Correction Fraser Bowen, Jon Dehdari and Josef Van Genabith | |
An Entity Resolution Approach to Isolate Instances of Human Trafficking Online Chirag Nagpal, Kyle Miller, Benedikt Boecking and Artur Dubrawski | |
Noisy Uyghur Text Normalization Osman Tursun and Ruket Cakici | |
Crowdsourcing Multiple Choice Science Questions Johannes Welbl, Nelson F. Liu and Matt Gardner | |
A Text Normalisation System for Non-Standard English Words Emma Flint, Elliot Ford, Olivia Thomas, Andrew Caines and Paula Buttery | |
Huntsville, hospitals, and hockey teams: Names can reveal your location Bahar Salehi, Dirk Hovy, Eduard Hovy and Anders Søgaard | |
Improving Document Clustering by Removing Unnatural Language Myungha Jang, Jinho D. Choi and James Allan | |
Lithium NLP: A System for Rich Information Extraction from Noisy User Generated Text on Social Media Preeti Bhargava, Nemanja Spasojevic and Guoning Hu | |
14:55–15:30 | Shared Task Session |
14:55–15:10 | Results of the WNUT2017 Shared Task on Novel and Emerging Entity Recognition Leon Derczynski, Eric Nichols, Marieke van Erp and Nut Limsopatham |
15:10–15:20 | A Multi-task Approach for Named Entity Recognition in Social Media Data Gustavo Aguilar, Suraj Maharjan, Adrian Pastor López Monroy and Thamar Solorio |
15:20–15:30 | Distributed Representation, LDA Topic Modelling and Deep Learning for Emerging Named Entity Recognition from Social Media Patrick Jansson and Shuhua Liu |
15:30–15:35 | Shared Task Lightning Talks |
Multi-channel BiLSTM-CRF Model for Emerging Named Entity Recognition in Social Media Bill Y. Lin, Frank Xu, Zhiyi Luo and Kenny Zhu | |
Transfer Learning and Sentence Level Features for Named Entity Recognition on Tweets Pius von Däniken and Mark Cieliebak | |
Context-Sensitive Recognition for Emerging and Rare Entities Jake Williams and Giovanni Santia | |
A Feature-based Ensemble Approach to Recognition of Emerging and Rare Named Entities Utpal Kumar Sikdar and Björn Gambäck | |
15:35–16:30 | Poster Session |
16:30–17:15 | Invited Talk: Modeling Language as a Social Construct (Dirk Hovy) |
17:15–17:30 | Closing and Best Paper Awards |
We seek submissions of regular papers
on original and unpublished work (same page limit EMNLP main conference). 1-page abstracts
on work-in-progress or work published elsewhere are also welcome and will *not* be included in the conference proceedings.
All accepted submissions will be presented as posters. Additionally, selected submissions will be presented orally. Shared task participants are also encouraged (but not required) to submit system description papers
and present posters; the top systems will be invited (but not required) to present orally.
Topics of interest include but are not limited to:
This shared task focuses on identifying unusual, previously-unseen entities in the context of emerging discussions. Named entities form the basis of many modern approaches to other tasks (like event clustering and summarisation), but recall on them is a real problem in noisy text - even among annotators. This drop tends to be due to novel entities and surface forms. Take for example the tweet “so.. kktny in 30 mins?” - even human experts find entity kktny hard to detect and resolve. This task will evaluate the ability to detect and classify novel, emerging, singleton named entities in noisy text.
Organisers: Leon Derczynski (University of Sheffield), Marieke van Erp (VU University Amsterdam), Nut Limsopatham (University of Cambridge), Eric Nichols (Honda Research Institute, Japan)
Full details, dates, data etc are on the Emerging and Rare Entity Recognition task page.