2018 The 4th Workshop on Noisy User-generated Text (W-NUT)
Nov 1, 2018, Brussels, Belgium (at EMNLP 2018)
The WNUT workshop focuses on Natural Language Processing applied to noisy user-generated text, such as that found in social media, online reviews, crowdsourced data, web forums, clinical records and language learner essays. The workshop hashtag is #wnut.
We again have best paper award(s) sponsored by Microsoft Research this year.
NEW! WNUT 2019 will be co-located again with EMNLP! (Hong Kong, Nov 2-7)
NEW! We received 44 paper submissions this year.
NEW! Best paper awards:
Thursday, November, 1, 2018 |
9:00–9:05 | Opening |
9:05–9:50 | Invited Talk: Leon Derczynski Dimensions of Variation in User-generated Text |
9:50–10:35 | Oral Session I |
9:50–10:05 | Inducing a lexicon of sociolinguistic variables from code-mixed text
Philippa Shoemark, James Kirby and Sharon Goldwater |
10:05–10:20 | Twitter Geolocation using Knowledge-Based Methods
Taro Miyazaki, Afshin Rahimi, Trevor Cohn and Timothy Baldwin |
10:20–10:35 | Content Extraction and Lexical Analysis from Customer-Agent Interactions
Sergiu Nisioi, Anca Bucur and Liviu P. Dinu |
10:35–11:00 | Tea Break |
11:00–12:30 | Oral Session II |
11:00–11:15 | Assigning people to tasks identified in email: The EPA dataset for addressee tagging for detected task intent
Revanth Rameshkumar, Peter Bailey, Abhishek Jha and Chris Quirk |
11:15–11:30 | How do you correct run-on sentences it’s not as easy as it seems
Junchao Zheng, Courtney Napoles and Joel Tetreault |
11:30–11:45 | A POS Tagging Model Designed for Learner English
Ryo Nagata, Tomoya Mizumoto, Yuta Kikuchi, Yoshifumi Kawasaki and Kotaro Funakoshi |
11:45–12:00 | Normalization of Transliterated Words in Code-Mixed Data Using Seq2Seq Model & Levenshtein Distance
Soumil Mandal and Karthick Nanmaran |
12:00–12:15 | Robust Word Vectors: Context-Informed Embeddings for Noisy Texts
Valentin Malykh, Taras Khakhulin and Varvara Logacheva |
12:15–12:30 | Paraphrase Detection on Noisy Subtitles in Six Languages
Eetu Sjöblom, Mathias Creutz and Mikko Aulamo |
12:30–14:00 | Lunch |
14:00–14:45 | Invited Talk: Diyi Yang Modeling Members' Social Roles and their Conversational Acts in Online Communities
|
14:45–15:15 | Lightning Talks |
| Geocoding Without Geotags: A Text-based Approach for reddit
Keith Harrigian |
| Distantly Supervised Attribute Detection from Reviews
Lisheng Fu and Pablo Barrio |
| Using Wikipedia Edits in Low Resource Grammatical Error Correction
Adriane Boyd |
| Empirical Evaluation of Character-Based Model on Neural Named-Entity Recognition in Indonesian Conversational Texts
Kemal Kurniawan and Samuel Louvan |
| Orthogonal Matching Pursuit for Text Classification
Konstantinos Skianis, Nikolaos Tziortziotis and Michalis Vazirgiannis |
| Training and Prediction Data Discrepancies: Challenges of Text Classification with Noisy, Historical Data
R. Andrew Kreek and Emilia Apostolova |
| Detecting Code-Switching between Turkish-English Language Pair
Zeynep Yirmibeşoğlu and Gülşen Eryiğit |
| Language Identification in Code-Mixed Data using Multichannel Neural Networks and Context Capture
Soumil Mandal and Anil Kumar Singh |
| Modeling Student Response Times: Towards Efficient One-on-one Tutoring Dialogues
Luciana Benotti, Jayadev Bhaskaran, Sigtryggur Kjartansson and David Lang |
| Preferred Answer Selection in Stack Overflow: Better Text Representations ... and Metadata, Metadata, Metadata
Steven Xu, Andrew Bennett, Doris Hoogeveen, Jey Han Lau and Timothy Baldwin |
| Word-like character n-gram embedding
Geewook Kim, Kazuki Fukui and Hidetoshi Shimodaira |
| Classification of Tweets about Reported Events using Neural Networks
Kiminobu Makino, Yuka Takei, Taro Miyazaki and Jun Goto |
| Learning to Define Terms in the Software Domain
Vidhisha Balachandran, Dheeraj Rajagopal, Rose Catherine Kanjirathinkal and William Cohen |
| FrameIt: Ontology Discovery for Noisy User-Generated Text
Dan Iter, Alon Halevy and Wang-Chiew Tan |
| Using Author Embeddings to Improve Tweet Stance Classification
Adrian Benton and Mark Dredze |
| Low-resource named entity recognition via multi-source projection: Not quite there yet?
Jan Vium Enghoff, Søren Harrison and Željko Agić |
| A Case Study on Learning a Unified Encoder of Relations
Lisheng Fu, Bonan Min, Thien Huu Nguyen and Ralph Grishman |
| Convolutions Are All You Need (For Classifying Character Sequences)
Zach Wood-Doughty, Nicholas Andrews and Mark Dredze |
| MTNT: A Testbed for Machine Translation of Noisy Text
Paul Michel and Graham Neubig |
| A Robust Adversarial Adaptation for Unsupervised Word Translation
Kazuma Hashimoto, Ehsan Hosseini-Asl, Caiming Xiong and Richard Socher |
| A Comparative Study of Embeddings Methods for Hate Speech Detection from Tweets
Shashank Gupta and Zeerak Waseem |
| Step or Not: Discriminator for The Real Instructions in User-generated Recipes
Shintaro Inuzuka, Takahiko Ito and Jun Harashima |
| Named Entity Recognition on Noisy Data using Images and Text
Diego Esteves |
| Handling Noise in Distributional Semantic Models for Large Scale Text Analytics and Media Monitoring
Peter Sumbler, Nina Viereckel, Nazanin Afsarmanesh and Jussi Karlgren |
| Combining Human and Machine Transcriptions on the Zooniverse Platform
Daniel Hanson and Andrea Simenstad |
| Predicting Good Twitter Conversations
Zach Wood-Doughty, Prabhanjan Kambadur and Gideon Mann |
| Automated opinion detection analysis of online conversations
Yuki M Asano, Niccolo Pescetelli and Jonas Haslbeck |
| Classification of Written Customer Requests: Dealing with Noisy Text and Labels
Viljami Laurmaa and Mostafa Ajallooeian |
15:15–16:30 | Poster Session |
16:30–17:15 | Invited Talk: Daniel Preoţiuc-Pietro User Trait Expression and Portrayal through Social Media |
17:15–17:30 | Closing and Best Paper Awards |
. Long and short paper submissions must be anonymized. Abstract submissions should include author information (and where the work was published in a footnote on the front page, if applicable). Please submit your papers at the
Papers that have been or will be submitted to other meetings or publications must indicate at submission time. Authors of a paper accepted for presentation must notify the workshop organizers by the camera-ready deadline as to whether the paper will be presented or withdrawn. (Exception: 1-page abstracts can be work-in-progress or work published elsewhere.)