W-NUT 2026: Workshop on Natural User-generated Text (at EMNLP 2026)

From Noisy Text to Real-World Applications of LLMs

The WNUT workshop on language by people. We focus on language as it occurs in the real world, from noisy text to real-world applications of LLMs.

Shared Task

This year, we host MultiLexNorm2026: a shared task on multi-lingual lexical normalization with a focus on non Indo-european languages. After the success of our first MultiLexNorm shared task held in 2021, we have extended our benchmark to more varied languages. More information about MultiLexNorm2026.

Important Dates

Date	Event
July 25th	Submission Deadline (anytime on earth; dual-submission allowed)
August 15th	ARR Commitment Date
August 25th	Acceptance Notification
September 6th	Camera-Ready Deadline
TBA	EMNLP 2026 Findings Deadline
October 28th	Workshop Day

Call for Papers

We seek submissions of long and short papers on original and unpublished work (same page limit as the EMNLP 2026 main conference). All accepted submissions will be presented as talks and/or posters at the workshop, following the EMNLP 2026 main conference.

We welcome submissions addressing (but not limited to) the following areas:

Classical NLP Tasks on Noisy Text

NLP of noisy text, e.g., POS and NER tagging, parsing
Text normalization and error correction
Paraphrase identification and semantic similarity of short text or noisy text
Extracting user demographics, profiles, and major life events
Machine translation and multilingual NLP over noisy text
Information extraction from noisy text and event extraction
Colloquial language, e.g., idiom detection
Domain adaptation to user-generated text
Detecting rumors, contradictory information, sarcasm, and humor on social media
Sentiment analysis
Temporal aspects of user-generated content (resolving time expressions, concept drift, etc.)
Representing and mining language variation in user-generated content

LLMs and Noisy Text

Robustness of LLMs to noisy, ungrammatical, or informal input
Training and fine-tuning of LLMs on user-generated text
Evaluation methodologies for LLMs on noisy text
Domain adaptation of LLMs to specific user-generated text domains
LLM-generated noisy text: detection, characteristics, and implications
Instruction-following in LLMs when faced with noisy or ambiguous prompts
Retrieval-augmented generation (RAG) with noisy documents

Real-World LLM Applications

LLM performance on text from social media, forums, and messaging platforms
Handling code-switching, slang, and emerging language in LLMs
LLMs for content moderation and safety in user-generated contexts
Multilingual and cross-lingual LLM applications on informal text
LLMs for assisting language learners and processing learner text
Bias, fairness, and representation in LLMs trained on user-generated text

We particularly encourage submissions that address multilingual challenges, low-resource languages, cross-platform variations, and the unique characteristics of user-generated text across different communities.

Submissions should conform to the ACL style guidelines. Long and short paper submissions must be anonymized. Please submit your papers via:

Submit via OpenReview Commit via ARR

Double Submission Policy: Papers that have been or will be submitted to other meetings or publications must indicate at submission time. Authors of a paper accepted for presentation must notify the workshop organizers by the camera-ready deadline as to whether the paper will be presented or withdrawn.