|  |   TOPICS: Browse articles of the conference sorted by topic 
  A - C - D - E - G - H - I - K - L - M - N - O - P - Q - S - T - U - V - W   
 
 | C |  
 | Cognitive Methods | Semi-Supervised Methods for Expanding Psycholinguistics Norms by Integrating Distributional Similarity with the Structure of WordNet #mygoal: Finding Motivations on Twitter
 A Graph-based Approach for Computing Free Word Associations
 Design and Development of an Online Computational Framework to Facilitate Language Comprehension Research on Indian Languages
 Mining a Multimodal Corpus for Non-Verbal Behavior Sequences Conveying Attitudes
 Turkish Resources for Visual Word Recognition
 
 |  
 | Collaborative Resource Construction | The DWAN Framework: Application of a Web Annotation Framework for the General Humanities to the Domain of Language Resources Collaboratively Annotating Multilingual Parallel Corpora in the Biomedical Domain―Some Mantras
 Mapping Between English Strings and Reentrant Semantic Graphs
 Developing Text Resources for Ten South African Languages
 Zmorge: a German Morphological Lexicon Extracted from Wiktionary
 Evaluating Lemmatization Models for Machine-Assisted Corpus-Dictionary Linkage
 Digital Library 2.0: Source of Knowledge and Research Collaboration Platform
 Linguistic Landscaping of South Asia Using Digital Language Resources: Genetic Vs. Areal Linguistics
 SAVAS: Collecting, Annotating and Sharing Audiovisual Language Resources for Automatic Subtitling
 CFT13: a Resource for Research into the Post-editing Process
 Generating a Lexicon of Errors in Portuguese to Support an Error Identification System for Spanish Native Learners
 A Colloquial Corpus of Japanese Sign Language: Linguistic Resources for Observing Sign Language Conversations
 Can Numerical Expressions Be Simpler? Implementation and Demostration of a Numerical Simplification System for Spanish
 The eIdentity Text Exploration Workbench
 Rhapsodie: a Prosodic-Syntactic Treebank for Spoken French
 CLARA: A New Generation of Researchers in Common Language Resources and Their Applications
 Can Crowdsourcing Be Used for Effective Annotation of Arabic?
 TweetNorm_es: an Annotated Corpus for Spanish Microtext Normalization
 Corpus Annotation Through Crowdsourcing: Towards Best Practice Guidelines
 Towards an Environment for the Production and the Validation of Lexical Semantic Resources
 Towards an Encyclopedia of Compositional Semantics: Documenting the Interface of the English Resource Grammar
 MUHIT: a Multilingual Harmonized Dictionary
 Pivot-based Multilingual Dictionary Building Using Wiktionary
 The AMARA Corpus: Building Parallel Language Resources for the Educational Domain
 Exploiting Networks in Law
 Terminology Resources and Terminology Work Benefit from Cloud Services
 
 |  
 | Computer-Assisted Language Learning (CALL) | FLELex: a Graded Lexical Resource for French Foreign Learners MAT: a Tool for L2 Pronunciation Errors Annotation
 Generating a Lexicon of Errors in Portuguese to Support an Error Identification System for Spanish Native Learners
 Reusing Swedish Framenet for Training Semantic Roles
 A Database of Freely Written Texts of German School Students for the Purpose of Automatic Spelling Error Classification
 Automatic Error Detection Concerning the Definite and Indefinite Conjugation in the Hunlearner Corpus
 Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process
 An Innovative World Language Centre : Challenges for the Use of Language Technology
 Open Philology at the University of Leipzig
 
 |  
 | Controlled Languages | Presenting a System of Human-Machine Interaction for Performing Map Tasks. 
 |  
 | Corpus (Creation, Annotation, etc.) | Statistical Analysis of Multilingual Text Corpus and Development of Language Models Smile and Laughter in Human-Machine Interaction: a Study of Engagement
 A Conventional Orthography for Tunisian Arabic
 The AMARA Corpus: Building Parallel Language Resources for the Educational Domain
 A Multimodal Dataset for Deception Detection
 Human Annotation of ASR Error Regions: is "gravity" a Sharable Concept for Human Annotators?
 Correcting and Validating Syntactic Dependency in the Spoken French Treebank Rhapsodie
 Erlangen-CLP: A Large Annotated Corpus of Speech from Children with Cleft Lip and Palate
 Semi-Automatic Annotation of the Ucu Accents Speech Corpus
 Evaluation of Automatic Hypernym Extraction from Technical Corpora in English and Dutch
 The Cle Urdu POS Tagset
 Automatic Detection of Other-Repetition Occurrences: Application to French Conversational Speech
 EMOVO Corpus: an Italian Emotional Speech Database
 A Tagged Corpus and a Tagger for Urdu
 A Multidialectal Parallel Corpus of Arabic
 Identification of Multiword Expressions in the Brwac
 Phone Boundary Annotation in Conversational Speech
 NoSta-D Named Entity Annotation for German: Guidelines and Dataset
 Mörkum Njálu. an Annotated Corpus to Analyse and Explain Grammatical Divergences Between 14th-Century Manuscripts of Njál's Saga.
 Mapping WordNet Domains, WordNet Topics and Wikipedia Categories to Generate Multilingual Domain Specific Resources
 The Polish Summaries Corpus
 Variations on Quantitative Comparability Measures and Their Evaluations on Synthetic French-English Comparable Corpora
 Teenage and Adult Speech in School Context: Building and Processing a Corpus of European Portuguese
 On the Importance of Text Analysis for Stock Price Prediction
 A Corpus of Comparisons in Product Reviews
 The IULA Spanish LSP Treebank
 A System for Experiments with Dependency Parsers
 Sockpuppet Detection in Wikipedia: a Corpus of Real-World Deceptive Writing for Linking Identities
 ALICO: a Multimodal Corpus for the Study of Active Listening
 Corpus and Method for Identifying Citations in Non-Academic Text
 A Cross-Language Corpus for Studying the Phonetics and Phonology of Prominence
 Using Resource-Rich Languages to Improve Morphological Analysis of Under-Resourced Languages
 Collaboratively Annotating Multilingual Parallel Corpora in the Biomedical Domain―Some Mantras
 On the Use of a Fuzzy Classifier to Speed Up the Sp_ToBI Labeling of the Glissando Spanish Corpus
 Turkish Treebank as a Gold Standard for Morphological Disambiguation and Its Influence on Parsing
 Praaline: Integrating Tools for Speech Corpus Research
 Interoperability and Customisation of Annotation Schemata in Argo
 Polish Coreference Corpus in Numbers
 A Gold Standard Dependency Corpus for English
 A Corpus of Machine Translation Errors Extracted from Translation Students Exercises
 Co-Training for Classification of Live Or Studio Music Recordings
 Creating and Using Large Monolingual Parallel Corpora for Sentential Paraphrase Generation
 A New Framework for Sign Language Recognition Based on 3d Handshape Identification and Linguistic Modeling
 Crowdsourcing for the Identification of Event Nominals: an Experiment
 Semantic Technologies for Querying Linguistic Annotations: an Experiment Focusing on Graph-Structured Data
 A Hierarchical Taxonomy for Classifying Hardness of Inference Tasks
 The Sweet-Home Speech and Multimodal Corpus for Home Automation Interaction
 Tools for Arabic Natural Language Processing: a Case Study in Qalqalah Prosody
 Aligning Predicate-Argument Structures for Paraphrase Fragment Extraction
 Automatic Creation of WordNets from Parallel Corpora
 Pre-Ordering of Phrase-based Machine Translation Input in Translation Workflow
 A Wikipedia-based Corpus for Contextualized Machine Translation
 Motàmot Project: Conversion of a French-Khmer Published Dictionary for Building a Multilingual Lexical System
 Building a Corpus of Manually Revised Texts from Discourse Perspective
 Single-Person and Multi-Party 3d Visualizations for Nonverbal Communication Analysis
 Interoperability of Dialogue Corpora Through Iso 24617-2-based Querying
 The Database for Spoken German ― DGD2
 Simple Effective Microblog Named Entity Recognition: Arabic as an Example
 Priberam Compressive Summarization Corpus: a New Multi-Document Summarization Corpus for European Portuguese
 The MMASCS Multi-Modal Annotated Synchronous Corpus of Audio, Video, Facial Motion and Tongue Motion Data of Normal, Fast and Slow Speech
 Constructing a Chinese―Japanese Parallel Corpus from Wikipedia
 Modelling Irony in Twitter: Feature Analysis and Evaluation
 Corpus and Evaluation of Handwriting Recognition of Historical Genealogical Records
 Computational Narratology: Extracting Tense Clusters from Narrative Texts
 Designing the Latvian Speech Recognition Corpus
 Aligning Parallel Texts with Intertext
 From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers
 A Corpus of Spontaneous Speech in Lectures: the Kit Lecture Corpus for Spoken Language Processing and Translation
 The Pragmatic Annotation of a Corpus of Academic Lectures
 Comparative Analysis of Verbal Alignment in Human-Human and Human-Agent Interactions
 The eIdentity Text Exploration Workbench
 Emilya: Emotional Body Expression in Daily Actions Database
 The LIMA Multilingual Analyzer Made Free: FLOSS Resources Adaptation and Correction
 Exploring Factors That Contribute to Successful Fingerspelling Comprehension
 On the Annotation of TMX Translation Memories for Advanced Leveraging in Computer-Aided Translation
 Named Entity Recognition on Turkish Tweets
 On Complex Word Alignment Configurations
 Linguistic Resources and Cats: How to Use Isocat, Relcat and Schemacat
 Cross-Linguistic Annotation of Narrativity for English / French Verb Tense Disambiguation
 Evaluating Corpora Documentation with Regards to the Ethics and Big Data Charter
 Introducing a Web Application for Labeling, Visualizing Speech and Correcting Derived Speech Signals
 Vocabulary-based Language Similarity Using Web Corpora
 S-Pot - a Benchmark in Spotting Signs Within Continuous Signing
 TweetNorm_es: an Annotated Corpus for Spanish Microtext Normalization
 The Procedure of Lexico-Semantic Annotation of Składnica Treebank
 A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition
 Crowdsourcing as a Preprocessing for Complex Semantic Annotation Tasks
 Automatic Annotation of Machine Translation Datasets with Binary Quality Judgements
 Learning from Domain Complexity
 Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process
 Deep Syntax Annotation of the Sequoia French Treebank
 Developing a French Framenet: Methodology and First Results
 Innovations in Parallel Corpus Search Tools
 Representing Multimodal Linguistic Annotated Data
 A Corpus of European Portuguese Child and Child-Directed Speech
 'interHist' - an Interactive Visual Interface for Corpus Exploration
 Hashtag Occurrences, Layout and Translation: a Corpus-Driven Analysis of Tweets Published by the Canadian Government
 Presenting a System of Human-Machine Interaction for Performing Map Tasks.
 Hesita(Te) in Portuguese
 MUHIT: a Multilingual Harmonized Dictionary
 The Munich Biovoice Corpus: Effects of Physical Exercising, Heart Rate, and Skin Conductance on Human Speech Production
 Conceptual Transfer: Using Local Classifiers for Transfer Selection
 Annotating Arguments: the Nomad Collaborative Annotation Tool
 Correcting Errors in a New Gold Standard for Tagging Icelandic Text
 Experiences with Parallelisation of an Existing NLP Pipeline: Tagging Hansard
 Named Entity Corpus Construction Using Wikipedia and DBpedia Ontology
 Euronews: a Multilingual Speech Corpus for ASR
 Thomas Aquinas in the Tündra: Integrating the Index Thomisticus Treebank Into Clarin-D
 Towards Linked Hypernyms Dataset 2.0: Complementing DBpedia with Hypernym Discovery
 The Slovene Bnsi Broadcast News Database and Reference Speech Corpus Gos: Towards the Uniform Guidelines for Future Work
 Language Editing Dataset of Academic Texts
 Japanese Conversation Corpus for Training and Evaluation of Backchannel Prediction Model.
 Aix Map Task Corpus: the French Multimodal Corpus of Task-Oriented Dialogue
 Multiword Expressions in Machine Translation
 CROMER: a Tool for Cross-Document Event and Entity Coreference
 Automatic Language Identity Tagging on Word and Sentence-Level in Multilingual Text Sources: a Case-Study on Luxembourgish
 CORILGA: a Galician Multilevel Annotated Speech Corpus for Linguistic Analysis
 Classifying Inconsistencies in DBpedia Language Specific Chapters
 The Halliday Centre Tagger: an Online Platform for Semi-Automatic Text Annotation and Analysis
 NIF4OGGD - NLP Interchange Format for Open German Governmental Data
 Verbs of Saying with a Textual Connecting Function in the Prague Discourse Treebank
 A Language-Independent and Fully Unsupervised Approach to Lexicon Induction and Part-Of-Speech Tagging for Closely Related Languages
 New Bilingual Speech Databases for Audio Diarization
 Less is More? Towards a Reduced Inventory of Categories for Training a Parser for the Italian Stanford Dependencies
 UnixMan Corpus: A Resource for Language Learning in the Unix Domain
 GraPAT: a Tool for Graph Annotations
 The Tutorbot Corpus ― a Corpus for Studying Tutoring Behaviour in Multiparty Face-To-Face Spoken Dialogue
 TweetCaT: a Tool for Building Twitter Corpora of Smaller Languages
 Re-Using an Argument Corpus to Aid in the Curation of Social Media Collections
 Rapid Deployment of Phrase Structure Parsing for Related Languages: a Case Study of Insular Scandinavian
 Assessment of Non-Native Prosody for Spanish as L2 Using Quantitative Scores and Perceptual Evaluation
 Exploiting the Large-Scale German Broadcast Corpus to Boost the Fraunhofer Iais Speech Recognition System
 Exploring the Utility of Coreference Chains for Improved Identification of Personal Names
 Co-Clustering of Bilingual Datasets as a Mean for Assisting the Construction of Thematic Bilingual Comparable Corpora
 The Extended Dirndl Corpus as a Resource for Coreference and Bridging Resolution
 A Flexible Language Learning Platform Based on Language Resources and Web Services
 Extracting Semantic Relations from Portuguese Corpora Using Lexical-Syntactic Patterns
 An Analysis of Ambiguity in Word Sense Annotations
 Disclose Models, Hide the Data - How to Make Use of Confidential Corpora Without Seeing Sensitive Raw Data
 Exploiting Networks in Law
 Parsing Chinese Synthetic Words with a Character-based Dependency Model
 Multilingual Test Sets for Machine Translation of Search Queries for Cross-Lingual Information Retrieval in the Medical Domain
 Building a Dataset for Summarization and Keyword Extraction from Emails
 
 |  
 | Crowdsourcing | Modern Chinese Helps Archaic Chinese Processing: Finding and Exploiting the Shared Properties A Crowdsourcing Smartphone Application for Swiss German: Putting Language Documentation in the Hands of the Users
 A Study on Expert Sourcing Enterprise Question Collection and Classification
 Collaboration in the Production of a Massively Multilingual Lexicon
 The Newsome Corpus: a Unifying Opinion Annotation Framework Across Genres and in Multiple Languages
 A SICK Cure for the Evaluation of Compositional Distributional Semantic Models
 Morpho-Syntactic Study of Errors from Speech Recognition System
 Crowdsourcing and Annotating NER for Twitter #drift
 Designing and Evaluating a Reliable Corpus of Web Genres Via Crowd-Sourcing
 Crowdsourcing as a Preprocessing for Complex Semantic Annotation Tasks
 A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic
 Crowd-Sourcing Evaluation of Automatically Acquired, Morphologically Related Word Groupings
 Propa-L: a Semantic Filtering Service from a Lexical Network Created Using Games with a Purpose
 When Transliteration Met Crowdsourcing : an Empirical Study of Transliteration Via Crowdsourcing Using Efficient, Non-Redundant and Fair Quality Control
 
 |      
 
 | E |  
 | Emotion Recognition/Generation | Toward a Unifying Model for Opinion, Sentiment and Emotion Information Extraction Eliciting and Annotating Uncertainty in Spoken Language
 Annotating Events in an Emotion Corpus
 Speech-based Emotion Recognition: Feature Selection by Self-Adaptive Multi-Criteria Genetic Algorithm
 The Av-Lasyn Database : a Synchronous Corpus of Audio and 3d Facial Marker Data for Audio-Visual Laughter Synthesis
 Emilya: Emotional Body Expression in Daily Actions Database
 The D-Ans Corpus: the Dublin-Autonomous Nervous System Corpus of Biosignal and Multimodal Recordings of Conversational Speech
 Texafon 2.0: a Text Processing Tool for the Generation of Expressive Speech in Tts Applications
 Media Monitoring and Information Extraction for the Highly Inflected Agglutinative Language Hungarian
 The Sspnet-Mobile Corpus: Social Signal Processing Over Mobile Phones.
 EMOVO Corpus: an Italian Emotional Speech Database
 The Munich Biovoice Corpus: Effects of Physical Exercising, Heart Rate, and Skin Conductance on Human Speech Production
 Voce Corpus: Ecologically Collected Speech Annotated with Physiological and Psychological Stress Assessments
 Alert!... Calm Down, There is Nothing to Worry About. Warning and Soothing Speech Synthesis.
 Modeling, Managing, Exposing, and Linking Ontologies with a Wiki-based Tool
 Smile and Laughter in Human-Machine Interaction: a Study of Engagement
 
 |  
 | Endangered Languages | PanLex: Building a Resource for Panlingual Lexical Translation Enriching ODIN
 TLAXCALA: a Multilingual Corpus of Independent News
 Untrained Forced Alignment of Transcriptions and Audio for Language Documentation Corpora Using Webmaus
 Finite-State Morphological Transducers for Three Kypchak Languages
 A Finite-State Morphological Analyzer for a Lakota HPSG Grammar
 Open-Domain Interaction and Online Content in the Sami Language
 Using Transfer Learning to Assist Exploratory Corpus Annotation
 Linguistic Evaluation of Support Verb Constructions by Openlogos and Google Translate
 First Approach Toward Semantic Role Labeling for Basque
 The Gulf of Guinea Creole Corpora
 An Innovative World Language Centre : Challenges for the Use of Language Technology
 
 |  
 | Evaluation Methodologies | VERTa: Facing a Multilingual Experience of a Linguistically-based MT Evaluation Combining Elicited Imitation and Fluency Features for Oral Proficiency Measurement
 ETER : a New Metric for the Evaluation of Hierarchical Named Entity Recognition
 Measuring Readability of Polish Texts: Baseline Experiments
 Bridging the Gap Between Speech Technology and Natural Language Processing: an Evaluation Toolbox for Term Discovery Systems
 Building a Database of Japanese Adjective Examples from Special Purpose Web Corpora
 A Repository of State of the Art and Competitive Baseline Summaries for Generic News Summarization
 Comparing the Quality of Focused Crawlers and of the Translation Resources Obtained from Them
 Creating and Using Large Monolingual Parallel Corpora for Sentential Paraphrase Generation
 A Comparative Evaluation Methodology for Nlg in Interactive Systems
 An Evaluation of the Role of Statistical Measures and Frequency for Mwe Identification
 Using a Machine Learning Model to Assess the Complexity of Stress Systems
 Translation Errors from English to Portuguese: an Annotated Corpus
 Discosuite - a Parser Test Suite for German Discontinuous Structures
 Corpus and Evaluation of Handwriting Recognition of Historical Genealogical Records
 PACE Corpus: a Multilingual Corpus of Polarity-Annotated Textual Data from the Domains Automotive and CEllphone
 A Benchmark Database of Phonetic Alignments in Historical Linguistics and Dialectology
 Introducing a Framework for the Evaluation of Music Detection Tools
 The Taraxü Corpus of Human-Annotated Machine Translations
 Detecting Document Structure in a Very Large Corpus of Uk Financial Reports
 Measuring Readability of Polish Texts: Baseline Experiments
 S-Pot - a Benchmark in Spotting Signs Within Continuous Signing
 Machine Translation for Subtitling: a Large-Scale Evaluation
 Extrinsic Corpus Evaluation with a Collocation Dictionary Task
 HuRIC: a Human Robot Interaction Corpus
 On the Origin of Errors: a Fine-Grained Analysis of MT and PE Errors and their Relationship
 Dense Components in the Structure of WordNet
 MADAMIRA: a Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic
 A Rank-based Distance Measure to Detect Polysemy and to Determine Salient Vector-Space Features for German Prepositions
 Building a Crisis Management Term Resource for Social Media: the Case of Floods and Protests
 The Use of a Filemaker Pro Database in Evaluating Sign Language Notation Systems
 A Quality-based Active Sample Selection Strategy for Statistical Machine Translation
 A Large-Scale Evaluation of Pre-Editing Strategies for Improving User-Generated Content Translation
 Activ-Es: a Comparable, Cross-Dialect Corpus of everyday Spanish from Argentina, Mexico, and Spain
 Overview of Todai Robot Project and Evaluation Framework of Its NLP-based Problem Solving
 Crowdsourcing for Evaluating Machine Translation Quality
 Student Achievement and French Sentence Repetition Test Scores
 Fuzzy V-Measure - an Evaluation Method for Cluster Analyses of Ambiguous Data
 Why Chinese Web-as-Corpus is Wacky? Or: How Big Data is Killing Chinese Corpus Linguistics
 KoKo: an L1 Learner Corpus for German
 An Efficient and User-Friendly Tool for Machine Translation Quality Estimation
 LexTerm Manager: Design for an Integrated Lexicography and Terminology System
 The Etape Speech Processing Evaluation
 
 |              
 
 | M |  
 | Machine Translation, SpeechToSpeech Translation | Bilingual Dictionary Construction with Transliteration Filtering Large SMTData-Sets Extracted from Wikipedia
 Two-Step Machine Translation with Lattices
 MTWatch: A Tool for the Analysis of Noisy Parallel Data
 Collecting Natural Sms and Chat Conversations in Multiple Languages: the Bolt Phase 2 Corpus
 Comparing the Quality of Focused Crawlers and of the Translation Resources Obtained from Them
 Openlogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries
 Incorporating Alternate Translations Into English Translation Treebank
 Multival - Towards a Multilingual Valence Lexicon
 A Unified Annotation Scheme for the Semantic / Pragmatic Components of Definiteness
 On the Reliability and Inter-Annotator Agreement of Human Semantic MT Evaluation Via Hmeant
 Dual Subtitles as Parallel Corpora
 Bootstrapping Open-Source English-Bulgarian Computational Dictionary
 Collection of a Simultaneous Translation Corpus for Comparative Analysis
 Translation Errors from English to Portuguese: an Annotated Corpus
 English-French Verb Phrase Alignment in Europarl for Tense Translation Modeling
 CFT13: a Resource for Research into the Post-editing Process
 Creating a Massively Parallel Bible Corpus
 Evaluating the Effects of Interactivity in a Post-Editing Workbench
 ParCor 1.0: a Parallel Pronoun-Coreference Corpus to Support Statistical Mt
 An Efficient Language Independent Toolkit for Complete Morphological Disambiguation
 A Corpus of Spontaneous Speech in Lectures: the Kit Lecture Corpus for Spoken Language Processing and Translation
 On the Annotation of TMX Translation Memories for Advanced Leveraging in Computer-Aided Translation
 The Taraxü Corpus of Human-Annotated Machine Translations
 The Strategic Impact of Meta-Net on the Regional, National and International Level
 An Iterative Approach for Mining Parallel Sentences in a Comparable Corpus
 Collocation Or Free Combination? ― Applying Machine Translation Techniques to Identify Collocations in Japanese
 Multiword Expressions in Machine Translation
 Crowdsourcing for Evaluating Machine Translation Quality
 Hindencorp - Hindi-English and Hindi-Only Corpus for Machine Translation
 caWaC - a Web Corpus of Catalan and Its Application to Language Modeling and Machine Translation
 Billions of Parallel Words for Free: Building and Using the Eu Bookshop Corpus
 LinkedHealthAnswers: Towards Linked Data-driven Question Answering for the Health Care Domain
 Chasing the Perfect Splitter: a Comparison of Different Compound Splitting Tools
 A Comparison of Mt Errors and Esl Errors
 Improving Evaluation of English-Czech Mt Through Paraphrasing
 DCEP - Digital Corpus of the European Parliament
 An Efficient and User-Friendly Tool for Machine Translation Quality Estimation
 
 |  
 | Metadata | Revising the Annotation of a Broadcast News Corpus: a Linguistic Approach Using Large Biomedical Databases as Gold Annotations for Automatic Relation Extraction
 Developing a Framework for Describing Relations Among Language Resources
 Global Intelligent Content: Active Curation of Language Resources Using Linked Data
 Experiences with the Isocat Data Category Registry
 The Dutch LESLLA Corpus
 The EASR Corpora of European Portuguese, French, Hungarian and Polish Elderly Speech
 Three Dimensions of the So-Called "interoperability" of Annotation Schemes
 TagNText: a Parallel Corpus for the Induction of Resource-Specific Non-Taxonomical Relations from Tagged Images
 Meta-Share: One Year After
 Meta-Classifiers Easily Improve Commercial Sentiment Detection Tools
 HindEnCorp - Hindi-English and Hindi-only Corpus for Machine Translation
 Recent Developments in DeReKo
 Vulnerability in Acquisition, Language Impairments in Dutch: Creating a Valid Data Archive
 Improving Entity Linking Using Surface Form Refinement
 Facing the Identification Problem in Language-Related Scientific Data Analysis.
 
 |  
 | Morphology | DerivBase.Hr: a High-Coverage Derivational Morphology Resource for Croatian Generating and Using Probabilistic Morphological Resources for the Biomedical Domain
 Computer-Aided Morphology Expansion for Old Swedish
 DeLex, a Freely-Avaible, Large-Scale and Linguistically Grounded Morphological Lexicon for German
 Automatic Refinement of Syntactic Categories in Chinese Word Structures
 Bootstrapping Open-Source English-Bulgarian Computational Dictionary
 Amazigh Verb Conjugator
 Szeged Corpus 2.5: Morphological Modifications in a Manually Pos-Tagged Hungarian Corpus
 The Syn-Series Corpora of Written Czech
 Corpus of 19th-Century Czech Texts: Problems and Solutions
 Automatic Error Detection Concerning the Definite and Indefinite Conjugation in the Hunlearner Corpus
 A Language-Independent Approach to Extracting Derivational Relations from an Inflectional Lexicon
 Morpho-Syntactic Study of Errors from Speech Recognition System
 Can Crowdsourcing Be Used for Effective Annotation of Arabic?
 Word-Formation Network for Czech
 Glàff, a Large Versatile French Lexicon
 The CMU Metal Farsi NLP Approach
 Language Resource Addition: Dictionary Or Corpus?
 The Development of Dutch and Afrikaans Language Resources for Compound Boundary Analysis.
 Correcting Errors in a New Gold Standard for Tagging Icelandic Text
 The Hungarian Gigaword Corpus
 Measuring the Impact of Spelling Errors on the Quality of Machine Translation
 Automatic Acquisition of Urdu Nouns (along with Gender and Irregular Plurals)
 Chasing the Perfect Splitter: a Comparison of Different Compound Splitting Tools
 
 |  
 | MultiWord Expressions & Collocations | PropBank: Semantics of New Predicate Types 4FX: Light Verb Constructions in a Multilingual Parallel Corpus
 Semi-Compositional Method for Synonym Extraction of Multi-Word Terms
 Linguistic Resources and Cats: How to Use Isocat, Relcat and Schemacat
 Identifying Idioms in Chinese Translations
 Identification of Multiword Expressions in the Brwac
 Extrinsic Corpus Evaluation with a Collocation Dictionary Task
 Comprehensive Annotation of Multiword Expressions in a Social Web Corpus
 T2K²: a System for Automatically Extracting and Organizing Knowledge from Texts
 Reconstructing the Semantic Landscape of Natural Language Processing
 ISLEX ― a Multilingual Web Dictionary
 TermWise: A CAT-tool with Context-Sensitive Terminological Support.
 Compounds and Distributional Thesauri
 SwissAdmin: a Multilingual Tagged Parallel Corpus of Press Releases
 Summarizing News Clusters on the Basis of Thematic Chains
 Named Entity Tagging a Very Large Unbalanced Corpus: Training and Evaluating Ne Classifiers
 LexTerm Manager: Design for an Integrated Lexicography and Terminology System
 
 |  
 | Multilinguality | Using Resource-Rich Languages to Improve Morphological Analysis of Under-Resourced Languages Universal Stanford Dependencies: a Cross-Linguistic Typology
 Pivot-based Multilingual Dictionary Building Using Wiktionary
 Production of Phrase Tables in 11 European Languages Using an Improved Sub-Sentential Aligner
 The Making of Ancient Greek WordNet
 Extracting a Bilingual Semantic Grammar from Framenet-Annotated Corpora
 Etymological WordNet: Tracing the History of Words
 TLAXCALA: a Multilingual Corpus of Independent News
 Relating Frames and Constructions in Japanese Framenet
 Tharwa: a Large Scale Dialectal Arabic - Standard Arabic - English Lexicon
 Automatic Methods for the Extension of a Bilingual Dictionary Using Comparable Corpora
 Aggregation Methods for Efficient Collocation Detection
 Globalphone: Pronunciation Dictionaries in 20 Languages
 Linguistic Evaluation of Support Verb Constructions by Openlogos and Google Translate
 Building a Dataset of Multilingual Cognates for the Romanian Lexicon
 Automatic Expansion of the MRC Psycholinguistic Database Imageability Ratings
 English-French Verb Phrase Alignment in Europarl for Tense Translation Modeling
 Constructing a Chinese―Japanese Parallel Corpus from Wikipedia
 xLiD-Lexica: Cross-lingual Linked Data Lexica
 An Efficient Language Independent Toolkit for Complete Morphological Disambiguation
 4FX: Light Verb Constructions in a Multilingual Parallel Corpus
 Resources in Conflict: a Bilingual Valency Lexicon Vs. a Bilingual Treebank Vs. a Linguistic Theory
 Buy One Get One Free: Distant Annotation of Chinese Tense, Event Type and Modality
 Comparison of the Impact of Word Segmentation on Name Tagging for Chinese and Japanese
 Not an Interlingua, But Close: Comparison of English Amrs to Chinese and Czech
 On Complex Word Alignment Configurations
 Bring vs. MTRoget: Evaluating Automatic Thesaurus Translation
 The Strategic Impact of Meta-Net on the Regional, National and International Level
 Bilingual Dictionary Induction as an Optimization Problem
 Bootstrapping Term Extractors for Multiple Languages
 Clustering of Multi-Word Named Entity Variants: Multilingual Evaluation
 A Multidialectal Parallel Corpus of Arabic
 Transfer Learning of Feedback Head Expressions in Danish and Polish Comparable Multimodal Corpora
 Comparing Two Acquisition Systems for Automatically Building an English―Croatian Parallel Corpus from Multilingual Websites
 Hashtag Occurrences, Layout and Translation: a Corpus-Driven Analysis of Tweets Published by the Canadian Government
 On the Origin of Errors: a Fine-Grained Analysis of MT and PE Errors and their Relationship
 YouDACC: the Youtube Dialectal Arabic Comment Corpus
 Improving the Exploitation of Linguistic Annotations in Elan
 Automatic Extraction of Synonyms for German Particle Verbs from Parallel Data with Distributional Similarity as a Re-Ranking Feature
 NASTIA: Negotiating Appointment Setting Interface
 Applying Accessibility-Oriented Controlled Language (CL) Rules to Improve Appropriateness of Text Alternatives for Images: an Exploratory Study
 The DIRHA simulated corpus
 High Quality Word Lists as a Resource for Multiple Purposes
 ISLEX ― a Multilingual Web Dictionary
 Exploiting Catenae in a Parallel Treebank Alignment
 Multiple Choice Question Corpus Analysis for Distractor Characterization
 Euronews: a Multilingual Speech Corpus for ASR
 Towards Multilingual Conversations in the Medical Domain: Development of Multilingual Medical Data and a Network-based ASR System
 How to Construct a Multi-Lingual Domain Ontology
 Mining Online Discussion Forums for Metaphors
 TALC-Sef a Manually-revised POS-Tagged Literary Corpus in Serbian, English and French
 The Development of the Multilingual Luna Corpus for Spoken Language System Porting
 An Open-Source Heavily Multilingual Translation Graph Extracted from Wiktionaries and Parallel Corpora
 Quality Estimation for Synthetic Parallel Data Generation
 Representing Multilingual Data as Linked Data: the Case of Babelnet 2.0
 A Framework for Compiling High Quality Knowledge Resources from Raw Corpora
 Extending Heideltime for Temporal Expressions Referring to Historic Dates
 Enabling Language Resources to Expose Translations as Linked Data on the Web
 Multilingual Extended WordNet Knowledge Base: Semantic Parsing and Translation of Glosses
 A Comparison of Mt Errors and Esl Errors
 HamleDT 2.0: Thirty Dependency Treebanks Stanfordized
 Building the Sense-Tagged Multilingual Parallel Corpus
 A Hindi-English Code-Switching Corpus
 Resource Creation and Evaluation for Multilingual Sentiment Analysis in Social Media Texts
 RECSA: Resource for Evaluating Cross-Lingual Semantic Annotation
 
 |  
 | Multimedia Document Processing | Multimodal Corpora for Silent Speech Interaction Extending Standoff Annotation
 Expanding N-Gram Analytics in Elan and a Case Study for Sign Synthesis
 TVD: a Reproducible and Multiply Aligned Tv Series Dataset
 New Functions for a Multipurpose Multimodal Tool for Phonetic and Linguistic Analysis of Very Large Speech Corpora
 
 |        
 
 | P |  
 | Parsing | A Gold Standard Dependency Corpus for English Boosting the Creation of a Treebank
 A System for Experiments with Dependency Parsers
 Improving Open Relation Extraction Via Sentence Re-Structuring
 Universal Stanford Dependencies: a Cross-Linguistic Typology
 Turkish Treebank as a Gold Standard for Morphological Disambiguation and Its Influence on Parsing
 Incorporating Alternate Translations Into English Translation Treebank
 Pre-Ordering of Phrase-based Machine Translation Input in Translation Workflow
 Towards Building a Kashmiri Treebank: Setting Up the Annotation Pipeline
 Information Extraction from German Patient Records Via Hybrid Parsing and Relation Extraction Strategies
 Parsing Heterogeneous Corpora with a Rich Dependency Grammar
 Mapping Diatopic and Diachronic Variation in Spoken Czech: the Ortofon and Dialekt Corpora
 The Norwegian Dependency Treebank
 All Fragments Count in Parser Evaluation
 A Persian Treebank with Stanford Typed Dependencies
 A Japanese Word Dependency Corpus
 Converting an HPSG-based Treebank Into Its Parallel Dependency-based Treebank
 Legal Aspects of Text Mining
 Treelet Probabilities for HPSG Parsing and Error Correction
 Swift Aligner, a Multifunctional Tool for Parallel Corpora: Visualization, Word Alignment, and (Morpho)-Syntactic Cross-Language Transfer
 Projection-based Annotation of a Polish Dependency Treebank
 Towards an Encyclopedia of Compositional Semantics: Documenting the Interface of the English Resource Grammar
 The CMU Metal Farsi NLP Approach
 The Setimes.Hr Linguistically Annotated Corpus of Croatian
 Croatian Dependency Treebank 2.0: New Annotation Guidelines for Improved Parsing
 Constituency Parsing of Bulgarian: Word- Vs Class-based Parsing
 An Out-Of-Domain Test Suite for Dependency Parsing of German
 Automatically Enriching Spoken Corpora with Syntactic Information for Linguistic Studies
 Because Size Does Matter: the Hamburg Dependency Treebank
 Dependency Parsing Representation Effects on the Accuracy of Semantic Applications ― an Example of an Inflective Language
 HamleDT 2.0: Thirty Dependency Treebanks Stanfordized
 Validation Issues Induced by an Automatic Pre-Annotation Mechanism in the Building of Non-Projective Dependency Treebanks
 Bidirectionnal Converter Between Syntactic Annotations : from French Treebank Dependencies to Passage Annotations, and Back
 
 |  
 | Part-of-Speech Tagging | PoliTa: a Multitagger for Polish DeLex, a Freely-Avaible, Large-Scale and Linguistically Grounded Morphological Lexicon for German
 The Kiezdeutsch Korpus (KiDKo) Release 1.0
 ColLex.EN: Automatically Generating and Evaluating a Full-Form Lexicon for English
 Developing an Egyptian Arabic Treebank: Impact of Dialectal Morphology on Annotation and Tool Development
 Finite-State Morphological Transducers for Three Kypchak Languages
 Using Transfer Learning to Assist Exploratory Corpus Annotation
 Szeged Corpus 2.5: Morphological Modifications in a Manually Pos-Tagged Hungarian Corpus
 The Cle Urdu POS Tagset
 Adapting Freely Available Resources to Build an Opinion Mining Pipeline in Portuguese
 Using Stem-Templates to Improve Arabic POS and Gender / Number Tagging
 CoRoLa ― The Reference Corpus of Contemporary Romanian Language
 The LIMA Multilingual Analyzer Made Free: FLOSS Resources Adaptation and Correction
 Bootstrapping Term Extractors for Multiple Languages
 The Gulf of Guinea Creole Corpora
 A Corpus of European Portuguese Child and Child-Directed Speech
 A Tagged Corpus and a Tagger for Urdu
 Talapi ― a Thai Linguistically Annotated Corpus for Language Processing
 Language Resource Addition: Dictionary Or Corpus?
 The Setimes.Hr Linguistically Annotated Corpus of Croatian
 Activ-Es: a Comparable, Cross-Dialect Corpus of everyday Spanish from Argentina, Mexico, and Spain
 TALC-Sef a Manually-revised POS-Tagged Literary Corpus in Serbian, English and French
 Morfeusz Reloaded
 SwissAdmin: a Multilingual Tagged Parallel Corpus of Press Releases
 Standardisation and Interoperation of Morphosyntactic and Syntactic Annotation Tools for Spanish and Their Annotations
 A 500 Million Word Pos-Tagged Icelandic Corpus
 Macrosyntactic Segmenters of a French Spoken Corpus
 KoKo: an L1 Learner Corpus for German
 
 |  
 | Person Identification | Sockpuppet Detection in Wikipedia: a Corpus of Real-World Deceptive Writing for Linking Identities An Effortless Way to Create Large-Scale Datasets for Famous Speakers
 Comparison of Gender- and Speaker-Adaptive Emotion Recognition
 German Alcohol Language Corpus - the Question of Dialect
 
 |  
 | Phonetic Databases, Phonology | On the Use of a Fuzzy Classifier to Speed Up the Sp_ToBI Labeling of the Glissando Spanish Corpus Using a Machine Learning Model to Assess the Complexity of Stress Systems
 The Nijmegen Corpus of Casual Czech
 Computer-Aided Quality Assurance of an Icelandic Pronunciation Dictionary
 Phoneme Similarity Matrices to Improve Long Audio Alignment for Automatic Subtitling
 GRASS: the Graz Corpus of Read and Spontaneous Speech
 Design and Development of an Rdb Version of the Corpus of Spontaneous Japanese
 Glàff, a Large Versatile French Lexicon
 C-Phonogenre: a 7-Hours Corpus of 7 Speaking Styles in French: Relations Between Situational Features and Prosodic Properties
 
 |  
 | Profiling | CLiPS Stylometry Investigation (CSI) Corpus: a Dutch Corpus for the Detection of Age, Gender, Personality, Sentiment and Deception in Text How to Use Less Features and Reach Better Performance in Author Gender Identification
 Modeling and Evaluating Dialog Success in the Last Minute Corpus
 Recognising Suicidal Messages in Dutch Social Media
 
 |  
 | Prosody | ALICO: a Multimodal Corpus for the Study of Active Listening A Cross-Language Corpus for Studying the Phonetics and Phonology of Prominence
 Praaline: Integrating Tools for Speech Corpus Research
 Evaluating Improvised Hip Hop Lyrics - Challenges and Observations
 Eliciting and Annotating Uncertainty in Spoken Language
 Teenage and Adult Speech in School Context: Building and Processing a Corpus of European Portuguese
 Prosodic, Syntactic, Semantic Guidelines for Topic Structures Across Domains and Corpora
 Annotation Pro + Tga: Automation of Speech Timing Analysis
 New Spanish Speech Corpus Database for the Analysis of People Suffering from Parkinson's Disease
 Towards Automatic Transformation Between Different Transcription Conventions: Prediction of Intonation Markers from Linguistic and Acoustic Features
 RSS-TOBI - a Prosodically Enhanced Romanian Speech Corpus
 Using Audio Books for Training a Text-To-Speech System
 Assessment of Non-Native Prosody for Spanish as L2 Using Quantitative Scores and Perceptual Evaluation
 C-Phonogenre: a 7-Hours Corpus of 7 Speaking Styles in French: Relations Between Situational Features and Prosodic Properties
 The Extended Dirndl Corpus as a Resource for Coreference and Bridging Resolution
 New Functions for a Multipurpose Multimodal Tool for Phonetic and Linguistic Analysis of Very Large Speech Corpora
 DisMo: a Morphosyntactic, Disfluency and Multi-Word Unit Annotator. an Evaluation on a Corpus of French Spontaneous and Read Speech
 Segmentation Evaluation Metrics, a Comparison Grounded on Prosodic and Discourse Units
 
 |      
 
 | S |  
 | Semantic Web | Accommodations in Tuscany as Linked Data The DWAN Framework: Application of a Web Annotation Framework for the General Humanities to the Domain of Language Resources
 A Meta-Data Driven Platform for Semi-Automatic Configuration of Ontology Mediators
 N-Gram Counts and Language Models from the Common Crawl
 TMO ― the Federated Ontology of the TRENDMINER Project
 A SKOS-based Schema for TEI encoded Dictionaries at ICLTT
 Efficient Reuse of Structured and Unstructured Resources for Ontology Population
 Linked Open Data and Web Corpus Data for Noun Compound Bracketing
 Newsreader: Recording History from Daily News Streams
 Discovering and Visualising Stories in News
 From Natural Language to Ontology Population in the Cultural Heritage Domain. a Computational Linguistics-based Approach.
 NIF4OGGD - NLP Interchange Format for Open German Governmental Data
 The LRE Map Disclosed
 Representing Multilingual Data as Linked Data: the Case of Babelnet 2.0
 VOAR: A Visual and Integrated Ontology Alignment Environment
 
 |  
 | Semantics | PropBank: Semantics of New Predicate Types Reusing Swedish Framenet for Training Semantic Roles
 A Rank-based Distance Measure to Detect Polysemy and to Determine Salient Vector-Space Features for German Prepositions
 Image Annotation with Iso-Space: Distinguishing Content from Structure
 Semantic Approaches to Software Component Retrieval with English Queries
 Definition Patterns for Predicative Terms in Specialized Lexical Resources
 The Making of Ancient Greek WordNet
 Augmenting English Adjective Senses with Supersenses
 Evaluation of Simple Distributional Compositional Operations on Longer Texts
 Relating Frames and Constructions in Japanese Framenet
 Crowdsourcing for the Identification of Event Nominals: an Experiment
 Tharwa: a Large Scale Dialectal Arabic - Standard Arabic - English Lexicon
 Semantic Technologies for Querying Linguistic Annotations: an Experiment Focusing on Graph-Structured Data
 A Unified Annotation Scheme for the Semantic / Pragmatic Components of Definiteness
 Aligning Predicate-Argument Structures for Paraphrase Fragment Extraction
 On the Reliability and Inter-Annotator Agreement of Human Semantic MT Evaluation Via Hmeant
 Mapping WordNet Domains, WordNet Topics and Wikipedia Categories to Generate Multilingual Domain Specific Resources
 Adapting VerbNet to French Using Existing Resources
 Corpus-based Computation of Reverse Associations
 Annotating Relation Mentions in Tabloid Press
 Mapping Diatopic and Diachronic Variation in Spoken Czech: the Ortofon and Dialekt Corpora
 Constructing a Corpus of Japanese Predicate Phrases for Synonym / Antonym Relations
 Distributed Distributional Similarities of Google Books Over the Centuries
 How to Tell a Schneemann from a Milchmann: an Annotation Scheme for Compound-Internal Relations
 Construction of Diachronic Ontologies from People's Daily of Fifty Years
 Resources in Conflict: a Bilingual Valency Lexicon Vs. a Bilingual Treebank Vs. a Linguistic Theory
 Buy One Get One Free: Distant Annotation of Chinese Tense, Event Type and Modality
 Building a Reference Lexicon for Countability in English
 Not an Interlingua, But Close: Comparison of English Amrs to Chinese and Czech
 WordNet―Wikipedia―Wiktionary: Construction of a Three-Way Alignment
 Evaluation of Automatic Hypernym Extraction from Technical Corpora in English and Dutch
 Discovering Frames in Specialized Domains
 Resources for the Detection of Conventionalized Metaphors in Four Languages
 Annotation of Computer Science Papers for Semantic Relation Extrac-Tion
 Using C5.0 and Exhaustive Search for Boosting Frame-Semantic Parsing Accuracy
 Automatic Semantic Relation Extraction from Portuguese Texts
 Lexical Substitution Dataset for German
 Polysemy Index for Nouns: an Experiment on Italian Using the Parole Simple CLiPS Lexical Database
 Manual Analysis of Structurally Informed Reordering in German-English Machine Translation
 Criteria for Identifying and Annotating Caused Motion Constructions in Corpus Data
 Web-Imageability of the Behavioral Features of Basic-Level Concepts
 Semi-Compositional Method for Synonym Extraction of Multi-Word Terms
 From Synsets to Videos: Enriching Italwordnet Multimodally
 Mining Online Discussion Forums for Metaphors
 Classifying Inconsistencies in DBpedia Language Specific Chapters
 Flow Graph Corpus from Recipe Texts
 To Pay Or to Get Paid: Enriching a Valency Lexicon with Diatheses
 Annotating the Focus of Negation in Japanese Text
 Less is More? Towards a Reduced Inventory of Categories for Training a Parser for the Italian Stanford Dependencies
 Combining Dependency Information and Generalization in a Pattern-based Approach to the Classification of Lexical-Semantic Relation Instances
 Dependency Parsing Representation Effects on the Accuracy of Semantic Applications ― an Example of an Inflective Language
 Extending the Coverage of a Mwe Database for Persian Cps Exploiting Valency Alternations
 Single Classifier Approach for Verb Sense Disambiguation Based on Generalized Features
 An Analysis of Ambiguity in Word Sense Annotations
 SANA: a Large Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis
 Word Semantic Similarity for Morphologically Rich Languages
 Focusing Annotation for Semantic Role Labeling
 
 |  
 | Sign Language Recognition/Generation | SLMotion - an Extensible Sign Language Oriented Video Analysis Tool Extensions of the Sign Language Recognition and Translation Corpus Rwth-Phoenix-Weather
 Expanding N-Gram Analytics in Elan and a Case Study for Sign Synthesis
 LinkedHealthAnswers: Towards Linked Data-driven Question Answering for the Health Care Domain
 
 |  
 | Social Media Processing | A Corpus of Comparisons in Product Reviews A Corpus of Participant Roles in Contentious Discussions
 Modelling Irony in Twitter: Feature Analysis and Evaluation
 Getting Reliable Annotations for Sarcasm in Online Dialogues
 Finding Romanized Arabic Dialect in Code-Mixed Tweets
 Votter Corpus: a Corpus of Social Polling Language
 A German Twitter Snapshot
 SenTube: a Corpus for Sentiment Analysis on Youtube Social Media
 Simple Effective Microblog Named Entity Recognition: Arabic as an Example
 An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis
 The Dangerous Myth of the Star System
 Crowdsourcing and Annotating NER for Twitter #drift
 When POS Data Sets Don't Add Up: Combatting Sample Bias
 Benchmarking Twitter Sentiment Analysis Tools
 Comprehensive Annotation of Multiword Expressions in a Social Web Corpus
 Building a Crisis Management Term Resource for Social Media: the Case of Floods and Protests
 Who Cares About Sarcastic Tweets? Investigating the Impact of Sarcasm on Sentiment Analysis.
 Named Entity Corpus Construction Using Wikipedia and DBpedia Ontology
 Towards Shared Datasets for Normalization Research
 Nomad: Linguistic Resources and Tools Aimed at Policy Formulation and Validation
 TweetCaT: a Tool for Building Twitter Corpora of Smaller Languages
 A Framework for Public Health Surveillance
 
 |  
 | Speech Recognition/Understanding | The Etape Speech Processing Evaluation Enhancing the Ted-Lium Corpus with Selected Data for Language Modeling and More Ted Talks
 Automatically Enriching Spoken Corpora with Syntactic Information for Linguistic Studies
 ASR-based CALL Systems and Learner Speech Data: New Resources and Opportunities for Research and Development in Second Language Learning
 Ciempiess: a New Open-Sourced Mexican Spanish Radio Corpus
 Speech Recognition Web Services for Dutch
 A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition
 Free English and Czech Telephone Speech Corpus Shared Under the Cc-By-Sa 3.0 License
 The DIRHA simulated corpus
 TUKE-BNews-SK: Slovak Broadcast News Corpus Construction and Evaluation
 Towards Multilingual Conversations in the Medical Domain: Development of Multilingual Medical Data and a Network-based Asr System
 The Slovene Bnsi Broadcast News Database and Reference Speech Corpus Gos: Towards the Uniform Guidelines for Future Work
 A Toolkit for Efficient Learning of Lexical Units for Speech Recognition
 Basque Speecon-Like and Basque Speechdat Mdb-600: Speech Databases for the Development of ASR Technology for Basque
 Using a Serious Game to Collect a Child Learner Speech Corpus
 A LDA-based Topic Classification Approach from highly Imperfect Automatic Transcriptions
 Exploiting the Large-Scale German Broadcast Corpus to Boost the Fraunhofer Iais Speech Recognition System
 El-Woz: a Client-Server Wizard-Of-Oz Interface
 
 |  
 | Speech Resource/Database | Phoneme Set Design Using English Speech Database by Japanese for Dialogue-based English Call Systems Croatian Memories
 Designing the Latvian Speech Recognition Corpus
 The Kiezdeutsch Korpus (KiDKo) Release 1.0
 The RATS Collection: Supporting HLT Research with Degraded Audio Data
 Untrained Forced Alignment of Transcriptions and Audio for Language Documentation Corpora Using Webmaus
 The Sweet-Home Speech and Multimodal Corpus for Home Automation Interaction
 Collection of a Simultaneous Translation Corpus for Comparative Analysis
 The Database for Spoken German ― DGD2
 SAVAS: Collecting, Annotating and Sharing Audiovisual Language Resources for Automatic Subtitling
 Speech Recognition Web Services for Dutch
 ML-Optimization of Ported Constraint Grammars
 Phone Boundary Annotation in Conversational Speech
 The Research and Teaching Corpus of Spoken German ― Folk
 Free Acoustic and Language Models for Large Vocabulary Continuous Speech Recognition in Swedish
 An Effortless Way to Create Large-Scale Datasets for Famous Speakers
 Rhapsodie: a Prosodic-Syntactic Treebank for Spoken French
 GRASS: the Graz Corpus of Read and Spontaneous Speech
 German Alcohol Language Corpus - the Question of Dialect
 Development of a Tv Broadcasts Speech Recognition System for Qatari Arabic
 Design and Development of an Rdb Version of the Corpus of Spontaneous Japanese
 Automatic Long Audio Alignment and Confidence Scoring for Conversational Arabic Speech
 Semi-Automatic Annotation of the Ucu Accents Speech Corpus
 AusTalk: an Audio-Visual Corpus of Australian English
 The Sspnet-Mobile Corpus: Social Signal Processing Over Mobile Phones.
 Extensions of the Sign Language Recognition and Translation Corpus Rwth-Phoenix-Weather
 Mapping CPA Patterns onto OntoNotes Senses
 Voce Corpus: Ecologically Collected Speech Annotated with Physiological and Psychological Stress Assessments
 A Multimodal Corpus of Rapid Dialogue Games
 Alert!... Calm Down, There is Nothing to Worry About. Warning and Soothing Speech Synthesis.
 CORILGA: a Galician Multilevel Annotated Speech Corpus for Linguistic Analysis
 Basque Speecon-Like and Basque Speechdat Mdb-600: Speech Databases for the Development of ASR Technology for Basque
 Erlangen-CLP: A Large Annotated Corpus of Speech from Children with Cleft Lip and Palate
 Using a Serious Game to Collect a Child Learner Speech Corpus
 Using Audio Books for Training a Text-To-Speech System
 Discovering the Italian Literature: Interactive Access to Audio Indexed Text Resources
 VOLIP: a Corpus of Spoken Italian and a Virtuous Example of Reuse of Linguistic Resources
 A Hindi-English Code-Switching Corpus
 El-Woz: a Client-Server Wizard-Of-Oz Interface
 Multilingual Test Sets for Machine Translation of Search Queries for Cross-Lingual Information Retrieval in the Medical Domain
 
 |  
 | Speech Synthesis | The MMASCS Multi-Modal Annotated Synchronous Corpus of Audio, Video, Facial Motion and Tongue Motion Data of Normal, Fast and Slow Speech Texafon 2.0: a Text Processing Tool for the Generation of Expressive Speech in Tts Applications
 RSS-TOBI - a Prosodically Enhanced Romanian Speech Corpus
 A Flexible Language Learning Platform Based on Language Resources and Web Services
 
 |  
 | Standards for LRs | On Paraphrase Identification Corpora Image Annotation with Iso-Space: Distinguishing Content from Structure
 N-Gram Counts and Language Models from the Common Crawl
 Benchmarking of English-Hindi Parallel Corpora
 RELISH LMF: Unlocking the Full Power of the Lexical Markup Framework
 The CMD Cloud
 Interoperability of Dialogue Corpora Through Iso 24617-2-based Querying
 A Benchmark Database of Phonetic Alignments in Historical Linguistics and Dialectology
 Using TEI, CMDI and ISOcat in CLARIN-DK
 Legal Aspects of Text Mining
 Towards an Integration of Syntactic and Temporal Annotations in Estonian
 Adapting a Part-Of-Speech Tagset to Non-Standard Text: the Case of Stts
 An Open Source Part-Of-Speech Tagger for Norwegian: Building on Existing Language Resources
 Vulnerability in Acquisition, Language Impairments in Dutch: Creating a Valid Data Archive
 Facing the Identification Problem in Language-Related Scientific Data Analysis.
 Off-Road LAF: Encoding and Processing Annotations in NLP Workflows
 
 |  
 | Statistical and Machine Learning Methods | Gold-Standard for Topic-Specific Sentiment Analysis of Economic Texts Semantic Approaches to Software Component Retrieval with English Queries
 Missed Opportunities in Translation Memory Matching
 Use of Unsupervised Word Classes for Entity Recognition: Application to the Detection of Disorders in Clinical Reports
 ColLex.EN: Automatically Generating and Evaluating a Full-Form Lexicon for English
 Event Extraction Using Distant Supervision
 A Vector Space Model for Syntactic Distances Between Dialects
 The Av-Lasyn Database : a Synchronous Corpus of Audio and 3d Facial Marker Data for Audio-Visual Laughter Synthesis
 Exploring and Visualizing Variation in Language Resources
 SLMotion - an Extensible Sign Language Oriented Video Analysis Tool
 Boosting the Creation of a Treebank
 Improvements to Dependency Parsing Using Automatic Simplification of Data
 From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers
 Comparison of Gender- and Speaker-Adaptive Emotion Recognition
 Disambiguating Verbs by Collocation: Corpus Lexicography Meets Natural Language Processing
 GenitivDB ― a Corpus-Generated Database for German Genitive Classification
 3d Face Tracking and Multi-Scale, Spatio-Temporal Analysis of Linguistically Significant Facial Expressions and Head Positions in Asl
 All Fragments Count in Parser Evaluation
 A Language-Independent Approach to Extracting Derivational Relations from an Inflectional Lexicon
 Bring vs. MTRoget: Evaluating Automatic Thesaurus Translation
 Latent Semantic Analysis Models on Wikipedia and Tasa
 Shata-Anuvadak: Tackling Multiway Translation of Indian Languages
 Narrowing the Gap Between Termbases and Corpora in Commercial Environments
 Machine Translationness: Machine-Likeness in Machine Translation Evaluation
 Using C5.0 and Exhaustive Search for Boosting Frame-Semantic Parsing Accuracy
 Projection-based Annotation of a Polish Dependency Treebank
 A Deep Context Grammatical Model for Authorship Attribution
 DINASTI: Dialogues with a Negotiating Appointment Setting Interface
 LQVSumm: a Corpus of Linguistic Quality Violations in Multi-Document Summarization
 Choosing Which to Use? A Study of Distributional Models for Nominal Lexical Semantic Classification
 Estimation of Speaking Style in Speech Corpora Focusing on Speech Transcriptions
 A Quality-based Active Sample Selection Strategy for Statistical Machine Translation
 Metadata as Linked Open Data: Mapping Disparate Xml Metadata Registries Into One Rdf / Owl Registry.
 Hindi to English Machine Translation: Using Effective Selection in Multi-Model SMT
 New Spanish Speech Corpus Database for the Analysis of People Suffering from Parkinson's Disease
 Automatic Language Identity Tagging on Word and Sentence-Level in Multilingual Text Sources: a Case-Study on Luxembourgish
 Crowd-Sourcing Evaluation of Automatically Acquired, Morphologically Related Word Groupings
 A Language-Independent and Fully Unsupervised Approach to Lexicon Induction and Part-Of-Speech Tagging for Closely Related Languages
 Quality Estimation for Synthetic Parallel Data Generation
 Online Optimisation of Log-Linear Weights in Interactive Machine Translation
 Finding a Tradeoff Between Accuracy and Rater's Workload in Grading Clustered Short Answers
 Evaluation of Technology Term Recognition with Random Indexing
 Utilizing Constituent Structure for Compound Analysis
 
 |  
 | Summarisation | Building a Dataset for Summarization and Keyword Extraction from Emails The Polish Summaries Corpus
 The Impact of Cohesion Errors in Extraction Based Summaries
 Out in the Open: Finding and Categorising Errors in the Lexical Simplification Pipeline
 Locating Requests Among Open Source Software Communication Messages
 How Could Veins Speed Up the Process of Discourse Parsing
 A Repository of State of the Art and Competitive Baseline Summaries for Generic News Summarization
 
 |    
 
 | T |  
 | Text Mining | Gold-Standard for Topic-Specific Sentiment Analysis of Economic Texts HiEve: A Corpus for Extracting Event Hierarchies from News Stories
 Co-Clustering of Bilingual Datasets as a Mean for Assisting the Construction of Thematic Bilingual Comparable Corpora
 Enrichment of Bilingual Dictionary Through News Stream Data
 Event Extraction Using Distant Supervision
 SinoCoreferencer: An End-to-End Chinese Event Coreference Resolver
 Using Large Biomedical Databases as Gold Annotations for Automatic Relation Extraction
 Annotating Inter-Sentence Temporal Relations in Clinical Notes
 Tools for Arabic Natural Language Processing: a Case Study in Qalqalah Prosody
 Dual Subtitles as Parallel Corpora
 Variations on Quantitative Comparability Measures and Their Evaluations on Synthetic French-English Comparable Corpora
 Linking Pictographs to Synsets: Sclera2Cornetto
 Information Extraction from German Patient Records Via Hybrid Parsing and Relation Extraction Strategies
 Constructing a Corpus of Japanese Predicate Phrases for Synonym / Antonym Relations
 On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter
 Automatic Semantic Relation Extraction from Portuguese Texts
 Creating a Gold Standard Corpus for the Extraction of Chemistry-Disease Relations from Patent Texts
 Estimation of Speaking Style in Speech Corpora Focusing on Speech Transcriptions
 AraNLP: a Java-based Library for the Processing of Arabic Text
 Who Cares About Sarcastic Tweets? Investigating the Impact of Sarcasm on Sentiment Analysis.
 Coreference Resolution for Latvian
 Ranking Job Offers for Candidates: Learning Hidden Knowledge from Big Data
 Clustering Tweets Usingwikipedia Concepts
 Hot Topics and Schisms in NLP: Community and Trend Analysis with Saffron on Acl and Lrec Proceedings
 The American Local News Corpus
 When Transliteration Met Crowdsourcing : an Empirical Study of Transliteration Via Crowdsourcing Using Efficient, Non-Redundant and Fair Quality Control
 
 |  
 | Textual Entailment and Paraphrasing | On Paraphrase Identification Corpora Multimodal Dialogue Segmentation with Gesture Post-Processing
 A SICK Cure for the Evaluation of Compositional Distributional Semantic Models
 Semantic Clustering of Pivot Paraphrases
 The Multilingual Paraphrase Database
 Annotating the Focus of Negation in Japanese Text
 Improving Evaluation of English-Czech Mt Through Paraphrasing
 
 |  
 | Tools, Systems, Applications | VERTa: Facing a Multilingual Experience of a Linguistically-based MT Evaluation Accommodations in Tuscany as Linked Data
 Discovering the Italian Literature: Interactive Access to Audio Indexed Text Resources
 Using Stem-Templates to Improve Arabic POS and Gender / Number Tagging
 A Meta-Data Driven Platform for Semi-Automatic Configuration of Ontology Mediators
 The Ellogon Pattern Engine: Context-Free Grammars Over Annotations
 Missed Opportunities in Translation Memory Matching
 Native Language Identification Using Large, Longitudinal Data
 Enriching ODIN
 Creating Summarization Systems with SUMMA
 Developing an Egyptian Arabic Treebank: Impact of Dialectal Morphology on Annotation and Tool Development
 MomResp: a Bayesian Model for Multi-Annotator Document Labeling
 Towards Automatic Detection of Narrative Structure
 A Method for Building Burst-Annotated Co-Occurrence Networks for Analysing Trends in Textual Data
 Annotating Inter-Sentence Temporal Relations in Clinical Notes
 Refractive: an Open Source Tool to Extract Knowledge from Syntactic and Semantic Relations
 A Finite-State Morphological Analyzer for a Lakota HPSG Grammar
 Motàmot Project: Conversion of a French-Khmer Published Dictionary for Building a Multilingual Lexical System
 Just.Ask, a QASystem That Learns to Answer New Questions from Previous Interactions
 Open-Domain Interaction and Online Content in the Sami Language
 Guampa: a Toolkit for Collaborative Translation
 RELISH LMF: Unlocking the Full Power of the Lexical Markup Framework
 Ciempiess: a New Open-Sourced Mexican Spanish Radio Corpus
 Exploring and Visualizing Variation in Language Resources
 A New Form of Humor ― Mapping Constraint-based Computational Morphologies to a Finite-State Representation
 A Multi-Cultural Repository of Automatically Discovered Linguistic and Conceptual Metaphors
 First Approach Toward Semantic Role Labeling for Basque
 Aligning Parallel Texts with Intertext
 Extending Standoff Annotation
 Turkish Resources for Visual Word Recognition
 ROOTS: a Toolkit for Easy, Fast and Consistent Processing of Large Sequential Annotated Data Collections
 Constructing and Exploiting an Automatically Annotated Resource of Legislative Texts
 The EASR Corpora of European Portuguese, French, Hungarian and Polish Elderly Speech
 Phoneme Similarity Matrices to Improve Long Audio Alignment for Automatic Subtitling
 HFST-SweNER ― a New NER Resource for Swedish
 Introducing a Framework for the Evaluation of Music Detection Tools
 Detecting Document Structure in a Very Large Corpus of Uk Financial Reports
 Latent Semantic Analysis Models on Wikipedia and Tasa
 Sprinter: Language Technologies for Interactive and Multimedia Language Learning
 Bilingual Dictionary Induction as an Optimization Problem
 A Set of Open Source Tools for Turkish Natural Language Processing
 The Procedure of Lexico-Semantic Annotation of Składnica Treebank
 French Resources for Extraction and Normalization of Temporal Expressions with Heideltime
 CLARIN-NL: Major Results
 Machine Translation for Subtitling: a Large-Scale Evaluation
 The N2 Corpus: a Semantically Annotated Collection of Islamist Extremist Stories
 Benchmarking Twitter Sentiment Analysis Tools
 Corpus Annotation Through Crowdsourcing: Towards Best Practice Guidelines
 Machine Translationness: Machine-Likeness in Machine Translation Evaluation
 Towards an Environment for the Production and the Validation of Lexical Semantic Resources
 The Distress Analysis Interview Corpus of Human and Computer Interviews
 Representing Multimodal Linguistic Annotated Data
 Comparative Analysis of Portuguese Named Entities Recognition Tools
 Collocation Or Free Combination? ― Applying Machine Translation Techniques to Identify Collocations in Japanese
 The Wavesurfer Automatic Speech Recognition Plugin
 A Cascade Approach for Complex-Type Classification
 Online Experiments with the Percy Software Framework - Experiences and Some Early Results
 Improving the Exploitation of Linguistic Annotations in Elan
 Sentence Rephrasing for Parsing Sentences with Oov Words
 Clinical Data-Driven Probabilistic Graph Processing
 A Compact Interactive Visualization of Dependency Treebank Query Results
 ILLINOISCLOUDNLP: Text Analytics Services in the Cloud
 Reconstructing the Semantic Landscape of Natural Language Processing
 High Quality Word Lists as a Resource for Multiple Purposes
 Language Processing Infrastructure in the Xlike Project
 Sharing Resources Between Free / Open-Source Rule-based Machine Translation Systems: Grammatical Framework and Apertium
 A Stream Computing Approach Towards Scalable NLP
 Hindi to English Machine Translation: Using Effective Selection in Multi-Model SMT
 Experiences with Parallelisation of an Existing NLP Pipeline: Tagging Hansard
 A Model to Generate Adaptive Multimodal Job Interviews with a Virtual Recruiter
 Identification of Technology Terms in Patents
 A Toolkit for Efficient Learning of Lexical Units for Speech Recognition
 Rule-based Reordering Space in Statistical Machine Translation
 TVD: a Reproducible and Multiply Aligned Tv Series Dataset
 The Halliday Centre Tagger: an Online Platform for Semi-Automatic Text Annotation and Analysis
 Heuristic Hyper-Minimization of Finite State Lexicons
 Nomad: Linguistic Resources and Tools Aimed at Policy Formulation and Validation
 Standardisation and Interoperation of Morphosyntactic and Syntactic Annotation Tools for Spanish and Their Annotations
 The Tutorbot Corpus ― a Corpus for Studying Tutoring Behaviour in Multiparty Face-To-Face Spoken Dialogue
 Combining Dependency Information and Generalization in a Pattern-based Approach to the Classification of Lexical-Semantic Relation Instances
 An Exercise in Reuse of Resources: Adapting General Discourse Coreference Resolution for Detecting Lexical Chains in Patent Documentation
 VOAR: A Visual and Integrated Ontology Alignment Environment
 DisMo: a Morphosyntactic, Disfluency and Multi-Word Unit Annotator. an Evaluation on a Corpus of French Spontaneous and Read Speech
 Integration of Workflow and Pipeline for Language Service Composition
 Large Scale Arabic Error Annotation: Guidelines and Framework
 MAT: a Tool for L2 Pronunciation Errors Annotation
 Taalportaal: an Online Grammar of Dutch and Frisian
 A Framework for Public Health Surveillance
 Languagesindanger.Eu - Including Multimedia Language Resources to Disseminate Knowledge and Create Educational Material On less-Resourced Languages
 
 |  
 | Topic Detection & Tracking | Extracting Information for Context-Aware Meeting Preparation Newsreader: Recording History from Daily News Streams
 The Slovak Categorized News Corpus
 Clustering Tweets Usingwikipedia Concepts
 Hot Topics and Schisms in NLP: Community and Trend Analysis with Saffron on ACL and LREC Proceedings
 A Modular System for Rule-based Text Categorisation
 
 |  
 | Typological Databases | Etymological WordNet: Tracing the History of Words Language Collage: Grammatical Description with the Lingo Grammar Matrix
 
 |        
 |  |