TOPICS: Browse articles of the conference sorted by topic

A - C - D - E - G - H - I - K - L - M - N - O - P - Q - S - T - U - V - W

A
Acquisition Combining Elicited Imitation and Fluency Features for Oral Proficiency Measurement
FLELex: a Graded Lexical Resource for French Foreign Learners
Automatic Methods for the Extension of a Bilingual Dictionary Using Comparable Corpora
ASR-based CALL Systems and Learner Speech Data: New Resources and Opportunities for Research and Development in Second Language Learning
The Dutch LESLLA Corpus
Ml-Optimization of Ported Constraint Grammars
Modeling and Evaluating Dialog Success in the LAST MINUTE Corpus
Transliteration and Alignment of Parallel Texts from Cyrillic to Latin
Exploring Factors That Contribute to Successful Fingerspelling Comprehension
Comparing Two Acquisition Systems for Automatically Building an English―Croatian Parallel Corpus from Multilingual Websites
YouDACC: the Youtube Dialectal Arabic Comment Corpus
Enriching the "Senso Comune" Platform with Automatically Acquired Data
NASTIA: Negotiating Appointment Setting Interface
The MERLIN Corpus: Learner Language and the CEFR
Comparing Similarity Measures for Distributional Thesauri
The Use of a Filemaker Pro Database in Evaluating Sign Language Notation Systems
Towards Electronic Sms Dictionary Construction: an Alignment-based Approach
Student Achievement and French Sentence Repetition Test Scores
Bilingual Dictionaries for All Eu Languages
Recent Developments in DeReKo
Generating Polarity Lexicons with WordNet Propagation in 5 Languages
Anaphora, Coreference ANCOR_Centre, a Large Free Spoken French Coreference Corpus: Description of the Resource and Reliability Measures
Genres in the Prague Discourse Treebank
Corpus for Coreference Resolution on Scientific Papers
The DARE Corpus: a Resource for Anaphora Resolution in Dialogue Based Intelligent Tutoring Systems
Language Resources and Annotation Tools for Cross-Sentence Relation Extraction
How Could Veins Speed Up the Process of Discourse Parsing
IXA Pipeline: Efficient and Ready to Use Multilingual NLP Tools
Using a Sledgehammer to Crack a Nut? Lexical Diversity and Event Coreference Resolution
Exploring the Utility of Coreference Chains for Improved Identification of Personal Names
Polish Coreference Corpus in Numbers
Authoring Tools Constructing and Exploiting an Automatically Annotated Resource of Legislative Texts
Applying Accessibility-Oriented Controlled Language (CL) Rules to Improve Appropriateness of Text Alternatives for Images: an Exploratory Study
A Large-Scale Evaluation of Pre-Editing Strategies for Improving User-Generated Content Translation
TermWise: A CAT-tool with Context-Sensitive Terminological Support.
UM-Corpus: a Large English-Chinese Parallel Corpus for Statistical Machine Translation

 

C
Cognitive Methods Semi-Supervised Methods for Expanding Psycholinguistics Norms by Integrating Distributional Similarity with the Structure of WordNet
#mygoal: Finding Motivations on Twitter
A Graph-based Approach for Computing Free Word Associations
Design and Development of an Online Computational Framework to Facilitate Language Comprehension Research on Indian Languages
Mining a Multimodal Corpus for Non-Verbal Behavior Sequences Conveying Attitudes
Turkish Resources for Visual Word Recognition
Collaborative Resource Construction The DWAN Framework: Application of a Web Annotation Framework for the General Humanities to the Domain of Language Resources
Collaboratively Annotating Multilingual Parallel Corpora in the Biomedical Domain―Some Mantras
Mapping Between English Strings and Reentrant Semantic Graphs
Developing Text Resources for Ten South African Languages
Zmorge: a German Morphological Lexicon Extracted from Wiktionary
Evaluating Lemmatization Models for Machine-Assisted Corpus-Dictionary Linkage
Digital Library 2.0: Source of Knowledge and Research Collaboration Platform
Linguistic Landscaping of South Asia Using Digital Language Resources: Genetic Vs. Areal Linguistics
SAVAS: Collecting, Annotating and Sharing Audiovisual Language Resources for Automatic Subtitling
CFT13: a Resource for Research into the Post-editing Process
Generating a Lexicon of Errors in Portuguese to Support an Error Identification System for Spanish Native Learners
A Colloquial Corpus of Japanese Sign Language: Linguistic Resources for Observing Sign Language Conversations
Can Numerical Expressions Be Simpler? Implementation and Demostration of a Numerical Simplification System for Spanish
The eIdentity Text Exploration Workbench
Rhapsodie: a Prosodic-Syntactic Treebank for Spoken French
CLARA: A New Generation of Researchers in Common Language Resources and Their Applications
Can Crowdsourcing Be Used for Effective Annotation of Arabic?
TweetNorm_es: an Annotated Corpus for Spanish Microtext Normalization
Corpus Annotation Through Crowdsourcing: Towards Best Practice Guidelines
Towards an Environment for the Production and the Validation of Lexical Semantic Resources
Towards an Encyclopedia of Compositional Semantics: Documenting the Interface of the English Resource Grammar
MUHIT: a Multilingual Harmonized Dictionary
Pivot-based Multilingual Dictionary Building Using Wiktionary
The AMARA Corpus: Building Parallel Language Resources for the Educational Domain
Exploiting Networks in Law
Terminology Resources and Terminology Work Benefit from Cloud Services
Computer-Assisted Language Learning (CALL) FLELex: a Graded Lexical Resource for French Foreign Learners
MAT: a Tool for L2 Pronunciation Errors Annotation
Generating a Lexicon of Errors in Portuguese to Support an Error Identification System for Spanish Native Learners
Reusing Swedish Framenet for Training Semantic Roles
A Database of Freely Written Texts of German School Students for the Purpose of Automatic Spelling Error Classification
Automatic Error Detection Concerning the Definite and Indefinite Conjugation in the Hunlearner Corpus
Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process
An Innovative World Language Centre : Challenges for the Use of Language Technology
Open Philology at the University of Leipzig
Controlled Languages Presenting a System of Human-Machine Interaction for Performing Map Tasks.
Corpus (Creation, Annotation, etc.) Statistical Analysis of Multilingual Text Corpus and Development of Language Models
Smile and Laughter in Human-Machine Interaction: a Study of Engagement
A Conventional Orthography for Tunisian Arabic
The AMARA Corpus: Building Parallel Language Resources for the Educational Domain
A Multimodal Dataset for Deception Detection
Human Annotation of ASR Error Regions: is "gravity" a Sharable Concept for Human Annotators?
Correcting and Validating Syntactic Dependency in the Spoken French Treebank Rhapsodie
Erlangen-CLP: A Large Annotated Corpus of Speech from Children with Cleft Lip and Palate
Semi-Automatic Annotation of the Ucu Accents Speech Corpus
Evaluation of Automatic Hypernym Extraction from Technical Corpora in English and Dutch
The Cle Urdu POS Tagset
Automatic Detection of Other-Repetition Occurrences: Application to French Conversational Speech
EMOVO Corpus: an Italian Emotional Speech Database
A Tagged Corpus and a Tagger for Urdu
A Multidialectal Parallel Corpus of Arabic
Identification of Multiword Expressions in the Brwac
Phone Boundary Annotation in Conversational Speech
NoSta-D Named Entity Annotation for German: Guidelines and Dataset
Mörkum Njálu. an Annotated Corpus to Analyse and Explain Grammatical Divergences Between 14th-Century Manuscripts of Njál's Saga.
Mapping WordNet Domains, WordNet Topics and Wikipedia Categories to Generate Multilingual Domain Specific Resources
The Polish Summaries Corpus
Variations on Quantitative Comparability Measures and Their Evaluations on Synthetic French-English Comparable Corpora
Teenage and Adult Speech in School Context: Building and Processing a Corpus of European Portuguese
On the Importance of Text Analysis for Stock Price Prediction
A Corpus of Comparisons in Product Reviews
The IULA Spanish LSP Treebank
A System for Experiments with Dependency Parsers
Sockpuppet Detection in Wikipedia: a Corpus of Real-World Deceptive Writing for Linking Identities
ALICO: a Multimodal Corpus for the Study of Active Listening
Corpus and Method for Identifying Citations in Non-Academic Text
A Cross-Language Corpus for Studying the Phonetics and Phonology of Prominence
Using Resource-Rich Languages to Improve Morphological Analysis of Under-Resourced Languages
Collaboratively Annotating Multilingual Parallel Corpora in the Biomedical Domain―Some Mantras
On the Use of a Fuzzy Classifier to Speed Up the Sp_ToBI Labeling of the Glissando Spanish Corpus
Turkish Treebank as a Gold Standard for Morphological Disambiguation and Its Influence on Parsing
Praaline: Integrating Tools for Speech Corpus Research
Interoperability and Customisation of Annotation Schemata in Argo
Polish Coreference Corpus in Numbers
A Gold Standard Dependency Corpus for English
A Corpus of Machine Translation Errors Extracted from Translation Students Exercises
Co-Training for Classification of Live Or Studio Music Recordings
Creating and Using Large Monolingual Parallel Corpora for Sentential Paraphrase Generation
A New Framework for Sign Language Recognition Based on 3d Handshape Identification and Linguistic Modeling
Crowdsourcing for the Identification of Event Nominals: an Experiment
Semantic Technologies for Querying Linguistic Annotations: an Experiment Focusing on Graph-Structured Data
A Hierarchical Taxonomy for Classifying Hardness of Inference Tasks
The Sweet-Home Speech and Multimodal Corpus for Home Automation Interaction
Tools for Arabic Natural Language Processing: a Case Study in Qalqalah Prosody
Aligning Predicate-Argument Structures for Paraphrase Fragment Extraction
Automatic Creation of WordNets from Parallel Corpora
Pre-Ordering of Phrase-based Machine Translation Input in Translation Workflow
A Wikipedia-based Corpus for Contextualized Machine Translation
Motàmot Project: Conversion of a French-Khmer Published Dictionary for Building a Multilingual Lexical System
Building a Corpus of Manually Revised Texts from Discourse Perspective
Single-Person and Multi-Party 3d Visualizations for Nonverbal Communication Analysis
Interoperability of Dialogue Corpora Through Iso 24617-2-based Querying
The Database for Spoken German ― DGD2
Simple Effective Microblog Named Entity Recognition: Arabic as an Example
Priberam Compressive Summarization Corpus: a New Multi-Document Summarization Corpus for European Portuguese
The MMASCS Multi-Modal Annotated Synchronous Corpus of Audio, Video, Facial Motion and Tongue Motion Data of Normal, Fast and Slow Speech
Constructing a Chinese―Japanese Parallel Corpus from Wikipedia
Modelling Irony in Twitter: Feature Analysis and Evaluation
Corpus and Evaluation of Handwriting Recognition of Historical Genealogical Records
Computational Narratology: Extracting Tense Clusters from Narrative Texts
Designing the Latvian Speech Recognition Corpus
Aligning Parallel Texts with Intertext
From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers
A Corpus of Spontaneous Speech in Lectures: the Kit Lecture Corpus for Spoken Language Processing and Translation
The Pragmatic Annotation of a Corpus of Academic Lectures
Comparative Analysis of Verbal Alignment in Human-Human and Human-Agent Interactions
The eIdentity Text Exploration Workbench
Emilya: Emotional Body Expression in Daily Actions Database
The LIMA Multilingual Analyzer Made Free: FLOSS Resources Adaptation and Correction
Exploring Factors That Contribute to Successful Fingerspelling Comprehension
On the Annotation of TMX Translation Memories for Advanced Leveraging in Computer-Aided Translation
Named Entity Recognition on Turkish Tweets
On Complex Word Alignment Configurations
Linguistic Resources and Cats: How to Use Isocat, Relcat and Schemacat
Cross-Linguistic Annotation of Narrativity for English / French Verb Tense Disambiguation
Evaluating Corpora Documentation with Regards to the Ethics and Big Data Charter
Introducing a Web Application for Labeling, Visualizing Speech and Correcting Derived Speech Signals
Vocabulary-based Language Similarity Using Web Corpora
S-Pot - a Benchmark in Spotting Signs Within Continuous Signing
TweetNorm_es: an Annotated Corpus for Spanish Microtext Normalization
The Procedure of Lexico-Semantic Annotation of Składnica Treebank
A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition
Crowdsourcing as a Preprocessing for Complex Semantic Annotation Tasks
Automatic Annotation of Machine Translation Datasets with Binary Quality Judgements
Learning from Domain Complexity
Designing a Bilingual Speech Corpus for French and German Language Learners: a Two-Step Process
Deep Syntax Annotation of the Sequoia French Treebank
Developing a French Framenet: Methodology and First Results
Innovations in Parallel Corpus Search Tools
Representing Multimodal Linguistic Annotated Data
A Corpus of European Portuguese Child and Child-Directed Speech
'interHist' - an Interactive Visual Interface for Corpus Exploration
Hashtag Occurrences, Layout and Translation: a Corpus-Driven Analysis of Tweets Published by the Canadian Government
Presenting a System of Human-Machine Interaction for Performing Map Tasks.
Hesita(Te) in Portuguese
MUHIT: a Multilingual Harmonized Dictionary
The Munich Biovoice Corpus: Effects of Physical Exercising, Heart Rate, and Skin Conductance on Human Speech Production
Conceptual Transfer: Using Local Classifiers for Transfer Selection
Annotating Arguments: the Nomad Collaborative Annotation Tool
Correcting Errors in a New Gold Standard for Tagging Icelandic Text
Experiences with Parallelisation of an Existing NLP Pipeline: Tagging Hansard
Named Entity Corpus Construction Using Wikipedia and DBpedia Ontology
Euronews: a Multilingual Speech Corpus for ASR
Thomas Aquinas in the Tündra: Integrating the Index Thomisticus Treebank Into Clarin-D
Towards Linked Hypernyms Dataset 2.0: Complementing DBpedia with Hypernym Discovery
The Slovene Bnsi Broadcast News Database and Reference Speech Corpus Gos: Towards the Uniform Guidelines for Future Work
Language Editing Dataset of Academic Texts
Japanese Conversation Corpus for Training and Evaluation of Backchannel Prediction Model.
Aix Map Task Corpus: the French Multimodal Corpus of Task-Oriented Dialogue
Multiword Expressions in Machine Translation
CROMER: a Tool for Cross-Document Event and Entity Coreference
Automatic Language Identity Tagging on Word and Sentence-Level in Multilingual Text Sources: a Case-Study on Luxembourgish
CORILGA: a Galician Multilevel Annotated Speech Corpus for Linguistic Analysis
Classifying Inconsistencies in DBpedia Language Specific Chapters
The Halliday Centre Tagger: an Online Platform for Semi-Automatic Text Annotation and Analysis
NIF4OGGD - NLP Interchange Format for Open German Governmental Data
Verbs of Saying with a Textual Connecting Function in the Prague Discourse Treebank
A Language-Independent and Fully Unsupervised Approach to Lexicon Induction and Part-Of-Speech Tagging for Closely Related Languages
New Bilingual Speech Databases for Audio Diarization
Less is More? Towards a Reduced Inventory of Categories for Training a Parser for the Italian Stanford Dependencies
UnixMan Corpus: A Resource for Language Learning in the Unix Domain
GraPAT: a Tool for Graph Annotations
The Tutorbot Corpus ― a Corpus for Studying Tutoring Behaviour in Multiparty Face-To-Face Spoken Dialogue
TweetCaT: a Tool for Building Twitter Corpora of Smaller Languages
Re-Using an Argument Corpus to Aid in the Curation of Social Media Collections
Rapid Deployment of Phrase Structure Parsing for Related Languages: a Case Study of Insular Scandinavian
Assessment of Non-Native Prosody for Spanish as L2 Using Quantitative Scores and Perceptual Evaluation
Exploiting the Large-Scale German Broadcast Corpus to Boost the Fraunhofer Iais Speech Recognition System
Exploring the Utility of Coreference Chains for Improved Identification of Personal Names
Co-Clustering of Bilingual Datasets as a Mean for Assisting the Construction of Thematic Bilingual Comparable Corpora
The Extended Dirndl Corpus as a Resource for Coreference and Bridging Resolution
A Flexible Language Learning Platform Based on Language Resources and Web Services
Extracting Semantic Relations from Portuguese Corpora Using Lexical-Syntactic Patterns
An Analysis of Ambiguity in Word Sense Annotations
Disclose Models, Hide the Data - How to Make Use of Confidential Corpora Without Seeing Sensitive Raw Data
Exploiting Networks in Law
Parsing Chinese Synthetic Words with a Character-based Dependency Model
Multilingual Test Sets for Machine Translation of Search Queries for Cross-Lingual Information Retrieval in the Medical Domain
Building a Dataset for Summarization and Keyword Extraction from Emails
Crowdsourcing Modern Chinese Helps Archaic Chinese Processing: Finding and Exploiting the Shared Properties
A Crowdsourcing Smartphone Application for Swiss German: Putting Language Documentation in the Hands of the Users
A Study on Expert Sourcing Enterprise Question Collection and Classification
Collaboration in the Production of a Massively Multilingual Lexicon
The Newsome Corpus: a Unifying Opinion Annotation Framework Across Genres and in Multiple Languages
A SICK Cure for the Evaluation of Compositional Distributional Semantic Models
Morpho-Syntactic Study of Errors from Speech Recognition System
Crowdsourcing and Annotating NER for Twitter #drift
Designing and Evaluating a Reliable Corpus of Web Genres Via Crowd-Sourcing
Crowdsourcing as a Preprocessing for Complex Semantic Annotation Tasks
A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic
Crowd-Sourcing Evaluation of Automatically Acquired, Morphologically Related Word Groupings
Propa-L: a Semantic Filtering Service from a Lexical Network Created Using Games with a Purpose
When Transliteration Met Crowdsourcing : an Empirical Study of Transliteration Via Crowdsourcing Using Efficient, Non-Redundant and Fair Quality Control

 

D
Dialogue Phoneme Set Design Using English Speech Database by Japanese for Dialogue-based English Call Systems
A Comparative Evaluation Methodology for Nlg in Interactive Systems
The Nijmegen Corpus of Casual Czech
ANCOR_Centre, a Large Free Spoken French Coreference Corpus: Description of the Resource and Reliability Measures
The Dbox Corpus Collection of Spoken Human-Human and Human-Machine Dialogues
A Colloquial Corpus of Japanese Sign Language: Linguistic Resources for Observing Sign Language Conversations
The Research and Teaching Corpus of Spoken German ― Folk
Twente Debate Corpus ― a Multimodal Corpus for Head Movement Analysis
The DARE Corpus: a Resource for Anaphora Resolution in Dialogue Based Intelligent Tutoring Systems
Narrowing the Gap Between Termbases and Corpora in Commercial Environments
An Analysis of Older Users' Interactions with Spoken Dialogue Systems
Free English and Czech Telephone Speech Corpus Shared Under the Cc-By-Sa 3.0 License
DINASTI: Dialogues with a Negotiating Appointment Setting Interface
Mapping CPA Patterns onto OntoNotes Senses
A Multimodal Corpus of Rapid Dialogue Games
Automatic Detection of Other-Repetition Occurrences: Application to French Conversational Speech
Towards Automatic Transformation Between Different Transcription Conventions: Prediction of Intonation Markers from Linguistic and Acoustic Features
Aix Map Task Corpus: the French Multimodal Corpus of Task-Oriented Dialogue
Dive-Arabic: Gulf Arabic Dialogue in a Virtual Environment
A Model for Processing Illocutionary Structures and Argumentation in Debates
The Development of the Multilingual Luna Corpus for Spoken Language System Porting
Digital Libraries Towards Automatic Quality Assessment of Component Metadata
Construction and Annotation of a French Folkstale Corpus
UM-Corpus: a Large English-Chinese Parallel Corpus for Statistical Machine Translation
Digital Library 2.0: Source of Knowledge and Research Collaboration Platform
Discourse Annotation, Representation and Processing A Corpus of Participant Roles in Contentious Discussions
Towards Automatic Detection of Narrative Structure
The Cuhk Discourse Treebank for Chinese: Annotating Explicit Discourse Connectives for the Chinese Treebank
Building a Corpus of Manually Revised Texts from Discourse Perspective
Prosodic, Syntactic, Semantic Guidelines for Topic Structures Across Domains and Corpora
Single-Person and Multi-Party 3d Visualizations for Nonverbal Communication Analysis
The Dbox Corpus Collection of Spoken Human-Human and Human-Machine Dialogues
Genres in the Prague Discourse Treebank
The Meta-Knowledge of Causality in Biomedical Scientific Discourse
Mining a Multimodal Corpus for Non-Verbal Behavior Sequences Conveying Attitudes
Multimodal Corpora for Silent Speech Interaction
Computational Narratology: Extracting Tense Clusters from Narrative Texts
ParCor 1.0: a Parallel Pronoun-Coreference Corpus to Support Statistical Mt
Comparative Analysis of Verbal Alignment in Human-Human and Human-Agent Interactions
Twente Debate Corpus ― a Multimodal Corpus for Head Movement Analysis
Annotating Relations in Scientific Articles
Locating Requests Among Open Source Software Communication Messages
Potsdam Commentary Corpus 2.0: Annotation for Discourse Research
Hesita(Te) in Portuguese
Developing Politeness Annotated Corpus of Hindi Blogs
Japanese Conversation Corpus for Training and Evaluation of Backchannel Prediction Model.
Dive-Arabic: Gulf Arabic Dialogue in a Virtual Environment
A Multimodal Interpreter for 3d Visualization and Animation of Verbal Concepts
Using a Sledgehammer to Crack a Nut? Lexical Diversity and Event Coreference Resolution
Segmentation Evaluation Metrics, a Comparison Grounded on Prosodic and Discourse Units
Document Classification, Text categorisation CLiPS Stylometry Investigation (CSI) Corpus: a Dutch Corpus for the Detection of Age, Gender, Personality, Sentiment and Deception in Text
Cross-Language Authorship Attribution
How to Use Less Features and Reach Better Performance in Author Gender Identification
LDC Activities
Annotating Question Decomposition on Complex Medical Questions
Modern Chinese Helps Archaic Chinese Processing: Finding and Exploiting the Shared Properties
Linking Pictographs to Synsets: Sclera2Cornetto
Characterizing and Predicting Bursty Events: the Buzz Case Study on Twitter
Can the Crowd Be Controlled?: a Case Study on Crowd Sourcing and Automatic Validation of Completed Tasks Based on User Modeling
The Pragmatic Annotation of a Corpus of Academic Lectures
Annotating Clinical Events in Text Snippets for Phenotype Detection
An Iterative Approach for Mining Parallel Sentences in a Comparable Corpus
Designing and Evaluating a Reliable Corpus of Web Genres Via Crowd-Sourcing
Using Word Familiarities and Word Associations to Measure Corpus Representativeness
Developing Politeness Annotated Corpus of Hindi Blogs
Recognising Suicidal Messages in Dutch Social Media
The Slovak Categorized News Corpus
Sublanguage Corpus Analysis Toolkit: a Tool for Assessing the Representativeness and Sublanguage Characteristics of Corpora
Ranking Job Offers for Candidates: Learning Hidden Knowledge from Big Data
Re-Using an Argument Corpus to Aid in the Curation of Social Media Collections
A Collection of Scholarly Book Reviews from the Platforms of Electronic Sources in Humanities and Social Sciences Openedition.Org
Resource Creation and Evaluation for Multilingual Sentiment Analysis in Social Media Texts

 

E
Emotion Recognition/Generation Toward a Unifying Model for Opinion, Sentiment and Emotion Information Extraction
Eliciting and Annotating Uncertainty in Spoken Language
Annotating Events in an Emotion Corpus
Speech-based Emotion Recognition: Feature Selection by Self-Adaptive Multi-Criteria Genetic Algorithm
The Av-Lasyn Database : a Synchronous Corpus of Audio and 3d Facial Marker Data for Audio-Visual Laughter Synthesis
Emilya: Emotional Body Expression in Daily Actions Database
The D-Ans Corpus: the Dublin-Autonomous Nervous System Corpus of Biosignal and Multimodal Recordings of Conversational Speech
Texafon 2.0: a Text Processing Tool for the Generation of Expressive Speech in Tts Applications
Media Monitoring and Information Extraction for the Highly Inflected Agglutinative Language Hungarian
The Sspnet-Mobile Corpus: Social Signal Processing Over Mobile Phones.
EMOVO Corpus: an Italian Emotional Speech Database
The Munich Biovoice Corpus: Effects of Physical Exercising, Heart Rate, and Skin Conductance on Human Speech Production
Voce Corpus: Ecologically Collected Speech Annotated with Physiological and Psychological Stress Assessments
Alert!... Calm Down, There is Nothing to Worry About. Warning and Soothing Speech Synthesis.
Modeling, Managing, Exposing, and Linking Ontologies with a Wiki-based Tool
Smile and Laughter in Human-Machine Interaction: a Study of Engagement
Endangered Languages PanLex: Building a Resource for Panlingual Lexical Translation
Enriching ODIN
TLAXCALA: a Multilingual Corpus of Independent News
Untrained Forced Alignment of Transcriptions and Audio for Language Documentation Corpora Using Webmaus
Finite-State Morphological Transducers for Three Kypchak Languages
A Finite-State Morphological Analyzer for a Lakota HPSG Grammar
Open-Domain Interaction and Online Content in the Sami Language
Using Transfer Learning to Assist Exploratory Corpus Annotation
Linguistic Evaluation of Support Verb Constructions by Openlogos and Google Translate
First Approach Toward Semantic Role Labeling for Basque
The Gulf of Guinea Creole Corpora
An Innovative World Language Centre : Challenges for the Use of Language Technology
Evaluation Methodologies VERTa: Facing a Multilingual Experience of a Linguistically-based MT Evaluation
Combining Elicited Imitation and Fluency Features for Oral Proficiency Measurement
ETER : a New Metric for the Evaluation of Hierarchical Named Entity Recognition
Measuring Readability of Polish Texts: Baseline Experiments
Bridging the Gap Between Speech Technology and Natural Language Processing: an Evaluation Toolbox for Term Discovery Systems
Building a Database of Japanese Adjective Examples from Special Purpose Web Corpora
A Repository of State of the Art and Competitive Baseline Summaries for Generic News Summarization
Comparing the Quality of Focused Crawlers and of the Translation Resources Obtained from Them
Creating and Using Large Monolingual Parallel Corpora for Sentential Paraphrase Generation
A Comparative Evaluation Methodology for Nlg in Interactive Systems
An Evaluation of the Role of Statistical Measures and Frequency for Mwe Identification
Using a Machine Learning Model to Assess the Complexity of Stress Systems
Translation Errors from English to Portuguese: an Annotated Corpus
Discosuite - a Parser Test Suite for German Discontinuous Structures
Corpus and Evaluation of Handwriting Recognition of Historical Genealogical Records
PACE Corpus: a Multilingual Corpus of Polarity-Annotated Textual Data from the Domains Automotive and CEllphone
A Benchmark Database of Phonetic Alignments in Historical Linguistics and Dialectology
Introducing a Framework for the Evaluation of Music Detection Tools
The Taraxü Corpus of Human-Annotated Machine Translations
Detecting Document Structure in a Very Large Corpus of Uk Financial Reports
Measuring Readability of Polish Texts: Baseline Experiments
S-Pot - a Benchmark in Spotting Signs Within Continuous Signing
Machine Translation for Subtitling: a Large-Scale Evaluation
Extrinsic Corpus Evaluation with a Collocation Dictionary Task
HuRIC: a Human Robot Interaction Corpus
On the Origin of Errors: a Fine-Grained Analysis of MT and PE Errors and their Relationship
Dense Components in the Structure of WordNet
MADAMIRA: a Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic
A Rank-based Distance Measure to Detect Polysemy and to Determine Salient Vector-Space Features for German Prepositions
Building a Crisis Management Term Resource for Social Media: the Case of Floods and Protests
The Use of a Filemaker Pro Database in Evaluating Sign Language Notation Systems
A Quality-based Active Sample Selection Strategy for Statistical Machine Translation
A Large-Scale Evaluation of Pre-Editing Strategies for Improving User-Generated Content Translation
Activ-Es: a Comparable, Cross-Dialect Corpus of ‘everyday’ Spanish from Argentina, Mexico, and Spain
Overview of Todai Robot Project and Evaluation Framework of Its NLP-based Problem Solving
Crowdsourcing for Evaluating Machine Translation Quality
Student Achievement and French Sentence Repetition Test Scores
Fuzzy V-Measure - an Evaluation Method for Cluster Analyses of Ambiguous Data
Why Chinese Web-as-Corpus is Wacky? Or: How Big Data is Killing Chinese Corpus Linguistics
KoKo: an L1 Learner Corpus for German
An Efficient and User-Friendly Tool for Machine Translation Quality Estimation
LexTerm Manager: Design for an Integrated Lexicography and Terminology System
The Etape Speech Processing Evaluation

 

G
Grammar and Syntax The Ellogon Pattern Engine: Context-Free Grammars Over Annotations
The Interplay Between Lexical and Syntactic Resources in Incremental Parsebanking
Multival - Towards a Multilingual Valence Lexicon
Amazigh Verb Conjugator
Towards Building a Kashmiri Treebank: Setting Up the Annotation Pipeline
Adapting VerbNet to French Using Existing Resources
Discosuite - a Parser Test Suite for German Discontinuous Structures
Parsing Heterogeneous Corpora with a Rich Dependency Grammar
Walenty: Towards a Comprehensive Valence Dictionary of Polish
The Norwegian Dependency Treebank
Can Numerical Expressions Be Simpler? Implementation and Demostration of a Numerical Simplification System for Spanish
GenitivDB ― a Corpus-Generated Database for German Genitive Classification
Building a Reference Lexicon for Countability in English
A Japanese Word Dependency Corpus
Converting an HPSG-based Treebank Into Its Parallel Dependency-based Treebank
Treelet Probabilities for HPSG Parsing and Error Correction
A Database for Measuring Linguistic Information Content
Bootstrapping an Italian VerbNet: Data-Driven Analysis of Verb Alternations
Self-Training a Constituency Parser Using N-Gram Trees
Manual Analysis of Structurally Informed Reordering in German-English Machine Translation
MADAMIRA: a Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic
Pruning the Search Space of the Wolof Lfg Grammar Using a Probabilistic and a Constraint Grammar Parser
Language Collage: Grammatical Description with the Lingo Grammar Matrix
Croatian Dependency Treebank 2.0: New Annotation Guidelines for Improved Parsing
Constituency Parsing of Bulgarian: Word- Vs Class-based Parsing
Correcting and Validating Syntactic Dependency in the Spoken French Treebank Rhapsodie
Morfeusz Reloaded
To Pay Or to Get Paid: Enriching a Valency Lexicon with Diatheses
Verbs of Saying with a Textual Connecting Function in the Prague Discourse Treebank
A 500 Million Word Pos-Tagged Icelandic Corpus
Rapid Deployment of Phrase Structure Parsing for Related Languages: a Case Study of Insular Scandinavian
Because Size Does Matter: the Hamburg Dependency Treebank
Extending the Coverage of a Mwe Database for Persian Cps Exploiting Valency Alternations
Large Scale Arabic Error Annotation: Guidelines and Framework
Validation Issues Induced by an Automatic Pre-Annotation Mechanism in the Building of Non-Projective Dependency Treebanks
Bidirectionnal Converter Between Syntactic Annotations : from French Treebank Dependencies to Passage Annotations, and Back

 

H
Handwritten, Typewritten Document Recognition DysList: an Annotated Resource of Dyslexic Errors

 

I
Information Extraction, Information Retrieval Boosting Open Information Extraction with Noun-based Relations
Textual Emigration Analysis (TEA)
Evaluating Web-As-Corpus Topical Document Retrieval with an Index of the Opendirectory
Extracting News Web Page Creation Time with Dctfinder
Biomedical Entity Extraction Using Machine-Learning Based Approaches
HiEve: A Corpus for Extracting Event Hierarchies from News Stories
Corpus and Method for Identifying Citations in Non-Academic Text
Native Language Identification Using Large, Longitudinal Data
Construction and Annotation of a French Folkstale Corpus
Enrichment of Bilingual Dictionary Through News Stream Data
A Large Scale Database of Strongly-Related Events in Japanese
Building Domain Specific Bilingual Dictionaries
Co-Training for Classification of Live Or Studio Music Recordings
SinoCoreferencer: An End-to-End Chinese Event Coreference Resolver
Refractive: an Open Source Tool to Extract Knowledge from Syntactic and Semantic Relations
A Vector Space Model for Syntactic Distances Between Dialects
Annotating Question Decomposition on Complex Medical Questions
Just.Ask, a QASystem That Learns to Answer New Questions from Previous Interactions
Priberam Compressive Summarization Corpus: a New Multi-Document Summarization Corpus for European Portuguese
ClearTK 2.0: Design Patterns for Machine Learning in UIMA
The Meta-Knowledge of Causality in Biomedical Scientific Discourse
A Multi-Cultural Repository of Automatically Discovered Linguistic and Conceptual Metaphors
Annotating Relation Mentions in Tabloid Press
Efficient Reuse of Structured and Unstructured Resources for Ontology Population
Extraction of Daily Changing Words for Question Answering
Data Mining with Shallow Vs. Linguistic Features to Study Diversification of Scientific Registers
On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter
The Dangerous Myth of the Star System
Comparison of the Impact of Word Segmentation on Name Tagging for Chinese and Japanese
Named Entity Recognition on Turkish Tweets
Annotating Relations in Scientific Articles
Use of Unsupervised Word Classes for Entity Recognition: Application to the Detection of Disorders in Clinical Reports
The Weltmodell: a Data-Driven Commonsense Knowledge Base
Language Resources and Annotation Tools for Cross-Sentence Relation Extraction
French Resources for Extraction and Normalization of Temporal Expressions with Heideltime
Annotation of Computer Science Papers for Semantic Relation Extrac-Tion
Author-Specific Sentiment Aggregation for Polarity Prediction of Reviews
Creating a Gold Standard Corpus for the Extraction of Chemistry-Disease Relations from Patent Texts
Generating a Resource for Products and Brandnames Recognition. Application to the Cosmetic Domain.
Annotation of Specialized Corpora Using a Comprehensive Entity and Relation Scheme
A Deep Context Grammatical Model for Authorship Attribution
T2K²: a System for Automatically Extracting and Organizing Knowledge from Texts
AraNLP: a Java-based Library for the Processing of Arabic Text
Web-Imageability of the Behavioral Features of Basic-Level Concepts
Supervised Within-Document Event Coreference Using Information Propagation
Metadata as Linked Open Data: Mapping Disparate Xml Metadata Registries Into One Rdf / Owl Registry.
From Natural Language to Ontology Population in the Cultural Heritage Domain. a Computational Linguistics-based Approach.
Identification of Technology Terms in Patents
Towards Linked Hypernyms Dataset 2.0: Complementing DBpedia with Hypernym Discovery
Access Control by Query Rewriting: the Case of Korap
Towards Electronic Sms Dictionary Construction: an Alignment-based Approach
Freepal: a Large Collection of Deep Lexico-Syntactic Patterns for Relation Extraction
A Gold Standard for Clir Evaluation in the Organic Agriculture Domain
A LDA-based Topic Classification Approach from highly Imperfect Automatic Transcriptions
The Usage Review Corpus for Fine Grained Multi Lingual Opinion Analysis
Multilingual Extended WordNet Knowledge Base: Semantic Parsing and Translation of Glosses
Parsing Chinese Synthetic Words with a Character-based Dependency Model
Detecting Subevent Structure for Event Coreference Resolution
Thematic Cohesion: Measuring Terms Discriminatory Power Toward Themes
Terminology Resources and Terminology Work Benefit from Cloud Services

 

K
Knowledge Discovery/Representation Extracting Information for Context-Aware Meeting Preparation
A Large Scale Database of Strongly-Related Events in Japanese
Modeling Language Proficiency Using Implicit Feedback
A Method for Building Burst-Annotated Co-Occurrence Networks for Analysing Trends in Textual Data
Automatic Expansion of the MRC Psycholinguistic Database Imageability Ratings
DBpedia Domains: Augmenting DBpedia with Domain Information
Data Mining with Shallow Vs. Linguistic Features to Study Diversification of Scientific Registers
Investigating the Image of Entities in Social Media: Dataset Design and First Results
Construction of Diachronic Ontologies from People's Daily of Fifty Years
Annotating Clinical Events in Text Snippets for Phenotype Detection
The Weltmodell: a Data-Driven Commonsense Knowledge Base
Author-Specific Sentiment Aggregation for Polarity Prediction of Reviews
Clinical Data-Driven Probabilistic Graph Processing
TagNText: a Parallel Corpus for the Induction of Resource-Specific Non-Taxonomical Relations from Tagged Images
UnixMan Corpus: A Resource for Language Learning in the Unix Domain
A Framework for Compiling High Quality Knowledge Resources from Raw Corpora
Multilingual Corpora with Coreferential Annotation of Person Entities
RECSA: Resource for Evaluating Cross-Lingual Semantic Annotation

 

L
LR Infrastructures and Architectures Textual Emigration Analysis (TEA)
ELRA's Consolidated Services for the HLT Community
Encompassing a Spectrum of LT Users in the Clarin-Dk Infrastructure
The Evolving Infrastructure for Language Resources and the Role for Data Scientists
Towards Automatic Quality Assessment of Component Metadata
Restful Annotation and Efficient Collaboration
The Interplay Between Lexical and Syntactic Resources in Incremental Parsebanking
Interoperability and Customisation of Annotation Schemata in Argo
Collecting Natural Sms and Chat Conversations in Multiple Languages: the Bolt Phase 2 Corpus
Global Intelligent Content: Active Curation of Language Resources Using Linked Data
Design and Development of an Online Computational Framework to Facilitate Language Comprehension Research on Indian Languages
The CMD Cloud
Linguistic Landscaping of South Asia Using Digital Language Resources: Genetic Vs. Areal Linguistics
The Liability of Service Providers in E-Research Infrastructures: Killing the Messenger?
ClearTK 2.0: Design Patterns for Machine Learning in UIMA
LexTec - a Rich Language Resource for Technical Domains in Portuguese
A Database of Freely Written Texts of German School Students for the Purpose of Automatic Spelling Error Classification
Using TEI, CMDI and ISOcat in CLARIN-DK
ROOTS: a Toolkit for Easy, Fast and Consistent Processing of Large Sequential Annotated Data Collections
Three Dimensions of the So-Called "interoperability" of Annotation Schemes
HFST-SweNER ― a New NER Resource for Swedish
CLARA: A New Generation of Researchers in Common Language Resources and Their Applications
CLARIN-NL: Major Results
The Wavesurfer Automatic Speech Recognition Plugin
Potsdam Commentary Corpus 2.0: Annotation for Discourse Research
Language Processing Infrastructure in the Xlike Project
Annotating Arguments: the Nomad Collaborative Annotation Tool
TUKE-BNews-SK: Slovak Broadcast News Corpus Construction and Evaluation
The Hungarian Gigaword Corpus
Freepal: a Large Collection of Deep Lexico-Syntactic Patterns for Relation Extraction
New Bilingual Speech Databases for Audio Diarization
An Exercise in Reuse of Resources: Adapting General Discourse Coreference Resolution for Detecting Lexical Chains in Patent Documentation
Morphological Parsing of Swahili Using Crowdsourced Lexical Resources
Disclose Models, Hide the Data - How to Make Use of Confidential Corpora Without Seeing Sensitive Raw Data
Off-Road LAF: Encoding and Processing Annotations in NLP Workflows
Developing a Framework for Describing Relations Among Language Resources
LR National/International Projects, Infrastructural/Policy issues Benchmarking of English-Hindi Parallel Corpora
Evaluating Corpora Documentation with Regards to the Ethics and Big Data Charter
Casa De La Lhéngua: a Set of Language Resources and Natural Language Processing Tools for Mirandese
Experiences with the Isocat Data Category Registry
The Evolving Infrastructure for Language Resources and the Role for Data Scientists
The Syn-Series Corpora of Written Czech
Corpus of 19th-Century Czech Texts: Problems and Solutions
A Decade of HLT Agency Activities in the Low Countries: from Resource Maintenance (BLARK) to Service Offerings (BLAISE)
CoRoLa ― The Reference Corpus of Contemporary Romanian Language
The Clarin Research Infrastructure: Resources and Tools for Ehumanities Scholars
AusTalk: an Audio-Visual Corpus of Australian English
Enriching the "Senso Comune" Platform with Automatically Acquired Data
Language Resources for French in the Biomedical Domain
The Alveo Virtual Laboratory: a Web Based Repository Api
Adapting a Part-Of-Speech Tagset to Non-Standard Text: the Case of Stts
Access Control by Query Rewriting: the Case of Korap
Synergy of Nederlab and @PhilosTEI: Diachronic and Multilingual Text-Induced Corpus Clean-Up
Meta-Classifiers Easily Improve Commercial Sentiment Detection Tools
ELRA's Consolidated Services for the HLT Community
The Language Application Grid
Taalportaal: an Online Grammar of Dutch and Frisian
Language Identification A Corpus of Machine Translation Errors Extracted from Translation Students Exercises
Finding Romanized Arabic Dialect in Code-Mixed Tweets
Varclass: an Open-Source Language Identification Tool for Language Varieties
The RATS Collection: Supporting HLT Research with Degraded Audio Data
On the Romance Languages Mutual Intelligibility
Statistical Analysis of Multilingual Text Corpus and Development of Language Models
A Crowdsourcing Smartphone Application for Swiss German: Putting Language Documentation in the Hands of the Users
Vocabulary-based Language Similarity Using Web Corpora
A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic
KALAKA-3: a Database for the Recognition of Spoken European Languages on Youtube Audios
Language Modelling Revising the Annotation of a Broadcast News Corpus: a Linguistic Approach
Semi-Supervised Methods for Expanding Psycholinguistics Norms by Integrating Distributional Similarity with the Structure of WordNet
Enhancing the Ted-Lium Corpus with Selected Data for Language Modeling and More Ted Talks
#mygoal: Finding Motivations on Twitter
A New Framework for Sign Language Recognition Based on 3D Handshape Identification and Linguistic Modeling
A Character-based Approach to Distributional Semantic Models: Exploiting Kanji Characters for Constructing Japaneseword Vectors
Building a Dataset of Multilingual Cognates for the Romanian Lexicon
Improvements to Dependency Parsing Using Automatic Simplification of Data
Free Acoustic and Language Models for Large Vocabulary Continuous Speech Recognition in Swedish
3D Face Tracking and Multi-Scale, Spatio-Temporal Analysis of Linguistically Significant Facial Expressions and Head Positions in Asl
Building and Modelling Multilingual Subjective Corpora
Discovering Frames in Specialized Domains
Creative Language Explorations Through a High-Expressivity N-Grams Query Language
Valency and Word Order in Czech ― a Corpus Probe
Transfer Learning of Feedback Head Expressions in Danish and Polish Comparable Multimodal Corpora
Pruning the Search Space of the Wolof Lfg Grammar Using a Probabilistic and a Constraint Grammar Parser
A Model for Processing Illocutionary Structures and Argumentation in Debates
caWaC -- a Web Corpus of Catalan and Its Application to Language Modeling and Machine Translation
Enabling Language Resources to Expose Translations as Linked Data on the Web
Word Semantic Similarity for Morphologically Rich Languages
Focusing Annotation for Semantic Role Labeling
Lexicon, Lexical Database Toward a Unifying Model for Opinion, Sentiment and Emotion Information Extraction
PanLex: Building a Resource for Panlingual Lexical Translation
Thematic Cohesion: Measuring Terms Discriminatory Power Toward Themes
Mapping the Lexique Des Verbes Du Français (Lexicon of French Verbs) to a NLP Lexicon Using Examples
Production of Phrase Tables in 11 European Languages Using an Improved Sub-Sentential Aligner
CroDeriV: a New Resource for Processing Croatian Morphology
Building a Database of Japanese Adjective Examples from Special Purpose Web Corpora
Extracting a Bilingual Semantic Grammar from Framenet-Annotated Corpora
DerivBase.Hr: a High-Coverage Derivational Morphology Resource for Croatian
sloWCrowd: a Crowdsourcing Tool for Lexicographic Tasks
Building Domain Specific Bilingual Dictionaries
A Graph-based Approach for Computing Free Word Associations
MomResp: a Bayesian Model for Multi-Annotator Document Labeling
Automatic Refinement of Syntactic Categories in Chinese Word Structures
Zmorge: a German Morphological Lexicon Extracted from Wiktionary
Casa De La Lhéngua: a Set of Language Resources and Natural Language Processing Tools for Mirandese
On the Romance Languages Mutual Intelligibility
Terminology Localization Guidelines for the National Scenario
An Evaluation of the Role of Statistical Measures and Frequency for Mwe Identification
Globalphone: Pronunciation Dictionaries in 20 Languages
Annotating Events in an Emotion Corpus
A Character-based Approach to Distributional Semantic Models: Exploiting Kanji Characters for Constructing Japaneseword Vectors
A New Form of Humor ― Mapping Constraint-based Computational Morphologies to a Finite-State Representation
Corpus-based Computation of Reverse Associations
LexTec - a Rich Language Resource for Technical Domains in Portuguese/a>
Extraction of Daily Changing Words for Question Answering
Distributed Distributional Similarities of Google Books Over the Centuries
The Imagact Visual Ontology. an Extendable Multilingual Infrastructure for the Representation of Lexical Encoding of Action
Collaboration in the Production of a Massively Multilingual Lexicon
How to Tell a Schneemann from a Milchmann: an Annotation Scheme for Compound-Internal Relations
Computer-Aided Quality Assurance of an Icelandic Pronunciation Dictionary
A Large Corpus of Product Reviews in Portuguese: Tackling Out-Of-Vocabulary Words
A Set of Open Source Tools for Turkish Natural Language Processing
Identifying Idioms in Chinese Translations
T-PAS; A resource of Typed Predicate Argument Structures for linguistic analysis and semantic processing
Creative Language Explorations Through a High-Expressivity N-Grams Query Language
Developing a French Framenet: Methodology and First Results
Innovations in Parallel Corpus Search Tools
Bootstrapping an Italian VerbNet: Data-Driven Analysis of Verb Alternations
Self-Training a Constituency Parser Using N-Gram Trees
A Cascade Approach for Complex-Type Classification
Generating a Resource for Products and Brandnames Recognition. Application to the Cosmetic Domain.
Automatic Extraction of Synonyms for German Particle Verbs from Parallel Data with Distributional Similarity as a Re-Ranking Feature
Choosing Which to Use? A Study of Distributional Models for Nominal Lexical Semantic Classification
Predicate Matrix: Extending Semlink Through WordNet Mappings
Computer-Aided Morphology Expansion for Old Swedish
DysList: an Annotated Resource of Dyslexic Errors
Comparing Similarity Measures for Distributional Thesauri
Criteria for Identifying and Annotating Caused Motion Constructions in Corpus Data
Text Readability and Word Distribution in Japanese
The Multilingual Paraphrase Database
The Development of Dutch and Afrikaans Language Resources for Compound Boundary Analysis.
Conceptual Transfer: Using Local Classifiers for Transfer Selection
Sharing Resources Between Free / Open-Source Rule-based Machine Translation Systems: Grammatical Framework and Apertium
Heuristic Hyper-Minimization of Finite State Lexicons
An Open Source Part-Of-Speech Tagger for Norwegian: Building on Existing Language Resources
Bilingual Dictionaries for All Eu Languages
Propa-L: a Semantic Filtering Service from a Lexical Network Created Using Games with a Purpose
Automatic Acquisition of Urdu Nouns (along with Gender and Irregular Plurals)
Generating Polarity Lexicons with WordNet Propagation in 5 Languages
Summarizing News Clusters on the Basis of Thematic Chains
Relation Inference in Lexical Networks ... with Refinements
Single Classifier Approach for Verb Sense Disambiguation Based on Generalized Features
Extracting Semantic Relations from Portuguese Corpora Using Lexical-Syntactic Patterns
Optimizing a Distributional Semantic Model for the Prediction of German Particle Verb Compositionality
DCEP - Digital Corpus of the European Parliament
Utilizing Constituent Structure for Compound Analysis
Linked Data Visualization of Language Relations and Families: Multitree
NomLex-PT: a Lexicon of Portuguese Nominalizations
Semantic Search in Documents Enriched by Lod-based Annotations
CroDeriV: a New Resource for Processing Croatian Morphology
BiographyNet: Methodological Issues when NLP Supports Historical Research
TMO ― the Federated Ontology of the TRENDMINER Project
A SKOS-based Schema for TEI encoded Dictionaries at ICLTT
Evaluating Lemmatization Models for Machine-Assisted Corpus-Dictionary Linkage
A Wikipedia-based Corpus for Contextualized Machine Translation
Benchmarking the Extraction and Disambiguation of Named Entities on the Semantic Web
xLiD-Lexica: Cross-lingual Linked Data Lexica
Linked Open Data and Web Corpus Data for Noun Compound Bracketing
Sharing Cultural Heritage: the Clavius on the Web Project
Harmonization of German Lexical Resources for Opinion Mining
Mapping the Lexique Des Verbes Du Français (Lexicon of French Verbs) to a NLP Lexicon Using Examples
Automatic Mapping Lexical Resources: a Lexical Unit as the Keystone
CROMER: a Tool for Cross-Document Event and Entity Coreference
A Collection of Scholarly Book Reviews from the Platforms of Electronic Sources in Humanities and Social Sciences Openedition.Org
N³ - a Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format
Ruled-based, Interlingual Motivated Mapping of Plwordnet onto Sumo Ontology
Towards Interoperable Discourse Annotation. Discourse Features in the Ontologies of Linguistic Annotation
Building the Sense-Tagged Multilingual Parallel Corpus
Open Philology at the University of Leipzig
A Tool Suite for Creating Question Answering Benchmarks

 

M
Machine Translation, SpeechToSpeech Translation Bilingual Dictionary Construction with Transliteration Filtering
Large SMTData-Sets Extracted from Wikipedia
Two-Step Machine Translation with Lattices
MTWatch: A Tool for the Analysis of Noisy Parallel Data
Collecting Natural Sms and Chat Conversations in Multiple Languages: the Bolt Phase 2 Corpus
Comparing the Quality of Focused Crawlers and of the Translation Resources Obtained from Them
Openlogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries
Incorporating Alternate Translations Into English Translation Treebank
Multival - Towards a Multilingual Valence Lexicon
A Unified Annotation Scheme for the Semantic / Pragmatic Components of Definiteness
On the Reliability and Inter-Annotator Agreement of Human Semantic MT Evaluation Via Hmeant
Dual Subtitles as Parallel Corpora
Bootstrapping Open-Source English-Bulgarian Computational Dictionary
Collection of a Simultaneous Translation Corpus for Comparative Analysis
Translation Errors from English to Portuguese: an Annotated Corpus
English-French Verb Phrase Alignment in Europarl for Tense Translation Modeling
CFT13: a Resource for Research into the Post-editing Process
Creating a Massively Parallel Bible Corpus
Evaluating the Effects of Interactivity in a Post-Editing Workbench
ParCor 1.0: a Parallel Pronoun-Coreference Corpus to Support Statistical Mt
An Efficient Language Independent Toolkit for Complete Morphological Disambiguation
A Corpus of Spontaneous Speech in Lectures: the Kit Lecture Corpus for Spoken Language Processing and Translation
On the Annotation of TMX Translation Memories for Advanced Leveraging in Computer-Aided Translation
The Taraxü Corpus of Human-Annotated Machine Translations
The Strategic Impact of Meta-Net on the Regional, National and International Level
An Iterative Approach for Mining Parallel Sentences in a Comparable Corpus
Collocation Or Free Combination? ― Applying Machine Translation Techniques to Identify Collocations in Japanese
Multiword Expressions in Machine Translation
Crowdsourcing for Evaluating Machine Translation Quality
Hindencorp - Hindi-English and Hindi-Only Corpus for Machine Translation
caWaC - a Web Corpus of Catalan and Its Application to Language Modeling and Machine Translation
Billions of Parallel Words for Free: Building and Using the Eu Bookshop Corpus
LinkedHealthAnswers: Towards Linked Data-driven Question Answering for the Health Care Domain
Chasing the Perfect Splitter: a Comparison of Different Compound Splitting Tools
A Comparison of Mt Errors and Esl Errors
Improving Evaluation of English-Czech Mt Through Paraphrasing
DCEP - Digital Corpus of the European Parliament
An Efficient and User-Friendly Tool for Machine Translation Quality Estimation
Metadata Revising the Annotation of a Broadcast News Corpus: a Linguistic Approach
Using Large Biomedical Databases as Gold Annotations for Automatic Relation Extraction
Developing a Framework for Describing Relations Among Language Resources
Global Intelligent Content: Active Curation of Language Resources Using Linked Data
Experiences with the Isocat Data Category Registry
The Dutch LESLLA Corpus
The EASR Corpora of European Portuguese, French, Hungarian and Polish Elderly Speech
Three Dimensions of the So-Called "interoperability" of Annotation Schemes
TagNText: a Parallel Corpus for the Induction of Resource-Specific Non-Taxonomical Relations from Tagged Images
Meta-Share: One Year After
Meta-Classifiers Easily Improve Commercial Sentiment Detection Tools
HindEnCorp - Hindi-English and Hindi-only Corpus for Machine Translation
Recent Developments in DeReKo
Vulnerability in Acquisition, Language Impairments in Dutch: Creating a Valid Data Archive
Improving Entity Linking Using Surface Form Refinement
Facing the Identification Problem in Language-Related Scientific Data Analysis.
Morphology DerivBase.Hr: a High-Coverage Derivational Morphology Resource for Croatian
Generating and Using Probabilistic Morphological Resources for the Biomedical Domain
Computer-Aided Morphology Expansion for Old Swedish
DeLex, a Freely-Avaible, Large-Scale and Linguistically Grounded Morphological Lexicon for German
Automatic Refinement of Syntactic Categories in Chinese Word Structures
Bootstrapping Open-Source English-Bulgarian Computational Dictionary
Amazigh Verb Conjugator
Szeged Corpus 2.5: Morphological Modifications in a Manually Pos-Tagged Hungarian Corpus
The Syn-Series Corpora of Written Czech
Corpus of 19th-Century Czech Texts: Problems and Solutions
Automatic Error Detection Concerning the Definite and Indefinite Conjugation in the Hunlearner Corpus
A Language-Independent Approach to Extracting Derivational Relations from an Inflectional Lexicon
Morpho-Syntactic Study of Errors from Speech Recognition System
Can Crowdsourcing Be Used for Effective Annotation of Arabic?
Word-Formation Network for Czech
Glàff, a Large Versatile French Lexicon
The CMU Metal Farsi NLP Approach
Language Resource Addition: Dictionary Or Corpus?
The Development of Dutch and Afrikaans Language Resources for Compound Boundary Analysis.
Correcting Errors in a New Gold Standard for Tagging Icelandic Text
The Hungarian Gigaword Corpus
Measuring the Impact of Spelling Errors on the Quality of Machine Translation
Automatic Acquisition of Urdu Nouns (along with Gender and Irregular Plurals)
Chasing the Perfect Splitter: a Comparison of Different Compound Splitting Tools
MultiWord Expressions & Collocations PropBank: Semantics of New Predicate Types
4FX: Light Verb Constructions in a Multilingual Parallel Corpus
Semi-Compositional Method for Synonym Extraction of Multi-Word Terms
Linguistic Resources and Cats: How to Use Isocat, Relcat and Schemacat
Identifying Idioms in Chinese Translations
Identification of Multiword Expressions in the Brwac
Extrinsic Corpus Evaluation with a Collocation Dictionary Task
Comprehensive Annotation of Multiword Expressions in a Social Web Corpus
T2K²: a System for Automatically Extracting and Organizing Knowledge from Texts
Reconstructing the Semantic Landscape of Natural Language Processing
ISLEX ― a Multilingual Web Dictionary
TermWise: A CAT-tool with Context-Sensitive Terminological Support.
Compounds and Distributional Thesauri
SwissAdmin: a Multilingual Tagged Parallel Corpus of Press Releases
Summarizing News Clusters on the Basis of Thematic Chains
Named Entity Tagging a Very Large Unbalanced Corpus: Training and Evaluating Ne Classifiers
LexTerm Manager: Design for an Integrated Lexicography and Terminology System
Multilinguality Using Resource-Rich Languages to Improve Morphological Analysis of Under-Resourced Languages
Universal Stanford Dependencies: a Cross-Linguistic Typology
Pivot-based Multilingual Dictionary Building Using Wiktionary
Production of Phrase Tables in 11 European Languages Using an Improved Sub-Sentential Aligner
The Making of Ancient Greek WordNet
Extracting a Bilingual Semantic Grammar from Framenet-Annotated Corpora
Etymological WordNet: Tracing the History of Words
TLAXCALA: a Multilingual Corpus of Independent News
Relating Frames and Constructions in Japanese Framenet
Tharwa: a Large Scale Dialectal Arabic - Standard Arabic - English Lexicon
Automatic Methods for the Extension of a Bilingual Dictionary Using Comparable Corpora
Aggregation Methods for Efficient Collocation Detection
Globalphone: Pronunciation Dictionaries in 20 Languages
Linguistic Evaluation of Support Verb Constructions by Openlogos and Google Translate
Building a Dataset of Multilingual Cognates for the Romanian Lexicon
Automatic Expansion of the MRC Psycholinguistic Database Imageability Ratings
English-French Verb Phrase Alignment in Europarl for Tense Translation Modeling
Constructing a Chinese―Japanese Parallel Corpus from Wikipedia
xLiD-Lexica: Cross-lingual Linked Data Lexica
An Efficient Language Independent Toolkit for Complete Morphological Disambiguation
4FX: Light Verb Constructions in a Multilingual Parallel Corpus
Resources in Conflict: a Bilingual Valency Lexicon Vs. a Bilingual Treebank Vs. a Linguistic Theory
Buy One Get One Free: Distant Annotation of Chinese Tense, Event Type and Modality
Comparison of the Impact of Word Segmentation on Name Tagging for Chinese and Japanese
Not an Interlingua, But Close: Comparison of English Amrs to Chinese and Czech
On Complex Word Alignment Configurations
Bring vs. MTRoget: Evaluating Automatic Thesaurus Translation
The Strategic Impact of Meta-Net on the Regional, National and International Level
Bilingual Dictionary Induction as an Optimization Problem
Bootstrapping Term Extractors for Multiple Languages
Clustering of Multi-Word Named Entity Variants: Multilingual Evaluation
A Multidialectal Parallel Corpus of Arabic
Transfer Learning of Feedback Head Expressions in Danish and Polish Comparable Multimodal Corpora
Comparing Two Acquisition Systems for Automatically Building an English―Croatian Parallel Corpus from Multilingual Websites
Hashtag Occurrences, Layout and Translation: a Corpus-Driven Analysis of Tweets Published by the Canadian Government
On the Origin of Errors: a Fine-Grained Analysis of MT and PE Errors and their Relationship
YouDACC: the Youtube Dialectal Arabic Comment Corpus
Improving the Exploitation of Linguistic Annotations in Elan
Automatic Extraction of Synonyms for German Particle Verbs from Parallel Data with Distributional Similarity as a Re-Ranking Feature
NASTIA: Negotiating Appointment Setting Interface
Applying Accessibility-Oriented Controlled Language (CL) Rules to Improve Appropriateness of Text Alternatives for Images: an Exploratory Study
The DIRHA simulated corpus
High Quality Word Lists as a Resource for Multiple Purposes
ISLEX ― a Multilingual Web Dictionary
Exploiting Catenae in a Parallel Treebank Alignment
Multiple Choice Question Corpus Analysis for Distractor Characterization
Euronews: a Multilingual Speech Corpus for ASR
Towards Multilingual Conversations in the Medical Domain: Development of Multilingual Medical Data and a Network-based ASR System
How to Construct a Multi-Lingual Domain Ontology
Mining Online Discussion Forums for Metaphors
TALC-Sef a Manually-revised POS-Tagged Literary Corpus in Serbian, English and French
The Development of the Multilingual Luna Corpus for Spoken Language System Porting
An Open-Source Heavily Multilingual Translation Graph Extracted from Wiktionaries and Parallel Corpora
Quality Estimation for Synthetic Parallel Data Generation
Representing Multilingual Data as Linked Data: the Case of Babelnet 2.0
A Framework for Compiling High Quality Knowledge Resources from Raw Corpora
Extending Heideltime for Temporal Expressions Referring to Historic Dates
Enabling Language Resources to Expose Translations as Linked Data on the Web
Multilingual Extended WordNet Knowledge Base: Semantic Parsing and Translation of Glosses
A Comparison of Mt Errors and Esl Errors
HamleDT 2.0: Thirty Dependency Treebanks Stanfordized
Building the Sense-Tagged Multilingual Parallel Corpus
A Hindi-English Code-Switching Corpus
Resource Creation and Evaluation for Multilingual Sentiment Analysis in Social Media Texts
RECSA: Resource for Evaluating Cross-Lingual Semantic Annotation
Multimedia Document Processing Multimodal Corpora for Silent Speech Interaction
Extending Standoff Annotation
Expanding N-Gram Analytics in Elan and a Case Study for Sign Synthesis
TVD: a Reproducible and Multiply Aligned Tv Series Dataset
New Functions for a Multipurpose Multimodal Tool for Phonetic and Linguistic Analysis of Very Large Speech Corpora

 

N
Named Entity Recognition Semantic Search in Documents Enriched by Lod-based Annotations
NoSta-D Named Entity Annotation for German: Guidelines and Dataset
Annotating the Masc Corpus with Babelnet
Comparative Analysis of Portuguese Named Entities Recognition Tools
Clustering of Multi-Word Named Entity Variants: Multilingual Evaluation
Annotation of Specialized Corpora Using a Comprehensive Entity and Relation Scheme
Talapi - a Thai Linguistically Annotated Corpus for Language Processing
Measuring the Impact of Spelling Errors on the Quality of Machine Translation
Coreference Resolution for Latvian
Flow Graph Corpus from Recipe Texts
IXA Pipeline: Efficient and Ready to Use Multilingual NLP Tools
Extending Heideltime for Temporal Expressions Referring to Historic Dates
N³ - a Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format
Improving Entity Linking Using Surface Form Refinement
Evaluation of Technology Term Recognition with Random Indexing
ETER : a New Metric for the Evaluation of Hierarchical Named Entity Recognition
Natural Language Generation Openlogos Semantico-Syntactic Knowledge-Rich Bilingual Dictionaries
A Database for Measuring Linguistic Information Content
Out in the Open: Finding and Categorising Errors in the Lexical Simplification Pipeline
Valency and Word Order in Czech ― a Corpus Probe

 

O
Ontologies Modeling Language Proficiency Using Implicit Feedback
DBpedia Domains: Augmenting DBpedia with Domain Information
How to Construct a Multi-Lingual Domain Ontology
The Imagact Visual Ontology. an Extendable Multilingual Infrastructure for the Representation of Lexical Encoding of Action
Sharing Cultural Heritage: the Clavius on the Web Project
Two Approaches to Metaphor Detection
T-PAS; A resource of Typed Predicate Argument Structures for linguistic analysis and semantic processing
Lexical Substitution Dataset for German
Language Resources for French in the Biomedical Domain
From Synsets to Videos: Enriching Italwordnet Multimodally
A Multimodal Interpreter for 3d Visualization and Animation of Verbal Concepts
A Gold Standard for Clir Evaluation in the Organic Agriculture Domain
Meta-Share: One Year After
The LRE Map Disclosed
Ruled-based, Interlingual Motivated Mapping of Plwordnet onto Sumo Ontology
Towards Interoperable Discourse Annotation. Discourse Features in the Ontologies of Linguistic Annotation
Multilingual Corpora with Coreferential Annotation of Person Entities
A Modular System for Rule-based Text Categorisation
Opinion Mining / Sentiment Analysis Getting Reliable Annotations for Sarcasm in Online Dialogues
On the Importance of Text Analysis for Stock Price Prediction
LDC Activities
SenTube: a Corpus for Sentiment Analysis on Youtube Social Media
Hope and Fear: How Opinions Influence Factuality
PACE Corpus: a Multilingual Corpus of Polarity-Annotated Textual Data from the Domains Automotive and CEllphone
Can the Crowd Be Controlled?: a Case Study on Crowd Sourcing and Automatic Validation of Completed Tasks Based on User Modeling
Adapting Freely Available Resources to Build an Opinion Mining Pipeline in Portuguese
Investigating the Image of Entities in Social Media: Dataset Design and First Results
An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis
The Newsome Corpus: a Unifying Opinion Annotation Framework Across Genres and in Multiple Languages
Building and Modelling Multilingual Subjective Corpora
A Large Corpus of Product Reviews in Portuguese: Tackling Out-Of-Vocabulary Words
Media Monitoring and Information Extraction for the Highly Inflected Agglutinative Language Hungarian
Harmonization of German Lexical Resources for Opinion Mining
Modeling, Managing, Exposing, and Linking Ontologies with a Wiki-based Tool
GraPAT: a Tool for Graph Annotations
The Usage Review Corpus for Fine Grained Multi Lingual Opinion Analysis
SANA: a Large Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis
Optical Character Recognition Synergy of Nederlab and @PhilosTEI: Diachronic and Multilingual Text-Induced Corpus Clean-Up
Other Bilingual Dictionary Construction with Transliteration Filtering
Improving Open Relation Extraction Via Sentence Re-Structuring
Languagesindanger.Eu - Including Multimedia Language Resources to Disseminate Knowledge and Create Educational Material On less-Resourced Languages
Croatian Memories
Definition Patterns for Predicative Terms in Specialized Lexical Resources
Creating Summarization Systems with SUMMA
First Insight Into Quality-Adaptive Dialogue
Evaluating Improvised Hip Hop Lyrics - Challenges and Observations
A German Twitter Snapshot
Developing Text Resources for Ten South African Languages
Aggregation Methods for Efficient Collocation Detection
Hope and Fear: How Opinions Influence Factuality
Characterizing and Predicting Bursty Events: the Buzz Case Study on Twitter
Creating a Massively Parallel Bible Corpus
Walenty: Towards a Comprehensive Valence Dictionary of Polish
Corpus for Coreference Resolution on Scientific Papers
A Decade of HLT Agency Activities in the Low Countries: from Resource Maintenance (BLARK) to Service Offerings (BLAISE)
The D-ANS Corpus: the Dublin-Autonomous Nervous System Corpus of Biosignal and Multimodal Recordings of Conversational Speech
A Persian Treebank with Stanford Typed Dependencies
Shata-Anuvadak: Tackling Multiway Translation of Indian Languages
Sprinter: Language Technologies for Interactive and Multimedia Language Learning
Development of a Tv Broadcasts Speech Recognition System for Qatari Arabic
Automatic Long Audio Alignment and Confidence Scoring for Conversational Arabic Speech
Resources for the Detection of Conventionalized Metaphors in Four Languages
Automatic Annotation of Machine Translation Datasets with Binary Quality Judgements
The N2 Corpus: a Semantically Annotated Collection of Islamist Extremist Stories
Learning from Domain Complexity
Deep Syntax Annotation of the Sequoia French Treebank
Word-Formation Network for Czech
The Distress Analysis Interview Corpus of Human and Computer Interviews
Swift Aligner, a Multifunctional Tool for Parallel Corpora: Visualization, Word Alignment, and (Morpho)-Syntactic Cross-Language Transfer
HuRIC: a Human Robot Interaction Corpus
Annotation Pro + Tga: Automation of Speech Timing Analysis
Sentence Rephrasing for Parsing Sentences with Oov Words
The Merlin Corpus: Learner Language and the Cefr
Evaluation of Different Strategies for Domain Adaptation in Opinion Mining
A Compact Interactive Visualization of Dependency Treebank Query Results
Text Readability and Word Distribution in Japanese
Discovering and Visualising Stories in News
Supervised Within-Document Event Coreference Using Information Propagation
A Stream Computing Approach Towards Scalable NLP
Exploiting Catenae in a Parallel Treebank Alignment
Language Editing Dataset of Academic Texts
Towards Shared Datasets for Normalization Research
Rule-based Reordering Space in Statistical Machine Translation
A Database of Full Body Virtual Interactions Annotated with Expressivity Scores
Compounds and Distributional Thesauri
An Open-Source Heavily Multilingual Translation Graph Extracted from Wiktionaries and Parallel Corpora
An Out-Of-Domain Test Suite for Dependency Parsing of German
Fuzzy V-Measure - an Evaluation Method for Cluster Analyses of Ambiguous Data
Why Chinese Web-as-Corpus is Wacky? Or: How Big Data is Killing Chinese Corpus Linguistics
Online Optimisation of Log-Linear Weights in Interactive Machine Translation
Relation Inference in Lexical Networks ... with Refinements
Finding a Tradeoff Between Accuracy and Rater's Workload in Grading Clustered Short Answers
The American Local News Corpus
Optimizing a Distributional Semantic Model for the Prediction of German Particle Verb Compositionality
Detecting Subevent Structure for Event Coreference Resolution
Word Alignment-based Reordering of Source Chunks in PB-SMT

 

P
Parsing A Gold Standard Dependency Corpus for English
Boosting the Creation of a Treebank
A System for Experiments with Dependency Parsers
Improving Open Relation Extraction Via Sentence Re-Structuring
Universal Stanford Dependencies: a Cross-Linguistic Typology
Turkish Treebank as a Gold Standard for Morphological Disambiguation and Its Influence on Parsing
Incorporating Alternate Translations Into English Translation Treebank
Pre-Ordering of Phrase-based Machine Translation Input in Translation Workflow
Towards Building a Kashmiri Treebank: Setting Up the Annotation Pipeline
Information Extraction from German Patient Records Via Hybrid Parsing and Relation Extraction Strategies
Parsing Heterogeneous Corpora with a Rich Dependency Grammar
Mapping Diatopic and Diachronic Variation in Spoken Czech: the Ortofon and Dialekt Corpora
The Norwegian Dependency Treebank
All Fragments Count in Parser Evaluation
A Persian Treebank with Stanford Typed Dependencies
A Japanese Word Dependency Corpus
Converting an HPSG-based Treebank Into Its Parallel Dependency-based Treebank
Legal Aspects of Text Mining
Treelet Probabilities for HPSG Parsing and Error Correction
Swift Aligner, a Multifunctional Tool for Parallel Corpora: Visualization, Word Alignment, and (Morpho)-Syntactic Cross-Language Transfer
Projection-based Annotation of a Polish Dependency Treebank
Towards an Encyclopedia of Compositional Semantics: Documenting the Interface of the English Resource Grammar
The CMU Metal Farsi NLP Approach
The Setimes.Hr Linguistically Annotated Corpus of Croatian
Croatian Dependency Treebank 2.0: New Annotation Guidelines for Improved Parsing
Constituency Parsing of Bulgarian: Word- Vs Class-based Parsing
An Out-Of-Domain Test Suite for Dependency Parsing of German
Automatically Enriching Spoken Corpora with Syntactic Information for Linguistic Studies
Because Size Does Matter: the Hamburg Dependency Treebank
Dependency Parsing Representation Effects on the Accuracy of Semantic Applications ― an Example of an Inflective Language
HamleDT 2.0: Thirty Dependency Treebanks Stanfordized
Validation Issues Induced by an Automatic Pre-Annotation Mechanism in the Building of Non-Projective Dependency Treebanks
Bidirectionnal Converter Between Syntactic Annotations : from French Treebank Dependencies to Passage Annotations, and Back
Part-of-Speech Tagging PoliTa: a Multitagger for Polish
DeLex, a Freely-Avaible, Large-Scale and Linguistically Grounded Morphological Lexicon for German
The Kiezdeutsch Korpus (KiDKo) Release 1.0
ColLex.EN: Automatically Generating and Evaluating a Full-Form Lexicon for English
Developing an Egyptian Arabic Treebank: Impact of Dialectal Morphology on Annotation and Tool Development
Finite-State Morphological Transducers for Three Kypchak Languages
Using Transfer Learning to Assist Exploratory Corpus Annotation
Szeged Corpus 2.5: Morphological Modifications in a Manually Pos-Tagged Hungarian Corpus
The Cle Urdu POS Tagset
Adapting Freely Available Resources to Build an Opinion Mining Pipeline in Portuguese
Using Stem-Templates to Improve Arabic POS and Gender / Number Tagging
CoRoLa ― The Reference Corpus of Contemporary Romanian Language
The LIMA Multilingual Analyzer Made Free: FLOSS Resources Adaptation and Correction
Bootstrapping Term Extractors for Multiple Languages
The Gulf of Guinea Creole Corpora
A Corpus of European Portuguese Child and Child-Directed Speech
A Tagged Corpus and a Tagger for Urdu
Talapi ― a Thai Linguistically Annotated Corpus for Language Processing
Language Resource Addition: Dictionary Or Corpus?
The Setimes.Hr Linguistically Annotated Corpus of Croatian
Activ-Es: a Comparable, Cross-Dialect Corpus of ‘everyday’ Spanish from Argentina, Mexico, and Spain
TALC-Sef a Manually-revised POS-Tagged Literary Corpus in Serbian, English and French
Morfeusz Reloaded
SwissAdmin: a Multilingual Tagged Parallel Corpus of Press Releases
Standardisation and Interoperation of Morphosyntactic and Syntactic Annotation Tools for Spanish and Their Annotations
A 500 Million Word Pos-Tagged Icelandic Corpus
Macrosyntactic Segmenters of a French Spoken Corpus
KoKo: an L1 Learner Corpus for German
Person Identification Sockpuppet Detection in Wikipedia: a Corpus of Real-World Deceptive Writing for Linking Identities
An Effortless Way to Create Large-Scale Datasets for Famous Speakers
Comparison of Gender- and Speaker-Adaptive Emotion Recognition
German Alcohol Language Corpus - the Question of Dialect
Phonetic Databases, Phonology On the Use of a Fuzzy Classifier to Speed Up the Sp_ToBI Labeling of the Glissando Spanish Corpus
Using a Machine Learning Model to Assess the Complexity of Stress Systems
The Nijmegen Corpus of Casual Czech
Computer-Aided Quality Assurance of an Icelandic Pronunciation Dictionary
Phoneme Similarity Matrices to Improve Long Audio Alignment for Automatic Subtitling
GRASS: the Graz Corpus of Read and Spontaneous Speech
Design and Development of an Rdb Version of the Corpus of Spontaneous Japanese
Glàff, a Large Versatile French Lexicon
C-Phonogenre: a 7-Hours Corpus of 7 Speaking Styles in French: Relations Between Situational Features and Prosodic Properties
Profiling CLiPS Stylometry Investigation (CSI) Corpus: a Dutch Corpus for the Detection of Age, Gender, Personality, Sentiment and Deception in Text
How to Use Less Features and Reach Better Performance in Author Gender Identification
Modeling and Evaluating Dialog Success in the Last Minute Corpus
Recognising Suicidal Messages in Dutch Social Media
Prosody ALICO: a Multimodal Corpus for the Study of Active Listening
A Cross-Language Corpus for Studying the Phonetics and Phonology of Prominence
Praaline: Integrating Tools for Speech Corpus Research
Evaluating Improvised Hip Hop Lyrics - Challenges and Observations
Eliciting and Annotating Uncertainty in Spoken Language
Teenage and Adult Speech in School Context: Building and Processing a Corpus of European Portuguese
Prosodic, Syntactic, Semantic Guidelines for Topic Structures Across Domains and Corpora
Annotation Pro + Tga: Automation of Speech Timing Analysis
New Spanish Speech Corpus Database for the Analysis of People Suffering from Parkinson's Disease
Towards Automatic Transformation Between Different Transcription Conventions: Prediction of Intonation Markers from Linguistic and Acoustic Features
RSS-TOBI - a Prosodically Enhanced Romanian Speech Corpus
Using Audio Books for Training a Text-To-Speech System
Assessment of Non-Native Prosody for Spanish as L2 Using Quantitative Scores and Perceptual Evaluation
C-Phonogenre: a 7-Hours Corpus of 7 Speaking Styles in French: Relations Between Situational Features and Prosodic Properties
The Extended Dirndl Corpus as a Resource for Coreference and Bridging Resolution
New Functions for a Multipurpose Multimodal Tool for Phonetic and Linguistic Analysis of Very Large Speech Corpora
DisMo: a Morphosyntactic, Disfluency and Multi-Word Unit Annotator. an Evaluation on a Corpus of French Spontaneous and Read Speech
Segmentation Evaluation Metrics, a Comparison Grounded on Prosodic and Discourse Units

 

Q
Question Answering Votter Corpus: a Corpus of Social Polling Language
A Hierarchical Taxonomy for Classifying Hardness of Inference Tasks
A Study on Expert Sourcing Enterprise Question Collection and Classification
Multimodal Dialogue Segmentation with Gesture Post-Processing
Overview of Todai Robot Project and Evaluation Framework of Its NLP-based Problem Solving
A Tool Suite for Creating Question Answering Benchmarks

 

S
Semantic Web Accommodations in Tuscany as Linked Data
The DWAN Framework: Application of a Web Annotation Framework for the General Humanities to the Domain of Language Resources
A Meta-Data Driven Platform for Semi-Automatic Configuration of Ontology Mediators
N-Gram Counts and Language Models from the Common Crawl
TMO ― the Federated Ontology of the TRENDMINER Project
A SKOS-based Schema for TEI encoded Dictionaries at ICLTT
Efficient Reuse of Structured and Unstructured Resources for Ontology Population
Linked Open Data and Web Corpus Data for Noun Compound Bracketing
Newsreader: Recording History from Daily News Streams
Discovering and Visualising Stories in News
From Natural Language to Ontology Population in the Cultural Heritage Domain. a Computational Linguistics-based Approach.
NIF4OGGD - NLP Interchange Format for Open German Governmental Data
The LRE Map Disclosed
Representing Multilingual Data as Linked Data: the Case of Babelnet 2.0
VOAR: A Visual and Integrated Ontology Alignment Environment
Semantics PropBank: Semantics of New Predicate Types
Reusing Swedish Framenet for Training Semantic Roles
A Rank-based Distance Measure to Detect Polysemy and to Determine Salient Vector-Space Features for German Prepositions
Image Annotation with Iso-Space: Distinguishing Content from Structure
Semantic Approaches to Software Component Retrieval with English Queries
Definition Patterns for Predicative Terms in Specialized Lexical Resources
The Making of Ancient Greek WordNet
Augmenting English Adjective Senses with Supersenses
Evaluation of Simple Distributional Compositional Operations on Longer Texts
Relating Frames and Constructions in Japanese Framenet
Crowdsourcing for the Identification of Event Nominals: an Experiment
Tharwa: a Large Scale Dialectal Arabic - Standard Arabic - English Lexicon
Semantic Technologies for Querying Linguistic Annotations: an Experiment Focusing on Graph-Structured Data
A Unified Annotation Scheme for the Semantic / Pragmatic Components of Definiteness
Aligning Predicate-Argument Structures for Paraphrase Fragment Extraction
On the Reliability and Inter-Annotator Agreement of Human Semantic MT Evaluation Via Hmeant
Mapping WordNet Domains, WordNet Topics and Wikipedia Categories to Generate Multilingual Domain Specific Resources
Adapting VerbNet to French Using Existing Resources
Corpus-based Computation of Reverse Associations
Annotating Relation Mentions in Tabloid Press
Mapping Diatopic and Diachronic Variation in Spoken Czech: the Ortofon and Dialekt Corpora
Constructing a Corpus of Japanese Predicate Phrases for Synonym / Antonym Relations
Distributed Distributional Similarities of Google Books Over the Centuries
How to Tell a Schneemann from a Milchmann: an Annotation Scheme for Compound-Internal Relations
Construction of Diachronic Ontologies from People's Daily of Fifty Years
Resources in Conflict: a Bilingual Valency Lexicon Vs. a Bilingual Treebank Vs. a Linguistic Theory
Buy One Get One Free: Distant Annotation of Chinese Tense, Event Type and Modality
Building a Reference Lexicon for Countability in English
Not an Interlingua, But Close: Comparison of English Amrs to Chinese and Czech
WordNet―Wikipedia―Wiktionary: Construction of a Three-Way Alignment
Evaluation of Automatic Hypernym Extraction from Technical Corpora in English and Dutch
Discovering Frames in Specialized Domains
Resources for the Detection of Conventionalized Metaphors in Four Languages
Annotation of Computer Science Papers for Semantic Relation Extrac-Tion
Using C5.0 and Exhaustive Search for Boosting Frame-Semantic Parsing Accuracy
Automatic Semantic Relation Extraction from Portuguese Texts
Lexical Substitution Dataset for German
Polysemy Index for Nouns: an Experiment on Italian Using the Parole Simple CLiPS Lexical Database
Manual Analysis of Structurally Informed Reordering in German-English Machine Translation
Criteria for Identifying and Annotating Caused Motion Constructions in Corpus Data
Web-Imageability of the Behavioral Features of Basic-Level Concepts
Semi-Compositional Method for Synonym Extraction of Multi-Word Terms
From Synsets to Videos: Enriching Italwordnet Multimodally
Mining Online Discussion Forums for Metaphors
Classifying Inconsistencies in DBpedia Language Specific Chapters
Flow Graph Corpus from Recipe Texts
To Pay Or to Get Paid: Enriching a Valency Lexicon with Diatheses
Annotating the Focus of Negation in Japanese Text
Less is More? Towards a Reduced Inventory of Categories for Training a Parser for the Italian Stanford Dependencies
Combining Dependency Information and Generalization in a Pattern-based Approach to the Classification of Lexical-Semantic Relation Instances
Dependency Parsing Representation Effects on the Accuracy of Semantic Applications ― an Example of an Inflective Language
Extending the Coverage of a Mwe Database for Persian Cps Exploiting Valency Alternations
Single Classifier Approach for Verb Sense Disambiguation Based on Generalized Features
An Analysis of Ambiguity in Word Sense Annotations
SANA: a Large Scale Multi-Genre, Multi-Dialect Lexicon for Arabic Subjectivity and Sentiment Analysis
Word Semantic Similarity for Morphologically Rich Languages
Focusing Annotation for Semantic Role Labeling
Sign Language Recognition/Generation SLMotion - an Extensible Sign Language Oriented Video Analysis Tool
Extensions of the Sign Language Recognition and Translation Corpus Rwth-Phoenix-Weather
Expanding N-Gram Analytics in Elan and a Case Study for Sign Synthesis
LinkedHealthAnswers: Towards Linked Data-driven Question Answering for the Health Care Domain
Social Media Processing A Corpus of Comparisons in Product Reviews
A Corpus of Participant Roles in Contentious Discussions
Modelling Irony in Twitter: Feature Analysis and Evaluation
Getting Reliable Annotations for Sarcasm in Online Dialogues
Finding Romanized Arabic Dialect in Code-Mixed Tweets
Votter Corpus: a Corpus of Social Polling Language
A German Twitter Snapshot
SenTube: a Corpus for Sentiment Analysis on Youtube Social Media
Simple Effective Microblog Named Entity Recognition: Arabic as an Example
An Arabic Twitter Corpus for Subjectivity and Sentiment Analysis
The Dangerous Myth of the Star System
Crowdsourcing and Annotating NER for Twitter #drift
When POS Data Sets Don't Add Up: Combatting Sample Bias
Benchmarking Twitter Sentiment Analysis Tools
Comprehensive Annotation of Multiword Expressions in a Social Web Corpus
Building a Crisis Management Term Resource for Social Media: the Case of Floods and Protests
Who Cares About Sarcastic Tweets? Investigating the Impact of Sarcasm on Sentiment Analysis.
Named Entity Corpus Construction Using Wikipedia and DBpedia Ontology
Towards Shared Datasets for Normalization Research
Nomad: Linguistic Resources and Tools Aimed at Policy Formulation and Validation
TweetCaT: a Tool for Building Twitter Corpora of Smaller Languages
A Framework for Public Health Surveillance
Speech Recognition/Understanding The Etape Speech Processing Evaluation
Enhancing the Ted-Lium Corpus with Selected Data for Language Modeling and More Ted Talks
Automatically Enriching Spoken Corpora with Syntactic Information for Linguistic Studies
ASR-based CALL Systems and Learner Speech Data: New Resources and Opportunities for Research and Development in Second Language Learning
Ciempiess: a New Open-Sourced Mexican Spanish Radio Corpus
Speech Recognition Web Services for Dutch
A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition
Free English and Czech Telephone Speech Corpus Shared Under the Cc-By-Sa 3.0 License
The DIRHA simulated corpus
TUKE-BNews-SK: Slovak Broadcast News Corpus Construction and Evaluation
Towards Multilingual Conversations in the Medical Domain: Development of Multilingual Medical Data and a Network-based Asr System
The Slovene Bnsi Broadcast News Database and Reference Speech Corpus Gos: Towards the Uniform Guidelines for Future Work
A Toolkit for Efficient Learning of Lexical Units for Speech Recognition
Basque Speecon-Like and Basque Speechdat Mdb-600: Speech Databases for the Development of ASR Technology for Basque
Using a Serious Game to Collect a Child Learner Speech Corpus
A LDA-based Topic Classification Approach from highly Imperfect Automatic Transcriptions
Exploiting the Large-Scale German Broadcast Corpus to Boost the Fraunhofer Iais Speech Recognition System
El-Woz: a Client-Server Wizard-Of-Oz Interface
Speech Resource/Database Phoneme Set Design Using English Speech Database by Japanese for Dialogue-based English Call Systems
Croatian Memories
Designing the Latvian Speech Recognition Corpus
The Kiezdeutsch Korpus (KiDKo) Release 1.0
The RATS Collection: Supporting HLT Research with Degraded Audio Data
Untrained Forced Alignment of Transcriptions and Audio for Language Documentation Corpora Using Webmaus
The Sweet-Home Speech and Multimodal Corpus for Home Automation Interaction
Collection of a Simultaneous Translation Corpus for Comparative Analysis
The Database for Spoken German ― DGD2
SAVAS: Collecting, Annotating and Sharing Audiovisual Language Resources for Automatic Subtitling
Speech Recognition Web Services for Dutch
ML-Optimization of Ported Constraint Grammars
Phone Boundary Annotation in Conversational Speech
The Research and Teaching Corpus of Spoken German ― Folk
Free Acoustic and Language Models for Large Vocabulary Continuous Speech Recognition in Swedish
An Effortless Way to Create Large-Scale Datasets for Famous Speakers
Rhapsodie: a Prosodic-Syntactic Treebank for Spoken French
GRASS: the Graz Corpus of Read and Spontaneous Speech
German Alcohol Language Corpus - the Question of Dialect
Development of a Tv Broadcasts Speech Recognition System for Qatari Arabic
Design and Development of an Rdb Version of the Corpus of Spontaneous Japanese
Automatic Long Audio Alignment and Confidence Scoring for Conversational Arabic Speech
Semi-Automatic Annotation of the Ucu Accents Speech Corpus
AusTalk: an Audio-Visual Corpus of Australian English
The Sspnet-Mobile Corpus: Social Signal Processing Over Mobile Phones.
Extensions of the Sign Language Recognition and Translation Corpus Rwth-Phoenix-Weather
Mapping CPA Patterns onto OntoNotes Senses
Voce Corpus: Ecologically Collected Speech Annotated with Physiological and Psychological Stress Assessments
A Multimodal Corpus of Rapid Dialogue Games
Alert!... Calm Down, There is Nothing to Worry About. Warning and Soothing Speech Synthesis.
CORILGA: a Galician Multilevel Annotated Speech Corpus for Linguistic Analysis
Basque Speecon-Like and Basque Speechdat Mdb-600: Speech Databases for the Development of ASR Technology for Basque
Erlangen-CLP: A Large Annotated Corpus of Speech from Children with Cleft Lip and Palate
Using a Serious Game to Collect a Child Learner Speech Corpus
Using Audio Books for Training a Text-To-Speech System
Discovering the Italian Literature: Interactive Access to Audio Indexed Text Resources
VOLIP: a Corpus of Spoken Italian and a Virtuous Example of Reuse of Linguistic Resources
A Hindi-English Code-Switching Corpus
El-Woz: a Client-Server Wizard-Of-Oz Interface
Multilingual Test Sets for Machine Translation of Search Queries for Cross-Lingual Information Retrieval in the Medical Domain
Speech Synthesis The MMASCS Multi-Modal Annotated Synchronous Corpus of Audio, Video, Facial Motion and Tongue Motion Data of Normal, Fast and Slow Speech
Texafon 2.0: a Text Processing Tool for the Generation of Expressive Speech in Tts Applications
RSS-TOBI - a Prosodically Enhanced Romanian Speech Corpus
A Flexible Language Learning Platform Based on Language Resources and Web Services
Standards for LRs On Paraphrase Identification Corpora
Image Annotation with Iso-Space: Distinguishing Content from Structure
N-Gram Counts and Language Models from the Common Crawl
Benchmarking of English-Hindi Parallel Corpora
RELISH LMF: Unlocking the Full Power of the Lexical Markup Framework
The CMD Cloud
Interoperability of Dialogue Corpora Through Iso 24617-2-based Querying
A Benchmark Database of Phonetic Alignments in Historical Linguistics and Dialectology
Using TEI, CMDI and ISOcat in CLARIN-DK
Legal Aspects of Text Mining
Towards an Integration of Syntactic and Temporal Annotations in Estonian
Adapting a Part-Of-Speech Tagset to Non-Standard Text: the Case of Stts
An Open Source Part-Of-Speech Tagger for Norwegian: Building on Existing Language Resources
Vulnerability in Acquisition, Language Impairments in Dutch: Creating a Valid Data Archive
Facing the Identification Problem in Language-Related Scientific Data Analysis.
Off-Road LAF: Encoding and Processing Annotations in NLP Workflows
Statistical and Machine Learning Methods Gold-Standard for Topic-Specific Sentiment Analysis of Economic Texts
Semantic Approaches to Software Component Retrieval with English Queries
Missed Opportunities in Translation Memory Matching
Use of Unsupervised Word Classes for Entity Recognition: Application to the Detection of Disorders in Clinical Reports
ColLex.EN: Automatically Generating and Evaluating a Full-Form Lexicon for English
Event Extraction Using Distant Supervision
A Vector Space Model for Syntactic Distances Between Dialects
The Av-Lasyn Database : a Synchronous Corpus of Audio and 3d Facial Marker Data for Audio-Visual Laughter Synthesis
Exploring and Visualizing Variation in Language Resources
SLMotion - an Extensible Sign Language Oriented Video Analysis Tool
Boosting the Creation of a Treebank
Improvements to Dependency Parsing Using Automatic Simplification of Data
From Non Word to New Word: Automatically Identifying Neologisms in French Newspapers
Comparison of Gender- and Speaker-Adaptive Emotion Recognition
Disambiguating Verbs by Collocation: Corpus Lexicography Meets Natural Language Processing
GenitivDB ― a Corpus-Generated Database for German Genitive Classification
3d Face Tracking and Multi-Scale, Spatio-Temporal Analysis of Linguistically Significant Facial Expressions and Head Positions in Asl
All Fragments Count in Parser Evaluation
A Language-Independent Approach to Extracting Derivational Relations from an Inflectional Lexicon
Bring vs. MTRoget: Evaluating Automatic Thesaurus Translation
Latent Semantic Analysis Models on Wikipedia and Tasa
Shata-Anuvadak: Tackling Multiway Translation of Indian Languages
Narrowing the Gap Between Termbases and Corpora in Commercial Environments
Machine Translationness: Machine-Likeness in Machine Translation Evaluation
Using C5.0 and Exhaustive Search for Boosting Frame-Semantic Parsing Accuracy
Projection-based Annotation of a Polish Dependency Treebank
A Deep Context Grammatical Model for Authorship Attribution
DINASTI: Dialogues with a Negotiating Appointment Setting Interface
LQVSumm: a Corpus of Linguistic Quality Violations in Multi-Document Summarization
Choosing Which to Use? A Study of Distributional Models for Nominal Lexical Semantic Classification
Estimation of Speaking Style in Speech Corpora Focusing on Speech Transcriptions
A Quality-based Active Sample Selection Strategy for Statistical Machine Translation
Metadata as Linked Open Data: Mapping Disparate Xml Metadata Registries Into One Rdf / Owl Registry.
Hindi to English Machine Translation: Using Effective Selection in Multi-Model SMT
New Spanish Speech Corpus Database for the Analysis of People Suffering from Parkinson's Disease
Automatic Language Identity Tagging on Word and Sentence-Level in Multilingual Text Sources: a Case-Study on Luxembourgish
Crowd-Sourcing Evaluation of Automatically Acquired, Morphologically Related Word Groupings
A Language-Independent and Fully Unsupervised Approach to Lexicon Induction and Part-Of-Speech Tagging for Closely Related Languages
Quality Estimation for Synthetic Parallel Data Generation
Online Optimisation of Log-Linear Weights in Interactive Machine Translation
Finding a Tradeoff Between Accuracy and Rater's Workload in Grading Clustered Short Answers
Evaluation of Technology Term Recognition with Random Indexing
Utilizing Constituent Structure for Compound Analysis
Summarisation Building a Dataset for Summarization and Keyword Extraction from Emails
The Polish Summaries Corpus
The Impact of Cohesion Errors in Extraction Based Summaries
Out in the Open: Finding and Categorising Errors in the Lexical Simplification Pipeline
Locating Requests Among Open Source Software Communication Messages
How Could Veins Speed Up the Process of Discourse Parsing
A Repository of State of the Art and Competitive Baseline Summaries for Generic News Summarization

 

T
Text Mining Gold-Standard for Topic-Specific Sentiment Analysis of Economic Texts
HiEve: A Corpus for Extracting Event Hierarchies from News Stories
Co-Clustering of Bilingual Datasets as a Mean for Assisting the Construction of Thematic Bilingual Comparable Corpora
Enrichment of Bilingual Dictionary Through News Stream Data
Event Extraction Using Distant Supervision
SinoCoreferencer: An End-to-End Chinese Event Coreference Resolver
Using Large Biomedical Databases as Gold Annotations for Automatic Relation Extraction
Annotating Inter-Sentence Temporal Relations in Clinical Notes
Tools for Arabic Natural Language Processing: a Case Study in Qalqalah Prosody
Dual Subtitles as Parallel Corpora
Variations on Quantitative Comparability Measures and Their Evaluations on Synthetic French-English Comparable Corpora
Linking Pictographs to Synsets: Sclera2Cornetto
Information Extraction from German Patient Records Via Hybrid Parsing and Relation Extraction Strategies
Constructing a Corpus of Japanese Predicate Phrases for Synonym / Antonym Relations
On Stopwords, Filtering and Data Sparsity for Sentiment Analysis of Twitter
Automatic Semantic Relation Extraction from Portuguese Texts
Creating a Gold Standard Corpus for the Extraction of Chemistry-Disease Relations from Patent Texts
Estimation of Speaking Style in Speech Corpora Focusing on Speech Transcriptions
AraNLP: a Java-based Library for the Processing of Arabic Text
Who Cares About Sarcastic Tweets? Investigating the Impact of Sarcasm on Sentiment Analysis.
Coreference Resolution for Latvian
Ranking Job Offers for Candidates: Learning Hidden Knowledge from Big Data
Clustering Tweets Usingwikipedia Concepts
Hot Topics and Schisms in NLP: Community and Trend Analysis with Saffron on Acl and Lrec Proceedings
The American Local News Corpus
When Transliteration Met Crowdsourcing : an Empirical Study of Transliteration Via Crowdsourcing Using Efficient, Non-Redundant and Fair Quality Control
Textual Entailment and Paraphrasing On Paraphrase Identification Corpora
Multimodal Dialogue Segmentation with Gesture Post-Processing
A SICK Cure for the Evaluation of Compositional Distributional Semantic Models
Semantic Clustering of Pivot Paraphrases
The Multilingual Paraphrase Database
Annotating the Focus of Negation in Japanese Text
Improving Evaluation of English-Czech Mt Through Paraphrasing
Tools, Systems, Applications VERTa: Facing a Multilingual Experience of a Linguistically-based MT Evaluation
Accommodations in Tuscany as Linked Data
Discovering the Italian Literature: Interactive Access to Audio Indexed Text Resources
Using Stem-Templates to Improve Arabic POS and Gender / Number Tagging
A Meta-Data Driven Platform for Semi-Automatic Configuration of Ontology Mediators
The Ellogon Pattern Engine: Context-Free Grammars Over Annotations
Missed Opportunities in Translation Memory Matching
Native Language Identification Using Large, Longitudinal Data
Enriching ODIN
Creating Summarization Systems with SUMMA
Developing an Egyptian Arabic Treebank: Impact of Dialectal Morphology on Annotation and Tool Development
MomResp: a Bayesian Model for Multi-Annotator Document Labeling
Towards Automatic Detection of Narrative Structure
A Method for Building Burst-Annotated Co-Occurrence Networks for Analysing Trends in Textual Data
Annotating Inter-Sentence Temporal Relations in Clinical Notes
Refractive: an Open Source Tool to Extract Knowledge from Syntactic and Semantic Relations
A Finite-State Morphological Analyzer for a Lakota HPSG Grammar
Motàmot Project: Conversion of a French-Khmer Published Dictionary for Building a Multilingual Lexical System
Just.Ask, a QASystem That Learns to Answer New Questions from Previous Interactions
Open-Domain Interaction and Online Content in the Sami Language
Guampa: a Toolkit for Collaborative Translation
RELISH LMF: Unlocking the Full Power of the Lexical Markup Framework
Ciempiess: a New Open-Sourced Mexican Spanish Radio Corpus
Exploring and Visualizing Variation in Language Resources
A New Form of Humor ― Mapping Constraint-based Computational Morphologies to a Finite-State Representation
A Multi-Cultural Repository of Automatically Discovered Linguistic and Conceptual Metaphors
First Approach Toward Semantic Role Labeling for Basque
Aligning Parallel Texts with Intertext
Extending Standoff Annotation
Turkish Resources for Visual Word Recognition
ROOTS: a Toolkit for Easy, Fast and Consistent Processing of Large Sequential Annotated Data Collections
Constructing and Exploiting an Automatically Annotated Resource of Legislative Texts
The EASR Corpora of European Portuguese, French, Hungarian and Polish Elderly Speech
Phoneme Similarity Matrices to Improve Long Audio Alignment for Automatic Subtitling
HFST-SweNER ― a New NER Resource for Swedish
Introducing a Framework for the Evaluation of Music Detection Tools
Detecting Document Structure in a Very Large Corpus of Uk Financial Reports
Latent Semantic Analysis Models on Wikipedia and Tasa
Sprinter: Language Technologies for Interactive and Multimedia Language Learning
Bilingual Dictionary Induction as an Optimization Problem
A Set of Open Source Tools for Turkish Natural Language Processing
The Procedure of Lexico-Semantic Annotation of Składnica Treebank
French Resources for Extraction and Normalization of Temporal Expressions with Heideltime
CLARIN-NL: Major Results
Machine Translation for Subtitling: a Large-Scale Evaluation
The N2 Corpus: a Semantically Annotated Collection of Islamist Extremist Stories
Benchmarking Twitter Sentiment Analysis Tools
Corpus Annotation Through Crowdsourcing: Towards Best Practice Guidelines
Machine Translationness: Machine-Likeness in Machine Translation Evaluation
Towards an Environment for the Production and the Validation of Lexical Semantic Resources
The Distress Analysis Interview Corpus of Human and Computer Interviews
Representing Multimodal Linguistic Annotated Data
Comparative Analysis of Portuguese Named Entities Recognition Tools
Collocation Or Free Combination? ― Applying Machine Translation Techniques to Identify Collocations in Japanese
The Wavesurfer Automatic Speech Recognition Plugin
A Cascade Approach for Complex-Type Classification
Online Experiments with the Percy Software Framework - Experiences and Some Early Results
Improving the Exploitation of Linguistic Annotations in Elan
Sentence Rephrasing for Parsing Sentences with Oov Words
Clinical Data-Driven Probabilistic Graph Processing
A Compact Interactive Visualization of Dependency Treebank Query Results
ILLINOISCLOUDNLP: Text Analytics Services in the Cloud
Reconstructing the Semantic Landscape of Natural Language Processing
High Quality Word Lists as a Resource for Multiple Purposes
Language Processing Infrastructure in the Xlike Project
Sharing Resources Between Free / Open-Source Rule-based Machine Translation Systems: Grammatical Framework and Apertium
A Stream Computing Approach Towards Scalable NLP
Hindi to English Machine Translation: Using Effective Selection in Multi-Model SMT
Experiences with Parallelisation of an Existing NLP Pipeline: Tagging Hansard
A Model to Generate Adaptive Multimodal Job Interviews with a Virtual Recruiter
Identification of Technology Terms in Patents
A Toolkit for Efficient Learning of Lexical Units for Speech Recognition
Rule-based Reordering Space in Statistical Machine Translation
TVD: a Reproducible and Multiply Aligned Tv Series Dataset
The Halliday Centre Tagger: an Online Platform for Semi-Automatic Text Annotation and Analysis
Heuristic Hyper-Minimization of Finite State Lexicons
Nomad: Linguistic Resources and Tools Aimed at Policy Formulation and Validation
Standardisation and Interoperation of Morphosyntactic and Syntactic Annotation Tools for Spanish and Their Annotations
The Tutorbot Corpus ― a Corpus for Studying Tutoring Behaviour in Multiparty Face-To-Face Spoken Dialogue
Combining Dependency Information and Generalization in a Pattern-based Approach to the Classification of Lexical-Semantic Relation Instances
An Exercise in Reuse of Resources: Adapting General Discourse Coreference Resolution for Detecting Lexical Chains in Patent Documentation
VOAR: A Visual and Integrated Ontology Alignment Environment
DisMo: a Morphosyntactic, Disfluency and Multi-Word Unit Annotator. an Evaluation on a Corpus of French Spontaneous and Read Speech
Integration of Workflow and Pipeline for Language Service Composition
Large Scale Arabic Error Annotation: Guidelines and Framework
MAT: a Tool for L2 Pronunciation Errors Annotation
Taalportaal: an Online Grammar of Dutch and Frisian
A Framework for Public Health Surveillance
Languagesindanger.Eu - Including Multimedia Language Resources to Disseminate Knowledge and Create Educational Material On less-Resourced Languages
Topic Detection & Tracking Extracting Information for Context-Aware Meeting Preparation
Newsreader: Recording History from Daily News Streams
The Slovak Categorized News Corpus
Clustering Tweets Usingwikipedia Concepts
Hot Topics and Schisms in NLP: Community and Trend Analysis with Saffron on ACL and LREC Proceedings
A Modular System for Rule-based Text Categorisation
Typological Databases Etymological WordNet: Tracing the History of Words
Language Collage: Grammatical Description with the Lingo Grammar Matrix

 

U
Usability, User Satisfaction BiographyNet: Methodological Issues when NLP Supports Historical Research
First Insight Into Quality-Adaptive Dialogue
Mörkum Njálu. an Annotated Corpus to Analyse and Explain Grammatical Divergences Between 14th-Century Manuscripts of Njál's Saga.
Evaluating the Effects of Interactivity in a Post-Editing Workbench
An Analysis of Older Users' Interactions with Spoken Dialogue Systems
'interHist' - an Interactive Visual Interface for Corpus Exploration
LQVSumm: a Corpus of Linguistic Quality Violations in Multi-Document Summarization
A Model to Generate Adaptive Multimodal Job Interviews with a Virtual Recruiter
Encompassing a Spectrum of LT Users in the Clarin-Dk Infrastructure
Macrosyntactic Segmenters of a French Spoken Corpus

 

V
Validation of LRs Generating and Using Probabilistic Morphological Resources for the Biomedical Domain
Large SMTData-Sets Extracted from Wikipedia
Terminology Localization Guidelines for the National Scenario
Transliteration and Alignment of Parallel Texts from Cyrillic to Latin
Exploiting Portuguese Lexical Knowledge Bases for Answering Open Domain Cloze Questions Automatically
Semantic Clustering of Pivot Paraphrases
When POS Data Sets Don't Add Up: Combatting Sample Bias
Using Word Familiarities and Word Associations to Measure Corpus Representativeness
Towards an Integration of Syntactic and Temporal Annotations in Estonian
Polysemy Index for Nouns: an Experiment on Italian Using the Parole Simple CLiPS Lexical Database
Dense Components in the Structure of WordNet
Sublanguage Corpus Analysis Toolkit: a Tool for Assessing the Representativeness and Sublanguage Characteristics of Corpora
KALAKA-3: a Database for the Recognition of Spoken European Languages on Youtube Audios
Billions of Parallel Words for Free: Building and Using the Eu Bookshop Corpus
Named Entity Tagging a Very Large Unbalanced Corpus: Training and Evaluating Ne Classifiers

 

W
Web Services Restful Annotation and Efficient Collaboration
PoliTa: a Multitagger for Polish
Visualization of Language Relations and Families: Multitree
Guampa: a Toolkit for Collaborative Translation
Benchmarking the Extraction and Disambiguation of Named Entities on the Semantic Web
The Liability of Service Providers in E-Research Infrastructures: Killing the Messenger?
The Clarin Research Infrastructure: Resources and Tools for Ehumanities Scholars
Introducing a Web Application for Labeling, Visualizing Speech and Correcting Derived Speech Signals
Online Experiments with the Percy Software Framework - Experiences and Some Early Results
The Alveo Virtual Laboratory: a Web Based Repository Api
ILLINOISCLOUDNLP: Text Analytics Services in the Cloud
Multiple Choice Question Corpus Analysis for Distractor Characterization
Thomas Aquinas in the Tündra: Integrating the Index Thomisticus Treebank Into Clarin-D
Morphological Parsing of Swahili Using Crowdsourced Lexical Resources
VOLIP: a Corpus of Spoken Italian and a Virtuous Example of Reuse of Linguistic Resources
The Language Application Grid
Integration of Workflow and Pipeline for Language Service Composition
Word Sense Disambiguation NomLex-PT: a Lexicon of Portuguese Nominalizations
Mapping Between English Strings and Reentrant Semantic Graphs
Augmenting English Adjective Senses with Supersenses
sloWCrowd: a Crowdsourcing Tool for Lexicographic Tasks
Automatic Creation of WordNets from Parallel Corpora
Disambiguating Verbs by Collocation: Corpus Lexicography Meets Natural Language Processing
Annotating the MASC Corpus with BabelNet
WordNet―Wikipedia―Wiktionary: Construction of a Three-Way Alignment
Cross-Linguistic Annotation of Narrativity for English / French Verb Tense Disambiguation
Two Approaches to Metaphor Detection
Exploiting Portuguese Lexical Knowledge Bases for Answering Open Domain Cloze Questions Automatically
Predicate Matrix: Extending Semlink Through WordNet Mappings
Automatic Mapping Lexical Resources: a Lexical Unit as the Keystone

Powered by ELDA © 2014 ELDA/ELRA