LREC 2016 Proceedings

Summary of the paper

Title	Domain Adaptation for Named Entity Recognition Using CRFs
Authors	Tian Tian, Marco Dinarelli, Isabelle Tellier and Pedro Dias Cardoso
Abstract	In this paper we explain how we created a labelled corpus in English for a Named Entity Recognition (NER) task from multi-source and multi-domain data, for an industrial partner. We explain the specificities of this corpus with examples and describe some baseline experiments. We present some results of domain adaptation on this corpus using a labelled Twitter corpus (Ritter et al., 2011). We tested a semi-supervised method from (Garcia-Fernandez et al., 2014) combined with a supervised domain adaptation approach proposed in (Raymond and Fayolle, 2010) for machine learning experiments with CRFs (Conditional Random Fields). We use the same technique to improve the NER results on the Twitter corpus (Ritter et al., 2011). Our contributions thus consist in an industrial corpus creation and NER performance improvements.
Topics	Named Entity Recognition, Corpus (Creation, Annotation, etc.), Information Extraction, Information Retrieval
Full paper	Domain Adaptation for Named Entity Recognition Using CRFs
Bibtex	@InProceedings{TIAN16.1102, author = {Tian Tian and Marco Dinarelli and Isabelle Tellier and Pedro Dias Cardoso}, title = {Domain Adaptation for Named Entity Recognition Using CRFs}, booktitle = {Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)}, year = {2016}, month = {may}, date = {23-28}, location = {Portorož, Slovenia}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Sara Goggi and Marko Grobelnik and Bente Maegaard and Joseph Mariani and Helene Mazo and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, address = {Paris, France}, isbn = {978-2-9517408-9-1}, language = {english} }