| Title | 
  Hybrid Citation Extraction from Patents | 
  
  
  | Authors | 
  Olivier Galibert, Sophie Rosset, Xavier Tannier and Fanny Grandry | 
  
  
  | Abstract | 
  The Quaero project organized a set of evaluations of Named Entity recognition systems in 2009. One of the sub-tasks consists in extracting citations from patents, i.e. references to other documents, either other patents or general literature from English-language patents. We present in this paper the participation of LIMSI in this evaluation, with a complete system description and the evaluation results. The corpus shown that patent and non-patent citations have a very different nature. We then separated references to other patents and to general literature papers and we created a hybrid system. For patent citations, the system used rule-based expert knowledge on the form of regular expressions. The system for detecting non-patent citations, on the other hand, is purely stochastic (machine learning with CRF++). Then we mixed both approaches to provide a single output. 4 teams participated to this task and our system obtained the best results of this evaluation campaign, even if the difference between the first two systems is poorly significant. | 
  
  
  | Topics | 
  Named Entity recognition, Information Extraction, Information Retrieval, Tools, systems, applications   | 
  
  
  Full paper  | 
  Hybrid Citation Extraction from Patents | 
  
  
  Slides  | 
  - | 
  
  
  | Bibtex | 
  @InProceedings{GALIBERT10.81, 
   author =  {Olivier Galibert and Sophie Rosset and Xavier Tannier and Fanny Grandry},    title =  {Hybrid Citation Extraction from Patents},    booktitle =  {Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC'10)},    year =  {2010},    month =  {may},    date =  {19-21},    address =  {Valletta, Malta},    editor =  {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Bente Maegaard and Joseph Mariani and Jan Odijk and Stelios Piperidis and Mike Rosner and Daniel Tapias},    publisher =  {European Language Resources Association (ELRA)},    isbn =  {2-9517408-6-7},    language =  {english}  }   |