Title |
ParsCit: an Open-source CRF Reference String Parsing Package |
Authors |
Isaac Councill, C Lee Giles and Min-Yen Kan |
Abstract |
We describe ParsCit, a freely available, open-source implementation of a reference string parsing package. At the core of ParsCit is a trained conditional random field (CRF) model used to label the token sequences in the reference string. A heuristic model wraps this core with added functionality to identify reference strings from a plain text file, and to retrieve the citation contexts. The package comes with utilities to run it as a web service or as a standalone utility. We compare ParsCit on three distinct reference string datasets and show that it compares well with other previously published work. |
Language |
|
Topics |
LR web services, Information Extraction, Information Retrieval, Tools, systems, applications |
Full paper |
ParsCit: an Open-source CRF Reference String Parsing Package |
Slides |
- |
Bibtex |
@InProceedings{COUNCILL08.166,
author = {Isaac Councill, C Lee Giles and Min-Yen Kan},
title = {ParsCit: an Open-source CRF Reference String Parsing Package},
booktitle = {Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)},
year = {2008},
month = {may},
date = {28-30},
address = {Marrakech, Morocco},
editor = {Nicoletta Calzolari (Conference Chair), Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Daniel Tapias},
publisher = {European Language Resources Association (ELRA)},
isbn = {2-9517408-4-0},
note = {http://www.lrec-conf.org/proceedings/lrec2008/},
language = {english}
} |