Title |
Boosting statistical tagger accuracy with simple rule-based grammars |
Authors |
Mans Hulden and Jerid Francom |
Abstract |
We report on several experiments on combining a rule-based tagger and a trigram tagger for Spanish. The results show that one can boost the accuracy of the best performing n-gram taggers by quickly developing a rough rule-based grammar to complement the statistically induced one and then combining the output of the two. The specific method of combination is crucial for achieving good results. The method provides particularly large gains in accuracy when only a small amount of tagged data is available for training a HMM, as may be the case for lesser-resourced and minority languages. |
Topics |
Part of speech tagging, Language modelling, Corpus (creation, annotation, etc.) |
Full paper |
Boosting statistical tagger accuracy with simple rule-based grammars |
Bibtex |
@InProceedings{HULDEN12.1075,
author = {Mans Hulden and Jerid Francom}, title = {Boosting statistical tagger accuracy with simple rule-based grammars}, booktitle = {Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)}, year = {2012}, month = {may}, date = {23-25}, address = {Istanbul, Turkey}, editor = {Nicoletta Calzolari (Conference Chair) and Khalid Choukri and Thierry Declerck and Mehmet Uğur Doğan and Bente Maegaard and Joseph Mariani and Asuncion Moreno and Jan Odijk and Stelios Piperidis}, publisher = {European Language Resources Association (ELRA)}, isbn = {978-2-9517408-7-7}, language = {english} } |