Summary of the paper

Title Hard Numbers: Language Exclusion in Computational Linguistics and Natural Language Processing
Authors Martin Benjamin
Abstract The intersection between computer science and human language occurs largely for English and a few dozen other languages with strong economic or political support. The supermajority of the world's languages have extremely little digital presence, and little activity that can be forecast to change that status. However, such an assertion has remained impressionistic in the absence of data comparing the attention lavished on elite languages with that given to the rest of the world. This study seeks to give some numbers to the extent to which non-lucrative languages sit at the margins of language technology and computational research. Three datasets are explored that reveal current hiring and research activity at universities and corporations concerned with computational linguistics and natural language processing. The data supports the conclusion that most research activity and career opportunities focus on a few languages, while most languages have little or no current research and little possibility for the professional pursuit of their development.
Full paper Hard Numbers: Language Exclusion in Computational Linguistics and Natural Language Processing
Bibtex @InProceedings{BENJAMIN18.23,
  author = {Martin Benjamin},
  title = {Hard Numbers: Language Exclusion in Computational Linguistics and Natural Language Processing},
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {may},
  date = {7-12},
  location = {Miyazaki, Japan},
  editor = {Claudia Soria and Laurent Besacier and Laurette Pretorius},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {979-10-95546-22-1},
  language = {english}
  }
Powered by ELDA © 2018 ELDA/ELRA