Summary of the paper

Title Extending the Loughran and McDonald Financial Sentiment Words List from 10-K Corporate Fillings using Social Media Texts
Authors Marcelo Sardelich and Dimitar Kazakov
Abstract This article describes a novel text corpora and sentiment lexicon for financial text mining. The text corpora comprises social media messages, specifically, comments on stocks by Yahoo Message Board service users. The messages contain the user opinion and are labelled by the user with an overall sentiment label. This novel dataset with 74,641 messages covering 492 stocks over a period of two years is made publicly available. State-of-the-art methods are used to extract terms that convey positive and negative connotation from each message of the corpora. Then, each message is represented as a vector of these terms and sentiment classifiers are trained. The best combination of text representation weights and classifier model achieves 91.4% accuracy in the test set. We then use this sentiment classifier to build a sentiment lexicon, which contains words associated with positive and negative sentiments. We show that this lexicon is useful to extend previously proposed words lists, which were manually crafted from 10-K or 10-Q financial documents, and is able to capture the sentiment of terms from the formal and informal language of financial stock markets. Our novel financial domain text corpora and sentiment lexicon constitute valuable language resources to help advance the work on financial narrative processing.
Full paper Extending the Loughran and McDonald Financial Sentiment Words List from 10-K Corporate Fillings using Social Media Texts
Bibtex @InProceedings{SARDELICH18.1,
  author = {Marcelo Sardelich and Dimitar Kazakov},
  title = {Extending the Loughran and McDonald Financial Sentiment Words List from 10-K Corporate Fillings using Social Media Texts},
  booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)},
  year = {2018},
  month = {may},
  date = {7-12},
  location = {Miyazaki, Japan},
  editor = {Octavian Popescu and Carlo Strapparava},
  publisher = {European Language Resources Association (ELRA)},
  address = {Paris, France},
  isbn = {979-10-95546-11-5},
  language = {english}
  }
Powered by ELDA © 2018 ELDA/ELRA