This paper aims to construct a linguistic resource of Korean Multiword Expressions DECO-MWE for Feature-Based Sentiment Analysis (FBSA). Dealing with multiword expressions (MWEs) has been a critical issue in FBSA since the compositions of the construct reveal lexical idiosyncrasy. To construct linguistic resources of sentiment MWEs efficiently, we utilize Local Grammar Graph (LGG) methodology: DECO-MWE is formalized as a Finite-State Transducer that represents lexical-syntactic restrictions of MWEs. In this study, we build a corpus of Cosmetics review texts, which particularly shows frequent occurrences of MWEs. Based on the empirical examination of the corpus, four types of MWEs are discerned. The DECO-MWE thus consists of the following four categories: Standard Polarity MWEs (SMWEs), Domain-Dependent Polarity MWEs (DMWEs), Compound Named Entity MWEs (EMWEs) and Compound Feature MWEs (FMWEs). The retrieval performance of the DECO-MWE shows 0.806 f-measure in the test corpus. This study brings a two-fold outcome: first, a sizable general-purpose polarity MWE lexicon will be proposed in this study, which may be broadly used in FBSA; second, a finite-state methodology adapted in this study to treat domain-dependent MWEs such as idiosyncratic polarity expressions, named entity expressions or feature expressions may be reutilized in describing linguistic properties of other corpus domains.
@InProceedings{HAN18.5, author = {Jaeho Han ,Changhoe Hwang ,Seongyong Choi ,Gwanghoon Yoo ,Eric Laporte and Jeesun Nam}, title = {DECO-MWE: Building a Linguistic Resource of Korean Multiword Expressions for Feature-Based Sentiment Analysis}, booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)}, year = {2018}, month = {may}, date = {7-12}, location = {Miyazaki, Japan}, editor = {Kiyoaki Shirai}, publisher = {European Language Resources Association (ELRA)}, address = {Paris, France}, isbn = {979-10-95546-24-5}, language = {english} }