Publication - A Distributional Semantics Perspective of Lexical Resources for Affect Analysis: An application to Extremist Narratives

A Distributional Semantics Perspective of Lexical Resources for Affect Analysis: An application to Extremist Narratives

Oscar Araque. (2020). A Distributional Semantics Perspective of Lexical Resources for Affect Analysis: An application to Extremist Narratives. Phd Thesis. Universidad Politécnica de Madrid, ETSI Telecomunicación.

Abstract:

Sentiment and Emotion Analysis are prominent fields in Natural Language Processing (NLP) and have contributed to its progress. Modeling emotions requires a number of techniques and methods that, in most of the cases, are shared with other NLP areas. In such a context, this thesis addresses the development of novel machine learning methods through the combination of both surface and deep features for Sentiment and Emotion Analysis. Observing the results obtained, the developed methods have been adapted to other NLP areas, transferring the obtained results to fields that can largely benefit from such novel techniques. In this way, we have designed a taxonomy that classifies different approaches in Sentiment Analysis, attending to the features and combinations used. This taxonomy has allowed us to develop several machine learning models that combine both surface and deep features, as well as a variety of learning models, with an especial focus on deep learning approaches. These models have been evaluated in document and aspect-level Sentiment Analysis frameworks. Following, we contributed with a novel technique for generating domain-specific sentiment lexicons through the backpropagation algorithm. In a similar line of work, this thesis also extends a method for generating emotion lexicons, obtaining a bilingual resource with emotion annotations in both English and Italian. The thesis' core contribution is a similarity-based perspective on lexicons that extracts features exploiting both a word embedding model and a sentiment lexicon: the SIMON model. We have observed that SIMON has shown positive results in the experimental evaluation, offering a compelling alternative to classical lexicon usage techniques. In light of this last contribution, this thesis studies the adaptability of methods that are developed in the context of Sentiment and Emotion Analysis to other fields. More concretely, this thesis has contributed to radicalization detection and moral value estimation. In the area of radicalization detection, we have adapted the SIMON model, combining it with an emotion-driven feature extraction method. Similarly, for moral value estimation, SIMON has been used to exploit a novel lexicon we have generated.

JRESEARCH_BIBTEX:

@phdthesis{araque2020thesis-gsi-phdthesis-2020,
author = "Araque, Oscar",
abstract = "Sentiment and Emotion Analysis are prominent fields in Natural Language Processing (NLP) and have contributed to its progress. Modeling emotions requires a number of techniques and methods that, in most of the cases, are shared with other NLP areas. In such a context, this thesis addresses the development of novel machine learning methods through the combination of both surface and deep features for Sentiment and Emotion Analysis. Observing the results obtained, the developed methods have been adapted to other NLP areas, transferring the obtained results to fields that can largely benefit from such novel techniques.

In this way, we have designed a taxonomy that classifies different approaches in Sentiment Analysis, attending to the features and combinations used. This taxonomy has allowed us to develop several machine learning models that combine both surface and deep features, as well as a variety of learning models, with an especial focus on deep learning approaches. These models have been evaluated in document and aspect-level Sentiment Analysis frameworks. Following, we contributed with a novel technique for generating domain-specific sentiment lexicons through the backpropagation algorithm. In a similar line of work, this thesis also extends a method for generating emotion lexicons, obtaining a bilingual resource with emotion annotations in both English and Italian.

The thesis{\&}#039; core contribution is a similarity-based perspective on lexicons that extracts features exploiting both a word embedding model and a sentiment lexicon: the SIMON model. We have observed that SIMON has shown positive results in the experimental evaluation, offering a compelling alternative to classical lexicon usage techniques.

In light of this last contribution, this thesis studies the adaptability of methods that are developed in the context of Sentiment and Emotion Analysis to other fields. More concretely, this thesis has contributed to radicalization detection and moral value estimation. In the area of radicalization detection, we have adapted the SIMON model, combining it with an emotion-driven feature extraction method. Similarly, for moral value estimation, SIMON has been used to exploit a novel lexicon we have generated.
",
address = "ETSI Telecomunicaci{\'o}n",
institution = "Universidad Polit{\'e}cnica de Madrid",
keywords = "machine learning;Sentiment analysis;emotion analysis;radicalism",
month = "October",
title = "{A} {D}istributional {S}emantics {P}erspective of {L}exical {R}esources for {A}ffect {A}nalysis: {A}n application to {E}xtremist {N}arratives",
year = "2020",
}