Abstract:
This graduate thesis collects the result of a project whose objectives are developing a senti-
ment analysis system and testing it with two different corpora. The first one is based on a
Twitter corpus extracted by ourselves. In the other one, a corpus set is committed by the
TASS 2015 evaluation workshop.
The Twitter corpus has been extracted following different strategies. A collector has
been designed focused on these strategies, and has been capturing tweets for several months.
This corpus has been divided into two sets: training and testing. The training set has been
annotated manually.
The sentiment analysis system relies on a supervised machine learning technique. This
system has a preprocessor and a feature extractor specializing on Twitter messages.
The TASS 2015 workshop consists on two different tasks and for both tasks, three
experiments have been submitted, choosing the best combination of features extracted by
the sentiment analysis system in the training set. Task-1 consist on evaluating the system
with two different tagged corpus, one with six sentiment labels and the other one with four
sentiment labels. Task-2 consist on detecting different aspects in a message and classifying
their sentiment polarity.
The sentiment analysis system has been evaluated too with the Twitter training set.
Once it has been trained, the test set has been classified. This set has been used for
developing an interactive dashboard based on the platform Sefarad 2.0.
Finally, we gather the extracted conclusions from this project, the technologies we have
learned during the development and the possible lines of future work.