Publication - Development of a Classification System of Deceptive Behaviours based on Machine Learning Techniques. Application to Fake Reviews and Radicalist Recruiters Detection

Development of a Classification System of Deceptive Behaviours based on Machine Learning Techniques. Application to Fake Reviews and Radicalist Recruiters Detection

Rodrigo Barbado. (2016). Development of a Classification System of Deceptive Behaviours based on Machine Learning Techniques. Application to Fake Reviews and Radicalist Recruiters Detection. Final Career Project. ETSI Telecomunicación, Universidad Politécnica de Madrid.

Abstract:

In the last years the Internet has been progressively introducing into the lives of people from all over the world. The power of Internet is huge and it has multiple benefits to improve communication between people and democratize the access to information. However, the Internet also carries some risks associated not only to security but also veracity of the published content. One of the risks related with content that can be found on the net has to do with the author intentions, which are not always well-intentioned. In this final grade project, the principal objective is developing a classification system for users with malicious intentions. Firstly, there will be a description of the enabling technologies which made possible the realization of this project in which there is a need of obtaining and storing data extracted from Internet for its posterior process. The problem of classification requires saving data instances along with the class they belong to, being a case of supervised machine learning. The system will be applied in two study cases: detection of fake reviews in social webs and yihadist recruiters detection. For the first case, the project will work with a labeled corpus of restaurants from the social network Yelp. For the other case, the work will be done with a labeled corpus from forums where the Dark Web community has detected activity from recruiters who promote yihadist radicalism. In both cases the objective will be achieving the greatest accuracy the classification for new input examples of the system. Finally, conclusions of the project and lines of future work will be exposed.

JRESEARCH_BIBTEX:

@mastersthesis{152|Rodrigo, Barbado2016,
author = "Barbado, Rodrigo",
abstract = "In the last years the Internet has been progressively introducing into the lives of people from all over the world. The power of Internet is huge and it has multiple benefits to improve communication between people and democratize the access to information.
However, the Internet also carries some risks associated not only to security but also veracity of the published content. One of the risks related with content that can be found on the net has to do with the author intentions, which are not always well-intentioned. In this final grade project, the principal objective is developing a classification system for users with malicious intentions.
Firstly, there will be a description of the enabling technologies which made possible the realization of this project in which there is a need of obtaining and storing data extracted from Internet for its posterior process. The problem of classification requires saving data instances along with the class they belong to, being a case of supervised machine learning.
The system will be applied in two study cases: detection of fake reviews in social webs and yihadist recruiters detection. For the first case, the project will work with a labeled corpus of restaurants from the social network Yelp. For the other case, the work will be done with a labeled corpus from forums where the Dark Web community has detected activity from recruiters who promote yihadist radicalism. In both cases the objective will be achieving the greatest accuracy the classification for new input examples of the system.
Finally, conclusions of the project and lines of future work will be exposed.",
address = "Universidad Polit{\'e}cnica de Madrid",
institution = "ETSI Telecomunicaci{\'o}n",
keywords = "recruiter;jihadist;deceived;fake review",
month = "June",
title = "{D}evelopment of a {C}lassification {S}ystem of {D}eceptive {B}ehaviours based on {M}achine {L}earning {T}echniques. {A}pplication to {F}ake {R}eviews and {R}adicalist {R}ecruiters {D}etection",
year = "2016",
}