Noticias

The article "A text classification approach to detect psychological stress combining a lexicon-based feature framework with distributional representations", by Sergio Muñoz andCarlos A. Iglesias has been published in the Information Processing & Management journal (7.466 impact factor, JCR Q1 2021). This work is a product of the COGNOS Project.

The full paper can be found at this URL.

 

Abstract:

Nowadays, stress has become a growing problem for society due to its high impact on individuals but also on health care systems and companies. In order to overcome this problem, early detection of stress is a key factor. Previous studies have shown the effectiveness of text analysis in the detection of sentiment, emotion, and mental illness. However, existing solutions for stress detection from text are focused on a specific corpus. There is still a lack of well-validated methods that provide good results in different datasets. We aim to advance state of the art by proposing a method to detect stress in textual data and evaluating it using multiple public English datasets. The proposed approach combines lexicon-based features with distributional representations to enhance classification performance. To help organize features for stress detection in text, we propose a lexicon-based feature framework that exploits affective, syntactic, social, and topic-related features. Also, three different word embedding techniques are studied for exploiting distributional representation. Our approach has been implemented with three machine learning models that have been evaluated in terms of performance through several experiments. This evaluation has been conducted using three public English datasets and provides a baseline for other researchers. The obtained results identify the combination of FastText embeddings with a selection of lexicon-based features as the best-performing model, achieving F-scores above 80%.

 

 

The article "Semantic Modeling of a VLC-Enabled Task Automation Platform for Smart Offices", by Sergio Muñoz, Carlos A. Iglesias, Andrei Scheianu and George Suciu has been published in the Electronics journal (2.397 impact factor, JCR Q2 2020). This work is a product of the TETRAMAX VLP Project (Cátedra Cabify-UPM))

The full paper can be found at this URL.

 

Abstract:

The evolution of ambient intelligence has introduced a range of new opportunities to improve people’s well-being. One of these opportunities is the use of these technologies to enhance workplaces and improve employees’ comfort and productivity. However, these technologies often entail two major challenges: the requirement for fast and reliable data transmission between the vast number of devices connected simultaneously, and the interoperability between these devices. Conventional communication technologies present some drawbacks in these kinds of systems, such as lower data rates and electromagnetic interference, which have prompted research into new wireless communication technologies. One of these technologies is visible light communication (VLC), which uses existing light in an environment to transmit data. Its characteristics make it an up-and-coming technology for IoT services but also aggravate the interoperability challenge. To facilitate the continuous communication of the enormous amount of heterogeneous data generated, highly agile data models are required. The semantic approach tackles this problem by switching from ad hoc application-centric representation models and formats to a formal definition of concepts and relationships. This paper aims to advance the state of the art by proposing a semantic vocabulary for an intelligent automation platform with VLC enabled, which benefits from the advantages of VLC while ensuring the scalability and interoperability of all system components. Thus, the main contributions of this work are threefold: (i) the design and definition of a semantic model for an automation platform; (ii) the development of a prototype automation platform based on a VLC-based communication system; and (iii) the integration and validation of the proposed semantic model in the VLC-based automation platform.

 

The article "GSITK: A sentiment analysis framework for agile replication and development", by Oscar Araque, J. Fernando Sánchez-Rada, and Carlos A. Iglesias has been published in the SoftwareX journal (1.959 impact factor, JCR Q3 2020). The paper describes the GSITK software, which is a framework to perform a wide variety of sentiment analysis tasks including dataset acquisition, text preprocessing, model design, and performance evaluation.

The full paper can be openly accessed at this URL.

 

Abstract:

GSITK is a framework to perform a wide variety of sentiment analysis tasks, including dataset acquisition, text preprocessing, model design, and performance evaluation. The framework is oriented to both researchers and practitioners, easing the replication of previous sentiment models, as well as offering implementations of common tasks. This is achieved by building several abstractions on top of popular libraries such as scikit-learn and NLTK. In this way, GSITK allows users to implement complex sentiment pipelines using comprehensible Python code. The framework is Open Source and has been used successfully in several research projects and competitions.

Ha sido aprobada la solicitud de un Proyecto de Innovación Educativa de aprendizaje basado en retos en la Convocatoria de 2021-2022 de “Ayudas a la Innovación Educativa y a la Mejora de la Calidad de la Enseñanza” de la Universidad Politécnica de Madrid.

El proyecto se centrará en mejorar la experiencia tanto de alumnos como de profesores en la enseñanza de asignaturas prácticas relacionadas con programación.

Aprender a programar requiere una combinación de aprendizaje de conceptos generales, conceptos específicos del lenguaje o plataforma utilizados, y mucha práctica por parte del alumno. Sin embargo, enlazar las clases teóricas con los elementos prácticos es un reto para los docentes. El enfoque tradicional consiste en complementar las clases teóricas con ejercicios para los alumnos. Esos ejercicios son corregidos ya sea de manera manual o a través de herramientas como tareas autocorregidas en moodle.

Ese enfoque tiene dos problemas. Por un lado, existe una separación muy clara entre clases teóricas y actividades prácticas. Por otro lado, la necesidad de recurrir a herramientas como moodle para la evaluación significa que el bucle de realimentación con el alumno es más lento. Además, diseñar y configurar este tipo de tareas puede ser costoso, lo que en la práctica implica que el número de tareas de este tipo es muy reducido.

Una alternativa a este enfoque es el uso de material que combina elementos teóricos con partes interactivas y prácticas. Uno de los entornos más extendidos en este ámbito son los Jupyter Notebooks. En el grupo de “Aplicación de Tecnologías Inteligentes a la Educación en Ingeniería” hemos migrado gradualmente varios cursos de materiales tradicionales(transparencias, documentos) a formato de Jupyter Notebook. Los Notebooks resultantes intercalan documentación, elementos guiados que permiten a los alumnos probar por sí mismos y código de autoevaluación. Ese código permite a los alumnos detectar sus errores e iterar rápidamente, sin salir del entorno de Jupyter. Una vez completadas las tareas, los alumnos guardan el notebook y lo entregan a través de Moodle. Los profesores pueden posteriormente evaluar las entregas manualmente o mediante tests más exhaustivos que los de autoevaluación.

No obstante, existen dos limitaciones. Por un lado, separar el contenido completo (con soluciones oficiales) del que se les comparte a los alumnos puede resultar tedioso, y un error puede llevar a liberar la solución a los alumnos. Por otro lado, evaluar de forma exhaustiva los notebooks entregados puede ser costoso. Existen herramientas para generar tareas de evaluación, pero no están integradas adecuadamente con la plataforma de aprendizaje usada en la universidad (moodle). En este proyecto, proponemos la integración del enfoque de enseñanza basado en Jupyter Notebooks con herramientas que permitan la evaluación automatizada de las entregas de los alumnos. El objetivo es facilitar la creación de material didáctico autocontenido que combine los conceptos teóricos con la experimentación autónoma por parte del alumno.

Esperamos que proporcionar un entorno de creación de contenido híbrido autoevaluable y evaluable de forma externa promueva el uso de este tipo de material en el aula. Basándonos en la experiencia en cursos similares, ese tipo de contenido tiene una acogida fantástica entre los alumnos, y facilitar su creación sólo puede contribuir a aumentar el contenido y mejorar su calidad.

The article "Transfer Learning with Social Media Content in the Ride-Hailing Domain by Using a Hybrid Machine Learning Architecture", by Álvaro de Pablo, Oscar Araque, and Carlos A. Iglesias has been published in the Electronics journal (2.397 impact factor, JCR Q3 2020). This work is a product of the Cabify-UPM Chair (Cátedra Cabify-UPM))

The full paper can be found at this URL.

 

Abstract:

The analysis of the content of posts written on social media has established an important line of research in recent years. The study of these texts, as well as their relationship with each other and their dependence on the platform on which they are written, enables the behavior analysis of users and their opinions with respect to different domains. In this work, a hybrid machine learning-based system has been developed to classify texts using topic modeling techniques and different word-vector representations, as well as traditional text representations. The system has been trained with ride-hailing posts extracted from Reddit, showing promising performance. Then, the generated models have been tested with data extracted from other sources such as Twitter and Google Play, classifying these texts without retraining any models and thus performing Transfer Learning. The obtained results show that our proposed architecture is effective when performing Transfer Learning from data-rich domains and applying them to other sources.