Development and Evaluation of a Mental Health Detection System on Social Media Posts using Natural Language Processing and Machine Learning Techniques

Andrea Laguna Liang. (2023). Development and Evaluation of a Mental Health Detection System on Social Media Posts using Natural Language Processing and Machine Learning Techniques. Final Career Project (TFG). Universidad Politécnica de Madrid, ETSI Telecomunicación.

Abstract:
Depression is a common mental disorder that affects all areas of life, and can lead to suicide. However, treatments exist and prevention programs are effective [67]. The aim of this work is to design and implement an automatic system for the detection of depression on social media. To do so, Machine Learning (ML) and Natural Language Processing (NLP) techniques were used. There were two main objectives. The first one was to further determine whether contextualization through emotions and topics could be used for depression detection, while the second objective was to profile model alternatives through a trade-off between performance and energy efficiency. After the exploration, both objectives were achieved. Related to the first objective, this work shows that contextualization through emotion and topic information was informative for the detection of depression, despite lowering the F-score in some cases. The results were particularly informative for the DepSign dataset, where concrete examples of behaviours indicating depression were found. Examples of this information included the indication of severe depression if the topic of medications was detected, or if certain words such as depression, suicide and depressed were present. Thus, it was shown that relevant knowledge could be extracted from ML and NLP algorithms. Furthermore, a clear candidate for the trade-off between computational costs and performance was identified. Thus, it was shown that in some applications state-of-the-art techniques may not be needed, but rather that some pre-existing techniques with a better balance between computational cost and performance can be used. The concrete example found in this work was also in the DepSign dataset, where using SIMON, an algorithm based on word embeddings, rather than a Transformer model, reduced the F-score in only 2%, while the cost was more than a hundred times lower. Finally, interpretability was found to be a key component for the analysis of this work, specially given its concern with a medical field.