Design and Development of a Reinforcement Learning-based Intelligent Agent for Solving Text Games using TextWorld

Bruno González López. (2020). Design and Development of a Reinforcement Learning-based Intelligent Agent for Solving Text Games using TextWorld. Final Career Project (TFG). Universidad Politécnica de Madrid, ETSI Telecomunicación.

Abstract:
The objective of this project is to train two intelligent agents to be able to solve text games, in particular, a simple game inspired by ETSIT. In addition, metrics will be compared to measure their performance in relation to a random decision making agent. Text-based games (very popular in the 80s) are complex and interactive simulations in which the text describes the state of the game and the players progress by introducing textual commands. To solve this type of game, an intelligent agent must be able to explore the environment, learn mechanics, identify its purpose, understand text and acquire some temporal perception. To achieve this, the project will be based on the machine learning field, specifically on deep reinforcement learning that has proved to be the best solution for virtual environments where agents have to learn to choose the best actions to achieve certain objectives. This branch has become one of the most promising branches in the area of artificial intelligence. The deep reinforcement learning algorithms DQN and A2C have demonstrated a great performance in this kind of environments, so our agents are based on these architectures. As games are a purely textual environment, the project will use tools from the field of natural language processing. Understanding language requires skills such as long-term memory, planning and common sense, qualities that our agent will need. Textworld will be used for the management and creation of textual games. TextWorld is a Python library developed by Microsoft that handles these games in addition to providing tracking and reward control functions. It also allows us to create new games or automatically generate others. The generative mechanisms it offers give precise control over the difficulty, scope and language of the games built.