Today, the web is the most widely used resource for information. Consequently, public administrations regularly publish regulations, public announcements, and administrative procedures online. This digital transition can increase the digital divide if the language used is unclear to large segments of the population. The obligation for administrations to use clear, accessible, and comprehensible language is supported by various national and international regulations. Producing clear texts is a laborious task, as guidelines contain hundreds of recommendations whose application is not always straightforward. Innovating support mechanisms for editing clear texts can streamline and improve their production.
To facilitate information access, diverse recommendations have been proposed, often based on heuristics, to produce comprehensible texts. These recommendations emphasize using clear, simple, and accessible language with straightforward syntax, as well as coherent and well-structured writing. Readability metrics have also been proposed to estimate text difficulty, based on word and sentence length. These metrics are used as thresholds to assess readability levels. However, these approaches are often simplistic and deterministic, and sometimes lack robust scientific support. They do not consider the readers' education, reading or linguistic abilities, or other sociodemographic characteristics that can affect these thresholds.
The project aims to analyze the difficulty of texts in order to identify strategies that facilitate the creation of text in clear language, studying, in addition, the possibility of inclusive
adaptations for different groups and individuals. These strategies will be based, at first, on previous bibliography on comprehension studies and recommendation guidelines. Based on these, prompts for generative grammar are defined by means of an Artificial Intelligence (AI). The results will be analyzed using natural language processing and pattern identification techniques. Finally, they are validated with real users and tracking hardware, such as eye trackers, to identify possible problems. In addition, a feedback system is enabled. The process will be repeated based on the results to refine the AI output with the observed features.
The project will result in a web service that evaluates text difficulty, detects challenges, and offers recommendations for adaptation to clear language, with the added possibility of personalization for population groups and individuals. The expected outcomes include:
Esta actuación ha sido financiada mediante el programa de actividades de I+D con referencia PHS-2024/PH-HUM-313, y acrónimo CLINFO-CM concedido por la Comunidad de Madrid a través de la Dirección General de Investigación e Innovación Tecnológica a través de la Orden CONVOCATORIA DE PROCESOS HUMANOS Y SOCIALES 2024 DE LA COMUNIDAD DE MADRID.
Budget: 70.500,00 €