Browsing by Author "Hynynen, Jussi-Veikka"
Now showing items 1-1 of 1
-
Hynynen, Jussi-Veikka (2023)Using language that is easy to understand when presenting information in a written form is critical for ensuring effective communication. Yet, using language that is too complex or technical for its intended audience is a common pitfall in many domains, such as legal and medical text. Automatic text simplification (ATS) aims to automatize the conversion of complex text into a simpler, more easily comprehensible form. This study explores ATS models for English that can be controlled in terms of the readability of the output text. Readability is measured with an automatically calculated readability level that corresponds to a school grade level. The readability- controlled models take a readability level as a parameter and simplify input text to match the reading level of the intended audience corresponding to the parameter value. In total, six readability-controlled sentence simplification models with different control attribute configurations are trained in this study. The models use a pretrained sequence-to-sequence model architecture that is finetuned on a dataset of sentence pairs in regular and simple English. The trained models are evaluated using automatic evaluation metrics and compared to each other and ATS systems from previous research. Additionally, the simplified sentences produced by the best performing model are evaluated manually to identify errors and the types of text transformations that the model employs to simplify sentences. When the readability level input value is optimized to maximise model performance on validation data, the readability-controlled models surpass systems from previous works in terms of automatic evaluation metrics, suggesting that the addition of readability level as a control attribute results in improved simplification quality. Manual evaluation shows that readability-controlled models are capable of splitting long sentences to multiple shorter sentences to reduce syntactic complexity of text. This finding suggests that readability level metrics can be used to effectively control syntactic complexity in ATS models as a lightweight alternative to previously applied, more computationally demanding methods that rely on dependency parsing. Finally, this study discusses the different types errors produced by the models, their potential causes and ways to reduce errors in future ATS systems.
Now showing items 1-1 of 1