Browsing by Subject "data-to-text generation; text generation"

Now showing items 1-1 of 1

A Review of Proposals for Improvements in Evaluation of Natural Language Generation

Moilanen, Jouni Petteri (2023)

In recent years, a concern has grown within the NLG community about the comparability of systems and reproducibility of research results. This concern has mainly been focused on the evaluation of NLG systems. Problems with automated metrics, crowd-sourced human evaluations, sloppy experimental design and error reporting, etc. have been widely discussed in the literature. A lot of proposals for best practices, metrics, frameworks and benchmarks for NLG evaluation have lately been issued to address these problems. In this thesis we examine the current state of NLG evaluation – focusing on data-to-text evaluation – in terms of proposed best practices, benchmarks, etc., and their adoption in practice. Academic publications concerning NLG evaluation indexed in the Scopus database published in 2018-2022 were examined. After manual inspection 141 of those I deemed to contain some kind of concrete proposal for improvements in evaluation practices. The adoption (use in practice) of those was again examined by inspecting papers citing them. There seems to be a willingness in the academic community to adopt these proposals, especially ”best practices” and metrics. As for datasets, benchmarks, evaluation platforms, etc., the results are inconclusive.

Now showing items 1-1 of 1