Skip to main content
Login | Suomeksi | På svenska | In English

Conditional Neural Headline Generation for Finnish

Show full item record

Title: Conditional Neural Headline Generation for Finnish
Author(s): Koppatz, Maximilian
Contributor: University of Helsinki, Faculty of Science
Degree program: Master's Programme in Data Science
Specialisation: no specialization
Language: English
Acceptance year: 2022
Automatic headline generation has the potential to significantly assist editors charged with head- lining articles. Approaches to automation in the headlining process can range from tools as creative aids, to complete end to end automation. The latter is difficult to achieve as journalistic require- ments imposed on headlines must be met with little room for error, with the requirements depending on the news brand in question. This thesis investigates automatic headline generation in the context of the Finnish newsroom. The primary question I seek to answer is how well the current state of text generation using deep neural language models can be applied to the headlining process in Finnish news media. To answer this, I have implemented and pre-trained a Finnish generative language model based on the Transformer architecture. I have fine-tuned this language model for headline generation as autoregression of headlines conditioned on the article text. I have designed and implemented a variation of the Diverse Beam Search algorithm, with additional parameters, to perform the headline generation in order to generate a diverse set of headlines for a given text. The evaluation of the generative capabilities of this system was done with real world usage in mind. I asked domain-experts in headlining to evaluate a generated set of text-headline pairs. The task was to accept or reject the individual headlines in key criteria. The responses of this survey were then quantitatively and qualitatively analyzed. Based on the analysis and feedback, this model can already be useful as a creative aid in the newsroom despite being far from ready for automation. I have identified concrete improvement directions based on the most common types of errors, and this provides interesting future work.
Keyword(s): text generation natural language processing deep learning algorithms headline generation

Files in this item

Files Size Format View
maximilian_koppatz_msc_2022.pdf 881.1Kb PDF

This item appears in the following Collection(s)

Show full item record