Towards text-based prediction of phrasal prominence

Towards text-based prediction of phrasal prominence

dc.date.accessioned	2015-05-25T11:30:03Z	und
dc.date.accessioned	2017-10-24T12:23:59Z
dc.date.available	2015-05-25T11:30:03Z	und
dc.date.available	2017-10-24T12:23:59Z
dc.date.issued	2015-05-25T11:30:03Z
dc.identifier.uri	http://radr.hulib.helsinki.fi/handle/10138.1/4730	und
dc.identifier.uri	http://hdl.handle.net/10138.1/4730
dc.title	Towards text-based prediction of phrasal prominence	en
ethesis.department.URI	http://data.hulib.helsinki.fi/id/225405e8-3362-4197-a7fd-6e7b79e52d14
ethesis.department	Institutionen för datavetenskap	sv
ethesis.department	Department of Computer Science	en
ethesis.department	Tietojenkäsittelytieteen laitos	fi
ethesis.faculty	Matematisk-naturvetenskapliga fakulteten	sv
ethesis.faculty	Matemaattis-luonnontieteellinen tiedekunta	fi
ethesis.faculty	Faculty of Science	en
ethesis.faculty.URI	http://data.hulib.helsinki.fi/id/8d59209f-6614-4edd-9744-1ebdaf1d13ca
ethesis.university.URI	http://data.hulib.helsinki.fi/id/50ae46d8-7ba9-4821-877c-c994c78b0d97
ethesis.university	Helsingfors universitet	sv
ethesis.university	University of Helsinki	en
ethesis.university	Helsingin yliopisto	fi
dct.creator	Kuusisto, Teemu
dct.issued	2015
dct.language.ISO639-2	eng
dct.abstract	The objective of this thesis was text-based prediction of phrasal prominence. Improving natural sounding speech synthesis motivated the task, because phrasal prominence, which depicts the relative saliency of words within a phrase, is a natural part of spoken language. Following the majority of previous research, prominence is predicted on binary level derived from a symbolic representation of pitch movements. In practice, new classifiers and new models from different fields of natural language processing were explored. Applicability of spatial and graph-based language models was tested by proposing such features as word vectors, a high-dimensional vector-space representation, and DegExt, a keyword weighting method. Support vector machines (SVMs) were used due to their widespread suitability to supervised classification tasks with high-dimensional continuous-valued input. Linear inner product and non-linear radial basis function (RBF) were used as kernels. Furthermore, hidden Markov support vector machines (HM-SVMs) were evaluated to investigate benefits of sequential classification. The experiments on the widely used Boston University Radio News Corpus (BURNC) were successful in two major ways: Firstly, the non-linear support vector machine along with the best performing features achieved similar performance than the previous state-of-the-art approach reported by Rangarajan et al. [RNB06]. Secondly, newly proposed features based on word vectors moderately outperformed part-of-speech tags, which has been inevitably the best performing feature throughout the research of text-based prominence prediction.	en
dct.language	en
ethesis.language.URI	http://data.hulib.helsinki.fi/id/languages/eng
ethesis.language	English	en
ethesis.language	englanti	fi
ethesis.language	engelska	sv
ethesis.thesistype	pro gradu-avhandlingar	sv
ethesis.thesistype	pro gradu -tutkielmat	fi
ethesis.thesistype	master's thesis	en
ethesis.thesistype.URI	http://data.hulib.helsinki.fi/id/thesistypes/mastersthesis
ethesis.degreeprogram	Algorithms and Machine Learning	en
dct.identifier.urn	URN:NBN:fi-fe2017112252492
dc.type.dcmitype	Text

Files in this item

Files	Size	Format	View
towards_text-ba ... hrasal_prominence_2015.pdf	506.2Kb	PDF

This item appears in the following Collection(s)

Faculty of Science [4203]

Show simple item record

Towards text-based prediction of phrasal prominence

Files in this item

This item appears in the following Collection(s)

Yhteystiedot

HELSINGIN YLIOPISTO