dc.date.accessioned |
2015-05-25T11:30:03Z |
und |
dc.date.accessioned |
2017-10-24T12:23:59Z |
|
dc.date.available |
2015-05-25T11:30:03Z |
und |
dc.date.available |
2017-10-24T12:23:59Z |
|
dc.date.issued |
2015-05-25T11:30:03Z |
|
dc.identifier.uri |
http://radr.hulib.helsinki.fi/handle/10138.1/4730 |
und |
dc.identifier.uri |
http://hdl.handle.net/10138.1/4730 |
|
dc.title |
Towards text-based prediction of phrasal prominence |
en |
ethesis.department.URI |
http://data.hulib.helsinki.fi/id/225405e8-3362-4197-a7fd-6e7b79e52d14 |
|
ethesis.department |
Institutionen för datavetenskap |
sv |
ethesis.department |
Department of Computer Science |
en |
ethesis.department |
Tietojenkäsittelytieteen laitos |
fi |
ethesis.faculty |
Matematisk-naturvetenskapliga fakulteten |
sv |
ethesis.faculty |
Matemaattis-luonnontieteellinen tiedekunta |
fi |
ethesis.faculty |
Faculty of Science |
en |
ethesis.faculty.URI |
http://data.hulib.helsinki.fi/id/8d59209f-6614-4edd-9744-1ebdaf1d13ca |
|
ethesis.university.URI |
http://data.hulib.helsinki.fi/id/50ae46d8-7ba9-4821-877c-c994c78b0d97 |
|
ethesis.university |
Helsingfors universitet |
sv |
ethesis.university |
University of Helsinki |
en |
ethesis.university |
Helsingin yliopisto |
fi |
dct.creator |
Kuusisto, Teemu |
|
dct.issued |
2015 |
|
dct.language.ISO639-2 |
eng |
|
dct.abstract |
The objective of this thesis was text-based prediction of phrasal prominence. Improving natural sounding speech synthesis motivated the task, because phrasal prominence, which depicts the relative saliency of words within a phrase, is a natural part of spoken language. Following the majority of previous research, prominence is predicted on binary level derived from a symbolic representation of pitch movements. In practice, new classifiers and new models from different fields of natural language processing were explored. Applicability of spatial and graph-based language models was tested by proposing such features as word vectors, a high-dimensional vector-space representation, and DegExt, a keyword weighting method. Support vector machines (SVMs) were used due to their widespread suitability to supervised classification tasks with high-dimensional continuous-valued input. Linear inner product and non-linear radial basis function (RBF) were used as kernels. Furthermore, hidden Markov support vector machines (HM-SVMs) were evaluated to investigate benefits of sequential classification. The experiments on the widely used Boston University Radio News Corpus (BURNC) were successful in two major ways: Firstly, the non-linear support vector machine along with the best performing features achieved similar performance than the previous state-of-the-art approach reported by Rangarajan et al. [RNB06]. Secondly, newly proposed features based on word vectors moderately outperformed part-of-speech tags, which has been inevitably the best performing feature throughout the research of text-based prominence prediction. |
en |
dct.language |
en |
|
ethesis.language.URI |
http://data.hulib.helsinki.fi/id/languages/eng |
|
ethesis.language |
English |
en |
ethesis.language |
englanti |
fi |
ethesis.language |
engelska |
sv |
ethesis.thesistype |
pro gradu-avhandlingar |
sv |
ethesis.thesistype |
pro gradu -tutkielmat |
fi |
ethesis.thesistype |
master's thesis |
en |
ethesis.thesistype.URI |
http://data.hulib.helsinki.fi/id/thesistypes/mastersthesis |
|
ethesis.degreeprogram |
Algorithms and Machine Learning |
en |
dct.identifier.urn |
URN:NBN:fi-fe2017112252492 |
|
dc.type.dcmitype |
Text |
|