dc.date.accessioned |
2016-08-17T10:00:30Z |
und |
dc.date.accessioned |
2017-10-24T12:24:12Z |
|
dc.date.available |
2016-08-17T10:00:30Z |
und |
dc.date.available |
2017-10-24T12:24:12Z |
|
dc.date.issued |
2016-08-17T10:00:30Z |
|
dc.identifier.uri |
http://radr.hulib.helsinki.fi/handle/10138.1/5708 |
und |
dc.identifier.uri |
http://hdl.handle.net/10138.1/5708 |
|
dc.title |
Training Algorithms for Multilingual Latent Dirichlet Allocation |
en |
ethesis.department.URI |
http://data.hulib.helsinki.fi/id/225405e8-3362-4197-a7fd-6e7b79e52d14 |
|
ethesis.department |
Institutionen för datavetenskap |
sv |
ethesis.department |
Department of Computer Science |
en |
ethesis.department |
Tietojenkäsittelytieteen laitos |
fi |
ethesis.faculty |
Matematisk-naturvetenskapliga fakulteten |
sv |
ethesis.faculty |
Matemaattis-luonnontieteellinen tiedekunta |
fi |
ethesis.faculty |
Faculty of Science |
en |
ethesis.faculty.URI |
http://data.hulib.helsinki.fi/id/8d59209f-6614-4edd-9744-1ebdaf1d13ca |
|
ethesis.university.URI |
http://data.hulib.helsinki.fi/id/50ae46d8-7ba9-4821-877c-c994c78b0d97 |
|
ethesis.university |
Helsingfors universitet |
sv |
ethesis.university |
University of Helsinki |
en |
ethesis.university |
Helsingin yliopisto |
fi |
dct.creator |
Jin, Haibo |
|
dct.issued |
2016 |
|
dct.language.ISO639-2 |
eng |
|
dct.abstract |
Multilingual Latent Dirichlet Allocation (MLDA) is an extension of Latent Dirichlet Allocation (LDA) in a multilingual setting, which aims to discover aligned latent topic structures of a parallel corpus. Although the two popular training algorithms of LDA, collapsed Gibbs sampling and variational inference, can be naturally adopted to MLDA, the two algorithms both become time-inefficient with MLDA due to its special structure. To address this problem, we propose an approximate training framework of MLDA, which works with both collapsed Gibbs sampling and variational inference. Through the experiments, we show that the proposed training framework is able to reduce the training time of MLDA considerably, especially when there are many languages. We also summarize the scenarios where the approximate framework gives comparable model accuracy to that of the standard framework. Finally, we discuss several possible explorations as a future plan. |
en |
dct.language |
en |
|
ethesis.language.URI |
http://data.hulib.helsinki.fi/id/languages/eng |
|
ethesis.language |
English |
en |
ethesis.language |
englanti |
fi |
ethesis.language |
engelska |
sv |
ethesis.thesistype |
pro gradu-avhandlingar |
sv |
ethesis.thesistype |
pro gradu -tutkielmat |
fi |
ethesis.thesistype |
master's thesis |
en |
ethesis.thesistype.URI |
http://data.hulib.helsinki.fi/id/thesistypes/mastersthesis |
|
ethesis.degreeprogram |
Algorithms and Machine Learning |
en |
dct.identifier.urn |
URN:NBN:fi-fe2017112251760 |
|
dc.type.dcmitype |
Text |
|