Skip to main content
Login | Suomeksi | På svenska | In English

A Variant of the Multiple Random Projection Trees Algorithm for Nearest Neighbors Search

Show simple item record

dc.date.accessioned 2016-08-17T10:00:23Z und
dc.date.accessioned 2017-10-24T12:24:11Z
dc.date.available 2016-08-17T10:00:23Z und
dc.date.available 2017-10-24T12:24:11Z
dc.date.issued 2016-08-17T10:00:23Z
dc.identifier.uri http://radr.hulib.helsinki.fi/handle/10138.1/5706 und
dc.identifier.uri http://hdl.handle.net/10138.1/5706
dc.title A Variant of the Multiple Random Projection Trees Algorithm for Nearest Neighbors Search en
ethesis.department.URI http://data.hulib.helsinki.fi/id/225405e8-3362-4197-a7fd-6e7b79e52d14
ethesis.department Institutionen för datavetenskap sv
ethesis.department Department of Computer Science en
ethesis.department Tietojenkäsittelytieteen laitos fi
ethesis.faculty Matematisk-naturvetenskapliga fakulteten sv
ethesis.faculty Matemaattis-luonnontieteellinen tiedekunta fi
ethesis.faculty Faculty of Science en
ethesis.faculty.URI http://data.hulib.helsinki.fi/id/8d59209f-6614-4edd-9744-1ebdaf1d13ca
ethesis.university.URI http://data.hulib.helsinki.fi/id/50ae46d8-7ba9-4821-877c-c994c78b0d97
ethesis.university Helsingfors universitet sv
ethesis.university University of Helsinki en
ethesis.university Helsingin yliopisto fi
dct.creator Tuomainen, Risto Olli Oskari
dct.issued 2016
dct.language.ISO639-2 eng
dct.abstract In nearest neighbors search the task is to find points from a data set that lie close in space to a given query point. To improve on brute force search, that computes distances between the query point and all data points, numerous data structures have been developed. These however perform poorly in high dimensional spaces. To tackle nearest neighbors search in high dimensions it is commonplace to use approximate methods that only return nearest neighbors with high probability. In practice an approximate solution is often as good as an exact one, among other reasons because approximations can be of such a high quality that they are practically indistinguishable from exact solutions. Approximate nearest neighbors search has found applications in many different fields, and can for example be used in the context of recommendation systems. One class of approximate nearest neighbors algorithms is space partitioning methods. These algorithms recursively partition the data set to smaller subsets in order to construct a search structure. Queries can then be performed very efficiently by using this structure to prune data points without needing to evaluate their distances to the query point. A recent proposal belonging to this class of algorithms is multiple random projections trees (MRPT). MRPT uses random projection trees (RP-trees) to prune the set from which nearest neighbors are searched. This thesis proposes a voting algorithm for using multiple RP-trees in nearest neighbors search. We also discuss a further improvement, called mixture method. The performance of these algorithms was evaluated against the previous MRPT algorithm using two moderately high dimensional data sets. Mixture method was found to improve considerably on MRPT in terms of accuracy attained. The results presented in this thesis suggest that the mixture method may potentially be a strong algorithm for nearest neighbors search, especially in very high dimensional spaces. en
dct.language en
ethesis.language.URI http://data.hulib.helsinki.fi/id/languages/eng
ethesis.language English en
ethesis.language englanti fi
ethesis.language engelska sv
ethesis.thesistype pro gradu-avhandlingar sv
ethesis.thesistype pro gradu -tutkielmat fi
ethesis.thesistype master's thesis en
ethesis.thesistype.URI http://data.hulib.helsinki.fi/id/thesistypes/mastersthesis
ethesis.degreeprogram Algorithms and Machine Learning en
dct.identifier.urn URN:NBN:fi-fe2017112251758
dc.type.dcmitype Text

Files in this item

Files Size Format View
tuomainen_gradu_final.pdf 670.7Kb PDF

This item appears in the following Collection(s)

Show simple item record