A Variant of the Multiple Random Projection Trees Algorithm for Nearest Neighbors Search

A Variant of the Multiple Random Projection Trees Algorithm for Nearest Neighbors Search

dc.date.accessioned	2016-08-17T10:00:23Z	und
dc.date.accessioned	2017-10-24T12:24:11Z
dc.date.available	2016-08-17T10:00:23Z	und
dc.date.available	2017-10-24T12:24:11Z
dc.date.issued	2016-08-17T10:00:23Z
dc.identifier.uri	http://radr.hulib.helsinki.fi/handle/10138.1/5706	und
dc.identifier.uri	http://hdl.handle.net/10138.1/5706
dc.title	A Variant of the Multiple Random Projection Trees Algorithm for Nearest Neighbors Search	en
ethesis.department.URI	http://data.hulib.helsinki.fi/id/225405e8-3362-4197-a7fd-6e7b79e52d14
ethesis.department	Institutionen för datavetenskap	sv
ethesis.department	Department of Computer Science	en
ethesis.department	Tietojenkäsittelytieteen laitos	fi
ethesis.faculty	Matematisk-naturvetenskapliga fakulteten	sv
ethesis.faculty	Matemaattis-luonnontieteellinen tiedekunta	fi
ethesis.faculty	Faculty of Science	en
ethesis.faculty.URI	http://data.hulib.helsinki.fi/id/8d59209f-6614-4edd-9744-1ebdaf1d13ca
ethesis.university.URI	http://data.hulib.helsinki.fi/id/50ae46d8-7ba9-4821-877c-c994c78b0d97
ethesis.university	Helsingfors universitet	sv
ethesis.university	University of Helsinki	en
ethesis.university	Helsingin yliopisto	fi
dct.creator	Tuomainen, Risto Olli Oskari
dct.issued	2016
dct.language.ISO639-2	eng
dct.abstract	In nearest neighbors search the task is to find points from a data set that lie close in space to a given query point. To improve on brute force search, that computes distances between the query point and all data points, numerous data structures have been developed. These however perform poorly in high dimensional spaces. To tackle nearest neighbors search in high dimensions it is commonplace to use approximate methods that only return nearest neighbors with high probability. In practice an approximate solution is often as good as an exact one, among other reasons because approximations can be of such a high quality that they are practically indistinguishable from exact solutions. Approximate nearest neighbors search has found applications in many different fields, and can for example be used in the context of recommendation systems. One class of approximate nearest neighbors algorithms is space partitioning methods. These algorithms recursively partition the data set to smaller subsets in order to construct a search structure. Queries can then be performed very efficiently by using this structure to prune data points without needing to evaluate their distances to the query point. A recent proposal belonging to this class of algorithms is multiple random projections trees (MRPT). MRPT uses random projection trees (RP-trees) to prune the set from which nearest neighbors are searched. This thesis proposes a voting algorithm for using multiple RP-trees in nearest neighbors search. We also discuss a further improvement, called mixture method. The performance of these algorithms was evaluated against the previous MRPT algorithm using two moderately high dimensional data sets. Mixture method was found to improve considerably on MRPT in terms of accuracy attained. The results presented in this thesis suggest that the mixture method may potentially be a strong algorithm for nearest neighbors search, especially in very high dimensional spaces.	en
dct.language	en
ethesis.language.URI	http://data.hulib.helsinki.fi/id/languages/eng
ethesis.language	English	en
ethesis.language	englanti	fi
ethesis.language	engelska	sv
ethesis.thesistype	pro gradu-avhandlingar	sv
ethesis.thesistype	pro gradu -tutkielmat	fi
ethesis.thesistype	master's thesis	en
ethesis.thesistype.URI	http://data.hulib.helsinki.fi/id/thesistypes/mastersthesis
ethesis.degreeprogram	Algorithms and Machine Learning	en
dct.identifier.urn	URN:NBN:fi-fe2017112251758
dc.type.dcmitype	Text

Files in this item

Files	Size	Format	View
tuomainen_gradu_final.pdf	670.7Kb	PDF

This item appears in the following Collection(s)

Faculty of Science [4203]

Show simple item record

A Variant of the Multiple Random Projection Trees Algorithm for Nearest Neighbors Search

Files in this item

This item appears in the following Collection(s)

Yhteystiedot

HELSINGIN YLIOPISTO