I have a 20-dimensional dataset, with a large amount of data points. I would like to have each dimension discretized into bins. Per bin, I would like to be able to access two neighbours per dimension (i.e. +1 and -1 per dimension). Basically I want to be able to easily access 2*d (where d is the dimensionality) neighbours. For lower dimensions, this would be quite easy to do using a multidimensional array (i.e. for point data[0][1][2]
I would access its neighbours data[0][1][3]
and data[0][1][1]
for the third dimension). However, when this approach is scaled up to higher dimensions, memory becomes an issue.
What kind of data structures would be suitable to use, where the most important criterium is the quick and easy access to its +1 and -1 neighbours?
Asked By : danielvdende
Answered By : Tolga Birdal
I would suggest you to use a hierarchical K-means, or a hierarchical vocabulary tree approach. For example:
http://gecco.org.chemie.uni-frankfurt.de/hkmeans/H-k-means.pdf http://www.vlfeat.org/overview/hikm.html
CVPR Paper with application: http://www.vis.uky.edu/~stewe/publications/nister_stewenius_cvpr2006.pdf
FLANN also has some kind of hierarchical capabilities.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/26435
0 comments:
Post a Comment
Let us know your responses and feedback