World's most popular travel blog for travel bloggers.

[Solved]: What does this performance formula mean?

, , No Comments
Problem Detail: 

I have to make a quick clustering program but the following formula is gibberish to me:

$\operatorname{Perf}(X,C) = \sum\limits_{i=1}^n\min\{||X_i-C_l||^2 \mid l = 1,...,K\}$

where $X$ is a set of multi-dimensional data and $C$ is a set of centroids for each data cluster.

This formula is a fitness function for an artificial bee colony clustering algorithm as a substitute for k-means clustering algorithm. It is described as a total within-cluster variance or the total mean-square quantization error (MSE).

Can anyone translate it to pseudo-code, normal human English, or at least enlighten me?

Asked By : helix

Answered By : edA-qa mort-ora-y

Just break it down into parts:

$ \{ f(l) \mid l = 1,...,K \} $

This is a simple set construction. The above would simply create a set with all the elements from 1 to K. In your case the f(l) is the function:

$ ||X_i-C_l||^2 $

Given the || means the norm, these are vectors you are subtracting (rows of the X and C matrices). So subtract the vectors, take the norm, and square it. That produces a new set, of which you want to take the minimum.

$ \sum\limits_{i=1}^n $

This part is then just the sum of above min calculation for every index $i$ from $1$ to $n$.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/2067

3.2K people like this

 Download Related Notes/Documents

0 comments:

Post a Comment

Let us know your responses and feedback