World's most popular travel blog for travel bloggers.

# [Solved]: Average case algorithm analysis using Kolmogorov Incompressibility Method

, ,
Problem Detail:

The Incompressibility Method is said to simplify the analysis of algorithms for the average case. From what I understand, this is because there is no need to compute all of the possible combinations of input for that algorithm and then derive an average complexity. Instead, a single incompressible string is taken as the input. As an incompressible string is typical, we can assume that this input can act as an accurate approximation of the average case.

I am lost in regard to actually applying the Incompressibility Method to an algorithm. As an aside, I am not a mathematician, but think that this theory has practical applications in everyday programming.

Ultimately, I would like to learn how I can deduce the average case of any given algorithm, be it trivial or complex. Could somebody please demonstrate to me how the method can be applied to a simple algorithm? For instance, given an input string S, store all of the unique characters in S, then print each one individually:

void uniqueChars(String s) {     char[] chars = chars[ s.length() ]     int free_idx = 0;      for (int i = 0; i < s.length(); i++) {        if (! s[i] in chars) {           chars[free_idx] = s[i]           free_idx++;        }     }      for (int i = 0; i < chars.length(); i++) {         print (chars[i])     } } 

Assume a linear search for checking whether the array contains an element.

The above code snippet is only for the sake of argument. Better algorithms by which the theory can be demonstrated are acceptable, of course.

I asked this question on StackOverflow (http://stackoverflow.com/q/24619383/3813812) back in July, 2014, and have received a few helpful comments but not a definite answer. As one of the commenters pointed out, this question is better suited for Computer Science StackExchange, so I ask here today.

Some literature that I have reviewed includes:

1. An Introduction to Kolmogorov Complexity and Its Applications, by Ming Li and Paul M.B. Vitányi

2. https://www.cs.duke.edu/~reif/courses/complectures/Li/KC-Lecture1.pdf

3. http://www.detectingdesign.com/PDF%20Files/Kolmogorov%20Complexity%202.pdf

Among a few other resources that I do not have links to on hand.

If my understanding of the applicability of Kolmogorov complexity is inaccurate or what I ask is impractical, I would appreciate a statement with regard to the fact.

#### Answered By : Yuval Filmus

The idea of the incompressibility method is that an incompressible input satisfies certain properties that can be helpful in the analysis. In your case, the complexity of the algorithm depends on how many characters appear in the string. When processing the $k$th character, the "running time" (or rather its proxy, the number of comparisons when checking the list) is $\approx \alpha_k/2$ comparisons on average, where $\alpha_k$ is the number of different characters that have been seen so far. In order to estimate $\alpha_k$, note that we can encode the first $k$ characters using roughly $8\alpha_k + k\log \alpha_k$ bits, and we deduce that $8\alpha_k + k\log_2 \alpha_k \geq 8k$. Hence unless $k$ is very small, $\alpha_k$ has to be very close to $256$, and we can deduce that there are on average $128$ comparisons per character. We can use the inequality to determine how large $k$ needs to be, and also what happens when $k$ is small.

The reason we are trying to count the exact complexity in the case of your algorithm is that its worst-case and best-case running times are both $\Theta(n)$. We used incompressibility to estimate $\alpha_k$, which can also be estimated directly using probabilistic methods, but the incompressibility calculation is probably simpler. Still, incompressibility doesn't obviate the need to analyze the algorithm probabilistically – it just makes the analysis more tractable in some cases.