World's most popular travel blog for travel bloggers.

[Solved]: compressing a set of binary strings with fixed length

, , No Comments
Problem Detail: 

I'm looking for a data structure / algorithm to store an unordered set S of binary strings of a fixed length n (i.e. all matching the following regular expression: [01]{n}).

Insertion and lookup ("Is element x in the S?") should maximally take polynomial time to |S|.

The space complexity should be logarithmic to |S| for large sets. In other words, the space should not be exponential to n if for example 2^n / 2 random and unique strings are inserted, but polynomial to n.

Is such a thing known to exist?

Asked By : Tobias Hermann

Answered By : Yuval Filmus

Suppose $S$ consists of $m$ strings in $\{0,1\}^n$ and we don't allow query arrows. Any two different sets $S_1,S_2 \subseteq \{0,1\}^n$ of size $m$ respond differently to some lookup query: there must be some $x \in S_1\setminus S_2$, for example, and the lookup of $x$ should succeed in $S_1$ and fail in $S_2$. For this reason, the contents of your data structure should be different for $S_1,S_2$. Since there are $\binom{2^n}{m}$ different choices for $S$, any data structure supporting all of them should have at least $\binom{2^n}{m}$ different settings. In particular, if you use $M$ bits to store it then $2^M \geq \binom{2^n}{m}$. When $m = 2^n/2$, for example, this forces $M \geq 2^n - O(n) = 2m - O(\log m)$.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/32466

0 comments:

Post a Comment

Let us know your responses and feedback