What we know is that π is infinite and quite likely it contains every possible finite string of digits (disjunctive sequence).
I've seen recently some prototype of πfs which assume that every file you've created (or anybody else) or you will create, it's already there so it's a matter of extracting it. There is also piFile which can convert your files to pi metadata.
There is already BBP type formula (as part of experimental mathematics) which allows us to compute nth binary digit of pi. So storing position of our start and length of the data, we can theoretically extract the data of our interest. There are some arguments against it that our metadata (e.g. the offset to our data) could be larger than the extracted data. The matrix symbols and π can be encoded in base-256 to make it more efficient (see the joke).
Based on above, my main question is:
- Are there any compression algorithms based on PI?
If not, does it make sense? Or were there any researches in that area?
Or maybe π is not the right one, so what about Euler's constant or Tau (τ)? Would it make any difference?
Image credits: Dinosaur Comics
See also:
Asked By : kenorb
Answered By : Yuval Filmus
Your suggestion doesn't make much sense, for many reasons. First of all, when trying to compress a large file, say a file of size $16$ bytes, you will have to find a place in the binary expansion of $\pi$ which agrees with your file. Since the file is $128$ bits long, one would expect this place to be around the $2^{128}$th bit. So it would be rather hard to find. This is not only because we have to go far into the expansion, but also since we expect to try $2^{128}$ different locations before finding a hit.
Second, while in some cases your scheme will result in a major compression, this will only happen when a certain string appears comparatively early in the expansion of $\pi$. There is no reason you would ever want to compress that sort of string. In contrast, other compression algorithms try to find structure in the data, and have guarantees that show that if such a structure exists, then they can always exploit it.
Changing $\pi$ with any other number wouldn't change the picture. The algorithm is too specific, compressing only strings we aren't really interested in; and very inefficient in the compression phase.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/42464
0 comments:
Post a Comment
Let us know your responses and feedback