World's most popular travel blog for travel bloggers.

Are there any compression algorithms based on PI?

, , No Comments
Problem Detail: 

What we know is that π is infinite and quite likely it contains every possible finite string of digits (disjunctive sequence).

I've seen recently some prototype of πfs which assume that every file you've created (or anybody else) or you will create, it's already there so it's a matter of extracting it. There is also piFile which can convert your files to pi metadata.

There is already BBP type formula (as part of experimental mathematics) which allows us to compute nth binary digit of pi. So storing position of our start and length of the data, we can theoretically extract the data of our interest. There are some arguments against it that our metadata (e.g. the offset to our data) could be larger than the extracted data. The matrix symbols and π can be encoded in base-256 to make it more efficient (see the joke).

Based on above, my main question is:

  • Are there any compression algorithms based on PI?

If not, does it make sense? Or were there any researches in that area?

Or maybe π is not the right one, so what about Euler's constant or Tau (τ)? Would it make any difference?


looking up dirty words in numbers is WAY more fun than looking them up in the dictionary!  ASS: pi position 590,725 (ascii encoding).  BUTT: position 177,031,174.  BOOB: position 32,355,500.  8==D is at position 158,907,339.  MAY I JUST SAY: HOW EROTIC

Image credits: Dinosaur Comics


See also:

Asked By : kenorb

Answered By : Yuval Filmus

Your suggestion doesn't make much sense, for many reasons. First of all, when trying to compress a large file, say a file of size $16$ bytes, you will have to find a place in the binary expansion of $\pi$ which agrees with your file. Since the file is $128$ bits long, one would expect this place to be around the $2^{128}$th bit. So it would be rather hard to find. This is not only because we have to go far into the expansion, but also since we expect to try $2^{128}$ different locations before finding a hit.

Second, while in some cases your scheme will result in a major compression, this will only happen when a certain string appears comparatively early in the expansion of $\pi$. There is no reason you would ever want to compress that sort of string. In contrast, other compression algorithms try to find structure in the data, and have guarantees that show that if such a structure exists, then they can always exploit it.

Changing $\pi$ with any other number wouldn't change the picture. The algorithm is too specific, compressing only strings we aren't really interested in; and very inefficient in the compression phase.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/42464

0 comments:

Post a Comment

Let us know your responses and feedback