World's most popular travel blog for travel bloggers.

[Solved]: Compression type that can be searched

, , No Comments
Problem Detail: 

Is there ANY compression type that can compress a file, and then that compressed file can be searched without uncompressing the file?

Asked By : Albert Renshaw

Answered By : KWillets

Compressed self-indexes such as the FM Index allow arbitrary substring searches in near entropy-compressed space. These are essentially compressed suffix arrays or suffix trees, which have a lot of literature.

Basic substring search can be o(k) or o(k log n) in time for length k, depending on what data structures are chosen (different types of rank/select data structures). There are a range of issues that arise depending on whether one wants simple boolean containment predicates, the offset of each occurrence, or more complicated suffix tree operations; the former can be done in less space and time than the latter.

There's also an entire book about searching and selective decompression of strings: "Compressed Data Structures for Strings: On Searching and Extracting Strings" by Rossano Venturini, published 2014 Springer Science & Business Media.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/45218

0 comments:

Post a Comment

Let us know your responses and feedback