Is there ANY compression type that can compress a file, and then that compressed file can be searched without uncompressing the file?
Asked By : Albert Renshaw
Answered By : KWillets
Compressed self-indexes such as the FM Index allow arbitrary substring searches in near entropy-compressed space. These are essentially compressed suffix arrays or suffix trees, which have a lot of literature.
Basic substring search can be o(k) or o(k log n) in time for length k, depending on what data structures are chosen (different types of rank/select data structures). There are a range of issues that arise depending on whether one wants simple boolean containment predicates, the offset of each occurrence, or more complicated suffix tree operations; the former can be done in less space and time than the latter.
There's also an entire book about searching and selective decompression of strings: "Compressed Data Structures for Strings: On Searching and Extracting Strings" by Rossano Venturini, published 2014 Springer Science & Business Media.
Best Answer from StackOverflow
Question Source : http://cs.stackexchange.com/questions/45218
0 comments:
Post a Comment
Let us know your responses and feedback