World's most popular travel blog for travel bloggers.

[Solved]: Is there any study or theory behind combining binary search and interpolation search?

, , No Comments
Problem Detail: 

I just read Can this algorithm still be considered a Binary Search algorithm? and recalled that a few years back I wrote an indexer/search for log files to find log entries in large plain text files by date/time window.

Whilst doing this, I decided to try interpolation search (I didn't know that was what it was called, I kind of stumbled across the idea by myself). Then for some reason I continued to the idea of alternating interpolation steps with binary split steps: On step 0 I would interpolate to decide test point, then step 1 I would take the exact midpoint etc.

I then benchmarked the system using pure interpolation search, pure binary search and my combination attempt. The alternating approach was a clear winner, both in time and number of tests required before finding a set of randomly chosen times.

Inspired by the linked question, I just made a quick search for "alternating interpolation search and binary search" and found nothing. I also tried "hedged interpolation search" as suggested on my comment on one of the answers.

Have I stumbled across a known thing? Is there any theoretical justification for it being faster for certain types of data? The log files were typically large for the time (e.g. 1-2 GB of text with maybe 10 million rows to search), and the spread of dates/times in them was complex with heavy bursts of activity, general peak times and quiet times. My benchmark tests sampled from an even distribution of target times to find.

Asked By : Neil Slater

Answered By : manlio

Have I stumbled across a known thing?

There are various methods, based on a mix of interpolation-search and binary search, with a $O(log\ log\ n)$ average case access time (uniform distribution) and $O(log\ n)$ worst case time (values unevenly distributed):

  • Introspective search is your method (iterating between an interpolation search and a binary search). I haven't further details.
  • Interpolation-binary search (IBS) by N. Santoro, J. B. Sidney (1985).

    The general idea is that interpolation search is useful only when the array searched is larger than a given threshold. When the considered search segment is smaller than a user-defined threshold, binary search is applied unconditionally. Vice-versa, over that threshold, an interpolation search step is applied, followed eventually by a binary search step.

    This has many common points with your approach.

  • Adaptive search (AS) by Biagio Bonasera, Emilio Ferrara, Giacomo Fiumara, Francesco Pagano, Alessandro Provetti

    Using the authors' words:

    [Interpolation-binary search] devised a similar solution that combines (but does not blend) together interpolation and binary search. Although the asymptotic complexity is the same, there are some marked differences.

    [CUT]

    Hence, it is possible to show that for any input AS will not take more elementary operations than IBS.

    The algorithm may spend up to double number of operations than "simple" interpolation search in carefully finding out the best halving of the search segment, which in turn will mean that less iterations shall be needed to complete (but you have an even greater overhead).

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/59750

0 comments:

Post a Comment

Let us know your responses and feedback