Tuesday, April 5, 2011

What are the pitfalls in implementing binary search?

Binary search is harder to implement than it looks. "Although the basic idea of binary search is comparatively straightforward, the details can be surprisingly tricky…" — Donald Knuth.

Which bugs are most likely to be introduced into a new binary search implementation?

From stackoverflow
  • If you have Programming Pearls book nearby, you should check chapter 4.

    edit2: As pointed out in the comments, you can download the draft chapter I mentioned the Author's website: http://www.cs.bell-labs.com/cm/cs/pearls/sketch04.html

    warren : it's referenced in the wikipedia article, too - http://en.wikipedia.org/wiki/Binary_search#The_algorithm
    Bill the Lizard : The whole book isn't available on Bentley's website, but the information from that chapter seems complete enough. The site has "sketches" of some chapters, and some chapters are just missing. Buy the book, though. It's worth it.
    Tiago : @Bill: I do have the book, it's great.
  • Here are some I can think of:

    • Off-by-one errors, when determining the boundary of the next interval
    • Handling of duplicate items, if you are suppose to return the first equal item in the array but instead returned a subsequent equal item
    • Numerical underflows/overflows when computing indices, with huge arrays
    • Recursive vs non-recursive implementation, a design choice you should consider

    Are these what you have in mind?

  • Read this. Java's binary search implementation hid a bug for almost a decade before anybody found it.

    The bug is integer overflow. It didn't cause people problems because hardly anyone was searching big enough data structures.

    joeforker : That bug is awesome.
  • Failing to consider that when calculating the midpoint between two indices summing the high and low values may result in integer overflow.

    Reference

    joeforker : Humbling. The programmer was trying to implement binary search and cannot correctly do something as apparently simple as taking the average two numbers.

0 comments:

Post a Comment