Data StructuresMedium

Binary Search

Binary Search

Binary Search

Binary Search is an exceptionally efficient algorithm for finding a target element in a sorted array. By leveraging the sorted nature of the data, it cuts the search space in half with each step, achieving logarithmic time complexity.

Requirements

  • Sorted Data: The array must be sorted in ascending or descending order before searching.
  • Random Access: The data structure must support constant-time O(1) random access by index (which is a signature strength of arrays).

How Binary Search Works

The algorithm uses a divide-and-conquer approach with two pointers (left and right) tracking the active search boundaries:

  1. Initialize: Set left = 0 and right = N - 1.
  2. Calculate Middle: Compute mid = left + (right - left) / 2 (this formula prevents integer overflow).
  3. Compare:
    • If arr[mid] == target, the element is found! Return mid.
    • If arr[mid] < target, discard the left half by setting left = mid + 1.
    • If arr[mid] > target, discard the right half by setting right = mid - 1.
  4. Repeat: Repeat steps 2 and 3 as long as left <= right. If the pointers cross and the target is not found, return -1.

Complexity Analysis

  • Time Complexity:
    • Best Case: O(1) (when the target is exactly at the middle element on the very first comparison)
    • Average Case: O(log N) (since the search space decreases exponentially by half each time)
    • Worst Case: O(log N) (when the target is at one of the outer boundaries or is not present in the array)
  • Space Complexity: O(1) (constant space since the iterative approach only uses a few pointers/variables)

Why use left + (right - left) / 2 instead of (left + right) / 2?

In languages like C, C++, and Java, using (left + right) / 2 can cause an integer overflow if the sum of left and right exceeds the maximum value that an integer data type can hold (2^31 - 1 for a standard signed 32-bit integer). The expression left + (right - left) / 2 is mathematically identical but completely safe from overflow because it calculates the offset difference first, keeping all intermediate values strictly within boundaries.