In this tutorial we’ll look at one of the fundamental algorithms of computer science, binary search. We’ll also look at a practical application of binary search: implementing fast autocompletion.

Introduction

Consider the problem of finding an item in an array. The obvious solution is to go through all the items in the array and check if any of them is equal to our item. This approach is called linear search and an implementation can be seen below.

func linearSearch (array: [ Int ], key: Int ) -> Bool { for item in array { if item == key { return true } } return false }

Linear search is often fast enough for most practical purposes. However linear search slows down significantly when the number of elements in an array is large or comparing elements is expensive.

Another much faster approach is binary search. Binary search only works on sorted arrays so we have to sort our array or maintain it in sorted order for binary search to work.

Binary search works by dividing the array into 2 halves around the middle element. The search only continues in one of the halves depending on the found element.

If the found element is less than our target we continue searching in the upper half of the array.

If the element is greater than our target we continue searching in the lower half of the array.

If the found element is equal to our target we are done.

We continue this process until we have either found the element or we’re splitting an array of length 1 .

To get some intuition on how binary search works, consider a phone book that contains an alphabetic list of people. Let’s say you want to find the phone number of Nash John , a good idea would be to open the book in the middle and search for the number from there. Now you only have to search half of the book because Nash John will either be after the name you find in the middle or before it. Binary search works on a similar idea but repeats the process of halving the search space until the element is found.

The below diagrams illustrate the steps binary search takes when finding the element 36 in the sorted array below.

let array = [ 1 , 2 , 3 , 5 , 5 , 7 , 10 , 36 , 50 ]

Binary search is much more effective than linear search because it halves the search space at each step. This is not significant for our array of length 9 , here linear search takes at most 9 steps and binary search takes at most 4 steps. But consider an array with 1000 elements, here linear search takes at most 1000 steps while binary search takes at most 10 steps (subsequently considering arrays of size 1000, 500, 250, 75, 37, 18, 9, 4, 2, 1).

Even if we had 1 billion elements binary search will still be able to find our key in at most 30 steps.

Mind == Blown Consider an array with a Googol(10^100) entries. That’s way more items than the number of particles in the observable universe. Binary search can find an item in that array in at most 333steps, while linear search would take longer than the life of the universe. [collapse]

To go a little further if our array has K elements, linear search will take K steps while binary search will take [log₂(K)] steps.

log₂(K) is the base 2 logarithm of K, that is the power we have to raise 2 too so that we get K . In other words how many times do we have to repeatedly multiply 1 by 2 so that we get K .

If log₂(K) = n then 2ⁿ = K .

Swift implementation

Until now we only talked about binary search from a theoretical perspective, let’s see how we can implement the algorithm in swift. The algorithm will be implemented as a function that takes an array of ints and a target and returns true / false depending on whether the target is in our array.

We’ll maintain two indices ( left and right ) that indicate what slice of the array we’re considering for search. Initially this correspond to the first and last index of the array.

Next we’ll repeat the process of getting the middle element from our slice and comparing it to our target . We want to repeat these steps in a while loop while the length of our slice is at least 1 . That is left <= right .

At each step of the loop we compute the middle index as the average of left and right and retrieve the value from the middle.

Now we can encounter 3 cases: 1. We find our target. We’re done! We return true . 2. The found value is less than our target. We set left to mid + 1 and continue our search in the upper slice of our array. 3. The found value is greater than our target. We set right to mid - 1 and continue our search in the lower slice of our array.

If we exit the loop then our target is definitely not in the array and we return false .

func binarySearch (array: [ Int ], target: Int ) -> Bool { var left = 0 var right = array . count - 1 while ( left <= right ) { let mid = ( left + right ) / 2 let value = array[mid] if (value == target) { return true } if (value < target) { left = mid + 1 } if (value > target) { right = mid - 1 } } return false }

Our implementation currently only works on integers but we can easily expand it to work on generic types that are Comparable .

func binarySearch <T : Comparable >(array: [T], target: T) -> Bool { var left = 0 var right = array . count - 1 while ( left <= right ) { let mid = ( left + right ) / 2 let value = array[mid] if (value == target) { return true } if (value < target) { left = mid + 1 } if (value > target) { right = mid - 1 } } return false }

Searching for prefixes

Our generic version of binary search can be used to search for a string in an array of strings. We can further expand binary search on strings such that instead of finding an exact match it looks if a string in our collection starts with our target strings. If a string starts with our target string we say that the target string is a prefix of that string.

To implement this we only have to replace our equality check with a call to the hasPrefix method.

func binarySearchPrefix (array: [ String ], target: String ) -> Bool { var left = 0 var right = array . count - 1 while ( left <= right ) { let mid = ( left + right ) / 2 let value = array[mid] if (value . hasPrefix (target)) { return true } if (value < target) { left = mid + 1 } if (value > target) { right = mid - 1 } } return false }

There can potentially be many items in our array that start with the same prefix. We can implement a function that instead of true/false returns the index of the first string in our array that starts with a given prefix. Similarly we can implement a function that returns the index of the last string in our array that starts with a given prefix. We’ll use these methods to implement autocomplete.

The trick is that we only return an index in one of 2 cases:

1. Our left and right indices are equal and the target is a prefix of the current value

2. The target is a prefix of the current value and the previous element in our array is not a prefix of the current value. This works for the binarySearchFirst method. We also have to be careful not to access an element outside our array.

3. For binarySearchLast we use a similar approach but we check if the next element is not a prefix of the current value.

If no string matches the given target the value -1 is returned.

binarySearchFirst

func binarySearchFirst (array: [ String ], target: String ) -> Int { var left = 0 var right = array . count - 1 while ( left <= right ) { let mid = ( left + right ) / 2 let value = array[mid] if ( left == right && value . hasPrefix (target)) { return left } if value . hasPrefix (target) { if mid > 0 { if ! array[mid - 1 ] . hasPrefix (target) { return mid } } right = mid - 1 } else if (value < target) { left = mid + 1 } else if (value > target) { right = mid - 1 } } return -1 }

binarySearchLast

func binarySearchLast (array: [ String ], target: String ) -> Int { var left = 0 var right = array . count - 1 while ( left <= right ) { let mid = ( left + right ) / 2 let value = array[mid] if ( left == right && value . hasPrefix (target)) { return left } if value . hasPrefix (target) { if mid < array . count - 1 { if ! array[mid + 1 ] . hasPrefix (target) { return mid } } left = mid + 1 } else if (value < target) { left = mid + 1 } else if (value > target) { right = mid - 1 } } return -1 }

Autocompletion

Consider the following problem, you’re building a browser and want to provide autocomplete when a user types in the url to a page. A user can potentially have thousands of URLs in their search history, doing a linear search through them for autocomplete just won’t cut it after a while.

Below you can see how autocomplete behaves with ~90K entries. Notice that linear search is extremely laggy at this point making for a bad user experience.

Linear Search

Binary Search

The UI code for this project is pretty straightforward: a textField where the user types the URL and a tableView below it that provides autocomplete suggestions. The data for the tableview is provided by the class SearchManager . This class contains the following methods:

func updateFilter(filter: String) – updates the string used for filtering the URLs

func filteredURLAtIndex(index: Int) -> String – returns the URL at an index for the list of filtered URLs

func filteredURLCount() -> Int – returns the number of URLs that match the current filter.

UI Code

class ViewController: UIViewController, UITableViewDataSource, UITableViewDelegate { @IBOutlet weak var tableView: UITableView ! @IBOutlet weak var textField: UITextField ! var searchManager = SearchManager() override func viewDidLoad () { super . viewDidLoad() } @IBAction func textFieldDidChange (sender: AnyObject ) { searchManager . updateFilter(textField . text) tableView . reloadData() } func tableView (tableView: UITableView, cellForRowAtIndexPath indexPath : NSIndexPath) -> UITableViewCell { let cell = UITableViewCell() cell . textLabel !. text = searchManager . filteredURLAtIndex(indexPath . row) return cell } func tableView (tableView: UITableView, numberOfRowsInSection section : Int ) -> Int { return searchManager . filteredURLCount() } func tableView (tableView: UITableView, didSelectRowAtIndexPath indexPath : NSIndexPath) { self . textField . text = searchManager . filteredURLAtIndex(indexPath . row) } }

Search manager

Search manager works by maintaining a list of all urls that it reads from a file at initialization. Two indices are maintained: filterStart – the index of the first URL that matches the given filter and filterEnd – the index of the last URL that matches the given filter.

Below you can see an implementation of SearchManager that uses linear search.

class SearchManager { private var urls: [ String ] private var filterStart: Int private var filterEnd: Int init () { let path = NSBundle . mainBundle() . pathForResource( " urls " , ofType: " txt " ) ! let content = String (contentsOfFile: path, encoding: NSUTF8StringEncoding, error: nil ) ! urls = content . componentsSeparatedByString( "

" ) filterStart = 0 filterEnd = urls . count - 1 } func filteredURLCount () -> Int { if filterStart == -1 { return 0 } return filterEnd - filterStart + 1 } func filteredURLAtIndex (index: Int ) -> String { return urls[filterStart + index] } func updateFilter (filter: String ) { if filter == " " { filterStart = 0 filterEnd = urls . count - 1 return } var foundFirst = false var index = 0 for url in urls { if url . hasPrefix (filter) { if ! foundFirst { filterStart = index foundFirst = true } filterEnd = index } index ++ } if ! foundFirst { filterStart = -1 } } }

A version of SearchManager that uses binary search is straightforward to write. Note that we don’t have to change our UI code in any way when changing the internals of SearchManager .

class SearchManager { private var urls: [ String ] private var filterStart: Int private var filterEnd: Int init () { let path = NSBundle . mainBundle() . pathForResource( " urls " , ofType: " txt " ) ! let content = String (contentsOfFile: path, encoding: NSUTF8StringEncoding, error: nil ) ! urls = content . componentsSeparatedByString( "

" ) filterStart = 0 filterEnd = urls . count - 1 } func filteredURLCount () -> Int { if filterStart == -1 { return 0 } return filterEnd - filterStart + 1 } func filteredURLAtIndex (index: Int ) -> String { return urls[filterStart + index] } func binarySearchLast (array: [ String ], target: String ) -> Int { var left = 0 var right = array . count - 1 while ( left <= right ) { let mid = ( left + right ) / 2 let value = array[mid] if ( left == right && value . hasPrefix (target)) { return left } if value . hasPrefix (target) { if mid < array . count - 1 { if ! array[mid + 1 ] . hasPrefix (target) { return mid } } left = mid + 1 } else if (value < target) { left = mid + 1 } else if (value > target) { right = mid - 1 } } return -1 } func binarySearchFirst (array: [ String ], target: String ) -> Int { var left = 0 var right = array . count - 1 while ( left <= right ) { let mid = ( left + right ) / 2 let value = array[mid] if ( left == right && value . hasPrefix (target)) { return left } if value . hasPrefix (target) { if mid > 0 { if ! array[mid - 1 ] . hasPrefix (target) { return mid } } right = mid - 1 } else if (value < target) { left = mid + 1 } else if (value > target) { right = mid - 1 } } return -1 } func updateFilter (filter: String ) { if filter == " " { filterStart = 0 filterEnd = urls . count - 1 return } filterStart = binarySearchFirst(urls, target: filter) filterEnd = binarySearchLast(urls, target: filter) } }

Download Project

Extra Credits

Binary search works great when we want to autocomplete strings based on prefixes. Sometimes you want to check if any of the strings in your collection contains the target string as a substring. For this purpose binary search doesn’t cut it anymore.

If you’re interested in more advanced string matching algorithms have a look at:

Tries

Suffix arrays

Suffix trees

and in case you want to do fuzzy autocomplete:

Levenshtein Distance

Levenshtein Automaton

BK Trees