Finding rectangles, part 2: borders

In the previous post, we looked at finding axis aligned rectangles in a binary image. Today I am going to solve a variation of that problem:

Given a binary image, find the largest axis aligned rectangle with a 1 pixel wide border that consists entirely of foreground pixels.

Here is an example:

,

where white pixels are the background and blue is the foreground. The rectangle with the largest area is indicated in red.

Like the previous rectangle finding problem, this one also came up in my masters thesis. The application was to, given a scan of a book, find the part that is a page, cutting away clutter:

.

Specification

The types we are going to need are exactly the same as in my previous post:

type Image = [ [ Bool ] ] data Rect = Rect { left , top , width , height :: Int } deriving ( Eq , Ord , Show )

The difference compared to last time is the contains function, which tells whether an image contains a given rectangle. We are now looking only at the borders of rectangles, or 'border rectangles' for short.

contains :: Image -> Rect -> Bool contains im rect = isBorder ( cropRect im rect ) cropRect :: Image -> Rect -> Image cropRect im ( Rect x y w h ) = map cols ( rows im ) where rows = take h . drop y . ( ++ repeat [ ] ) cols = take w . drop x . ( ++ repeat False ) isBorder :: Image -> Bool isBorder im = and ( head im ) && and ( last im ) && and ( map head im ) && and ( map last im )

Finding the largest border rectangle can again be done by enumerating all rectangles contained in the image, and picking the largest one:

largestRect spec :: Image -> Rect largestRect spec = maximalRectBy area . allRects allRects :: Image -> [ Rect ] allRects im = filter ( im `contains` ) rects where

Just as before, this specification has runtime O(n6) for an n by n image, which is completely impractical.

An O(n4) algorithm

Unfortunately, the nice properties of maximal rectangles will not help us out this time. In particular, whenever a filled rectangle is contained in an image, then so are all smaller subrectangles So we could 'grow' filled rectangles one row or column at a time. This is no longer true for border rectangles.

We can, however, easily improve the above O(n6) algorithm to an O(n4) one by using the line endpoints. With those we can check if an image contains a rectangle in constant time. We just need to check all four of the sides:

contains fast im ( Rect x y w h ) = r !! ( x , y ) >= x + w && r !! ( x , y + h -1 ) >= x + w && b !! ( x , y ) >= y + h && b !! ( x + w -1 , y ) >= y + h

Where r and b give the right and bottom endpoints of the horizontal and vertical lines through each pixel.

r = , b = .

An O(n3) algorithm

As the title of this section hints, a still more efficient algorithm is possible. The trick is to only look for rectangles with a specific height h . For any given height h , we will be able to find only maximal rectangles of that height.

For example, for h=6 we would expect to find these rectangles:

.

Notice how each of these rectangles consist of three parts: a left side, a middle and a right side:

.

The left and right parts both consist of a vertical line at least h pixels high. We can find those vertical lines by looking at the top (or bottom) line endpoints. The top endpoint for pixel (x,y+h-1) should be at most y ,

let h = 6 let av = zipWith2d ( <= ) ( drop ( h -1 ) t ) y av = <= =

Each True pixel in av corresponds to a point where there is a h pixel high vertical line. So, a potential left or right side of a rectangle.

The middle part of each rectangle has both pixel (x,y) and (x,y+h-1) set,

let ah = zipWith2d ( && ) a ( drop ( h -1 ) a ) ah = && =

To find the rectangles of height h , we just need to find runs that start and end with a pixel in av , and where all pixels in between are in ah . First we find the left coordinates of the rectangles,

let leStep ( av , ah , x ) le | av = min le x | ah = le | otherwise = maxBound let le = scanRightward leStep maxBound ( zip2d3 av ah x ) le =

Finally we need to look for right sides. These are again given by av . For each right side, le gives the leftmost left side, and h gives the height of the rectangles:

let mkRect x y av le | av = [ Rect le y ( x - le + 1 ) h ] | otherwise = [ ] let rects = zipWith2d4 mkRect x y av le rects =

Compare the resulting image to the one at the start of this section. We found the same rectangles.

Just like last time, all we need to do now is put the steps together in a function:

rectsWithHeight :: Int -> Image -> [ Rect ] rectsWithHeight h a = concat . concat $ rects where x = scanRightward ( \ _ x -> x + 1 ) ( -1 ) a y = scanDownward ( \ _ y -> y + 1 ) ( -1 ) a t = scanDownward ( \ ( a , y ) t -> if a then t else y + 1 ) 0 ( zip2d a y ) ah = zipWith2d ( && ) ( drop ( h -1 ) a ) a av = zipWith2d ( <= ) ( drop ( h -1 ) t ) y leStep ( av , ah , x ) le | av = min le x | ah = le | otherwise = maxBound le = scanRightward leStep maxBound ( zip2d3 av ah x ) mkRect x y av le | av = [ Rect le y ( x - le + 1 ) h ] | otherwise = [ ] rects = zipWith2d4 mkRect x y av le

Of course, finding (a superset of) all maximal rectangles in an image is just a matter of calling rectsWithHeight for all possible heights.

findRects fast :: Image -> [ Rect ] findRects fast im = concat [ rectsWithHeight h im | h <- [ 1 .. imHeight im ] ] largestRect fast :: Image -> Rect largestRect fast = maximalRectBy area . findRects fast

Let's quickly check that this function does the same as the specification,

prop fast_spec = forAll genImage $ \ a -> largestRect spec a == largestRect fast a

λ> quickCheck prop fast_spec +++ OK , passed 100 tests .

Great.

Conclusions

The runtime of rectsWithHeight is linear in the number of pixels; and it is called n times for an n by n image. Therefore the total runtime of largestRect fast is O(n3). While this is much better than what we started with, it can still be quite slow. For example, the book page that motivated this problem is around 2000 pixels squared. Finding the largest rectangle takes on the order of 20003 = 8*109, or 8 giga-operations, which is still a pretty large number.

To make this algorithm faster in practice, I used a couple of tricks. Most importantly, if we know what function we are maximizing, say area , then we can stop as soon as we know that we can't possibly find a better rectangle. The idea is to start with h=imHeight im , and work downwards. Keep track of the area a of the largest rectangle. Then as soon as h * imWidth im < a , we can stop, because any rectangle we can find from then on will be smaller.

Is this the best we can do? No. I know an algorithm for finding all maximal border rectangles in O(n2*(log n)2) time. But it is rather complicated, and this post is long enough already. So I will save it for another time. If anyone thinks they can come up with such an algorithm themselves, I would love to read about it in the comments.