Permutations By Sorting

Posted on March 24, 2019

A naive—and wrong—way to shuffle a list is to assign each element in the list a random number, and then sort it. It might not be immediately obvious why: Kiselyov (2002) has a good explanation as to the problem. One way to think about it is like this: choosing n n random numbers each in the range [ 0 , n ) [0,n) has n n n^n possible outcomes, whereas there are n ! n! permutations. Since these don’t necessarily divide evenly into each other, you’re going to have some bias.

Factorial Numbers

The first part of the fix is to figure out a way to get some random data that has only n ! n! possible values. The trick here will be to mimic the structure of a factorial itself: taking n = 5 n = 5 , the previous technique would have yielded:

5 × 5 × 5 × 5 × 5 = 5 5 5 \times 5 \times 5 \times 5 \times 5 = 5^5

possible values. But we want:

5 × 4 × 3 × 2 × 1 = 5 ! 5 \times 4 \times 3 \times 2 \times 1 = 5!

The solution is simple, then! Simply decrement the range by one for each position in the output list. In Haskell:

nums :: Int -> IO [ Int ] 0 = pure [] nums[] = ( : ) <$> randomR ( 0 ,n) <*> nums (n - 1 ) nums nrandomR (,n)nums (n

As an aside, what we’ve done here is constructed a list of digits in the factorial number system.

Sorts

Unfortunately, while we’ve figured out a way to get properly distributed random data, we can’t yet sort it to shuffle our list. If we look at the 6 factorial numbers generated for n = 5 n = 5 , we can see the problem:

000 010 100 110 200 210

Different values in the list will produce the same sort: 100 and 200 , for instance.

Lehmer Codes

We need a way to map the numbers above to a particular permutations: that’s precisely the problem solved by Lehmer codes. For the numbers 110 , we can think of each digit as the relative position to put that item from the string into. Some Haskell code might make it clear:

insert :: Int -> a -> [a] -> [a] [a][a] 0 x xs = x : xs insertx xsxs : ys) = y : insert (i - 1 ) x ys insert i x (yys)insert (i) x ys shuffle :: [a] -> [ Int ] -> [a] [a][a] = foldr ( uncurry insert) [] ( zip ys xs) shuffle xs ysinsert) [] (ys xs)

And we can step through its execution:

"abc" [ 1 , 1 , 0 ] shuffle foldr ( uncurry insert) [] [( 1 , 'a' ),( 1 , 'b' ),( 0 , 'c' )] insert) [] [(),(),()] 1 'a' (insert 1 'b' (insert 0 'c' [])) insert(insert(insert[])) 1 'a' (insert 1 'b' "c" ) insert(insert 1 'a' "cb" insert 'c' : insert 0 'a' "b" insert "cab"

Dualities of Sorts

Notice the similarity of the function above to a standard insertion sort:

insert :: Ord a => a -> [a] -> [a] [a][a] = x : [] insert x [][] : ys) insert x (yys) | x <= y = x : y : ys ys | otherwise = y : insert x ys insert x ys insertSort :: Ord a => [a] -> [a] [a][a] = foldr insert [] insertSortinsert []

The “comparison” is a little strange—we have to take into account relative position—but the shape is almost identical. Once I spot something like that, my first thought is to see if the relationship extends to a better 𝒪 ( n log n ) \mathcal{O}(n \log n) sort, but there’s something else I’d like to look at first.

“A Duality of Sorts” (Hinze, Magalhães, and Wu 2013) is a paper based on the interesting symmetry between insertion sort and selection sort (There’s also a video of Graham Hutton explaining the idea; Haran 2016).

With that paper in mind, can we rewrite shuffle as a selection-based algorithm? We can indeed!

pop :: [( Int ,a)] -> Maybe (a, [( Int ,a)]) [(,a)](a, [(,a)]) = Nothing pop [] 0 ,x) : xs) = Just (x, xs) pop ((,x)xs)(x, xs) : xs) = ( fmap . fmap ) ((i - 1 ,x) : ) (pop xs) pop ((i,x)xs)) ((i,x)) (pop xs) shuffle :: [a] -> [ Int ] -> [a] [a][a] = unfoldr pop ( zip ys xs) shuffle xs ysunfoldr pop (ys xs)

While the symmetry is pleasing, the paper details how to make the relationship explicit, using the same function for both selection and insertion sort:

Nil = Nil swop Cons a (x , Nil )) = Cons a ( Left x) swop (a (x ,))a (x) Cons a (x , Cons b x')) swop (a (x ,b x')) | fst a == 0 = Cons a ( Left x) a (x) | otherwise = Cons b ( Right ( Cons (first pred a) x')) b ((firsta) x')) ishuffle :: [( Int ,a)] -> [( Int ,a)] [(,a)][(,a)] = cata (apo (swop . fmap ( id &&& project))) ishufflecata (apo (swopproject))) sshuffle :: [( Int ,a)] -> [( Int ,a)] [(,a)][(,a)] = ana (para ( fmap ( id ||| embed) . swop)) sshuffleana (para (embed)swop))

Improved Efficiency

So now we have to upgrade our sorts: in the paper, merge sort is the more efficient sort chosen, similarly to what I chose previously.

= ys merge [] ysys = xs merge xs []xs : xs) ((y,j) : ys) merge ((x,i)xs) ((y,j)ys) | i <= j = (x,i) : merge xs ((y,j - i) : ys) (x,i)merge xs ((y,ji)ys) | otherwise = (y,j) : merge ((x,i - j - 1 ) : xs) ys (y,j)merge ((x,ixs) ys treeFold :: (a -> a -> a) -> a -> [a] -> a (aa)[a] = go treeFold fgo where = x go x [] : l) = go (f a b) (pairMap l) go a (bl)go (f a b) (pairMap l) : y : rest) = f x y : pairMap rest pairMap (xrest)f x ypairMap rest = xs pairMap xsxs = map fst $ treeFold merge [] $ map pure $ zip xs inds shuffle xs indstreeFold merge []xs inds

However, I feel like merge sort is an upgrade of insertion sort, not selection sort. Indeed, if you do the “split” step of merge sort badly, i.e. by splitting very unevenly, merge sort in fact becomes insertion sort!

So there’s a missing bit of this table:

Insertion Selection 𝒪 ( n 2 ) \mathcal{O}(n^2) Insertion sort Selection sort 𝒪 ( n log n ) \mathcal{O}(n \log n) Merge sort ???

I think it’s clear that quicksort is the algorithm that fits in there: again, done badly it degrades to selection sort (if you intentionally pick the pivot to be the worst element possible, i.e. the smallest element).

There are more symmetries: merge sort splits the lists using their structure, and merges them using the ordering of the elements. Quicksort is the opposite, merging by concatenation, but splitting using order. Finally, in merge sort adjacent elements are in the correct order after the recursive call, but the two sides of the split are not. Again, quicksort is precisely the opposite: adjacent elements have not been compared (before the recursive call), but the two sides of the split are correctly ordered.

Anyway, I haven’t yet formalised this duality (and I don’t know if I can), but we can use it to produce a quicksort-based shuffle algorithm:

= foldr f ( const ([],[])) partitionf (([],[])) where f (y,j) ys i | i <= j = fmap ((y,j - i) : ) (ys i) ((y,ji)) (ys i) | otherwise = first ((y,j) : ) (ys (i - 1 )) first ((y,j)) (ys (i)) shuffle :: [a] -> [ Int ] -> [a] [a][a] = go ( zip xs ys) shuffle xs ysgo (xs ys) where = [] go [][] : xs) = case partition xs i of go ((x,i)xs)partition xs i -> go ls ++ [x] ++ go rs (ls,rs)go ls[x]go rs

That’s all for this post! The algorithms can all be translated into Agda or Idris: I’m currently working on a way to represent permutations that isn’t 𝒪 ( n 2 ) \mathcal{O}(n^2) using them. If I figure out a way to properly dualise quicksort and merge sort I’ll do a small write up as well (I’m currently working my way through Hinze et al. 2012 for ideas). Finally, I’d like to explore some other sorting algorithms as permutation algorithms: sorting networks seem especially related to “permutations by swapping”.

References