Recommended way: text-icu

The recommended way for robustly processing strings in a locale-sensitive manner is via text and text-icu, as you have seen. The text library is provided in the standard library set, the Haskell Platform.

An example, sorting Turkish strings:

{-# LANGUAGE OverloadedStrings #-} import Data.Text.IO as T import Data.Text.ICU as T import Data.List (sortBy) main = do let trLocale = T.Locale "tr-TR" str = "ÇIİĞÖŞÜ" strs = take 10 (cycle $ T.toLower trLocale str : str : []) mapM_ T.putStrLn (sortBy (T.compare [T.FoldCaseExcludeSpecialI]) strs)

appears to correctly sort by lexicographic ordering based on locale, after correctly lower-casing the Turkish string:

*Main> main ÇIİĞÖŞÜ ÇIİĞÖŞÜ ÇIİĞÖŞÜ ÇIİĞÖŞÜ ÇIİĞÖŞÜ çıiğöşü çıiğöşü çıiğöşü çıiğöşü çıiğöşü

Not using the text-icu package

You've asked in your question to avoid solutions that use additional libraries, other than what Posix provides. While text-icu is easily installable from Hackage ( cabal install text-icu ), it does depend on the ICU C library, which isn't available everywhere. Additionally, there is no Posix alternative that is as robust or comprehensive. Finally, text-icu is the only package that correctly does conversions on multi-char characters.

Given this, though, the built in Char and String types in Haskell provide Data.Char, whose values represent Unicode, and with functions that will do Unicode case conversion, in a locale-insensitive way, using the wchar_t functions defined by the Open Group. Additionally, we can do IO on Handles in a (text) locale-sensitive way.

import System.IO import Data.Char import Data.List (sort) main = do t <- mkTextEncoding "UTF-8" hSetEncoding stdout t let str = "ÇIİĞÖŞÜ" strs = take 10 (cycle $ map toLower str : str : []) mapM_ putStrLn (sort strs)

In fact, GHC will use your text locale by default for IO (e.g. UTF8). For many problems, this will probably give the right answer. You just have to be aware it will also be wrong in many cases, since it isn't possible to be correct without bulk processing of text, and rich conversion and comparison support.