- http://unicode.org/
- http://www.unicode.org/versions/Unicode9.0.0/ch03.pdf Normalization spec
- http://www.unicode.org/reports/tr15/ UNICODE NORMALIZATION FORMS
- http://www.unicode.org/reports/tr44/ UNICODE CHARACTER DATABASE
- Unicode Character Database
The text
package already provides proper unicode casemapping and casefolding
operations.
The Haskell package text-icu
is a full featured implementation of unicode
operations via bindings to the C++ icu
libraries.
text-icu
provides the following additional features:
- Normalization checks
- FCD normalization for collation
- String collation
- Iteration
- Regular expressions
Unicode functionality in Haskell is fragmented across various packages. The
most comprehensive functionality is provided by text-icu
which is based on
the icu
C++ libraries. All related packages are listed here, they may or may
not be up to date or useful.
- unicode-properties Unicode 3.2.0 character properties
- hxt-charproperties Character properties and classes for XML and Unicode
- unicode-names Unicode 3.2.0 character names
- unicode Construct and transform unicode characters
- utf8-string Support for reading and writing UTF8 Strings
- utf8-light Lightweight UTF8 handling
- hxt-unicode Unicode en-/decoding functions for utf8, iso-latin-* and other encodings
- text An efficient packed Unicode text type
- text-normal Data types for Unicode-normalized text - depends on text-icu