summaryrefslogtreecommitdiff
path: root/doc/libunistring.info
diff options
context:
space:
mode:
Diffstat (limited to 'doc/libunistring.info')
-rw-r--r--doc/libunistring.info6200
1 files changed, 6200 insertions, 0 deletions
diff --git a/doc/libunistring.info b/doc/libunistring.info
new file mode 100644
index 00000000..2fad8fec
--- /dev/null
+++ b/doc/libunistring.info
@@ -0,0 +1,6200 @@
+This is libunistring.info, produced by makeinfo version 4.13 from
+libunistring.texi.
+
+INFO-DIR-SECTION Software development
+START-INFO-DIR-ENTRY
+* GNU libunistring: (libunistring). Unicode string library.
+END-INFO-DIR-ENTRY
+
+ This manual is for GNU libunistring.
+
+
+File: libunistring.info, Node: Top, Next: Introduction, Up: (dir)
+
+GNU libunistring
+****************
+
+* Menu:
+
+* Introduction:: Who may need Unicode strings?
+* Conventions:: Conventions used in this manual
+* unitypes.h:: Elementary types
+* unistr.h:: Elementary Unicode string functions
+* uniconv.h:: Conversions between Unicode and encodings
+* unistdio.h:: Output with Unicode strings
+* uniname.h:: Names of Unicode characters
+* unictype.h:: Unicode character classification and properties
+* uniwidth.h:: Display width
+* uniwbrk.h:: Word breaks in strings
+* unilbrk.h:: Line breaking
+* uninorm.h:: Normalization forms
+* unicase.h:: Case mappings
+* uniregex.h:: Regular expressions
+* Using the library:: How to link with the library and use it?
+* More functionality:: More advanced functionality
+* Licenses:: Licenses
+
+* Index:: General Index
+
+ --- The Detailed Node Listing ---
+
+Introduction
+
+* Unicode:: What is Unicode?
+* Unicode and i18n:: Unicode and internationalization
+* Locale encodings:: What is a locale encoding?
+* In-memory representation:: How to represent strings in memory?
+* char * strings:: What to keep in mind with `char *' strings
+* The wchar_t mess:: Why `wchar_t *' strings are useless
+* Unicode strings:: How are Unicode strings represented?
+
+unistr.h
+
+* Elementary string checks::
+* Elementary string conversions::
+* Elementary string functions::
+* Elementary string functions with memory allocation::
+* Elementary string functions on NUL terminated strings::
+
+unictype.h
+
+* General category::
+* Canonical combining class::
+* Bidirectional category::
+* Decimal digit value::
+* Digit value::
+* Numeric value::
+* Mirrored character::
+* Properties::
+* Scripts::
+* Blocks::
+* ISO C and Java syntax::
+* Classifications like in ISO C::
+
+General category
+
+* Object oriented API::
+* Bit mask API::
+
+Properties
+
+* Properties as objects::
+* Properties as functions::
+
+uniwbrk.h
+
+* Word breaks in a string::
+* Word break property::
+
+uninorm.h
+
+* Decomposition of characters::
+* Composition of characters::
+* Normalization of strings::
+* Normalizing comparisons::
+* Normalization of streams::
+
+unicase,h
+
+* Case mappings of characters::
+* Case mappings of strings::
+* Case mappings of substrings::
+* Case insensitive comparison::
+* Case detection::
+
+Using the library
+
+* Installation::
+* Compiler options::
+* Include files::
+* Autoconf macro::
+* Reporting problems::
+
+Licenses
+
+* GNU GPL:: GNU General Public License
+* GNU LGPL:: GNU Lesser General Public License
+* GNU FDL:: GNU Free Documentation License
+
+
+File: libunistring.info, Node: Introduction, Next: Conventions, Prev: Top, Up: Top
+
+1 Introduction
+**************
+
+ This library provides functions for manipulating Unicode strings and
+for manipulating C strings according to the Unicode standard.
+
+ It consists of the following parts:
+
+`<unistr.h>'
+ elementary string functions
+
+`<uniconv.h>'
+ conversion from/to legacy encodings
+
+`<unistdio.h>'
+ formatted output to strings
+
+`<uniname.h>'
+ character names
+
+`<unictype.h>'
+ character classification and properties
+
+`<uniwidth.h>'
+ string width when using nonproportional fonts
+
+`<uniwbrk.h>'
+ word breaks
+
+`<unilbrk.h>'
+ line breaking algorithm
+
+`<uninorm.h>'
+ normalization (composition and decomposition)
+
+`<unicase.h>'
+ case folding
+
+`<uniregex.h>'
+ regular expressions (not yet implemented)
+
+ libunistring is for you if your application involves non-trivial text
+processing, such as upper/lower case conversions, line breaking,
+operations on words, or more advanced analysis of text. Text provided
+by the user can, in general, contain characters of all kinds of
+scripts. The text processing functions provided by this library handle
+all scripts and all languages.
+
+ libunistring is for you if your application already uses the ISO C /
+POSIX `<ctype.h>', `<wctype.h>' functions and the text it operates on is
+provided by the user and can be in any language.
+
+ libunistring is also for you if your application uses Unicode
+strings as internal in-memory representation.
+
+* Menu:
+
+* Unicode:: What is Unicode?
+* Unicode and i18n:: Unicode and internationalization
+* Locale encodings:: What is a locale encoding?
+* In-memory representation:: How to represent strings in memory?
+* char * strings:: What to keep in mind with `char *' strings
+* The wchar_t mess:: Why `wchar_t *' strings are useless
+* Unicode strings:: How are Unicode strings represented?
+
+
+File: libunistring.info, Node: Unicode, Next: Unicode and i18n, Up: Introduction
+
+1.1 Unicode
+===========
+
+ Unicode is a standardized repertoire of characters that contains
+characters from all scripts of the world, from Latin letters to Chinese
+ideographs and Babylonian cuneiform glyphs. It also specifies how
+these characters are to be rendered on a screen or on paper, and how
+common text processing (word selection, line breaking, uppercasing of
+page titles etc.) is supposed to behave on Unicode text.
+
+ Unicode also specifies three ways of storing sequences of Unicode
+characters in a computer whose basic unit of data is an 8-bit byte:
+UTF-8
+ Every character is represented as 1 to 4 bytes.
+
+UTF-16
+ Every character is represented as 1 to 2 units of 16 bits.
+
+UTF-32, a.k.a. UCS-4
+ Every character is represented as 1 unit of 32 bits.
+
+ For encoding Unicode text in a file, UTF-8 is usually used. For
+encoding Unicode strings in memory for a program, either of the three
+encoding forms can be reasonably used.
+
+ Unicode is widely used on the web. Prior to the use of Unicode, web
+pages were in many different encodings (ISO-8859-1 for English, French,
+Spanish, ISO-8859-2 for Polish, ISO-8859-7 for Greek, KOI8-R for
+Russian, GB2312 or BIG5 for Chinese, ISO-2022-JP-2 or EUC-JP or
+Shift_JIS for Japanese, and many many others). It was next to
+impossible to create a document that contained Chinese and Polish text
+in the same document. Due to the many encodings for Japanese, even the
+processing of pure Japanese text was error prone.
+
+ References:
+ * The Unicode standard: `http://www.unicode.org/'
+
+ * Definition of UTF-8: `http://www.rfc-editor.org/rfc/rfc3629.txt'
+
+ * Definition of UTF-16: `http://www.rfc-editor.org/rfc/rfc2781.txt'
+
+ * Markus Kuhn's UTF-8 and Unicode FAQ:
+ `http://www.cl.cam.ac.uk/~mgk25/unicode.html'
+
+
+File: libunistring.info, Node: Unicode and i18n, Next: Locale encodings, Prev: Unicode, Up: Introduction
+
+1.2 Unicode and Internationalization
+====================================
+
+ Internationalization is the process of changing the source code of a
+program so that it can meet the expectations of users in any culture,
+if culture specific data (translations, images etc.) are provided.
+
+ Use of Unicode is not strictly required for internationalization,
+but it makes internationalization much easier, because operations that
+need to look at specific characters (like hyphenation, spell checking,
+or the automatic conversion of double-quotes to opening and closing
+double-quote characters) don't need to consider multiple possible
+encodings of the text.
+
+ Use of Unicode also enables multilingualization: the ability of
+having text in multiple languages present in the same document or even
+in the same line of text.
+
+ But use of Unicode is not everything. Internationalization usually
+consists of three features:
+ * Use of Unicode where needed for text processing. This is what
+ this library is for.
+
+ * Use of message catalogs for messages shown to the user, This is
+ what GNU gettext is about.
+
+ * Use of locale specific conventions for date and time formats, for
+ numeric formatting, or for sorting of text. This can be done
+ adequately with the POSIX APIs and the implementation of locales
+ in the GNU C library.
+
+
+File: libunistring.info, Node: Locale encodings, Next: In-memory representation, Prev: Unicode and i18n, Up: Introduction
+
+1.3 Locale encodings
+====================
+
+ A locale is a set of cultural conventions. According to POSIX, for
+a program, at any moment, there is one locale being designated as the
+"current locale". (Actually, POSIX supports also one locale per
+thread, but this feature is not yet universally implemented and not
+widely used.) The locale is partitioned into several aspects, called
+the "categories" of the locale. The main various aspects are:
+ * The character encoding and the character properties. This is the
+ `LC_CTYPE' category.
+
+ * The sorting rules for text. This is the `LC_COLLATE' category.
+
+ * The language specific translations of messages. This is the
+ `LC_MESSAGES' category.
+
+ * The formatting rules for numbers, such as the decimal separator.
+ This is the `LC_NUMERIC' category.
+
+ * The formatting rules for amounts of money. This is the
+ `LC_MONETARY' category.
+
+ * The formatting of date and time. This is the `LC_TIME' category.
+
+ In particular, the `LC_CTYPE' category of the current locale
+determines the character encoding. This is the encoding of `char *'
+strings. We also call it the "locale encoding". GNU libunistring has
+a function, `locale_charset', that returns a standardized (platform
+independent) name for this encoding.
+
+ All locale encodings used on glibc systems are essentially ASCII
+compatible: Most graphic ASCII characters have the same representation,
+as a single byte, in that encoding as in ASCII.
+
+ Among the possible locale encodings are UTF-8 and GB18030. Both
+allow to represent any Unicode character as a sequence of bytes. UTF-8
+is used in most of the world, whereas GB18030 is used in the People's
+Republic of China, because it is backward compatible with the GB2312
+encoding that was used in this country earlier.
+
+ The legacy locale encodings, ISO-8859-15 (which supplanted
+ISO-8859-1 in most of Europe), ISO-8859-2, KOI8-R, EUC-JP, etc., are
+still in use in many places, though.
+
+ UTF-16 and UTF-32 are not used as locale encodings, because they are
+not ASCII compatible.
+
+
+File: libunistring.info, Node: In-memory representation, Next: char * strings, Prev: Locale encodings, Up: Introduction
+
+1.4 Choice of in-memory representation of strings
+=================================================
+
+ There are three ways of representing strings in memory of a running
+program.
+ * As `char *' strings. Such strings are represented in locale
+ encoding. This approach is employed when not much text processing
+ is done by the program. When some Unicode aware processing is to
+ be done, a string is converted to Unicode on the fly and back to
+ locale encoding afterwards.
+
+ * As UTF-8 or UTF-16 or UTF-32 strings. This implies that
+ conversion from locale encoding to Unicode is performed on input,
+ and in the opposite direction on output. This approach is
+ employed when the program does a significant amount of text
+ processing, or when the program has multiple threads operating on
+ the same data but in different locales.
+
+ * As `wchar_t *', a.k.a. "wide strings". This approach is misguided,
+ see *note The wchar_t mess::.
+
+
+File: libunistring.info, Node: char * strings, Next: The wchar_t mess, Prev: In-memory representation, Up: Introduction
+
+1.5 `char *' strings
+====================
+
+ The classical C strings, with its C library support standardized by
+ISO C and POSIX, can be used in internationalized programs with some
+precautions. The problem with this API is that many of the C library
+functions for strings don't work correctly on strings in locale
+encodings, leading to bugs that only people in some cultures of the
+world will experience.
+
+ The first problem with the C library API is the support of multibyte
+locales. According to the locale encoding, in general, every character
+is represented by one or more bytes (up to 4 bytes in practice -- but
+use `MB_LEN_MAX' instead of the number 4 in the code). When every
+character is represented by only 1 byte, we speak of an "unibyte
+locale", otherwise of a "multibyte locale". It is important to realize
+that the majority of Unix installations nowadays use UTF-8 or GB18030
+as locale encoding; therefore, the majority of users are using
+multibyte locales.
+
+ The important fact to remember is: _A `char' is a byte, not a
+character._
+
+ As a consequence:
+ * The `<ctype.h>' API is useless in this context; it does not work in
+ multibyte locales.
+
+ * The `strlen' function does not return the number of characters in
+ a string. Nor does it return the number of screen columns occupied
+ by a string after it is output. It merely returns the number of
+ _bytes_ occupied by a string.
+
+ * Truncating a string, for example, with `strncpy', can have the
+ effect of truncating it in the middle of a multibyte character.
+ Such a string will, when output, have a garbled character at its
+ end, often represented by a hollow box.
+
+ * `strchr' and `strrchr' do not work with multibyte strings if the
+ locale encoding is GB18030 and the character to be searched is a
+ digit.
+
+ * `strstr' does not work with multibyte strings if the locale
+ encoding is different from UTF-8.
+
+ * `strcspn', `strpbrk', `strspn' cannot work correctly in multibyte
+ locales: they assume the second argument is a list of single-byte
+ characters. Even in this simple case, they do not work with
+ multibyte strings if the locale encoding is GB18030 and one of the
+ characters to be searched is a digit.
+
+ * `strsep' and `strtok_r' do not work with multibyte strings unless
+ all of the delimiter characters are ASCII characters < 0x30.
+
+ * The `strcasecmp', `strncasecmp', and `strcasestr' functions do not
+ work with multibyte strings.
+
+ The workarounds can be found in GNU gnulib
+`http://www.gnu.org/software/gnulib/'.
+ * gnulib has modules `mbchar', `mbiter', `mbuiter' that represent
+ multibyte characters and allow to iterate across a multibyte
+ string with the same ease as through a unibyte string.
+
+ * gnulib has functions `mbslen' and `mbswidth' that can be used
+ instead of `strlen' when the number of characters or the number of
+ screen columns of a string is requested.
+
+ * gnulib has functions `mbschr' and `mbsrrchr' that are like
+ `strchr' and `strrchr', but work in multibyte locales.
+
+ * gnulib has a function `mbsstr', like `strstr', but works in
+ multibyte locales.
+
+ * gnulib has functions `mbscspn', `mbspbrk', `mbsspn' that are like
+ `strcspn', `strpbrk', `strspn', but work in multibyte locales.
+
+ * gnulib has functions `mbssep' and `mbstok_r' that are like
+ `strsep' and `strtok_r' but work in multibyte locales.
+
+ * gnulib has functions `mbscasecmp', `mbsncasecmp', `mbspcasecmp',
+ and `mbscasestr' that are like `strcasecmp', `strncasecmp', and
+ `strcasestr', but work in multibyte locales. Still, the function
+ `ulc_casecmp' is preferable to these functions; see below.
+
+ The second problem with the C library API is that it has some
+assumptions built-in that are not valid in some languages:
+ * It assumes that there are only two forms of every character:
+ uppercase and lowercase. This is not true for Croatian, where the
+ character LETTER DZ WITH CARON comes in three forms: LATIN CAPITAL
+ LETTER DZ WITH CARON (DZ), LATIN CAPITAL LETTER D WITH SMALL
+ LETTER Z WITH CARON (Dz), LATIN SMALL LETTER DZ WITH CARON (dz).
+
+ * It assumes that uppercasing of 1 character leads to 1 character.
+ This is not true for German, where the LATIN SMALL LETTER SHARP S,
+ when uppercased, becomes `SS'.
+
+ * It assumes that there is 1:1 mapping between uppercase and
+ lowercase forms. This is not true for the Greek sigma: GREEK
+ CAPITAL LETTER SIGMA is the uppercase of both GREEK SMALL LETTER
+ SIGMA and GREEK SMALL LETTER FINAL SIGMA.
+
+ * It assumes that the upper/lowercase mappings are position
+ independent. This is not true for the Greek sigma and the
+ Lithuanian i.
+
+ The correct way to deal with this problem is
+ 1. to provide functions for titlecasing, as well as for upper- and
+ lowercasing,
+
+ 2. to view case transformations as functions that operates on strings,
+ rather than on characters.
+
+ This is implemented in this library, through the functions declared
+in `<unicase.h>', see *note unicase.h::.
+
+
+File: libunistring.info, Node: The wchar_t mess, Next: Unicode strings, Prev: char * strings, Up: Introduction
+
+1.6 The `wchar_t' mess
+======================
+
+ The ISO C and POSIX standard creators made an attempt to fix the
+first problem mentioned in the previous section. They introduced
+ * a type `wchar_t', designed to encapsulate an entire character,
+
+ * a "wide string" type `wchar_t *', and
+
+ * functions declared in `<wctype.h>' that were meant to supplant the
+ ones in `<ctype.h>'.
+
+ Unfortunately, this API and its implementation has numerous problems:
+
+ * On AIX and Windows platforms, `wchar_t' is a 16-bit type. This
+ means that it can never accommodate an entire Unicode character.
+ Either the `wchar_t *' strings are limited to characters in UCS-2
+ (the "Basic Multilingual Plane" of Unicode), or -- if `wchar_t *'
+ strings are encoded in UTF-16 -- a `wchar_t' represents only half
+ of a character in the worst case, making the `<wctype.h>' functions
+ pointless.
+
+ * On Solaris and FreeBSD, the `wchar_t' encoding is locale dependent
+ and undocumented. This means, if you want to know any property of
+ a `wchar_t' character, other than the properties defined by
+ `<wctype.h>' -- such as whether it's a dash, currency symbol,
+ paragraph separator, or similar --, you have to convert it to
+ `char *' encoding first, by use of the function `wctomb'.
+
+ * When you read a stream of wide characters, through the functions
+ `fgetwc' and `fgetws', and when the input stream/file is not in
+ the expected encoding, you have no way to determine the invalid
+ byte sequence and do some corrective action. If you use these
+ functions, your program becomes "garbage in - more garbage out" or
+ "garbage in - abort".
+
+ As a consequence, it is better to use multibyte strings, as
+explained in the previous section. Such multibyte strings can bypass
+limitations of the `wchar_t' type, if you use functions defined in
+gnulib and libunistring for text processing. They can also faithfully
+transport malformed characters that were present in the input, without
+requiring the program to produce garbage or abort.
+
+
+File: libunistring.info, Node: Unicode strings, Prev: The wchar_t mess, Up: Introduction
+
+1.7 Unicode strings
+===================
+
+ libunistring supports Unicode strings in three representations:
+ * UTF-8 strings, through the type `uint8_t *'. The units are bytes
+ (`uint8_t').
+
+ * UTF-16 strings, through the type `uint16_t *', The units are
+ 16-bit memory words (`uint16_t').
+
+ * UTF-32 strings, through the type `uint32_t *'. The units are
+ 32-bit memory words (`uint32_t').
+
+ As with C strings, there are two variants:
+ * Unicode strings with a terminating NUL character are represented as
+ a pointer to the first unit of the string. There is a unit
+ containing a 0 value at the end. It is considered part of the
+ string for all memory allocation purposes, but is not considered
+ part of the string for all other logical purposes.
+
+ * Unicode strings where embedded NUL characters are allowed. These
+ are represented by a pointer to the first unit and the number of
+ units (not bytes!) of the string. In this setting, there is no
+ trailing zero-valued unit used as "end marker".
+
+
+File: libunistring.info, Node: Conventions, Next: unitypes.h, Prev: Introduction, Up: Top
+
+2 Conventions
+*************
+
+ This chapter explains conventions valid throughout the libunistring
+library.
+
+ Variables of type `char *' denote C strings in locale encoding. See
+*note Locale encodings::.
+
+ Variables of type `uint8_t *' denote UTF-8 strings. Their units are
+bytes.
+
+ Variables of type `uint16_t *' denote UTF-16 strings, without byte
+order mark. Their units are 2-byte words.
+
+ Variables of type `uint32_t *' denote UTF-32 strings, without byte
+order mark. Their units are 4-byte words.
+
+ Argument pairs `(S, N)' denote a string `S[0..N-1]' with exactly N
+units.
+
+ All functions with prefix `ulc_' operate on C strings in locale
+encoding.
+
+ All functions with prefix `u8_' operate on UTF-8 strings.
+
+ All functions with prefix `u16_' operate on UTF-16 strings.
+
+ All functions with prefix `u32_' operate on UTF-32 strings.
+
+ For every function with prefix `u8_', operating on UTF-8 strings,
+there is also a corresponding function with prefix `u16_', operating on
+UTF-16 strings, and a corresponding function with prefix `u32_',
+operating on UTF-32 strings. Their description is analogous; in this
+documentation we describe only the function that operates on UTF-8
+strings, for brevity.
+
+ A declaration with a variable N denotes the three concrete
+declarations with N = 8, N = 16, N = 32.
+
+ All parameters starting with `str' and the parameters of functions
+starting with `u8_str'/`u16_str'/`u32_str' denote a NUL terminated
+string.
+
+ Error values are always returned through the `errno' variable,
+usually with a return value that indicates the presence of an error
+(NULL for functions that return an pointer, or -1 for functions that
+return an `int').
+
+ Functions returning a string result take a `(RESULTBUF, LENGTHP)'
+argument pair. If RESULTBUF is not NULL and the result fits into
+`*LENGTHP' units, it is put in RESULTBUF, and RESULTBUF is returned.
+Otherwise, a freshly allocated string is returned. In both cases,
+`*LENGTHP' is set to the length (number of units) of the returned
+string. In case of error, NULL is returned and `errno' is set.
+
+
+File: libunistring.info, Node: unitypes.h, Next: unistr.h, Prev: Conventions, Up: Top
+
+3 Elementary types `<unitypes.h>'
+*********************************
+
+ The include file `<unitypes.h>' provides the following basic types.
+
+ -- Type: uint8_t
+ -- Type: uint16_t
+ -- Type: uint32_t
+ These are the storage units of UTF-8/16/32 strings, respectively.
+ The definitions are taken from `<stdint.h>', on platforms where
+ this include file is present.
+
+ -- Type: ucs4_t
+ This type represents a single Unicode character, outside of an
+ UTF-32 string.
+
+
+File: libunistring.info, Node: unistr.h, Next: uniconv.h, Prev: unitypes.h, Up: Top
+
+4 Elementary Unicode string functions `<unistr.h>'
+**************************************************
+
+ This include file declares elementary functions for Unicode strings.
+It is essentially the equivalent of what `<string.h>' is for C strings.
+
+* Menu:
+
+* Elementary string checks::
+* Elementary string conversions::
+* Elementary string functions::
+* Elementary string functions with memory allocation::
+* Elementary string functions on NUL terminated strings::
+
+
+File: libunistring.info, Node: Elementary string checks, Next: Elementary string conversions, Up: unistr.h
+
+4.1 Elementary string checks
+============================
+
+ The following function is available to verify the integrity of a
+Unicode string.
+
+ -- Function: const uint8_t * u8_check (const uint8_t *S, size_t N)
+ -- Function: const uint16_t * u16_check (const uint16_t *S, size_t N)
+ -- Function: const uint32_t * u32_check (const uint32_t *S, size_t N)
+ This function checks whether a Unicode string is well-formed. It
+ returns NULL if valid, or a pointer to the first invalid unit
+ otherwise.
+
+
+File: libunistring.info, Node: Elementary string conversions, Next: Elementary string functions, Prev: Elementary string checks, Up: unistr.h
+
+4.2 Elementary string conversions
+=================================
+
+ The following functions perform conversions between the different
+forms of Unicode strings.
+
+ -- Function: uint16_t * u8_to_u16 (const uint8_t *S, size_t N,
+ uint16_t *RESULTBUF, size_t *LENGTHP)
+ Converts an UTF-8 string to an UTF-16 string.
+
+ -- Function: uint32_t * u8_to_u32 (const uint8_t *S, size_t N,
+ uint32_t *RESULTBUF, size_t *LENGTHP)
+ Converts an UTF-8 string to an UTF-32 string.
+
+ -- Function: uint8_t * u16_to_u8 (const uint16_t *S, size_t N, uint8_t
+ *RESULTBUF, size_t *LENGTHP)
+ Converts an UTF-16 string to an UTF-8 string.
+
+ -- Function: uint32_t * u16_to_u32 (const uint16_t *S, size_t N,
+ uint32_t *RESULTBUF, size_t *LENGTHP)
+ Converts an UTF-16 string to an UTF-32 string.
+
+ -- Function: uint8_t * u32_to_u8 (const uint32_t *S, size_t N, uint8_t
+ *RESULTBUF, size_t *LENGTHP)
+ Converts an UTF-32 string to an UTF-8 string.
+
+ -- Function: uint16_t * u32_to_u16 (const uint32_t *S, size_t N,
+ uint16_t *RESULTBUF, size_t *LENGTHP)
+ Converts an UTF-32 string to an UTF-16 string.
+
+
+File: libunistring.info, Node: Elementary string functions, Next: Elementary string functions with memory allocation, Prev: Elementary string conversions, Up: unistr.h
+
+4.3 Elementary string functions
+===============================
+
+ The following functions inspect and return details about the first
+character in a Unicode string.
+
+ -- Function: int u8_mblen (const uint8_t *S, size_t N)
+ -- Function: int u16_mblen (const uint16_t *S, size_t N)
+ -- Function: int u32_mblen (const uint32_t *S, size_t N)
+ Returns the length (number of units) of the first character in S,
+ which is no longer than N. Returns 0 if it is the NUL character.
+ Returns -1 upon failure.
+
+ This function is similar to `mblen', except that it operates on a
+ Unicode string and that S must not be NULL.
+
+ -- Function: int u8_mbtouc_unsafe (ucs4_t *PUC, const uint8_t *S,
+ size_t N)
+ -- Function: int u16_mbtouc_unsafe (ucs4_t *PUC, const uint16_t *S,
+ size_t N)
+ -- Function: int u32_mbtouc_unsafe (ucs4_t *PUC, const uint32_t *S,
+ size_t N)
+ Returns the length (number of units) of the first character in S,
+ putting its `ucs4_t' representation in `*PUC'. Upon failure,
+ `*PUC' is set to `0xfffd', and an appropriate number of units is
+ returned.
+
+ The number of available units, N, must be > 0.
+
+ This function is similar to `mbtowc', except that it operates on a
+ Unicode string, PUC and S must not be NULL, N must be > 0, and the
+ NUL character is not treated specially.
+
+ -- Function: int u8_mbtouc (ucs4_t *PUC, const uint8_t *S, size_t N)
+ -- Function: int u16_mbtouc (ucs4_t *PUC, const uint16_t *S, size_t N)
+ -- Function: int u32_mbtouc (ucs4_t *PUC, const uint32_t *S, size_t N)
+ This function is like `u8_mbtouc_unsafe', except that it will
+ detect an invalid UTF-8 character, even if the library is compiled
+ without `--enable-safety'.
+
+ -- Function: int u8_mbtoucr (ucs4_t *PUC, const uint8_t *S, size_t N)
+ -- Function: int u16_mbtoucr (ucs4_t *PUC, const uint16_t *S, size_t N)
+ -- Function: int u32_mbtoucr (ucs4_t *PUC, const uint32_t *S, size_t N)
+ Returns the length (number of units) of the first character in S,
+ putting its `ucs4_t' representation in `*PUC'. Upon failure,
+ `*PUC' is set to `0xfffd', and -1 is returned for an invalid
+ sequence of units, -2 is returned for an incomplete sequence of
+ units.
+
+ The number of available units, N, must be > 0.
+
+ This function is similar to `u8_mbtouc', except that the return
+ value gives more details about the failure, similar to `mbrtowc'.
+
+ The following function stores a Unicode character as a Unicode
+string in memory.
+
+ -- Function: int u8_uctomb (uint8_t *S, ucs4_t UC, int N)
+ -- Function: int u16_uctomb (uint16_t *S, ucs4_t UC, int N)
+ -- Function: int u32_uctomb (uint32_t *S, ucs4_t UC, int N)
+ Puts the multibyte character represented by UC in S, returning its
+ length. Returns -1 upon failure, -2 if the number of available
+ units, N, is too small. The latter case cannot occur if N >=
+ 6/2/1, respectively.
+
+ This function is similar to `wctomb', except that it operates on a
+ Unicode strings, S must not be NULL, and the argument N must be
+ specified.
+
+ The following functions copy Unicode strings in memory.
+
+ -- Function: uint8_t * u8_cpy (uint8_t *DEST, const uint8_t *SRC,
+ size_t N)
+ -- Function: uint16_t * u16_cpy (uint16_t *DEST, const uint16_t *SRC,
+ size_t N)
+ -- Function: uint32_t * u32_cpy (uint32_t *DEST, const uint32_t *SRC,
+ size_t N)
+ Copies N units from SRC to DEST.
+
+ This function is similar to `memcpy', except that it operates on
+ Unicode strings.
+
+ -- Function: uint8_t * u8_move (uint8_t *DEST, const uint8_t *SRC,
+ size_t N)
+ -- Function: uint16_t * u16_move (uint16_t *DEST, const uint16_t *SRC,
+ size_t N)
+ -- Function: uint32_t * u32_move (uint32_t *DEST, const uint32_t *SRC,
+ size_t N)
+ Copies N units from SRC to DEST, guaranteeing correct behavior for
+ overlapping memory areas.
+
+ This function is similar to `memmove', except that it operates on
+ Unicode strings.
+
+ The following function fills a Unicode string.
+
+ -- Function: uint8_t * u8_set (uint8_t *S, ucs4_t UC, size_t N)
+ -- Function: uint16_t * u16_set (uint16_t *S, ucs4_t UC, size_t N)
+ -- Function: uint32_t * u32_set (uint32_t *S, ucs4_t UC, size_t N)
+ Sets the first N characters of S to UC. UC should be a character
+ that occupies only 1 unit.
+
+ This function is similar to `memset', except that it operates on
+ Unicode strings.
+
+ The following function compares two Unicode strings of the same
+length.
+
+ -- Function: int u8_cmp (const uint8_t *S1, const uint8_t *S2, size_t
+ N)
+ -- Function: int u16_cmp (const uint16_t *S1, const uint16_t *S2,
+ size_t N)
+ -- Function: int u32_cmp (const uint32_t *S1, const uint32_t *S2,
+ size_t N)
+ Compares S1 and S2, each of length N, lexicographically. Returns
+ a negative value if S1 compares smaller than S2, a positive value
+ if S1 compares larger than S2, or 0 if they compare equal.
+
+ This function is similar to `memcmp', except that it operates on
+ Unicode strings.
+
+ The following function compares two Unicode strings of possibly
+different lengths.
+
+ -- Function: int u8_cmp2 (const uint8_t *S1, size_t N1, const uint8_t
+ *S2, size_t N2)
+ -- Function: int u16_cmp2 (const uint16_t *S1, size_t N1, const
+ uint16_t *S2, size_t N2)
+ -- Function: int u32_cmp2 (const uint32_t *S1, size_t N1, const
+ uint32_t *S2, size_t N2)
+ Compares S1 and S2, lexicographically. Returns a negative value
+ if S1 compares smaller than S2, a positive value if S1 compares
+ larger than S2, or 0 if they compare equal.
+
+ This function is similar to the gnulib function `memcmp2', except
+ that it operates on Unicode strings.
+
+ The following function searches for a given Unicode character.
+
+ -- Function: uint8_t * u8_chr (const uint8_t *S, size_t N, ucs4_t UC)
+ -- Function: uint16_t * u16_chr (const uint16_t *S, size_t N, ucs4_t
+ UC)
+ -- Function: uint32_t * u32_chr (const uint32_t *S, size_t N, ucs4_t
+ UC)
+ Searches the string at S for UC. Returns a pointer to the first
+ occurrence of UC in S, or NULL if UC does not occur in S.
+
+ This function is similar to `memchr', except that it operates on
+ Unicode strings.
+
+ The following function counts the number of Unicode characters.
+
+ -- Function: size_t u8_mbsnlen (const uint8_t *S, size_t N)
+ -- Function: size_t u16_mbsnlen (const uint16_t *S, size_t N)
+ -- Function: size_t u32_mbsnlen (const uint32_t *S, size_t N)
+ Counts and returns the number of Unicode characters in the N units
+ from S.
+
+ This function is similar to the gnulib function `mbsnlen', except
+ that it operates on Unicode strings.
+
+
+File: libunistring.info, Node: Elementary string functions with memory allocation, Next: Elementary string functions on NUL terminated strings, Prev: Elementary string functions, Up: unistr.h
+
+4.4 Elementary string functions with memory allocation
+======================================================
+
+ The following function copies a Unicode string.
+
+ -- Function: uint8_t * u8_cpy_alloc (const uint8_t *S, size_t N)
+ -- Function: uint16_t * u16_cpy_alloc (const uint16_t *S, size_t N)
+ -- Function: uint32_t * u32_cpy_alloc (const uint32_t *S, size_t N)
+ Makes a freshly allocated copy of S, of length N.
+
+
+File: libunistring.info, Node: Elementary string functions on NUL terminated strings, Prev: Elementary string functions with memory allocation, Up: unistr.h
+
+4.5 Elementary string functions on NUL terminated strings
+=========================================================
+
+ The following functions inspect and return details about the first
+character in a Unicode string.
+
+ -- Function: int u8_strmblen (const uint8_t *S)
+ -- Function: int u16_strmblen (const uint16_t *S)
+ -- Function: int u32_strmblen (const uint32_t *S)
+ Returns the length (number of units) of the first character in S.
+ Returns 0 if it is the NUL character. Returns -1 upon failure.
+
+ -- Function: int u8_strmbtouc (ucs4_t *PUC, const uint8_t *S)
+ -- Function: int u16_strmbtouc (ucs4_t *PUC, const uint16_t *S)
+ -- Function: int u32_strmbtouc (ucs4_t *PUC, const uint32_t *S)
+ Returns the length (number of units) of the first character in S,
+ putting its `ucs4_t' representation in `*PUC'. Returns 0 if it is
+ the NUL character. Returns -1 upon failure.
+
+ -- Function: const uint8_t * u8_next (ucs4_t *PUC, const uint8_t *S)
+ -- Function: const uint16_t * u16_next (ucs4_t *PUC, const uint16_t *S)
+ -- Function: const uint32_t * u32_next (ucs4_t *PUC, const uint32_t *S)
+ Forward iteration step. Advances the pointer past the next
+ character, or returns NULL if the end of the string has been
+ reached. Puts the character's `ucs4_t' representation in `*PUC'.
+
+ The following function inspects and returns details about the
+previous character in a Unicode string.
+
+ -- Function: const uint8_t * u8_prev (ucs4_t *PUC, const uint8_t *S,
+ const uint8_t *START)
+ -- Function: const uint16_t * u16_prev (ucs4_t *PUC, const uint16_t
+ *S, const uint16_t *START)
+ -- Function: const uint32_t * u32_prev (ucs4_t *PUC, const uint32_t
+ *S, const uint32_t *START)
+ Backward iteration step. Advances the pointer to point to the
+ previous character, or returns NULL if the beginning of the string
+ had been reached. Puts the character's `ucs4_t' representation in
+ `*PUC'.
+
+ The following functions determine the length of a Unicode string.
+
+ -- Function: size_t u8_strlen (const uint8_t *S)
+ -- Function: size_t u16_strlen (const uint16_t *S)
+ -- Function: size_t u32_strlen (const uint32_t *S)
+ Returns the number of units in S.
+
+ This function is similar to `strlen' and `wcslen', except that it
+ operates on Unicode strings.
+
+ -- Function: size_t u8_strnlen (const uint8_t *S, size_t MAXLEN)
+ -- Function: size_t u16_strnlen (const uint16_t *S, size_t MAXLEN)
+ -- Function: size_t u32_strnlen (const uint32_t *S, size_t MAXLEN)
+ Returns the number of units in S, but at most MAXLEN.
+
+ This function is similar to `strnlen' and `wcsnlen', except that
+ it operates on Unicode strings.
+
+ The following functions copy portions of Unicode strings in memory.
+
+ -- Function: uint8_t * u8_strcpy (uint8_t *DEST, const uint8_t *SRC)
+ -- Function: uint16_t * u16_strcpy (uint16_t *DEST, const uint16_t
+ *SRC)
+ -- Function: uint32_t * u32_strcpy (uint32_t *DEST, const uint32_t
+ *SRC)
+ Copies SRC to DEST.
+
+ This function is similar to `strcpy' and `wcscpy', except that it
+ operates on Unicode strings.
+
+ -- Function: uint8_t * u8_stpcpy (uint8_t *DEST, const uint8_t *SRC)
+ -- Function: uint16_t * u16_stpcpy (uint16_t *DEST, const uint16_t
+ *SRC)
+ -- Function: uint32_t * u32_stpcpy (uint32_t *DEST, const uint32_t
+ *SRC)
+ Copies SRC to DEST, returning the address of the terminating NUL
+ in DEST.
+
+ This function is similar to `stpcpy', except that it operates on
+ Unicode strings.
+
+ -- Function: uint8_t * u8_strncpy (uint8_t *DEST, const uint8_t *SRC,
+ size_t N)
+ -- Function: uint16_t * u16_strncpy (uint16_t *DEST, const uint16_t
+ *SRC, size_t N)
+ -- Function: uint32_t * u32_strncpy (uint32_t *DEST, const uint32_t
+ *SRC, size_t N)
+ Copies no more than N units of SRC to DEST.
+
+ This function is similar to `strncpy' and `wcsncpy', except that
+ it operates on Unicode strings.
+
+ -- Function: uint8_t * u8_stpncpy (uint8_t *DEST, const uint8_t *SRC,
+ size_t N)
+ -- Function: uint16_t * u16_stpncpy (uint16_t *DEST, const uint16_t
+ *SRC, size_t N)
+ -- Function: uint32_t * u32_stpncpy (uint32_t *DEST, const uint32_t
+ *SRC, size_t N)
+ Copies no more than N units of SRC to DEST, returning the address
+ of the last unit written into DEST.
+
+ This function is similar to `stpncpy', except that it operates on
+ Unicode strings.
+
+ -- Function: uint8_t * u8_strcat (uint8_t *DEST, const uint8_t *SRC)
+ -- Function: uint16_t * u16_strcat (uint16_t *DEST, const uint16_t
+ *SRC)
+ -- Function: uint32_t * u32_strcat (uint32_t *DEST, const uint32_t
+ *SRC)
+ Appends SRC onto DEST.
+
+ This function is similar to `strcat' and `wcscat', except that it
+ operates on Unicode strings.
+
+ -- Function: uint8_t * u8_strncat (uint8_t *DEST, const uint8_t *SRC,
+ size_t N)
+ -- Function: uint16_t * u16_strncat (uint16_t *DEST, const uint16_t
+ *SRC, size_t N)
+ -- Function: uint32_t * u32_strncat (uint32_t *DEST, const uint32_t
+ *SRC, size_t N)
+ Appends no more than N units of SRC onto DEST.
+
+ This function is similar to `strncat' and `wcsncat', except that
+ it operates on Unicode strings.
+
+ The following functions compare two Unicode strings.
+
+ -- Function: int u8_strcmp (const uint8_t *S1, const uint8_t *S2)
+ -- Function: int u16_strcmp (const uint16_t *S1, const uint16_t *S2)
+ -- Function: int u32_strcmp (const uint32_t *S1, const uint32_t *S2)
+ Compares S1 and S2, lexicographically. Returns a negative value
+ if S1 compares smaller than S2, a positive value if S1 compares
+ larger than S2, or 0 if they compare equal.
+
+ This function is similar to `strcmp' and `wcscmp', except that it
+ operates on Unicode strings.
+
+ -- Function: int u8_strcoll (const uint8_t *S1, const uint8_t *S2)
+ -- Function: int u16_strcoll (const uint16_t *S1, const uint16_t *S2)
+ -- Function: int u32_strcoll (const uint32_t *S1, const uint32_t *S2)
+ Compares S1 and S2 using the collation rules of the current locale.
+ Returns -1 if S1 < S2, 0 if S1 = S2, 1 if S1 > S2. Upon failure,
+ sets `errno' and returns any value.
+
+ This function is similar to `strcoll' and `wcscoll', except that
+ it operates on Unicode strings.
+
+ Note that this function may consider different canonical
+ normalizations of the same string as having a large distance. It
+ is therefore better to use the function `u8_normcoll' instead of
+ this one; see *note uninorm.h::.
+
+ -- Function: int u8_strncmp (const uint8_t *S1, const uint8_t *S2,
+ size_t N)
+ -- Function: int u16_strncmp (const uint16_t *S1, const uint16_t *S2,
+ size_t N)
+ -- Function: int u32_strncmp (const uint32_t *S1, const uint32_t *S2,
+ size_t N)
+ Compares no more than N units of S1 and S2.
+
+ This function is similar to `strncmp' and `wcsncmp', except that
+ it operates on Unicode strings.
+
+ The following function allocates a duplicate of a Unicode string.
+
+ -- Function: uint8_t * u8_strdup (const uint8_t *S)
+ -- Function: uint16_t * u16_strdup (const uint16_t *S)
+ -- Function: uint32_t * u32_strdup (const uint32_t *S)
+ Duplicates S, returning an identical malloc'd string.
+
+ This function is similar to `strdup' and `wcsdup', except that it
+ operates on Unicode strings.
+
+ The following functions search for a given Unicode character.
+
+ -- Function: uint8_t * u8_strchr (const uint8_t *STR, ucs4_t UC)
+ -- Function: uint16_t * u16_strchr (const uint16_t *STR, ucs4_t UC)
+ -- Function: uint32_t * u32_strchr (const uint32_t *STR, ucs4_t UC)
+ Finds the first occurrence of UC in STR.
+
+ This function is similar to `strchr' and `wcschr', except that it
+ operates on Unicode strings.
+
+ -- Function: uint8_t * u8_strrchr (const uint8_t *STR, ucs4_t UC)
+ -- Function: uint16_t * u16_strrchr (const uint16_t *STR, ucs4_t UC)
+ -- Function: uint32_t * u32_strrchr (const uint32_t *STR, ucs4_t UC)
+ Finds the last occurrence of UC in STR.
+
+ This function is similar to `strrchr' and `wcsrchr', except that
+ it operates on Unicode strings.
+
+ The following functions search for the first occurrence of some
+Unicode character in or outside a given set of Unicode characters.
+
+ -- Function: size_t u8_strcspn (const uint8_t *STR, const uint8_t
+ *REJECT)
+ -- Function: size_t u16_strcspn (const uint16_t *STR, const uint16_t
+ *REJECT)
+ -- Function: size_t u32_strcspn (const uint32_t *STR, const uint32_t
+ *REJECT)
+ Returns the length of the initial segment of STR which consists
+ entirely of Unicode characters not in REJECT.
+
+ This function is similar to `strcspn' and `wcscspn', except that
+ it operates on Unicode strings.
+
+ -- Function: size_t u8_strspn (const uint8_t *STR, const uint8_t
+ *ACCEPT)
+ -- Function: size_t u16_strspn (const uint16_t *STR, const uint16_t
+ *ACCEPT)
+ -- Function: size_t u32_strspn (const uint32_t *STR, const uint32_t
+ *ACCEPT)
+ Returns the length of the initial segment of STR which consists
+ entirely of Unicode characters in ACCEPT.
+
+ This function is similar to `strspn' and `wcsspn', except that it
+ operates on Unicode strings.
+
+ -- Function: uint8_t * u8_strpbrk (const uint8_t *STR, const uint8_t
+ *ACCEPT)
+ -- Function: uint16_t * u16_strpbrk (const uint16_t *STR, const
+ uint16_t *ACCEPT)
+ -- Function: uint32_t * u32_strpbrk (const uint32_t *STR, const
+ uint32_t *ACCEPT)
+ Finds the first occurrence in STR of any character in ACCEPT.
+
+ This function is similar to `strpbrk' and `wcspbrk', except that
+ it operates on Unicode strings.
+
+ The following functions search whether a given Unicode string is a
+substring of another Unicode string.
+
+ -- Function: uint8_t * u8_strstr (const uint8_t *HAYSTACK, const
+ uint8_t *NEEDLE)
+ -- Function: uint16_t * u16_strstr (const uint16_t *HAYSTACK, const
+ uint16_t *NEEDLE)
+ -- Function: uint32_t * u32_strstr (const uint32_t *HAYSTACK, const
+ uint32_t *NEEDLE)
+ Finds the first occurrence of NEEDLE in HAYSTACK.
+
+ This function is similar to `strstr' and `wcsstr', except that it
+ operates on Unicode strings.
+
+ -- Function: bool u8_startswith (const uint8_t *STR, const uint8_t
+ *PREFIX)
+ -- Function: bool u16_startswith (const uint16_t *STR, const uint16_t
+ *PREFIX)
+ -- Function: bool u32_startswith (const uint32_t *STR, const uint32_t
+ *PREFIX)
+ Tests whether STR starts with PREFIX.
+
+ -- Function: bool u8_endswith (const uint8_t *STR, const uint8_t
+ *SUFFIX)
+ -- Function: bool u16_endswith (const uint16_t *STR, const uint16_t
+ *SUFFIX)
+ -- Function: bool u32_endswith (const uint32_t *STR, const uint32_t
+ *SUFFIX)
+ Tests whether STR ends with SUFFIX.
+
+ The following function does one step in tokenizing a Unicode string.
+
+ -- Function: uint8_t * u8_strtok (uint8_t *STR, const uint8_t *DELIM,
+ uint8_t **PTR)
+ -- Function: uint16_t * u16_strtok (uint16_t *STR, const uint16_t
+ *DELIM, uint16_t **PTR)
+ -- Function: uint32_t * u32_strtok (uint32_t *STR, const uint32_t
+ *DELIM, uint32_t **PTR)
+ Divides STR into tokens separated by characters in DELIM.
+
+ This function is similar to `strtok_r' and `wcstok', except that
+ it operates on Unicode strings. Its interface is actually more
+ similar to `wcstok' than to `strtok'.
+
+
+File: libunistring.info, Node: uniconv.h, Next: unistdio.h, Prev: unistr.h, Up: Top
+
+5 Conversions between Unicode and encodings `<uniconv.h>'
+*********************************************************
+
+ This include file declares functions for converting between Unicode
+strings and `char *' strings in locale encoding or in other specified
+encodings.
+
+ The following function returns the locale encoding.
+
+ -- Function: const char * locale_charset ()
+ Determines the current locale's character encoding, and
+ canonicalizes it into one of the canonical names listed in
+ `config.charset'. If the canonical name cannot be determined, the
+ result is a non-canonical name.
+
+ The result must not be freed; it is statically allocated.
+
+ The result of this function can be used as an argument to the
+ `iconv_open' function in GNU libc, in GNU libiconv, or in the
+ gnulib provided wrapper around the native `iconv_open' function.
+ It may not work as an argument to the native `iconv_open' function
+ directly.
+
+ The handling of unconvertible characters during the conversions can
+be parametrized through the following enumeration type:
+
+ -- Type: enum iconv_ilseq_handler
+ This type specifies how unconvertible characters in the input are
+ handled.
+
+ -- Constant: enum iconv_ilseq_handler iconveh_error
+ This handler causes the function to return with `errno' set to
+ `EILSEQ'.
+
+ -- Constant: enum iconv_ilseq_handler iconveh_question_mark
+ This handler produces one question mark `?' per unconvertible
+ character.
+
+ -- Constant: enum iconv_ilseq_handler iconveh_escape_sequence
+ This handler produces an escape sequence `\uXXXX' or `\UXXXXXXXX'
+ for each unconvertible character.
+
+ The following functions convert between strings in a specified
+encoding and Unicode strings.
+
+ -- Function: uint8_t * u8_conv_from_encoding (const char *FROMCODE,
+ enum iconv_ilseq_handler HANDLER, const char *SRC, size_t
+ SRCLEN, size_t *OFFSETS, uint8_t *RESULTBUF, size_t *LENGTHP)
+ -- Function: uint16_t * u16_conv_from_encoding (const char *FROMCODE,
+ enum iconv_ilseq_handler HANDLER, const char *SRC, size_t
+ SRCLEN, size_t *OFFSETS, uint16_t *RESULTBUF, size_t *LENGTHP)
+ -- Function: uint32_t * u32_conv_from_encoding (const char *FROMCODE,
+ enum iconv_ilseq_handler HANDLER, const char *SRC, size_t
+ SRCLEN, size_t *OFFSETS, uint32_t *RESULTBUF, size_t *LENGTHP)
+ Converts an entire string, possibly including NUL bytes, from one
+ encoding to UTF-8 encoding.
+
+ Converts a memory region given in encoding FROMCODE. FROMCODE is
+ as for the `iconv_open' function.
+
+ The input is in the memory region between SRC (inclusive) and `SRC
+ + SRCLEN' (exclusive).
+
+ If OFFSETS is not NULL, it should point to an array of SRCLEN
+ integers; this array is filled with offsets into the result, i.e.
+ the character starting at `SRC[i]' corresponds to the character
+ starting at `RESULT[OFFSETS[i]]', and other offsets are set to
+ `(size_t)(-1)'.
+
+ `RESULTBUF' and `*LENGTHP' should be a scratch buffer and its
+ size, or `RESULTBUF' can be NULL.
+
+ May erase the contents of the memory at `RESULTBUF'.
+
+ If successful: The resulting Unicode string (non-NULL) is returned
+ and its length stored in `*LENGTHP'. The resulting string is
+ `RESULTBUF' if no dynamic memory allocation was necessary, or a
+ freshly allocated memory block otherwise.
+
+ In case of error: NULL is returned and `errno' is set. Particular
+ `errno' values: `EINVAL', `EILSEQ', `ENOMEM'.
+
+ -- Function: char * u8_conv_to_encoding (const char *TOCODE, enum
+ iconv_ilseq_handler HANDLER, const uint8_t *SRC, size_t
+ SRCLEN, size_t *OFFSETS, char *RESULTBUF, size_t *LENGTHP)
+ -- Function: char * u16_conv_to_encoding (const char *TOCODE, enum
+ iconv_ilseq_handler HANDLER, const uint16_t *SRC, size_t
+ SRCLEN, size_t *OFFSETS, char *RESULTBUF, size_t *LENGTHP)
+ -- Function: char * u32_conv_to_encoding (const char *TOCODE, enum
+ iconv_ilseq_handler HANDLER, const uint32_t *SRC, size_t
+ SRCLEN, size_t *OFFSETS, char *RESULTBUF, size_t *LENGTHP)
+ Converts an entire Unicode string, possibly including NUL units,
+ from UTF-8 encoding to a given encoding.
+
+ Converts a memory region to encoding TOCODE. TOCODE is as for the
+ `iconv_open' function.
+
+ The input is in the memory region between SRC (inclusive) and `SRC
+ + SRCLEN' (exclusive).
+
+ If OFFSETS is not NULL, it should point to an array of SRCLEN
+ integers; this array is filled with offsets into the result, i.e.
+ the character starting at `SRC[i]' corresponds to the character
+ starting at `RESULT[OFFSETS[i]]', and other offsets are set to
+ `(size_t)(-1)'.
+
+ `RESULTBUF' and `*LENGTHP' should be a scratch buffer and its
+ size, or `RESULTBUF' can be NULL.
+
+ May erase the contents of the memory at `RESULTBUF'.
+
+ If successful: The resulting Unicode string (non-NULL) is returned
+ and its length stored in `*LENGTHP'. The resulting string is
+ `RESULTBUF' if no dynamic memory allocation was necessary, or a
+ freshly allocated memory block otherwise.
+
+ In case of error: NULL is returned and `errno' is set. Particular
+ `errno' values: `EINVAL', `EILSEQ', `ENOMEM'.
+
+ The following functions convert between NUL terminated strings in a
+specified encoding and NUL terminated Unicode strings.
+
+ -- Function: uint8_t * u8_strconv_from_encoding (const char *STRING,
+ const char *FROMCODE, enum iconv_ilseq_handler HANDLER)
+ -- Function: uint16_t * u16_strconv_from_encoding (const char *STRING,
+ const char *FROMCODE, enum iconv_ilseq_handler HANDLER)
+ -- Function: uint32_t * u32_strconv_from_encoding (const char *STRING,
+ const char *FROMCODE, enum iconv_ilseq_handler HANDLER)
+ Converts a NUL terminated string from a given encoding.
+
+ The result is `malloc' allocated, or NULL (with ERRNO set) in case
+ of error.
+
+ Particular `errno' values: `EILSEQ', `ENOMEM'.
+
+ -- Function: char * u8_strconv_to_encoding (const uint8_t *STRING,
+ const char *TOCODE, enum iconv_ilseq_handler HANDLER)
+ -- Function: char * u16_strconv_to_encoding (const uint16_t *STRING,
+ const char *TOCODE, enum iconv_ilseq_handler HANDLER)
+ -- Function: char * u32_strconv_to_encoding (const uint32_t *STRING,
+ const char *TOCODE, enum iconv_ilseq_handler HANDLER)
+ Converts a NUL terminated string to a given encoding.
+
+ The result is `malloc' allocated, or NULL (with `errno' set) in
+ case of error.
+
+ Particular `errno' values: `EILSEQ', `ENOMEM'.
+
+ The following functions are shorthands that convert between NUL
+terminated strings in locale encoding and NUL terminated Unicode
+strings.
+
+ -- Function: uint8_t * u8_strconv_from_locale (const char *STRING)
+ -- Function: uint16_t * u16_strconv_from_locale (const char *STRING)
+ -- Function: uint32_t * u32_strconv_from_locale (const char *STRING)
+ Converts a NUL terminated string from the locale encoding.
+
+ The result is `malloc' allocated, or NULL (with `errno' set) in
+ case of error.
+
+ Particular `errno' values: `ENOMEM'.
+
+ -- Function: char * u8_strconv_to_locale (const uint8_t *STRING)
+ -- Function: char * u16_strconv_to_locale (const uint16_t *STRING)
+ -- Function: char * u32_strconv_to_locale (const uint32_t *STRING)
+ Converts a NUL terminated string to the locale encoding.
+
+ The result is `malloc' allocated, or NULL (with `errno' set) in
+ case of error.
+
+ Particular `errno' values: `ENOMEM'.
+
+
+File: libunistring.info, Node: unistdio.h, Next: uniname.h, Prev: uniconv.h, Up: Top
+
+6 Output with Unicode strings `<unistdio.h>'
+********************************************
+
+ This include file declares functions for doing formatted output with
+Unicode strings. It defines a set of functions similar to `fprintf' and
+`sprintf', which are declared in `<stdio.h>'.
+
+ These functions work like the `printf' function family. In the
+format string:
+ * The format directive `U' takes an UTF-8 string (`const uint8_t *').
+
+ * The format directive `lU' takes an UTF-16 string (`const uint16_t
+ *').
+
+ * The format directive `llU' takes an UTF-32 string (`const uint32_t
+ *').
+
+ A function name with an infix `v' indicates that a `va_list' is
+passed instead of multiple arguments.
+
+ The functions `*sprintf' have a BUF argument that is assumed to be
+large enough. (_DANGEROUS! Overflowing the buffer will crash the
+program._)
+
+ The functions `*snprintf' have a BUF argument that is assumed to be
+SIZE units large. (_DANGEROUS! The resulting string might be
+truncated in the middle of a multibyte character._)
+
+ The functions `*asprintf' have a RESULTP argument. The result will
+be freshly allocated and stored in `*resultp'.
+
+ The functions `*asnprintf' have a (RESULTBUF, LENGTHP) argument
+pair. If RESULTBUF is not NULL and the result fits into `*LENGTHP'
+units, it is put in RESULTBUF, and RESULTBUF is returned. Otherwise, a
+freshly allocated string is returned. In both cases, `*LENGTHP' is set
+to the length (number of units) of the returned string. In case of
+error, NULL is returned and `errno' is set.
+
+ The following functions take an ASCII format string and return a
+result that is a `char *' string in locale encoding.
+
+ -- Function: int ulc_sprintf (char *BUF, const char *FORMAT, ...)
+
+ -- Function: int ulc_snprintf (char *BUF, size_t size, const char
+ *FORMAT, ...)
+
+ -- Function: int ulc_asprintf (char **RESULTP, const char *FORMAT, ...)
+
+ -- Function: char * ulc_asnprintf (char *RESULTBUF, size_t *LENGTHP,
+ const char *FORMAT, ...)
+
+ -- Function: int ulc_vsprintf (char *BUF, const char *FORMAT, va_list
+ AP)
+
+ -- Function: int ulc_vsnprintf (char *BUF, size_t size, const char
+ *FORMAT, va_list AP)
+
+ -- Function: int ulc_vasprintf (char **RESULTP, const char *FORMAT,
+ va_list AP)
+
+ -- Function: char * ulc_vasnprintf (char *RESULTBUF, size_t *LENGTHP,
+ const char *FORMAT, va_list AP)
+
+ The following functions take an ASCII format string and return a
+result in UTF-8 format.
+
+ -- Function: int u8_sprintf (uint8_t *BUF, const char *FORMAT, ...)
+
+ -- Function: int u8_snprintf (uint8_t *BUF, size_t SIZE, const char
+ *FORMAT, ...)
+
+ -- Function: int u8_asprintf (uint8_t **RESULTP, const char *FORMAT,
+ ...)
+
+ -- Function: uint8_t * u8_asnprintf (uint8_t *RESULTBUF, size_t
+ *LENGTHP, const char *FORMAT, ...)
+
+ -- Function: int u8_vsprintf (uint8_t *BUF, const char *FORMAT,
+ va_list ap)
+
+ -- Function: int u8_vsnprintf (uint8_t *BUF, size_t SIZE, const char
+ *FORMAT, va_list AP)
+
+ -- Function: int u8_vasprintf (uint8_t **RESULTP, const char *FORMAT,
+ va_list AP)
+
+ -- Function: uint8_t * u8_vasnprintf (uint8_t *resultbuf, size_t
+ *LENGTHP, const char *FORMAT, va_list AP)
+
+ The following functions take an UTF-8 format string and return a
+result in UTF-8 format.
+
+ -- Function: int u8_u8_sprintf (uint8_t *BUF, const uint8_t *FORMAT,
+ ...)
+
+ -- Function: int u8_u8_snprintf (uint8_t *BUF, size_t SIZE, const
+ uint8_t *FORMAT, ...)
+
+ -- Function: int u8_u8_asprintf (uint8_t **RESULTP, const uint8_t
+ *FORMAT, ...)
+
+ -- Function: uint8_t * u8_u8_asnprintf (uint8_t *resultbuf, size_t
+ *LENGTHP, const uint8_t *FORMAT, ...)
+
+ -- Function: int u8_u8_vsprintf (uint8_t *BUF, const uint8_t *FORMAT,
+ va_list AP)
+
+ -- Function: int u8_u8_vsnprintf (uint8_t *BUF, size_t SIZE, const
+ uint8_t *FORMAT, va_list AP)
+
+ -- Function: int u8_u8_vasprintf (uint8_t **RESULTP, const uint8_t
+ *FORMAT, va_list AP)
+
+ -- Function: uint8_t * u8_u8_vasnprintf (uint8_t *resultbuf, size_t
+ *LENGTHP, const uint8_t *FORMAT, va_list AP)
+
+ The following functions take an ASCII format string and return a
+result in UTF-16 format.
+
+ -- Function: int u16_sprintf (uint16_t *BUF, const char *FORMAT, ...)
+
+ -- Function: int u16_snprintf (uint16_t *BUF, size_t SIZE, const char
+ *FORMAT, ...)
+
+ -- Function: int u16_asprintf (uint16_t **RESULTP, const char *FORMAT,
+ ...)
+
+ -- Function: uint16_t * u16_asnprintf (uint16_t *RESULTBUF, size_t
+ *LENGTHP, const char *FORMAT, ...)
+
+ -- Function: int u16_vsprintf (uint16_t *BUF, const char *FORMAT,
+ va_list ap)
+
+ -- Function: int u16_vsnprintf (uint16_t *BUF, size_t SIZE, const char
+ *FORMAT, va_list AP)
+
+ -- Function: int u16_vasprintf (uint16_t **RESULTP, const char
+ *FORMAT, va_list AP)
+
+ -- Function: uint16_t * u16_vasnprintf (uint16_t *resultbuf, size_t
+ *LENGTHP, const char *FORMAT, va_list AP)
+
+ The following functions take an UTF-16 format string and return a
+result in UTF-16 format.
+
+ -- Function: int u16_u16_sprintf (uint16_t *BUF, const uint16_t
+ *FORMAT, ...)
+
+ -- Function: int u16_u16_snprintf (uint16_t *BUF, size_t SIZE, const
+ uint16_t *FORMAT, ...)
+
+ -- Function: int u16_u16_asprintf (uint16_t **RESULTP, const uint16_t
+ *FORMAT, ...)
+
+ -- Function: uint16_t * u16_u16_asnprintf (uint16_t *resultbuf, size_t
+ *LENGTHP, const uint16_t *FORMAT, ...)
+
+ -- Function: int u16_u16_vsprintf (uint16_t *BUF, const uint16_t
+ *FORMAT, va_list AP)
+
+ -- Function: int u16_u16_vsnprintf (uint16_t *BUF, size_t SIZE, const
+ uint16_t *FORMAT, va_list AP)
+
+ -- Function: int u16_u16_vasprintf (uint16_t **RESULTP, const uint16_t
+ *FORMAT, va_list AP)
+
+ -- Function: uint16_t * u16_u16_vasnprintf (uint16_t *resultbuf,
+ size_t *LENGTHP, const uint16_t *FORMAT, va_list AP)
+
+ The following functions take an ASCII format string and return a
+result in UTF-32 format.
+
+ -- Function: int u32_sprintf (uint32_t *BUF, const char *FORMAT, ...)
+
+ -- Function: int u32_snprintf (uint32_t *BUF, size_t SIZE, const char
+ *FORMAT, ...)
+
+ -- Function: int u32_asprintf (uint32_t **RESULTP, const char *FORMAT,
+ ...)
+
+ -- Function: uint32_t * u32_asnprintf (uint32_t *RESULTBUF, size_t
+ *LENGTHP, const char *FORMAT, ...)
+
+ -- Function: int u32_vsprintf (uint32_t *BUF, const char *FORMAT,
+ va_list ap)
+
+ -- Function: int u32_vsnprintf (uint32_t *BUF, size_t SIZE, const char
+ *FORMAT, va_list AP)
+
+ -- Function: int u32_vasprintf (uint32_t **RESULTP, const char
+ *FORMAT, va_list AP)
+
+ -- Function: uint32_t * u32_vasnprintf (uint32_t *resultbuf, size_t
+ *LENGTHP, const char *FORMAT, va_list AP)
+
+ The following functions take an UTF-32 format string and return a
+result in UTF-32 format.
+
+ -- Function: int u32_u32_sprintf (uint32_t *BUF, const uint32_t
+ *FORMAT, ...)
+
+ -- Function: int u32_u32_snprintf (uint32_t *BUF, size_t SIZE, const
+ uint32_t *FORMAT, ...)
+
+ -- Function: int u32_u32_asprintf (uint32_t **RESULTP, const uint32_t
+ *FORMAT, ...)
+
+ -- Function: uint32_t * u32_u32_asnprintf (uint32_t *resultbuf, size_t
+ *LENGTHP, const uint32_t *FORMAT, ...)
+
+ -- Function: int u32_u32_vsprintf (uint32_t *BUF, const uint32_t
+ *FORMAT, va_list AP)
+
+ -- Function: int u32_u32_vsnprintf (uint32_t *BUF, size_t SIZE, const
+ uint32_t *FORMAT, va_list AP)
+
+ -- Function: int u32_u32_vasprintf (uint32_t **RESULTP, const uint32_t
+ *FORMAT, va_list AP)
+
+ -- Function: uint32_t * u32_u32_vasnprintf (uint32_t *resultbuf,
+ size_t *LENGTHP, const uint32_t *FORMAT, va_list AP)
+
+ The following functions take an ASCII format string and produce
+output in locale encoding to a `FILE' stream.
+
+ -- Function: int ulc_fprintf (FILE *STREAM, const char *FORMAT, ...)
+
+ -- Function: int ulc_vfprintf (FILE *STREAM, const char *FORMAT,
+ va_list AP)
+
+
+File: libunistring.info, Node: uniname.h, Next: unictype.h, Prev: unistdio.h, Up: Top
+
+7 Names of Unicode characters `<uniname.h>'
+*******************************************
+
+ This include file implements the association between a Unicode
+character and its name.
+
+ The name of a Unicode character allows to distinguish it from other,
+similar looking characters. For example, the character `x' has the name
+`"LATIN SMALL LETTER X"' and is therefore different from the character
+named `"MULTIPLICATION SIGN"'.
+
+ -- Macro: unsigned int UNINAME_MAX
+ This macro expands to a constant that is the required size of
+ buffer for a Unicode character name.
+
+ -- Function: char * unicode_character_name (ucs4_t UC, char *BUF)
+ Looks up the name of a Unicode character, in uppercase ASCII. BUF
+ must point to a buffer, at least `UNINAME_MAX' bytes in size.
+ Returns the filled BUF, or NULL if the character does not have a
+ name.
+
+ -- Function: ucs4_t unicode_name_character (const char *NAME)
+ Looks up the Unicode character with a given name, in upper- or
+ lowercase ASCII. Returns the character if found, or
+ `UNINAME_INVALID' if not found.
+
+ -- Macro: ucs4_t UNINAME_INVALID
+ This macro expands to a constant that is a special return value of
+ the `unicode_name_character' function.
+
+
+File: libunistring.info, Node: unictype.h, Next: uniwidth.h, Prev: uniname.h, Up: Top
+
+8 Unicode character classification and properties `<unictype.h>'
+****************************************************************
+
+ This include file declares functions that classify Unicode characters
+and that test whether Unicode characters have specific properties.
+
+ The classification assigns a "general category" to every Unicode
+character. This is similar to the classification provided by ISO C in
+`<wctype.h>'.
+
+ Properties are the data that guides various text processing
+algorithms in the presence of specific Unicode characters.
+
+* Menu:
+
+* General category::
+* Canonical combining class::
+* Bidirectional category::
+* Decimal digit value::
+* Digit value::
+* Numeric value::
+* Mirrored character::
+* Properties::
+* Scripts::
+* Blocks::
+* ISO C and Java syntax::
+* Classifications like in ISO C::
+
+
+File: libunistring.info, Node: General category, Next: Canonical combining class, Up: unictype.h
+
+8.1 General category
+====================
+
+ Every Unicode character or code point has a _general category_
+assigned to it. This classification is important for most algorithms
+that work on Unicode text.
+
+ The GNU libunistring library provides two kinds of API for working
+with general categories. The object oriented API uses a variable to
+denote every predefined general category value or combinations thereof.
+The low-level API uses a bit mask instead. The advantage of the object
+oriented API is that if only a few predefined general category values
+are used, the data tables are relatively small. When you combine
+general category values (using `uc_general_category_or',
+`uc_general_category_and', or `uc_general_category_and_not'), or when
+you use the low level bit masks, a big table is used thats holds the
+complete general category information for all Unicode characters.
+
+* Menu:
+
+* Object oriented API::
+* Bit mask API::
+
+
+File: libunistring.info, Node: Object oriented API, Next: Bit mask API, Up: General category
+
+8.1.1 The object oriented API for general category
+--------------------------------------------------
+
+ -- Type: uc_general_category_t
+ This data type denotes a general category value. It is an
+ immediate type that can be copied by simple assignment, without
+ involving memory allocation. It is not an array type.
+
+ The following are the predefined general category value. Additional
+general categories may be added in the future.
+
+ -- Constant: uc_general_category_t UC_CATEGORY_L
+ -- Constant: uc_general_category_t UC_CATEGORY_Lu
+ -- Constant: uc_general_category_t UC_CATEGORY_Ll
+ -- Constant: uc_general_category_t UC_CATEGORY_Lt
+ -- Constant: uc_general_category_t UC_CATEGORY_Lm
+ -- Constant: uc_general_category_t UC_CATEGORY_Lo
+ -- Constant: uc_general_category_t UC_CATEGORY_M
+ -- Constant: uc_general_category_t UC_CATEGORY_Mn
+ -- Constant: uc_general_category_t UC_CATEGORY_Mc
+ -- Constant: uc_general_category_t UC_CATEGORY_Me
+ -- Constant: uc_general_category_t UC_CATEGORY_N
+ -- Constant: uc_general_category_t UC_CATEGORY_Nd
+ -- Constant: uc_general_category_t UC_CATEGORY_Nl
+ -- Constant: uc_general_category_t UC_CATEGORY_No
+ -- Constant: uc_general_category_t UC_CATEGORY_P
+ -- Constant: uc_general_category_t UC_CATEGORY_Pc
+ -- Constant: uc_general_category_t UC_CATEGORY_Pd
+ -- Constant: uc_general_category_t UC_CATEGORY_Ps
+ -- Constant: uc_general_category_t UC_CATEGORY_Pe
+ -- Constant: uc_general_category_t UC_CATEGORY_Pi
+ -- Constant: uc_general_category_t UC_CATEGORY_Pf
+ -- Constant: uc_general_category_t UC_CATEGORY_Po
+ -- Constant: uc_general_category_t UC_CATEGORY_S
+ -- Constant: uc_general_category_t UC_CATEGORY_Sm
+ -- Constant: uc_general_category_t UC_CATEGORY_Sc
+ -- Constant: uc_general_category_t UC_CATEGORY_Sk
+ -- Constant: uc_general_category_t UC_CATEGORY_So
+ -- Constant: uc_general_category_t UC_CATEGORY_Z
+ -- Constant: uc_general_category_t UC_CATEGORY_Zs
+ -- Constant: uc_general_category_t UC_CATEGORY_Zl
+ -- Constant: uc_general_category_t UC_CATEGORY_Zp
+ -- Constant: uc_general_category_t UC_CATEGORY_C
+ -- Constant: uc_general_category_t UC_CATEGORY_Cc
+ -- Constant: uc_general_category_t UC_CATEGORY_Cf
+ -- Constant: uc_general_category_t UC_CATEGORY_Cs
+ -- Constant: uc_general_category_t UC_CATEGORY_Co
+ -- Constant: uc_general_category_t UC_CATEGORY_Cn
+
+ The following are alias names for predefined General category values.
+
+ -- Macro: uc_general_category_t UC_LETTER
+ This is another name for `UC_CATEGORY_L'.
+
+ -- Macro: uc_general_category_t UC_UPPERCASE_LETTER
+ This is another name for `UC_CATEGORY_Lu'.
+
+ -- Macro: uc_general_category_t UC_LOWERCASE_LETTER
+ This is another name for `UC_CATEGORY_Ll'.
+
+ -- Macro: uc_general_category_t UC_TITLECASE_LETTER
+ This is another name for `UC_CATEGORY_Lt'.
+
+ -- Macro: uc_general_category_t UC_MODIFIER_LETTER
+ This is another name for `UC_CATEGORY_Lm'.
+
+ -- Macro: uc_general_category_t UC_OTHER_LETTER
+ This is another name for `UC_CATEGORY_Lo'.
+
+ -- Macro: uc_general_category_t UC_MARK
+ This is another name for `UC_CATEGORY_M'.
+
+ -- Macro: uc_general_category_t UC_NON_SPACING_MARK
+ This is another name for `UC_CATEGORY_Mn'.
+
+ -- Macro: uc_general_category_t UC_COMBINING_SPACING_MARK
+ This is another name for `UC_CATEGORY_Mc'.
+
+ -- Macro: uc_general_category_t UC_ENCLOSING_MARK
+ This is another name for `UC_CATEGORY_Me'.
+
+ -- Macro: uc_general_category_t UC_NUMBER
+ This is another name for `UC_CATEGORY_N'.
+
+ -- Macro: uc_general_category_t UC_DECIMAL_DIGIT_NUMBER
+ This is another name for `UC_CATEGORY_Nd'.
+
+ -- Macro: uc_general_category_t UC_LETTER_NUMBER
+ This is another name for `UC_CATEGORY_Nl'.
+
+ -- Macro: uc_general_category_t UC_OTHER_NUMBER
+ This is another name for `UC_CATEGORY_No'.
+
+ -- Macro: uc_general_category_t UC_PUNCTUATION
+ This is another name for `UC_CATEGORY_P'.
+
+ -- Macro: uc_general_category_t UC_CONNECTOR_PUNCTUATION
+ This is another name for `UC_CATEGORY_Pc'.
+
+ -- Macro: uc_general_category_t UC_DASH_PUNCTUATION
+ This is another name for `UC_CATEGORY_Pd'.
+
+ -- Macro: uc_general_category_t UC_OPEN_PUNCTUATION
+ This is another name for `UC_CATEGORY_Ps' ("start punctuation").
+
+ -- Macro: uc_general_category_t UC_CLOSE_PUNCTUATION
+ This is another name for `UC_CATEGORY_Pe' ("end punctuation").
+
+ -- Macro: uc_general_category_t UC_INITIAL_QUOTE_PUNCTUATION
+ This is another name for `UC_CATEGORY_Pi'.
+
+ -- Macro: uc_general_category_t UC_FINAL_QUOTE_PUNCTUATION
+ This is another name for `UC_CATEGORY_Pf'.
+
+ -- Macro: uc_general_category_t UC_OTHER_PUNCTUATION
+ This is another name for `UC_CATEGORY_Po'.
+
+ -- Macro: uc_general_category_t UC_SYMBOL
+ This is another name for `UC_CATEGORY_S'.
+
+ -- Macro: uc_general_category_t UC_MATH_SYMBOL
+ This is another name for `UC_CATEGORY_Sm'.
+
+ -- Macro: uc_general_category_t UC_CURRENCY_SYMBOL
+ This is another name for `UC_CATEGORY_Sc'.
+
+ -- Macro: uc_general_category_t UC_MODIFIER_SYMBOL
+ This is another name for `UC_CATEGORY_Sk'.
+
+ -- Macro: uc_general_category_t UC_OTHER_SYMBOL
+ This is another name for `UC_CATEGORY_So'.
+
+ -- Macro: uc_general_category_t UC_SEPARATOR
+ This is another name for `UC_CATEGORY_Z'.
+
+ -- Macro: uc_general_category_t UC_SPACE_SEPARATOR
+ This is another name for `UC_CATEGORY_Zs'.
+
+ -- Macro: uc_general_category_t UC_LINE_SEPARATOR
+ This is another name for `UC_CATEGORY_Zl'.
+
+ -- Macro: uc_general_category_t UC_PARAGRAPH_SEPARATOR
+ This is another name for `UC_CATEGORY_Zp'.
+
+ -- Macro: uc_general_category_t UC_OTHER
+ This is another name for `UC_CATEGORY_C'.
+
+ -- Macro: uc_general_category_t UC_CONTROL
+ This is another name for `UC_CATEGORY_Cc'.
+
+ -- Macro: uc_general_category_t UC_FORMAT
+ This is another name for `UC_CATEGORY_Cf'.
+
+ -- Macro: uc_general_category_t UC_SURROGATE
+ This is another name for `UC_CATEGORY_Cs'. All code points in this
+ category are invalid characters.
+
+ -- Macro: uc_general_category_t UC_PRIVATE_USE
+ This is another name for `UC_CATEGORY_Co'.
+
+ -- Macro: uc_general_category_t UC_UNASSIGNED
+ This is another name for `UC_CATEGORY_Cn'. Some code points in
+ this category are invalid characters.
+
+ The following functions combine general categories, like in a
+boolean algebra, except that there is no `not' operation.
+
+ -- Function: uc_general_category_t uc_general_category_or
+ (uc_general_category_t CATEGORY1, uc_general_category_t
+ CATEGORY2)
+ Returns the union of two general categories. This corresponds to
+ the unions of the two sets of characters.
+
+ -- Function: uc_general_category_t uc_general_category_and
+ (uc_general_category_t CATEGORY1, uc_general_category_t
+ CATEGORY2)
+ Returns the intersection of two general categories as bit masks.
+ This _does not_ correspond to the intersection of the two sets of
+ characters.
+
+ -- Function: uc_general_category_t uc_general_category_and_not
+ (uc_general_category_t CATEGORY1, uc_general_category_t
+ CATEGORY2)
+ Returns the intersection of a general category with the complement
+ of a second general category, as bit masks. This _does not_
+ correspond to the intersection with complement, when viewing the
+ categories as sets of characters.
+
+ The following functions associate general categories with their name.
+
+ -- Function: const char * uc_general_category_name
+ (uc_general_category_t CATEGORY)
+ Returns the name of a general category. Returns NULL if the
+ general category corresponds to a bit mask that does not have a
+ name.
+
+ -- Function: uc_general_category_t uc_general_category_byname (const
+ char *CATEGORY_NAME)
+ Returns the general category given by name, e.g. `"Lu"'.
+
+ The following functions view general categories as sets of Unicode
+characters.
+
+ -- Function: uc_general_category_t uc_general_category (ucs4_t UC)
+ Returns the general category of a Unicode character.
+
+ This function uses a big table.
+
+ -- Function: bool uc_is_general_category (ucs4_t UC,
+ uc_general_category_t CATEGORY)
+ Tests whether a Unicode character belongs to a given category.
+ The CATEGORY argument can be a predefined general category or the
+ combination of several predefined general categories.
+
+
+File: libunistring.info, Node: Bit mask API, Prev: Object oriented API, Up: General category
+
+8.1.2 The bit mask API for general category
+-------------------------------------------
+
+ The following are the predefined general category value as bit masks.
+Additional general categories may be added in the future.
+
+ -- Macro: uint32_t UC_CATEGORY_MASK_L
+ -- Macro: uint32_t UC_CATEGORY_MASK_Lu
+ -- Macro: uint32_t UC_CATEGORY_MASK_Ll
+ -- Macro: uint32_t UC_CATEGORY_MASK_Lt
+ -- Macro: uint32_t UC_CATEGORY_MASK_Lm
+ -- Macro: uint32_t UC_CATEGORY_MASK_Lo
+ -- Macro: uint32_t UC_CATEGORY_MASK_M
+ -- Macro: uint32_t UC_CATEGORY_MASK_Mn
+ -- Macro: uint32_t UC_CATEGORY_MASK_Mc
+ -- Macro: uint32_t UC_CATEGORY_MASK_Me
+ -- Macro: uint32_t UC_CATEGORY_MASK_N
+ -- Macro: uint32_t UC_CATEGORY_MASK_Nd
+ -- Macro: uint32_t UC_CATEGORY_MASK_Nl
+ -- Macro: uint32_t UC_CATEGORY_MASK_No
+ -- Macro: uint32_t UC_CATEGORY_MASK_P
+ -- Macro: uint32_t UC_CATEGORY_MASK_Pc
+ -- Macro: uint32_t UC_CATEGORY_MASK_Pd
+ -- Macro: uint32_t UC_CATEGORY_MASK_Ps
+ -- Macro: uint32_t UC_CATEGORY_MASK_Pe
+ -- Macro: uint32_t UC_CATEGORY_MASK_Pi
+ -- Macro: uint32_t UC_CATEGORY_MASK_Pf
+ -- Macro: uint32_t UC_CATEGORY_MASK_Po
+ -- Macro: uint32_t UC_CATEGORY_MASK_S
+ -- Macro: uint32_t UC_CATEGORY_MASK_Sm
+ -- Macro: uint32_t UC_CATEGORY_MASK_Sc
+ -- Macro: uint32_t UC_CATEGORY_MASK_Sk
+ -- Macro: uint32_t UC_CATEGORY_MASK_So
+ -- Macro: uint32_t UC_CATEGORY_MASK_Z
+ -- Macro: uint32_t UC_CATEGORY_MASK_Zs
+ -- Macro: uint32_t UC_CATEGORY_MASK_Zl
+ -- Macro: uint32_t UC_CATEGORY_MASK_Zp
+ -- Macro: uint32_t UC_CATEGORY_MASK_C
+ -- Macro: uint32_t UC_CATEGORY_MASK_Cc
+ -- Macro: uint32_t UC_CATEGORY_MASK_Cf
+ -- Macro: uint32_t UC_CATEGORY_MASK_Cs
+ -- Macro: uint32_t UC_CATEGORY_MASK_Co
+ -- Macro: uint32_t UC_CATEGORY_MASK_Cn
+
+ The following function views general categories as sets of Unicode
+characters.
+
+ -- Function: bool uc_is_general_category_withtable (ucs4_t UC,
+ uint32_t BITMASK)
+ Tests whether a Unicode character belongs to a given category.
+ The BITMASK argument can be a predefined general category bitmask
+ or the combination of several predefined general category bitmasks.
+
+ This function uses a big table comprising all general categories.
+
+
+File: libunistring.info, Node: Canonical combining class, Next: Bidirectional category, Prev: General category, Up: unictype.h
+
+8.2 Canonical combining class
+=============================
+
+ Every Unicode character or code point has a _canonical combining
+class_ assigned to it.
+
+ What is the meaning of the canonical combining class? Essentially,
+it indicates the priority with which a combining character is attached
+to its base character. The characters for which the canonical
+combining class is 0 are the base characters, and the characters for
+which it is greater than 0 are the combining characters. Combining
+characters are rendered near/attached/around their base character, and
+combining characters with small combining classes are attached "first"
+or "closer" to the base character.
+
+ The canonical combining class of a character is a number in the range
+0..255. The possible values are described in the Unicode Character
+Database `http://www.unicode.org/Public/UNIDATA/UCD.html'. The list
+here is not definitive; more values can be added in future versions.
+
+ -- Constant: int UC_CCC_NR
+ The canonical combining class value for "Not Reordered" characters.
+ The value is 0.
+
+ -- Constant: int UC_CCC_OV
+ The canonical combining class value for "Overlay" characters.
+
+ -- Constant: int UC_CCC_NK
+ The canonical combining class value for "Nukta" characters.
+
+ -- Constant: int UC_CCC_KV
+ The canonical combining class value for "Kana Voicing" characters.
+
+ -- Constant: int UC_CCC_VR
+ The canonical combining class value for "Virama" characters.
+
+ -- Constant: int UC_CCC_ATBL
+ The canonical combining class value for "Attached Below Left"
+ characters.
+
+ -- Constant: int UC_CCC_ATB
+ The canonical combining class value for "Attached Below"
+ characters.
+
+ -- Constant: int UC_CCC_ATAR
+ The canonical combining class value for "Attached Above Right"
+ characters.
+
+ -- Constant: int UC_CCC_BL
+ The canonical combining class value for "Below Left" characters.
+
+ -- Constant: int UC_CCC_B
+ The canonical combining class value for "Below" characters.
+
+ -- Constant: int UC_CCC_BR
+ The canonical combining class value for "Below Right" characters.
+
+ -- Constant: int UC_CCC_L
+ The canonical combining class value for "Left" characters.
+
+ -- Constant: int UC_CCC_R
+ The canonical combining class value for "Right" characters.
+
+ -- Constant: int UC_CCC_AL
+ The canonical combining class value for "Above Left" characters.
+
+ -- Constant: int UC_CCC_A
+ The canonical combining class value for "Above" characters.
+
+ -- Constant: int UC_CCC_AR
+ The canonical combining class value for "Above Right" characters.
+
+ -- Constant: int UC_CCC_DB
+ The canonical combining class value for "Double Below" characters.
+
+ -- Constant: int UC_CCC_DA
+ The canonical combining class value for "Double Above" characters.
+
+ -- Constant: int UC_CCC_IS
+ The canonical combining class value for "Iota Subscript"
+ characters.
+
+ The following function looks up the canonical combining class of a
+character.
+
+ -- Function: int uc_combining_class (ucs4_t UC)
+ Returns the canonical combining class of a Unicode character.
+
+
+File: libunistring.info, Node: Bidirectional category, Next: Decimal digit value, Prev: Canonical combining class, Up: unictype.h
+
+8.3 Bidirectional category
+==========================
+
+ Every Unicode character or code point has a _bidirectional category_
+assigned to it.
+
+ The bidirectional category guides the bidirectional algorithm
+(`http://www.unicode.org/reports/tr9/'). The possible values are the
+following.
+
+ -- Constant: int UC_BIDI_L
+ The bidirectional category for `Left-to-Right`" characters.
+
+ -- Constant: int UC_BIDI_LRE
+ The bidirectional category for "Left-to-Right Embedding"
+ characters.
+
+ -- Constant: int UC_BIDI_LRO
+ The bidirectional category for "Left-to-Right Override" characters.
+
+ -- Constant: int UC_BIDI_R
+ The bidirectional category for "Right-to-Left" characters.
+
+ -- Constant: int UC_BIDI_AL
+ The bidirectional category for "Right-to-Left Arabic" characters.
+
+ -- Constant: int UC_BIDI_RLE
+ The bidirectional category for "Right-to-Left Embedding"
+ characters.
+
+ -- Constant: int UC_BIDI_RLO
+ The bidirectional category for "Right-to-Left Override" characters.
+
+ -- Constant: int UC_BIDI_PDF
+ The bidirectional category for "Pop Directional Format" characters.
+
+ -- Constant: int UC_BIDI_EN
+ The bidirectional category for "European Number" characters.
+
+ -- Constant: int UC_BIDI_ES
+ The bidirectional category for "European Number Separator"
+ characters.
+
+ -- Constant: int UC_BIDI_ET
+ The bidirectional category for "European Number Terminator"
+ characters.
+
+ -- Constant: int UC_BIDI_AN
+ The bidirectional category for "Arabic Number" characters.
+
+ -- Constant: int UC_BIDI_CS
+ The bidirectional category for "Common Number Separator"
+ characters.
+
+ -- Constant: int UC_BIDI_NSM
+ The bidirectional category for "Non-Spacing Mark" characters.
+
+ -- Constant: int UC_BIDI_BN
+ The bidirectional category for "Boundary Neutral" characters.
+
+ -- Constant: int UC_BIDI_B
+ The bidirectional category for "Paragraph Separator" characters.
+
+ -- Constant: int UC_BIDI_S
+ The bidirectional category for "Segment Separator" characters.
+
+ -- Constant: int UC_BIDI_WS
+ The bidirectional category for "Whitespace" characters.
+
+ -- Constant: int UC_BIDI_ON
+ The bidirectional category for "Other Neutral" characters.
+
+ The following functions implement the association between a
+bidirectional category and its name.
+
+ -- Function: const char * uc_bidi_category_name (int CATEGORY)
+ Returns the name of a bidirectional category.
+
+ -- Function: int uc_bidi_category_byname (const char *CATEGORY_NAME)
+ Returns the bidirectional category given by name, e.g. `"LRE"'.
+
+ The following functions view bidirectional categories as sets of
+Unicode characters.
+
+ -- Function: int uc_bidi_category (ucs4_t UC)
+ Returns the bidirectional category of a Unicode character.
+
+ -- Function: bool uc_is_bidi_category (ucs4_t UC, int CATEGORY)
+ Tests whether a Unicode character belongs to a given bidirectional
+ category.
+
+
+File: libunistring.info, Node: Decimal digit value, Next: Digit value, Prev: Bidirectional category, Up: unictype.h
+
+8.4 Decimal digit value
+=======================
+
+ Decimal digits (like the digits from `0' to `9') exist in many
+scripts. The following function converts a decimal digit character to
+its numerical value.
+
+ -- Function: int uc_decimal_value (ucs4_t UC)
+ Returns the decimal digit value of a Unicode character. The
+ return value is an integer in the range 0..9, or -1 for characters
+ that do not represent a decimal digit.
+
+
+File: libunistring.info, Node: Digit value, Next: Numeric value, Prev: Decimal digit value, Up: unictype.h
+
+8.5 Digit value
+===============
+
+ Digit characters are like decimal digit characters, possibly in
+special forms, like as superscript, subscript, or circled. The
+following function converts a digit character to its numerical value.
+
+ -- Function: int uc_digit_value (ucs4_t UC)
+ Returns the digit value of a Unicode character. The return value
+ is an integer in the range 0..9, or -1 for characters that do not
+ represent a digit.
+
+
+File: libunistring.info, Node: Numeric value, Next: Mirrored character, Prev: Digit value, Up: unictype.h
+
+8.6 Numeric value
+=================
+
+ There are also characters that represent numbers without a digit
+system, like the Roman numerals, and fractional numbers, like 1/4 or
+3/4.
+
+ The following type represents the numeric value of a Unicode
+character.
+
+ -- Type: uc_fraction_t
+ This is a structure type with the following fields:
+ int numerator;
+ int denominator;
+ An integer N is represented by `numerator = N', `denominator = 1'.
+
+ The following function converts a number character to its numerical
+value.
+
+ -- Function: uc_fraction_t uc_numeric_value (ucs4_t UC)
+ Returns the numeric value of a Unicode character. The return
+ value is a fraction, or the pseudo-fraction `{ 0, 0 }' for
+ characters that do not represent a number.
+
+
+File: libunistring.info, Node: Mirrored character, Next: Properties, Prev: Numeric value, Up: unictype.h
+
+8.7 Mirrored character
+======================
+
+ Character mirroring is used to associate the closing parenthesis
+character to the opening parenthesis character, the closing brace
+character with the opening brace character, and so on.
+
+ The following function looks up the mirrored character of a Unicode
+character.
+
+ -- Function: bool uc_mirror_char (ucs4_t UC, ucs4_t *PUC)
+ Stores the mirrored character of a Unicode character UC in `*PUC'
+ and returns `true', if it exists. Otherwise it stores UC
+ unmodified in `*PUC' and returns `false'.
+
+
+File: libunistring.info, Node: Properties, Next: Scripts, Prev: Mirrored character, Up: unictype.h
+
+8.8 Properties
+==============
+
+ This section defines boolean properties of Unicode characters. This
+means, a character either has the given property or does not have it.
+In other words, the property can be viewed as a subset of the set of
+Unicode characters.
+
+ The GNU libunistring library provides two kinds of API for working
+with properties. The object oriented API uses a type `uc_property_t'
+to designate a property. In the function-based API, which is a bit more
+low level, a property is merely a function.
+
+* Menu:
+
+* Properties as objects::
+* Properties as functions::
+
+
+File: libunistring.info, Node: Properties as objects, Next: Properties as functions, Up: Properties
+
+8.8.1 Properties as objects - the object oriented API
+-----------------------------------------------------
+
+ The following type designates a property on Unicode characters.
+
+ -- Type: uc_property_t
+ This data type denotes a boolean property on Unicode characters.
+ It is an immediate type that can be copied by simple assignment,
+ without involving memory allocation. It is not an array type.
+
+ Many Unicode properties are predefined.
+
+ The following are general properties.
+
+ -- Constant: uc_property_t UC_PROPERTY_WHITE_SPACE
+ -- Constant: uc_property_t UC_PROPERTY_ALPHABETIC
+ -- Constant: uc_property_t UC_PROPERTY_OTHER_ALPHABETIC
+ -- Constant: uc_property_t UC_PROPERTY_NOT_A_CHARACTER
+ -- Constant: uc_property_t UC_PROPERTY_DEFAULT_IGNORABLE_CODE_POINT
+ -- Constant: uc_property_t
+UC_PROPERTY_OTHER_DEFAULT_IGNORABLE_CODE_POINT
+ -- Constant: uc_property_t UC_PROPERTY_DEPRECATED
+ -- Constant: uc_property_t UC_PROPERTY_LOGICAL_ORDER_EXCEPTION
+ -- Constant: uc_property_t UC_PROPERTY_VARIATION_SELECTOR
+ -- Constant: uc_property_t UC_PROPERTY_PRIVATE_USE
+ -- Constant: uc_property_t UC_PROPERTY_UNASSIGNED_CODE_VALUE
+
+ The following properties are related to case folding.
+
+ -- Constant: uc_property_t UC_PROPERTY_UPPERCASE
+ -- Constant: uc_property_t UC_PROPERTY_OTHER_UPPERCASE
+ -- Constant: uc_property_t UC_PROPERTY_LOWERCASE
+ -- Constant: uc_property_t UC_PROPERTY_OTHER_LOWERCASE
+ -- Constant: uc_property_t UC_PROPERTY_TITLECASE
+ -- Constant: uc_property_t UC_PROPERTY_SOFT_DOTTED
+
+ The following properties are related to identifiers.
+
+ -- Constant: uc_property_t UC_PROPERTY_ID_START
+ -- Constant: uc_property_t UC_PROPERTY_OTHER_ID_START
+ -- Constant: uc_property_t UC_PROPERTY_ID_CONTINUE
+ -- Constant: uc_property_t UC_PROPERTY_OTHER_ID_CONTINUE
+ -- Constant: uc_property_t UC_PROPERTY_XID_START
+ -- Constant: uc_property_t UC_PROPERTY_XID_CONTINUE
+ -- Constant: uc_property_t UC_PROPERTY_PATTERN_WHITE_SPACE
+ -- Constant: uc_property_t UC_PROPERTY_PATTERN_SYNTAX
+
+ The following properties have an influence on shaping and rendering.
+
+ -- Constant: uc_property_t UC_PROPERTY_JOIN_CONTROL
+ -- Constant: uc_property_t UC_PROPERTY_GRAPHEME_BASE
+ -- Constant: uc_property_t UC_PROPERTY_GRAPHEME_EXTEND
+ -- Constant: uc_property_t UC_PROPERTY_OTHER_GRAPHEME_EXTEND
+ -- Constant: uc_property_t UC_PROPERTY_GRAPHEME_LINK
+
+ The following properties relate to bidirectional reordering.
+
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_CONTROL
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_LEFT_TO_RIGHT
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_HEBREW_RIGHT_TO_LEFT
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_ARABIC_RIGHT_TO_LEFT
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_EUROPEAN_DIGIT
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_EUR_NUM_SEPARATOR
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_EUR_NUM_TERMINATOR
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_ARABIC_DIGIT
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_COMMON_SEPARATOR
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_BLOCK_SEPARATOR
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_SEGMENT_SEPARATOR
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_WHITESPACE
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_NON_SPACING_MARK
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_BOUNDARY_NEUTRAL
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_PDF
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_EMBEDDING_OR_OVERRIDE
+ -- Constant: uc_property_t UC_PROPERTY_BIDI_OTHER_NEUTRAL
+
+ The following properties deal with number representations.
+
+ -- Constant: uc_property_t UC_PROPERTY_HEX_DIGIT
+ -- Constant: uc_property_t UC_PROPERTY_ASCII_HEX_DIGIT
+
+ The following properties deal with CJK.
+
+ -- Constant: uc_property_t UC_PROPERTY_IDEOGRAPHIC
+ -- Constant: uc_property_t UC_PROPERTY_UNIFIED_IDEOGRAPH
+ -- Constant: uc_property_t UC_PROPERTY_RADICAL
+ -- Constant: uc_property_t UC_PROPERTY_IDS_BINARY_OPERATOR
+ -- Constant: uc_property_t UC_PROPERTY_IDS_TRINARY_OPERATOR
+
+ Other miscellaneous properties are:
+
+ -- Constant: uc_property_t UC_PROPERTY_ZERO_WIDTH
+ -- Constant: uc_property_t UC_PROPERTY_SPACE
+ -- Constant: uc_property_t UC_PROPERTY_NON_BREAK
+ -- Constant: uc_property_t UC_PROPERTY_ISO_CONTROL
+ -- Constant: uc_property_t UC_PROPERTY_FORMAT_CONTROL
+ -- Constant: uc_property_t UC_PROPERTY_DASH
+ -- Constant: uc_property_t UC_PROPERTY_HYPHEN
+ -- Constant: uc_property_t UC_PROPERTY_PUNCTUATION
+ -- Constant: uc_property_t UC_PROPERTY_LINE_SEPARATOR
+ -- Constant: uc_property_t UC_PROPERTY_PARAGRAPH_SEPARATOR
+ -- Constant: uc_property_t UC_PROPERTY_QUOTATION_MARK
+ -- Constant: uc_property_t UC_PROPERTY_SENTENCE_TERMINAL
+ -- Constant: uc_property_t UC_PROPERTY_TERMINAL_PUNCTUATION
+ -- Constant: uc_property_t UC_PROPERTY_CURRENCY_SYMBOL
+ -- Constant: uc_property_t UC_PROPERTY_MATH
+ -- Constant: uc_property_t UC_PROPERTY_OTHER_MATH
+ -- Constant: uc_property_t UC_PROPERTY_PAIRED_PUNCTUATION
+ -- Constant: uc_property_t UC_PROPERTY_LEFT_OF_PAIR
+ -- Constant: uc_property_t UC_PROPERTY_COMBINING
+ -- Constant: uc_property_t UC_PROPERTY_COMPOSITE
+ -- Constant: uc_property_t UC_PROPERTY_DECIMAL_DIGIT
+ -- Constant: uc_property_t UC_PROPERTY_NUMERIC
+ -- Constant: uc_property_t UC_PROPERTY_DIACRITIC
+ -- Constant: uc_property_t UC_PROPERTY_EXTENDER
+ -- Constant: uc_property_t UC_PROPERTY_IGNORABLE_CONTROL
+
+ The following function looks up a property by its name.
+
+ -- Function: uc_property_t uc_property_byname (const char
+ *PROPERTY_NAME)
+ Returns the property given by name, e.g. `"White space"'. If a
+ property with the given name exists, the result will satisfy the
+ `uc_property_is_valid' predicate. Otherwise the result will not
+ satisfy this predicate and must not be passed to functions that
+ expect an `uc_property_t' argument.
+
+ This function references a big table of all predefined properties.
+ Its use can significantly increase the size of your application.
+
+ -- Function: bool uc_property_is_valid (uc_property_t property)
+ Returns `true' when the given property is valid, or `false'
+ otherwise.
+
+ The following function views a property as a set of Unicode
+characters.
+
+ -- Function: bool uc_is_property (ucs4_t UC, uc_property_t PROPERTY)
+ Tests whether the Unicode character UC has the given property.
+
+
+File: libunistring.info, Node: Properties as functions, Prev: Properties as objects, Up: Properties
+
+8.8.2 Properties as functions - the functional API
+--------------------------------------------------
+
+ The following are general properties.
+
+ -- Function: bool uc_is_property_white_space (ucs4_t UC)
+ -- Function: bool uc_is_property_alphabetic (ucs4_t UC)
+ -- Function: bool uc_is_property_other_alphabetic (ucs4_t UC)
+ -- Function: bool uc_is_property_not_a_character (ucs4_t UC)
+ -- Function: bool uc_is_property_default_ignorable_code_point (ucs4_t
+ UC)
+ -- Function: bool uc_is_property_other_default_ignorable_code_point
+ (ucs4_t UC)
+ -- Function: bool uc_is_property_deprecated (ucs4_t UC)
+ -- Function: bool uc_is_property_logical_order_exception (ucs4_t UC)
+ -- Function: bool uc_is_property_variation_selector (ucs4_t UC)
+ -- Function: bool uc_is_property_private_use (ucs4_t UC)
+ -- Function: bool uc_is_property_unassigned_code_value (ucs4_t UC)
+
+ The following properties are related to case folding.
+
+ -- Function: bool uc_is_property_uppercase (ucs4_t UC)
+ -- Function: bool uc_is_property_other_uppercase (ucs4_t UC)
+ -- Function: bool uc_is_property_lowercase (ucs4_t UC)
+ -- Function: bool uc_is_property_other_lowercase (ucs4_t UC)
+ -- Function: bool uc_is_property_titlecase (ucs4_t UC)
+ -- Function: bool uc_is_property_soft_dotted (ucs4_t UC)
+
+ The following properties are related to identifiers.
+
+ -- Function: bool uc_is_property_id_start (ucs4_t UC)
+ -- Function: bool uc_is_property_other_id_start (ucs4_t UC)
+ -- Function: bool uc_is_property_id_continue (ucs4_t UC)
+ -- Function: bool uc_is_property_other_id_continue (ucs4_t UC)
+ -- Function: bool uc_is_property_xid_start (ucs4_t UC)
+ -- Function: bool uc_is_property_xid_continue (ucs4_t UC)
+ -- Function: bool uc_is_property_pattern_white_space (ucs4_t UC)
+ -- Function: bool uc_is_property_pattern_syntax (ucs4_t UC)
+
+ The following properties have an influence on shaping and rendering.
+
+ -- Function: bool uc_is_property_join_control (ucs4_t UC)
+ -- Function: bool uc_is_property_grapheme_base (ucs4_t UC)
+ -- Function: bool uc_is_property_grapheme_extend (ucs4_t UC)
+ -- Function: bool uc_is_property_other_grapheme_extend (ucs4_t UC)
+ -- Function: bool uc_is_property_grapheme_link (ucs4_t UC)
+
+ The following properties relate to bidirectional reordering.
+
+ -- Function: bool uc_is_property_bidi_control (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_left_to_right (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_hebrew_right_to_left (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_arabic_right_to_left (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_european_digit (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_eur_num_separator (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_eur_num_terminator (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_arabic_digit (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_common_separator (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_block_separator (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_segment_separator (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_whitespace (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_non_spacing_mark (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_boundary_neutral (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_pdf (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_embedding_or_override (ucs4_t UC)
+ -- Function: bool uc_is_property_bidi_other_neutral (ucs4_t UC)
+
+ The following properties deal with number representations.
+
+ -- Function: bool uc_is_property_hex_digit (ucs4_t UC)
+ -- Function: bool uc_is_property_ascii_hex_digit (ucs4_t UC)
+
+ The following properties deal with CJK.
+
+ -- Function: bool uc_is_property_ideographic (ucs4_t UC)
+ -- Function: bool uc_is_property_unified_ideograph (ucs4_t UC)
+ -- Function: bool uc_is_property_radical (ucs4_t UC)
+ -- Function: bool uc_is_property_ids_binary_operator (ucs4_t UC)
+ -- Function: bool uc_is_property_ids_trinary_operator (ucs4_t UC)
+
+ Other miscellaneous properties are:
+
+ -- Function: bool uc_is_property_zero_width (ucs4_t UC)
+ -- Function: bool uc_is_property_space (ucs4_t UC)
+ -- Function: bool uc_is_property_non_break (ucs4_t UC)
+ -- Function: bool uc_is_property_iso_control (ucs4_t UC)
+ -- Function: bool uc_is_property_format_control (ucs4_t UC)
+ -- Function: bool uc_is_property_dash (ucs4_t UC)
+ -- Function: bool uc_is_property_hyphen (ucs4_t UC)
+ -- Function: bool uc_is_property_punctuation (ucs4_t UC)
+ -- Function: bool uc_is_property_line_separator (ucs4_t UC)
+ -- Function: bool uc_is_property_paragraph_separator (ucs4_t UC)
+ -- Function: bool uc_is_property_quotation_mark (ucs4_t UC)
+ -- Function: bool uc_is_property_sentence_terminal (ucs4_t UC)
+ -- Function: bool uc_is_property_terminal_punctuation (ucs4_t UC)
+ -- Function: bool uc_is_property_currency_symbol (ucs4_t UC)
+ -- Function: bool uc_is_property_math (ucs4_t UC)
+ -- Function: bool uc_is_property_other_math (ucs4_t UC)
+ -- Function: bool uc_is_property_paired_punctuation (ucs4_t UC)
+ -- Function: bool uc_is_property_left_of_pair (ucs4_t UC)
+ -- Function: bool uc_is_property_combining (ucs4_t UC)
+ -- Function: bool uc_is_property_composite (ucs4_t UC)
+ -- Function: bool uc_is_property_decimal_digit (ucs4_t UC)
+ -- Function: bool uc_is_property_numeric (ucs4_t UC)
+ -- Function: bool uc_is_property_diacritic (ucs4_t UC)
+ -- Function: bool uc_is_property_extender (ucs4_t UC)
+ -- Function: bool uc_is_property_ignorable_control (ucs4_t UC)
+
+
+File: libunistring.info, Node: Scripts, Next: Blocks, Prev: Properties, Up: unictype.h
+
+8.9 Scripts
+===========
+
+ The Unicode characters are subdivided into scripts.
+
+ The following type is used to represent a script:
+
+ -- Type: uc_script_t
+ This data type is a structure type that refers to statically
+ allocated read-only data. It contains the following fields:
+ const char *name;
+
+ The `name' field contains the name of the script.
+
+ The following functions look up a script.
+
+ -- Function: const uc_script_t * uc_script (ucs4_t UC)
+ Returns the script of a Unicode character. Returns NULL if UC
+ does not belong to any script.
+
+ -- Function: const uc_script_t * uc_script_byname (const char
+ *SCRIPT_NAME)
+ Returns the script given by its name, e.g. `"HAN"'. Returns NULL
+ if a script with the given name does not exist.
+
+ The following function views a script as a set of Unicode characters.
+
+ -- Function: bool uc_is_script (ucs4_t UC, const uc_script_t *SCRIPT)
+ Tests whether a Unicode character belongs to a given script.
+
+ The following gives a global picture of all scripts.
+
+ -- Function: void uc_all_scripts (const uc_script_t **SCRIPTS, size_t
+ *COUNT)
+ Get the list of all scripts. Stores a pointer to an array of all
+ scripts in `*SCRIPTS' and the length of this array in `*COUNT'.
+
+
+File: libunistring.info, Node: Blocks, Next: ISO C and Java syntax, Prev: Scripts, Up: unictype.h
+
+8.10 Blocks
+===========
+
+ The Unicode characters are subdivided into blocks. A block is an
+interval of Unicode code points.
+
+ The following type is used to represent a block.
+
+ -- Type: uc_block_t
+ This data type is a structure type that refers to statically
+ allocated data. It contains the following fields:
+ ucs4_t start;
+ ucs4_t end;
+ const char *name;
+
+ The `start' field is the first Unicode code point in the block.
+
+ The `end' field is the last Unicode code point in the block.
+
+ The `name' field is the name of the block.
+
+ The following function looks up a block.
+
+ -- Function: const uc_block_t * uc_block (ucs4_t UC)
+ Returns the block a character belongs to.
+
+ The following function views a block as a set of Unicode characters.
+
+ -- Function: bool uc_is_block (ucs4_t UC, const uc_block_t *BLOCK)
+ Tests whether a Unicode character belongs to a given block.
+
+ The following gives a global picture of all block.
+
+ -- Function: void uc_all_blocks (const uc_block_t **BLOCKS, size_t
+ *COUNT)
+ Get the list of all blocks. Stores a pointer to an array of all
+ blocks in `*BLOCKS' and the length of this array in `*COUNT'.
+
+
+File: libunistring.info, Node: ISO C and Java syntax, Next: Classifications like in ISO C, Prev: Blocks, Up: unictype.h
+
+8.11 ISO C and Java syntax
+==========================
+
+ The following properties are taken from language standards. The
+supported language standards are ISO C 99 and Java.
+
+ -- Function: bool uc_is_c_whitespace (ucs4_t UC)
+ Tests whether a Unicode character is considered whitespace in ISO
+ C 99.
+
+ -- Function: bool uc_is_java_whitespace (ucs4_t UC)
+ Tests whether a Unicode character is considered whitespace in Java.
+
+ The following enumerated values are the possible return values of
+the functions `uc_c_ident_category' and `uc_java_ident_category'.
+
+ -- Constant: int UC_IDENTIFIER_START
+ This return value means that the given character is valid as first
+ or subsequent character in an identifier.
+
+ -- Constant: int UC_IDENTIFIER_VALID
+ This return value means that the given character is valid as
+ subsequent character only.
+
+ -- Constant: int UC_IDENTIFIER_INVALID
+ This return value means that the given character is not valid in
+ an identifier.
+
+ -- Constant: int UC_IDENTIFIER_IGNORABLE
+ This return value (only for Java) means that the given character
+ is ignorable.
+
+ The following function determine whether a given character can be a
+constituent of an identifier in the given programming language.
+
+ -- Function: int uc_c_ident_category (ucs4_t UC)
+ Returns the categorization of a Unicode character with respect to
+ the ISO C 99 identifier syntax.
+
+ -- Function: int uc_java_ident_category (ucs4_t UC)
+ Returns the categorization of a Unicode character with respect to
+ the Java identifier syntax.
+
+
+File: libunistring.info, Node: Classifications like in ISO C, Prev: ISO C and Java syntax, Up: unictype.h
+
+8.12 Classifications like in ISO C
+==================================
+
+ The following character classifications mimic those declared in the
+ISO C header files `<ctype.h>' and `<wctype.h>'. These functions are
+deprecated, because this set of functions was designed with ASCII in
+mind and cannot reflect the more diverse reality of the Unicode
+character set. But they can be a quick-and-dirty porting aid when
+migrating from `wchar_t' APIs to Unicode strings.
+
+ -- Function: bool uc_is_alnum (ucs4_t UC)
+ Tests for any character for which `uc_is_alpha' or `uc_is_digit' is
+ true.
+
+ -- Function: bool uc_is_alpha (ucs4_t UC)
+ Tests for any character for which `uc_is_upper' or `uc_is_lower' is
+ true, or any character that is one of a locale-specific set of
+ characters for which none of `uc_is_cntrl', `uc_is_digit',
+ `uc_is_punct', or `uc_is_space' is true.
+
+ -- Function: bool uc_is_cntrl (ucs4_t UC)
+ Tests for any control character.
+
+ -- Function: bool uc_is_digit (ucs4_t UC)
+ Tests for any character that corresponds to a decimal-digit
+ character.
+
+ -- Function: bool uc_is_graph (ucs4_t UC)
+ Tests for any character for which `uc_is_print' is true and
+ `uc_is_space' is false.
+
+ -- Function: bool uc_is_lower (ucs4_t UC)
+ Tests for any character that corresponds to a lowercase letter or
+ is one of a locale-specific set of characters for which none of
+ `uc_is_cntrl', `uc_is_digit', `uc_is_punct', or `uc_is_space' is
+ true.
+
+ -- Function: bool uc_is_print (ucs4_t UC)
+ Tests for any printing character.
+
+ -- Function: bool uc_is_punct (ucs4_t UC)
+ Tests for any printing character that is one of a locale-specific
+ set of characters for which neither `uc_is_space' nor
+ `uc_is_alnum' is true.
+
+ -- Function: bool uc_is_space (ucs4_t UC)
+ Test for any character that corresponds to a locale-specific set
+ of characters for which none of `uc_is_alnum', `uc_is_graph', or
+ `uc_is_punct' is true.
+
+ -- Function: bool uc_is_upper (ucs4_t UC)
+ Tests for any character that corresponds to an uppercase letter or
+ is one of a locale-specific set of characters for which none of
+ `uc_is_cntrl', `uc_is_digit', `uc_is_punct', or `uc_is_space' is
+ true.
+
+ -- Function: bool uc_is_xdigit (ucs4_t UC)
+ Tests for any character that corresponds to a hexadecimal-digit
+ character.
+
+ -- Function: bool uc_is_blank (ucs4_t UC)
+ Tests for any character that corresponds to a standard blank
+ character or a locale-specific set of characters for which
+ `uc_is_alnum' is false.
+
+
+File: libunistring.info, Node: uniwidth.h, Next: uniwbrk.h, Prev: unictype.h, Up: Top
+
+9 Display width `<uniwidth.h>'
+******************************
+
+ This include file declares functions that return the display width,
+measured in columns, of characters or strings, when output to a device
+that uses non-proportional fonts.
+
+ Note that for some rarely used characters the actual fonts or
+terminal emulators can use a different width. There is no mechanism
+for communicating the display width of characters across a Unix
+pseudo-terminal (tty). Also, there are scripts with complex rendering,
+like the Indic scripts. For these scripts, there is no such concept as
+non-proportional fonts. Therefore the results of these functions
+usually work fine on most scripts and on most characters but can fail
+to represent the actual display width.
+
+ These functions are locale dependent. The ENCODING argument
+identifies the encoding (e.g. `"ISO-8859-2"' for Polish).
+
+ -- Function: int uc_width (ucs4_t UC, const char *ENCODING)
+ Determines and returns the number of column positions required for
+ UC. Returns -1 if UC is a control character that has an influence
+ on the column position when output.
+
+ -- Function: int u8_width (const uint8_t *S, size_t N, const char
+ *ENCODING)
+ -- Function: int u16_width (const uint16_t *S, size_t N, const char
+ *ENCODING)
+ -- Function: int u32_width (const uint32_t *S, size_t N, const char
+ *ENCODING)
+ Determines and returns the number of column positions required for
+ first N units (or fewer if S ends before this) in S. This
+ function ignores control characters in the string.
+
+ -- Function: int u8_strwidth (const uint8_t *S, const char *ENCODING)
+ -- Function: int u16_strwidth (const uint16_t *S, const char *ENCODING)
+ -- Function: int u32_strwidth (const uint32_t *S, const char *ENCODING)
+ Determines and returns the number of column positions required for
+ S. This function ignores control characters in the string.
+
+
+File: libunistring.info, Node: uniwbrk.h, Next: unilbrk.h, Prev: uniwidth.h, Up: Top
+
+10 Word breaks in strings `<uniwbrk.h>'
+***************************************
+
+ This include file declares functions for determining where in a
+string "words" start and end. Here "words" are not necessarily the
+same as entities that can be looked up in dictionaries, but rather
+groups of consecutive characters that should not be split by text
+processing operations.
+
+* Menu:
+
+* Word breaks in a string::
+* Word break property::
+
+
+File: libunistring.info, Node: Word breaks in a string, Next: Word break property, Up: uniwbrk.h
+
+10.1 Word breaks in a string
+============================
+
+ The following functions determine the word breaks in a string.
+
+ -- Function: void u8_wordbreaks (const uint8_t *S, size_t N, char *P)
+ -- Function: void u16_wordbreaks (const uint16_t *S, size_t N, char *P)
+ -- Function: void u32_wordbreaks (const uint32_t *S, size_t N, char *P)
+ -- Function: void ulc_wordbreaks (const char *S, size_t N, char *P)
+ Determines the word break points in S, an array of N units, and
+ stores the result at `P[0..N-1]'.
+ `P[i] = 1'
+ means that there is a word boundary between `S[i-1]' and
+ `S[i]'.
+
+ `P[i] = 0'
+ means that `S[i-1]' and `S[i]' must not be separated.
+ `P[0]' is always set to 0. If an application wants to consider a
+ word break to be present at the beginning of the string (before
+ `S[0]') or at the end of the string (after `S[0..N-1]'), it has to
+ treat these cases explicitly.
+
+
+File: libunistring.info, Node: Word break property, Prev: Word breaks in a string, Up: uniwbrk.h
+
+10.2 Word break property
+========================
+
+ This is a more low-level API. The word break property is a property
+defined in Unicode Standard Annex #29, section "Word Boundaries", see
+`http://www.unicode.org/reports/tr29/#Word_Boundaries'. It is used for
+determining the word breaks in a string.
+
+ The following are the possible values of the word break property.
+More values may be added in the future.
+
+ -- Constant: int WBP_OTHER
+ -- Constant: int WBP_CR
+ -- Constant: int WBP_LF
+ -- Constant: int WBP_NEWLINE
+ -- Constant: int WBP_EXTEND
+ -- Constant: int WBP_FORMAT
+ -- Constant: int WBP_KATAKANA
+ -- Constant: int WBP_ALETTER
+ -- Constant: int WBP_MIDNUMLET
+ -- Constant: int WBP_MIDLETTER
+ -- Constant: int WBP_MIDNUM
+ -- Constant: int WBP_NUMERIC
+ -- Constant: int WBP_EXTENDNUMLET
+
+ The following function looks up the word break property of a
+character.
+
+ -- Function: int uc_wordbreak_property (ucs4_t UC)
+ Returns the Word_Break property of a Unicode character.
+
+
+File: libunistring.info, Node: unilbrk.h, Next: uninorm.h, Prev: uniwbrk.h, Up: Top
+
+11 Line breaking `<unilbrk.h>'
+******************************
+
+ This include file declares functions for determining where in a
+string line breaks could or should be introduced, in order to make the
+displayed string fit into a column of given width.
+
+ These functions are locale dependent. The ENCODING argument
+identifies the encoding (e.g. `"ISO-8859-2"' for Polish).
+
+ The following enumerated values indicate whether, at a given
+position, a line break is possible or not. Given an string S as an
+array `S[0..N-1]' and a position I, the values have the following
+meanings:
+
+ -- Constant: int UC_BREAK_MANDATORY
+ This value indicates that `S[I]' is a line break character.
+
+ -- Constant: int UC_BREAK_POSSIBLE
+ This value indicates that a line break may be inserted between
+ `S[I-1]' and `S[I]'.
+
+ -- Constant: int UC_BREAK_HYPHENATION
+ This value indicates that a hyphen and a line break may be
+ inserted between `S[I-1]' and `S[I]'. But beware of language
+ dependent hyphenation rules.
+
+ -- Constant: int UC_BREAK_PROHIBITED
+ This value indicates that `S[I-1]' and `S[I]' must not be
+ separated.
+
+ -- Constant: int UC_BREAK_UNDEFINED
+ This value is not used as a return value; rather, in the
+ overriding argument of the `u*_width_linebreaks' functions, it
+ indicates the absence of an override.
+
+ The following functions determine the positions at which line breaks
+are possible.
+
+ -- Function: void u8_possible_linebreaks (const uint8_t *S, size_t N,
+ const char *ENCODING, char *P)
+ -- Function: void u16_possible_linebreaks (const uint16_t *S, size_t
+ N, const char *ENCODING, char *P)
+ -- Function: void u32_possible_linebreaks (const uint32_t *S, size_t
+ N, const char *ENCODING, char *P)
+ -- Function: void ulc_possible_linebreaks (const char *S, size_t N,
+ const char *ENCODING, char *P)
+ Determines the line break points in S, and stores the result at
+ `P[0..N-1]'. Every `P[I]' is assigned one of the values
+ `UC_BREAK_MANDATORY', `UC_BREAK_POSSIBLE', `UC_BREAK_HYPHENATION',
+ `UC_BREAK_PROHIBITED'.
+
+ The following functions determine where line breaks should be
+inserted so that each line fits in a given width, when output to a
+device that uses non-proportional fonts.
+
+ -- Function: int u8_width_linebreaks (const uint8_t *S, size_t N, int
+ WIDTH, int START_COLUMN, int AT_END_COLUMNS, const char
+ *OVERRIDE, const char *ENCODING, char *P)
+ -- Function: int u16_width_linebreaks (const uint16_t *S, size_t N,
+ int WIDTH, int START_COLUMN, int AT_END_COLUMNS, const char
+ *OVERRIDE, const char *ENCODING, char *P)
+ -- Function: int u32_width_linebreaks (const uint32_t *S, size_t N,
+ int WIDTH, int START_COLUMN, int AT_END_COLUMNS, const char
+ *OVERRIDE, const char *ENCODING, char *P)
+ -- Function: int ulc_width_linebreaks (const char *S, size_t N, int
+ WIDTH, int START_COLUMN, int AT_END_COLUMNS, const char
+ *OVERRIDE, const char *ENCODING, char *P)
+ Chooses the best line breaks, assuming that every character
+ occupies a width given by the `uc_width' function (see *note
+ uniwidth.h::).
+
+ The string is `S[0..N-1]'.
+
+ The maximum number of columns per line is given as WIDTH. The
+ starting column of the string is given as START_COLUMN. If the
+ algorithm shall keep room after the last piece, this amount of
+ room can be given as AT_END_COLUMNS.
+
+ OVERRIDE is an optional override; if `OVERRIDE[I] !=
+ UC_BREAK_UNDEFINED', `OVERRIDE[I]' takes precedence over `P[I]' as
+ returned by the `u*_possible_linebreaks' function.
+
+ The given ENCODING is used for disambiguating widths in `uc_width'.
+
+ Returns the column after the end of the string, and stores the
+ result at `P[0..N-1]'. Every `P[I]' is assigned one of the values
+ `UC_BREAK_MANDATORY', `UC_BREAK_POSSIBLE', `UC_BREAK_HYPHENATION',
+ `UC_BREAK_PROHIBITED'. Here the value `UC_BREAK_POSSIBLE'
+ indicates that a line break _should_ be inserted.
+
+
+File: libunistring.info, Node: uninorm.h, Next: unicase.h, Prev: unilbrk.h, Up: Top
+
+12 Normalization forms (composition and decomposition) `<uninorm.h>'
+********************************************************************
+
+ This include file defines functions for transforming Unicode strings
+to one of the four normal forms, known as NFC, NFD, NKFC, NFKD. These
+transformations involve decomposition and -- for NFC and NFKC --
+composition of Unicode characters.
+
+* Menu:
+
+* Decomposition of characters::
+* Composition of characters::
+* Normalization of strings::
+* Normalizing comparisons::
+* Normalization of streams::
+
+
+File: libunistring.info, Node: Decomposition of characters, Next: Composition of characters, Up: uninorm.h
+
+12.1 Decomposition of Unicode characters
+========================================
+
+ The following enumerated values are the possible types of
+decomposition of a Unicode character.
+
+ -- Constant: int UC_DECOMP_CANONICAL
+ Denotes canonical decomposition.
+
+ -- Constant: int UC_DECOMP_FONT
+ UCD marker: `<font>'. Denotes a font variant (e.g. a blackletter
+ form).
+
+ -- Constant: int UC_DECOMP_NOBREAK
+ UCD marker: `<noBreak>'. Denotes a no-break version of a space or
+ hyphen.
+
+ -- Constant: int UC_DECOMP_INITIAL
+ UCD marker: `<initial>'. Denotes an initial presentation form
+ (Arabic).
+
+ -- Constant: int UC_DECOMP_MEDIAL
+ UCD marker: `<medial>'. Denotes a medial presentation form
+ (Arabic).
+
+ -- Constant: int UC_DECOMP_FINAL
+ UCD marker: `<final>'. Denotes a final presentation form (Arabic).
+
+ -- Constant: int UC_DECOMP_ISOLATED
+ UCD marker: `<isolated>'. Denotes an isolated presentation form
+ (Arabic).
+
+ -- Constant: int UC_DECOMP_CIRCLE
+ UCD marker: `<circle>'. Denotes an encircled form.
+
+ -- Constant: int UC_DECOMP_SUPER
+ UCD marker: `<super>'. Denotes a superscript form.
+
+ -- Constant: int UC_DECOMP_SUB
+ UCD marker: `<sub>'. Denotes a subscript form.
+
+ -- Constant: int UC_DECOMP_VERTICAL
+ UCD marker: `<vertical>'. Denotes a vertical layout presentation
+ form.
+
+ -- Constant: int UC_DECOMP_WIDE
+ UCD marker: `<wide>'. Denotes a wide (or zenkaku) compatibility
+ character.
+
+ -- Constant: int UC_DECOMP_NARROW
+ UCD marker: `<narrow>'. Denotes a narrow (or hankaku)
+ compatibility character.
+
+ -- Constant: int UC_DECOMP_SMALL
+ UCD marker: `<small>'. Denotes a small variant form (CNS
+ compatibility).
+
+ -- Constant: int UC_DECOMP_SQUARE
+ UCD marker: `<square>'. Denotes a CJK squared font variant.
+
+ -- Constant: int UC_DECOMP_FRACTION
+ UCD marker: `<fraction>'. Denotes a vulgar fraction form.
+
+ -- Constant: int UC_DECOMP_COMPAT
+ UCD marker: `<compat>'. Denotes an otherwise unspecified
+ compatibility character.
+
+ The following constant denotes the maximum size of decomposition of
+a single Unicode character.
+
+ -- Macro: unsigned int UC_DECOMPOSITION_MAX_LENGTH
+ This macro expands to a constant that is the required size of
+ buffer passed to the `uc_decomposition' and
+ `uc_canonical_decomposition' functions.
+
+ The following functions decompose a Unicode character.
+
+ -- Function: int uc_decomposition (ucs4_t UC, int *DECOMP_TAG, ucs4_t
+ *DECOMPOSITION)
+ Returns the character decomposition mapping of the Unicode
+ character UC. DECOMPOSITION must point to an array of at least
+ `UC_DECOMPOSITION_MAX_LENGTH' `ucs_t' elements.
+
+ When a decomposition exists, `DECOMPOSITION[0..N-1]' and
+ `*DECOMP_TAG' are filled and N is returned. Otherwise -1 is
+ returned.
+
+ -- Function: int uc_canonical_decomposition (ucs4_t UC, ucs4_t
+ *DECOMPOSITION)
+ Returns the canonical character decomposition mapping of the
+ Unicode character UC. DECOMPOSITION must point to an array of at
+ least `UC_DECOMPOSITION_MAX_LENGTH' `ucs_t' elements.
+
+ When a decomposition exists, `DECOMPOSITION[0..N-1]' is filled and
+ N is returned. Otherwise -1 is returned.
+
+
+File: libunistring.info, Node: Composition of characters, Next: Normalization of strings, Prev: Decomposition of characters, Up: uninorm.h
+
+12.2 Composition of Unicode characters
+======================================
+
+ The following function composes a Unicode character from two Unicode
+characters.
+
+ -- Function: ucs4_t uc_composition (ucs4_t UC1, ucs4_t UC2)
+ Attempts to combine the Unicode characters UC1, UC2. UC1 is known
+ to have canonical combining class 0.
+
+ Returns the combination of UC1 and UC2, if it exists. Returns 0
+ otherwise.
+
+ Not all decompositions can be recombined using this function. See
+ the Unicode file `CompositionExclusions.txt' for details.
+
+
+File: libunistring.info, Node: Normalization of strings, Next: Normalizing comparisons, Prev: Composition of characters, Up: uninorm.h
+
+12.3 Normalization of strings
+=============================
+
+ The Unicode standard defines four normalization forms for Unicode
+strings. The following type is used to denote a normalization form.
+
+ -- Type: uninorm_t
+ An object of type `uninorm_t' denotes a Unicode normalization form.
+ This is a scalar type; its values can be compared with `=='.
+
+ The following constants denote the four normalization forms.
+
+ -- Macro: uninorm_t UNINORM_NFD
+ Denotes Normalization form D: canonical decomposition.
+
+ -- Macro: uninorm_t UNINORM_NFC
+ Normalization form C: canonical decomposition, then canonical
+ composition.
+
+ -- Macro: uninorm_t UNINORM_NFKD
+ Normalization form KD: compatibility decomposition.
+
+ -- Macro: uninorm_t UNINORM_NFKC
+ Normalization form KC: compatibility decomposition, then canonical
+ composition.
+
+ The following functions operate on `uninorm_t' objects.
+
+ -- Function: bool uninorm_is_compat_decomposing (uninorm_t NF)
+ Tests whether the normalization form NF does compatibility
+ decomposition.
+
+ -- Function: bool uninorm_is_composing (uninorm_t NF)
+ Tests whether the normalization form NF includes canonical
+ composition.
+
+ -- Function: uninorm_t uninorm_decomposing_form (uninorm_t NF)
+ Returns the decomposing variant of the normalization form NF.
+ This maps NFC,NFD -> NFD and NFKC,NFKD -> NFKD.
+
+ The following functions apply a Unicode normalization form to a
+Unicode string.
+
+ -- Function: uint8_t * u8_normalize (uninorm_t NF, const uint8_t *S,
+ size_t N, uint8_t *RESULTBUF, size_t *LENGTHP)
+ -- Function: uint16_t * u16_normalize (uninorm_t NF, const uint16_t
+ *S, size_t N, uint16_t *RESULTBUF, size_t *LENGTHP)
+ -- Function: uint32_t * u32_normalize (uninorm_t NF, const uint32_t
+ *S, size_t N, uint32_t *RESULTBUF, size_t *LENGTHP)
+ Returns the specified normalization form of a string.
+
+
+File: libunistring.info, Node: Normalizing comparisons, Next: Normalization of streams, Prev: Normalization of strings, Up: uninorm.h
+
+12.4 Normalizing comparisons
+============================
+
+ The following functions compare Unicode string, ignoring differences
+in normalization.
+
+ -- Function: int u8_normcmp (const uint8_t *S1, size_t N1, const
+ uint8_t *S2, size_t N2, uninorm_t NF, int *RESULTP)
+ -- Function: int u16_normcmp (const uint16_t *S1, size_t N1, const
+ uint16_t *S2, size_t N2, uninorm_t NF, int *RESULTP)
+ -- Function: int u32_normcmp (const uint32_t *S1, size_t N1, const
+ uint32_t *S2, size_t N2, uninorm_t NF, int *RESULTP)
+ Compares S1 and S2, ignoring differences in normalization.
+
+ NF must be either `UNINORM_NFD' or `UNINORM_NFKD'.
+
+ If successful, sets `*RESULTP' to -1 if S1 < S2, 0 if S1 = S2, 1
+ if S1 > S2, and returns 0. Upon failure, returns -1 with `errno'
+ set.
+
+ -- Function: char * u8_normxfrm (const uint8_t *S, size_t N, uninorm_t
+ NF, char *RESULTBUF, size_t *LENGTHP)
+ -- Function: char * u16_normxfrm (const uint16_t *S, size_t N,
+ uninorm_t NF, char *RESULTBUF, size_t *LENGTHP)
+ -- Function: char * u32_normxfrm (const uint32_t *S, size_t N,
+ uninorm_t NF, char *RESULTBUF, size_t *LENGTHP)
+ Converts the string S of length N to a NUL-terminated byte
+ sequence, in such a way that comparing `u8_normxfrm (S1)' and
+ `u8_normxfrm (S2)' with the `u8_cmp2' function is equivalent to
+ comparing S1 and S2 with the `u8_normcoll' function.
+
+ NF must be either `UNINORM_NFC' or `UNINORM_NFKC'.
+
+ -- Function: int u8_normcoll (const uint8_t *S1, size_t N1, const
+ uint8_t *S2, size_t N2, uninorm_t NF, int *RESULTP)
+ -- Function: int u16_normcoll (const uint16_t *S1, size_t N1, const
+ uint16_t *S2, size_t N2, uninorm_t NF, int *RESULTP)
+ -- Function: int u32_normcoll (const uint32_t *S1, size_t N1, const
+ uint32_t *S2, size_t N2, uninorm_t NF, int *RESULTP)
+ Compares S1 and S2, ignoring differences in normalization, using
+ the collation rules of the current locale.
+
+ NF must be either `UNINORM_NFC' or `UNINORM_NFKC'.
+
+ If successful, sets `*RESULTP' to -1 if S1 < S2, 0 if S1 = S2, 1
+ if S1 > S2, and returns 0. Upon failure, returns -1 with `errno'
+ set.
+
+
+File: libunistring.info, Node: Normalization of streams, Prev: Normalizing comparisons, Up: uninorm.h
+
+12.5 Normalization of streams of Unicode characters
+===================================================
+
+ A "stream of Unicode characters" is essentially a function that
+accepts an `ucs4_t' argument repeatedly, optionally combined with a
+function that "flushes" the stream.
+
+ -- Type: struct uninorm_filter
+ This is the data type of a stream of Unicode characters that
+ normalizes its input according to a given normalization form and
+ passes the normalized character sequence to the encapsulated
+ stream of Unicode characters.
+
+ -- Function: struct uninorm_filter * uninorm_filter_create (uninorm_t
+ NF, int (*STREAM_FUNC) (void *STREAM_DATA, ucs4_t UC), void
+ *STREAM_DATA)
+ Creates and returns a normalization filter for Unicode characters.
+
+ The pair (STREAM_FUNC, STREAM_DATA) is the encapsulated stream.
+ `STREAM_FUNC (STREAM_DATA, UC)' receives the Unicode character UC
+ and returns 0 if successful, or -1 with `errno' set upon failure.
+
+ Returns the new filter, or NULL with `errno' set upon failure.
+
+ -- Function: int uninorm_filter_write (struct uninorm_filter *FILTER,
+ ucs4_t UC)
+ Stuffs a Unicode character into a normalizing filter. Returns 0
+ if successful, or -1 with `errno' set upon failure.
+
+ -- Function: int uninorm_filter_flush (struct uninorm_filter *FILTER)
+ Brings data buffered in the filter to its destination, the
+ encapsulated stream.
+
+ Returns 0 if successful, or -1 with `errno' set upon failure.
+
+ Note! If after calling this function, additional characters are
+ written into the filter, the resulting character sequence in the
+ encapsulated stream will not necessarily be normalized.
+
+ -- Function: int uninorm_filter_free (struct uninorm_filter *FILTER)
+ Brings data buffered in the filter to its destination, the
+ encapsulated stream, then closes and frees the filter.
+
+ Returns 0 if successful, or -1 with `errno' set upon failure.
+
+
+File: libunistring.info, Node: unicase.h, Next: uniregex.h, Prev: uninorm.h, Up: Top
+
+13 Case mappings `<unicase.h>'
+******************************
+
+ This include file defines functions for case mapping for Unicode
+strings and case insensitive comparison of Unicode strings and C
+strings.
+
+ These string functions fix the problems that were mentioned in *note
+char * strings::, namely, they handle the Croatian LETTER DZ WITH
+CARON, the German LATIN SMALL LETTER SHARP S, the Greek sigma and the
+Lithuanian i correctly.
+
+* Menu:
+
+* Case mappings of characters::
+* Case mappings of strings::
+* Case mappings of substrings::
+* Case insensitive comparison::
+* Case detection::
+
+
+File: libunistring.info, Node: Case mappings of characters, Next: Case mappings of strings, Up: unicase.h
+
+13.1 Case mappings of characters
+================================
+
+ The following functions implement case mappings on Unicode
+characters -- for those cases only where the result of the mapping is a
+again a single Unicode character.
+
+ These mappings are locale and context independent.
+
+ *WARNING!* These functions are not sufficient for languages such as
+German, Greek and Lithuanian. Better use the functions below that
+treat an entire string at once and are language aware.
+
+ -- Function: ucs4_t uc_toupper (ucs4_t UC)
+ Returns the uppercase mapping of the Unicode character UC.
+
+ -- Function: ucs4_t uc_tolower (ucs4_t UC)
+ Returns the lowercase mapping of the Unicode character UC.
+
+ -- Function: ucs4_t uc_totitle (ucs4_t UC)
+ Returns the titlecase mapping of the Unicode character UC.
+
+ The titlecase mapping of a character is to be used when the
+ character should look like upper case and the following characters
+ are lower cased.
+
+ For most characters, this is the same as the uppercase mapping.
+ There are only few characters where the title case variant and the
+ uuper case variant are different. These characters occur in the
+ Latin writing of the Croatian, Bosnian, and Serbian languages.
+
+ Lower case Title case Upper case
+ ------------------------------------------------------------------
+ LATIN SMALL LETTER LJ LATIN CAPITAL LETTER LATIN CAPITAL LETTER
+ L WITH SMALL LETTER J LJ
+ LATIN SMALL LETTER NJ LATIN CAPITAL LETTER LATIN CAPITAL LETTER
+ N WITH SMALL LETTER J NJ
+ LATIN SMALL LETTER DZ LATIN CAPITAL LETTER LATIN CAPITAL LETTER
+ D WITH SMALL LETTER Z DZ
+ LATIN SMALL LETTER LATIN CAPITAL LETTER LATIN CAPITAL LETTER
+ DZ WITH CARON D WITH SMALL LETTER DZ WITH CARON
+ Z WITH CARON
+
+
+File: libunistring.info, Node: Case mappings of strings, Next: Case mappings of substrings, Prev: Case mappings of characters, Up: unicase.h
+
+13.2 Case mappings of strings
+=============================
+
+ Case mapping should always be performed on entire strings, not on
+individual characters. The functions in this sections do so.
+
+ These functions allow to apply a normalization after the case
+mapping. The reason is that if you want to treat `ä' and `Ä' the
+same, you most often also want to treat the composed and decomposed
+forms of such a character, U+00C4 LATIN CAPITAL LETTER A WITH DIAERESIS
+and U+0041 LATIN CAPITAL LETTER A U+0308 COMBINING DIAERESIS the same.
+The NF argument designates the normalization.
+
+ These functions are locale dependent. The ISO639_LANGUAGE argument
+identifies the language (e.g. `"tr"' for Turkish). NULL means to use
+locale independent case mappings.
+
+ -- Function: const char * uc_locale_language ()
+ Returns the ISO 639 language code of the current locale. Returns
+ `""' if it is unknown, or in the "C" locale.
+
+ -- Function: uint8_t * u8_toupper (const uint8_t *S, size_t N, const
+ char *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF,
+ size_t *LENGTHP)
+ -- Function: uint16_t * u16_toupper (const uint16_t *S, size_t N,
+ const char *ISO639_LANGUAGE, uninorm_t NF, uint16_t
+ *RESULTBUF, size_t *LENGTHP)
+ -- Function: uint32_t * u32_toupper (const uint32_t *S, size_t N,
+ const char *ISO639_LANGUAGE, uninorm_t NF, uint32_t
+ *RESULTBUF, size_t *LENGTHP)
+ Returns the uppercase mapping of a string.
+
+ The NF argument identifies the normalization form to apply after
+ the case-mapping. It can also be NULL, for no normalization.
+
+ -- Function: uint8_t * u8_tolower (const uint8_t *S, size_t N, const
+ char *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF,
+ size_t *LENGTHP)
+ -- Function: uint16_t * u16_tolower (const uint16_t *S, size_t N,
+ const char *ISO639_LANGUAGE, uninorm_t NF, uint16_t
+ *RESULTBUF, size_t *LENGTHP)
+ -- Function: uint32_t * u32_tolower (const uint32_t *S, size_t N,
+ const char *ISO639_LANGUAGE, uninorm_t NF, uint32_t
+ *RESULTBUF, size_t *LENGTHP)
+ Returns the lowercase mapping of a string.
+
+ The NF argument identifies the normalization form to apply after
+ the case-mapping. It can also be NULL, for no normalization.
+
+ -- Function: uint8_t * u8_totitle (const uint8_t *S, size_t N, const
+ char *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF,
+ size_t *LENGTHP)
+ -- Function: uint16_t * u16_totitle (const uint16_t *S, size_t N,
+ const char *ISO639_LANGUAGE, uninorm_t NF, uint16_t
+ *RESULTBUF, size_t *LENGTHP)
+ -- Function: uint32_t * u32_totitle (const uint32_t *S, size_t N,
+ const char *ISO639_LANGUAGE, uninorm_t NF, uint32_t
+ *RESULTBUF, size_t *LENGTHP)
+ Returns the titlecase mapping of a string.
+
+ Mapping to title case means that, in each word, the first cased
+ character is being mapped to title case and the remaining
+ characters of the word are being mapped to lower case.
+
+ The NF argument identifies the normalization form to apply after
+ the case-mapping. It can also be NULL, for no normalization.
+
+
+File: libunistring.info, Node: Case mappings of substrings, Next: Case insensitive comparison, Prev: Case mappings of strings, Up: unicase.h
+
+13.3 Case mappings of substrings
+================================
+
+ Case mapping of a substring cannot simply be performed by extracting
+the substring and then applying the case mapping function to it. This
+does not work because case mapping requires some information about the
+surrounding characters. The following functions allow to apply case
+mappings to substrings of a given string, while taking into account the
+characters that precede it (the "prefix") and the characters that
+follow it (the "suffix").
+
+ -- Type: casing_prefix_context_t
+ This data type denotes the case-mapping context that is given by a
+ prefix string. It is an immediate type that can be copied by
+ simple assignment, without involving memory allocation. It is not
+ an array type.
+
+ -- Constant: casing_prefix_context_t unicase_empty_prefix_context
+ This constant is the case-mapping context that corresponds to an
+ empty prefix string.
+
+ The following functions return `casing_prefix_context_t' objects:
+
+ -- Function: casing_prefix_context_t u8_casing_prefix_context (const
+ uint8_t *S, size_t N)
+ -- Function: casing_prefix_context_t u16_casing_prefix_context (const
+ uint16_t *S, size_t N)
+ -- Function: casing_prefix_context_t u32_casing_prefix_context (const
+ uint32_t *S, size_t N)
+ Returns the case-mapping context of a given prefix string.
+
+ -- Function: casing_prefix_context_t u8_casing_prefixes_context (const
+ uint8_t *S, size_t N, casing_prefix_context_t A_CONTEXT)
+ -- Function: casing_prefix_context_t u16_casing_prefixes_context
+ (const uint16_t *S, size_t N, casing_prefix_context_t
+ A_CONTEXT)
+ -- Function: casing_prefix_context_t u32_casing_prefixes_context
+ (const uint32_t *S, size_t N, casing_prefix_context_t
+ A_CONTEXT)
+ Returns the case-mapping context of the prefix concat(A, S), given
+ the case-mapping context of the prefix A.
+
+ -- Type: casing_suffix_context_t
+ This data type denotes the case-mapping context that is given by a
+ suffix string. It is an immediate type that can be copied by
+ simple assignment, without involving memory allocation. It is not
+ an array type.
+
+ -- Constant: casing_suffix_context_t unicase_empty_suffix_context
+ This constant is the case-mapping context that corresponds to an
+ empty suffix string.
+
+ The following functions return `casing_suffix_context_t' objects:
+
+ -- Function: casing_suffix_context_t u8_casing_suffix_context (const
+ uint8_t *S, size_t N)
+ -- Function: casing_suffix_context_t u16_casing_suffix_context (const
+ uint16_t *S, size_t N)
+ -- Function: casing_suffix_context_t u32_casing_suffix_context (const
+ uint32_t *S, size_t N)
+ Returns the case-mapping context of a given suffix string.
+
+ -- Function: casing_suffix_context_t u8_casing_suffixes_context (const
+ uint8_t *S, size_t N, casing_suffix_context_t A_CONTEXT)
+ -- Function: casing_suffix_context_t u16_casing_suffixes_context
+ (const uint16_t *S, size_t N, casing_suffix_context_t
+ A_CONTEXT)
+ -- Function: casing_suffix_context_t u32_casing_suffixes_context
+ (const uint32_t *S, size_t N, casing_suffix_context_t
+ A_CONTEXT)
+ Returns the case-mapping context of the suffix concat(S, A), given
+ the case-mapping context of the suffix A.
+
+ The following functions perform a case mapping, considering the
+prefix context and the suffix context.
+
+ -- Function: uint8_t * u8_ct_toupper (const uint8_t *S, size_t N,
+ casing_prefix_context_t PREFIX_CONTEXT,
+ casing_suffix_context_t SUFFIX_CONTEXT, const char
+ *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF, size_t
+ *LENGTHP)
+ -- Function: uint16_t * u16_ct_toupper (const uint16_t *S, size_t N,
+ casing_prefix_context_t PREFIX_CONTEXT,
+ casing_suffix_context_t SUFFIX_CONTEXT, const char
+ *ISO639_LANGUAGE, uninorm_t NF, uint16_t *RESULTBUF, size_t
+ *LENGTHP)
+ -- Function: uint32_t * u32_ct_toupper (const uint32_t *S, size_t N,
+ casing_prefix_context_t PREFIX_CONTEXT,
+ casing_suffix_context_t SUFFIX_CONTEXT, const char
+ *ISO639_LANGUAGE, uninorm_t NF, uint32_t *RESULTBUF, size_t
+ *LENGTHP)
+ Returns the uppercase mapping of a string that is surrounded by a
+ prefix and a suffix.
+
+ -- Function: uint8_t * u8_ct_tolower (const uint8_t *S, size_t N,
+ casing_prefix_context_t PREFIX_CONTEXT,
+ casing_suffix_context_t SUFFIX_CONTEXT, const char
+ *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF, size_t
+ *LENGTHP)
+ -- Function: uint16_t * u16_ct_tolower (const uint16_t *S, size_t N,
+ casing_prefix_context_t PREFIX_CONTEXT,
+ casing_suffix_context_t SUFFIX_CONTEXT, const char
+ *ISO639_LANGUAGE, uninorm_t NF, uint16_t *RESULTBUF, size_t
+ *LENGTHP)
+ -- Function: uint32_t * u32_ct_tolower (const uint32_t *S, size_t N,
+ casing_prefix_context_t PREFIX_CONTEXT,
+ casing_suffix_context_t SUFFIX_CONTEXT, const char
+ *ISO639_LANGUAGE, uninorm_t NF, uint32_t *RESULTBUF, size_t
+ *LENGTHP)
+ Returns the lowercase mapping of a string that is surrounded by a
+ prefix and a suffix.
+
+ -- Function: uint8_t * u8_ct_totitle (const uint8_t *S, size_t N,
+ casing_prefix_context_t PREFIX_CONTEXT,
+ casing_suffix_context_t SUFFIX_CONTEXT, const char
+ *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF, size_t
+ *LENGTHP)
+ -- Function: uint16_t * u16_ct_totitle (const uint16_t *S, size_t N,
+ casing_prefix_context_t PREFIX_CONTEXT,
+ casing_suffix_context_t SUFFIX_CONTEXT, const char
+ *ISO639_LANGUAGE, uninorm_t NF, uint16_t *RESULTBUF, size_t
+ *LENGTHP)
+ -- Function: uint32_t * u32_ct_totitle (const uint32_t *S, size_t N,
+ casing_prefix_context_t PREFIX_CONTEXT,
+ casing_suffix_context_t SUFFIX_CONTEXT, const char
+ *ISO639_LANGUAGE, uninorm_t NF, uint32_t *RESULTBUF, size_t
+ *LENGTHP)
+ Returns the titlecase mapping of a string that is surrounded by a
+ prefix and a suffix.
+
+ For example, to uppercase the UTF-8 substring between `s +
+start_index' and `s + end_index' of a string that extends from `s' to
+`s + u8_strlen (s)', you can use the statements
+
+ size_t result_length;
+ uint8_t result =
+ u8_ct_toupper (s + start_index, end_index - start_index,
+ u8_casing_prefix_context (s, start_index),
+ u8_casing_suffix_context (s + end_index,
+ u8_strlen (s) - end_index),
+ iso639_language, NULL, NULL, &result_length);
+
+
+File: libunistring.info, Node: Case insensitive comparison, Next: Case detection, Prev: Case mappings of substrings, Up: unicase.h
+
+13.4 Case insensitive comparison
+================================
+
+ The following functions implement comparison that ignores
+differences in case and normalization.
+
+ -- Function: uint8_t * u8_casefold (const uint8_t *S, size_t N, const
+ char *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF,
+ size_t *LENGTHP)
+ -- Function: uint16_t * u16_casefold (const uint16_t *S, size_t N,
+ const char *ISO639_LANGUAGE, uninorm_t NF, uint16_t
+ *RESULTBUF, size_t *LENGTHP)
+ -- Function: uint32_t * u32_casefold (const uint32_t *S, size_t N,
+ const char *ISO639_LANGUAGE, uninorm_t NF, uint32_t
+ *RESULTBUF, size_t *LENGTHP)
+ Returns the case folded string.
+
+ Comparing `u8_casefold (S1)' and `u8_casefold (S2)' with the
+ `u8_cmp2' function is equivalent to comparing S1 and S2 with
+ `u8_casecmp'.
+
+ The NF argument identifies the normalization form to apply after
+ the case-mapping. It can also be NULL, for no normalization.
+
+ -- Function: uint8_t * u8_ct_casefold (const uint8_t *S, size_t N,
+ casing_prefix_context_t PREFIX_CONTEXT,
+ casing_suffix_context_t SUFFIX_CONTEXT, const char
+ *ISO639_LANGUAGE, uninorm_t NF, uint8_t *RESULTBUF, size_t
+ *LENGTHP)
+ -- Function: uint16_t * u16_ct_casefold (const uint16_t *S, size_t N,
+ casing_prefix_context_t PREFIX_CONTEXT,
+ casing_suffix_context_t SUFFIX_CONTEXT, const char
+ *ISO639_LANGUAGE, uninorm_t NF, uint16_t *RESULTBUF, size_t
+ *LENGTHP)
+ -- Function: uint32_t * u32_ct_casefold (const uint32_t *S, size_t N,
+ casing_prefix_context_t PREFIX_CONTEXT,
+ casing_suffix_context_t SUFFIX_CONTEXT, const char
+ *ISO639_LANGUAGE, uninorm_t NF, uint32_t *RESULTBUF, size_t
+ *LENGTHP)
+ Returns the case folded string. The case folding takes into
+ account the case mapping contexts of the prefix and suffix strings.
+
+ -- Function: int u8_casecmp (const uint8_t *S1, size_t N1, const
+ uint8_t *S2, size_t N2, const char *ISO639_LANGUAGE,
+ uninorm_t NF, int *RESULTP)
+ -- Function: int u16_casecmp (const uint16_t *S1, size_t N1, const
+ uint16_t *S2, size_t N2, const char *ISO639_LANGUAGE,
+ uninorm_t NF, int *RESULTP)
+ -- Function: int u32_casecmp (const uint32_t *S1, size_t N1, const
+ uint32_t *S2, size_t N2, const char *ISO639_LANGUAGE,
+ uninorm_t NF, int *RESULTP)
+ -- Function: int ulc_casecmp (const char *S1, size_t N1, const char
+ *S2, size_t N2, const char *ISO639_LANGUAGE, uninorm_t NF,
+ int *RESULTP)
+ Compares S1 and S2, ignoring differences in case and normalization.
+
+ The NF argument identifies the normalization form to apply after
+ the case-mapping. It can also be NULL, for no normalization.
+
+ If successful, sets `*RESULTP' to -1 if S1 < S2, 0 if S1 = S2, 1
+ if S1 > S2, and returns 0. Upon failure, returns -1 with `errno'
+ set.
+
+ The following functions additionally take into account the sorting
+rules of the current locale.
+
+ -- Function: char * u8_casexfrm (const uint8_t *S, size_t N, const
+ char *ISO639_LANGUAGE, uninorm_t NF, char *RESULTBUF, size_t
+ *LENGTHP)
+ -- Function: char * u16_casexfrm (const uint16_t *S, size_t N, const
+ char *ISO639_LANGUAGE, uninorm_t NF, char *RESULTBUF, size_t
+ *LENGTHP)
+ -- Function: char * u32_casexfrm (const uint32_t *S, size_t N, const
+ char *ISO639_LANGUAGE, uninorm_t NF, char *RESULTBUF, size_t
+ *LENGTHP)
+ -- Function: char * ulc_casexfrm (const char *S, size_t N, const char
+ *ISO639_LANGUAGE, uninorm_t NF, char *RESULTBUF, size_t
+ *LENGTHP)
+ Converts the string S of length N to a NUL-terminated byte
+ sequence, in such a way that comparing `u8_casexfrm (S1)' and
+ `u8_casexfrm (S2)' with the gnulib function `memcmp2' is
+ equivalent to comparing S1 and S2 with `u8_casecoll'.
+
+ NF must be either `UNINORM_NFC', `UNINORM_NFKC', or NULL for no
+ normalization.
+
+ -- Function: int u8_casecoll (const uint8_t *S1, size_t N1, const
+ uint8_t *S2, size_t N2, const char *ISO639_LANGUAGE,
+ uninorm_t NF, int *RESULTP)
+ -- Function: int u16_casecoll (const uint16_t *S1, size_t N1, const
+ uint16_t *S2, size_t N2, const char *ISO639_LANGUAGE,
+ uninorm_t NF, int *RESULTP)
+ -- Function: int u32_casecoll (const uint32_t *S1, size_t N1, const
+ uint32_t *S2, size_t N2, const char *ISO639_LANGUAGE,
+ uninorm_t NF, int *RESULTP)
+ -- Function: int ulc_casecoll (const char *S1, size_t N1, const char
+ *S2, size_t N2, const char *ISO639_LANGUAGE, uninorm_t NF,
+ int *RESULTP)
+ Compares S1 and S2, ignoring differences in case and normalization,
+ using the collation rules of the current locale.
+
+ The NF argument identifies the normalization form to apply after
+ the case-mapping. It must be either `UNINORM_NFC' or
+ `UNINORM_NFKC'. It can also be NULL, for no normalization.
+
+ If successful, sets `*RESULTP' to -1 if S1 < S2, 0 if S1 = S2, 1
+ if S1 > S2, and returns 0. Upon failure, returns -1 with `errno'
+ set.
+
+
+File: libunistring.info, Node: Case detection, Prev: Case insensitive comparison, Up: unicase.h
+
+13.5 Case detection
+===================
+
+ The following functions determine whether a Unicode string is
+entirely in upper case. or entirely in lower case, or entirely in title
+case, or already case-folded.
+
+ -- Function: int u8_is_uppercase (const uint8_t *S, size_t N, const
+ char *ISO639_LANGUAGE, bool *RESULTP)
+ -- Function: int u16_is_uppercase (const uint16_t *S, size_t N, const
+ char *ISO639_LANGUAGE, bool *RESULTP)
+ -- Function: int u32_is_uppercase (const uint32_t *S, size_t N, const
+ char *ISO639_LANGUAGE, bool *RESULTP)
+ Sets `*RESULTP' to true if mapping NFD(S) to upper case is a
+ no-op, or to false otherwise, and returns 0. Upon failure,
+ returns -1 with `errno' set.
+
+ -- Function: int u8_is_lowercase (const uint8_t *S, size_t N, const
+ char *ISO639_LANGUAGE, bool *RESULTP)
+ -- Function: int u16_is_lowercase (const uint16_t *S, size_t N, const
+ char *ISO639_LANGUAGE, bool *RESULTP)
+ -- Function: int u32_is_lowercase (const uint32_t *S, size_t N, const
+ char *ISO639_LANGUAGE, bool *RESULTP)
+ Sets `*RESULTP' to true if mapping NFD(S) to lower case is a
+ no-op, or to false otherwise, and returns 0. Upon failure,
+ returns -1 with `errno' set.
+
+ -- Function: int u8_is_titlecase (const uint8_t *S, size_t N, const
+ char *ISO639_LANGUAGE, bool *RESULTP)
+ -- Function: int u16_is_titlecase (const uint16_t *S, size_t N, const
+ char *ISO639_LANGUAGE, bool *RESULTP)
+ -- Function: int u32_is_titlecase (const uint32_t *S, size_t N, const
+ char *ISO639_LANGUAGE, bool *RESULTP)
+ Sets `*RESULTP' to true if mapping NFD(S) to title case is a
+ no-op, or to false otherwise, and returns 0. Upon failure,
+ returns -1 with `errno' set.
+
+ -- Function: int u8_is_casefolded (const uint8_t *S, size_t N, const
+ char *ISO639_LANGUAGE, bool *RESULTP)
+ -- Function: int u16_is_casefolded (const uint16_t *S, size_t N, const
+ char *ISO639_LANGUAGE, bool *RESULTP)
+ -- Function: int u32_is_casefolded (const uint32_t *S, size_t N, const
+ char *ISO639_LANGUAGE, bool *RESULTP)
+ Sets `*RESULTP' to true if applying case folding to NFD(S) is a
+ no-op, or to false otherwise, and returns 0. Upon failure,
+ returns -1 with `errno' set.
+
+ The following functions determine whether case mappings have any
+effect on a Unicode string.
+
+ -- Function: int u8_is_cased (const uint8_t *S, size_t N, const char
+ *ISO639_LANGUAGE, bool *RESULTP)
+ -- Function: int u16_is_cased (const uint16_t *S, size_t N, const char
+ *ISO639_LANGUAGE, bool *RESULTP)
+ -- Function: int u32_is_cased (const uint32_t *S, size_t N, const char
+ *ISO639_LANGUAGE, bool *RESULTP)
+ Sets `*RESULTP' to true if case matters for S, that is, if mapping
+ NFD(S) to either upper case or lower case or title case is not a
+ no-op. Set `*RESULTP' to false if NFD(S) maps to itself under the
+ upper case mapping, under the lower case mapping, and under the
+ title case mapping; in other words, when NFD(S) consists entirely
+ of caseless characters. Upon failure, returns -1 with `errno' set.
+
+
+File: libunistring.info, Node: uniregex.h, Next: Using the library, Prev: unicase.h, Up: Top
+
+14 Regular expressions `<uniregex.h>'
+*************************************
+
+ This include file is not yet implemented.
+
+
+File: libunistring.info, Node: Using the library, Next: More functionality, Prev: uniregex.h, Up: Top
+
+15 Using the library
+********************
+
+ This chapter explains some practical considerations, regarding the
+installation and compiler options that are needed in order to use this
+library.
+
+* Menu:
+
+* Installation::
+* Compiler options::
+* Include files::
+* Autoconf macro::
+* Reporting problems::
+
+
+File: libunistring.info, Node: Installation, Next: Compiler options, Up: Using the library
+
+15.1 Installation
+=================
+
+ Before you can use the library, it must be installed. First, you
+have to make sure all dependencies are installed. They are listed in
+the file `DEPENDENCIES'.
+
+ Then you can proceed to build and install the library, as described
+in the file `INSTALL'. For installation on Windows systems, please
+refer to the file `README.woe32'.
+
+
+File: libunistring.info, Node: Compiler options, Next: Include files, Prev: Installation, Up: Using the library
+
+15.2 Compiler options
+=====================
+
+ Let's denote as `LIBUNISTRING_PREFIX' the value of the `--prefix'
+option that you passed to `configure' while installing this package.
+If you didn't pass any `--prefix' option, then the package is installed
+in `/usr/local'.
+
+ Let's denote as `LIBUNISTRING_INCLUDEDIR' the directory where the
+include files were installed. This is usually the same as
+`${LIBUNISTRING_PREFIX}/include'. Except that if you passed an
+`--includedir' option to `configure', it is the value of that option.
+
+ Let's further denote as `LIBUNISTRING_LIBDIR' the directory where
+the library itself was installed. This is the value that you passed
+with the `--libdir' option to `configure', or otherwise the same as
+`${LIBUNISTRING_PREFIX}/lib'. Recall that when building in 64-bit mode
+on a 64-bit GNU/Linux system that supports executables in either 64-bit
+mode or 32-bit mode, you should have used the option
+`--libdir=${LIBUNISTRING_PREFIX}/lib64'.
+
+ So that the compiler finds the include files, you have to pass it the
+option `-I${LIBUNISTRING_INCLUDEDIR}'.
+
+ So that the compiler finds the library during its linking pass, you
+have to pass it the options `-L${LIBUNISTRING_LIBDIR} -lunistring'. On
+some systems, in some configurations, you also have to pass options
+needed for linking with `libiconv'. The autoconf macro
+`gl_LIBUNISTRING' (see *note Autoconf macro::) deals with this
+particularity.
+
+
+File: libunistring.info, Node: Include files, Next: Autoconf macro, Prev: Compiler options, Up: Using the library
+
+15.3 Include files
+==================
+
+ Most of the include files have been presented in the introduction,
+see *note Introduction::, and subsequent detailed chapters.
+
+ Another include file is `<unistring/version.h>'. It contains the
+version number of the libunistring library.
+
+ -- Macro: int _LIBUNISTRING_VERSION
+ This constant contains the version of libunistring that is being
+ used at compile time. It encodes the major and minor parts of the
+ version number only. These parts are encoded in the form
+ `(major<<8) + minor'.
+
+ -- Constant: int _libunistring_version
+ This constant contains the version of libunistring that is being
+ used at run time. It encodes the major and minor parts of the
+ version number only. These parts are encoded in the form
+ `(major<<8) + minor'.
+
+ It is possible that `_libunistring_version' is greater than
+`_LIBUNISTRING_VERSION'. This can happen when you use `libunistring'
+as a shared library, and a newer, binary backward-compatible version
+has been installed after your program that uses `libunistring' was
+installed.
+
+
+File: libunistring.info, Node: Autoconf macro, Next: Reporting problems, Prev: Include files, Up: Using the library
+
+15.4 Autoconf macro
+===================
+
+ GNU Gnulib provides an autoconf macro that tests for the availability
+of `libunistring'. It is contained in the Gnulib module
+`libunistring', see
+`http://www.gnu.org/software/gnulib/MODULES.html#module=libunistring'.
+
+ The macro is called `gl_LIBUNISTRING'. It searches for an installed
+libunistring. If found, it sets and AC_SUBSTs `HAVE_LIBUNISTRING=yes'
+and the `LIBUNISTRING' and `LTLIBUNISTRING' variables and augments the
+`CPPFLAGS' variable, and defines the C macro `HAVE_LIBUNISTRING' to 1.
+Otherwise, it sets and AC_SUBSTs `HAVE_LIBUNISTRING=no' and
+`LIBUNISTRING' and `LTLIBUNISTRING' to empty.
+
+ The complexities that `gl_LIBUNISTRING' deals with are the following:
+
+ * On some operating systems, in some configurations, libunistring
+ depends on `libiconv', and the options for linking with libiconv
+ must be mentioned explicitly on the link command line.
+
+ * GNU `libunistring', if installed, is not necessarily already in the
+ search path (`CPPFLAGS' for the include file search path,
+ `LDFLAGS' for the library search path).
+
+ * GNU `libunistring', if installed, is not necessarily already in the
+ run time library search path. To avoid the need for setting an
+ environment variable like `LD_LIBRARY_PATH', the macro adds the
+ appropriate run time search path options to the `LIBUNISTRING'
+ variable. This works on most systems.
+
+
+File: libunistring.info, Node: Reporting problems, Prev: Autoconf macro, Up: Using the library
+
+15.5 Reporting problems
+=======================
+
+ If you encounter any problem, please don't hesitate to send a
+detailed bug report to the `bug-libunistring@gnu.org' mailing list.
+You can alternatively also use the bug tracker at the project page
+`https://savannah.gnu.org/projects/libunistring'.
+
+ Please always include the version number of this library, and a short
+description of your operating system and compilation environment with
+corresponding version numbers.
+
+ For problems that appear while building and installing
+`libunistring', for which you don't find the remedy in the `INSTALL'
+file, please include a description of the options that you passed to
+the `configure' script.
+
+
+File: libunistring.info, Node: More functionality, Next: Licenses, Prev: Using the library, Up: Top
+
+16 More advanced functionality
+******************************
+
+ For bidirectional reordering of strings, we recommend the GNU
+FriBidi library: `http://www.fribidi.org/'.
+
+ For the rendering of Unicode strings outside of the context of a
+given toolkit (KDE/Qt or GNOME/Gtk), we recommend the Pango library:
+`http://www.pango.org/'.
+
+
+File: libunistring.info, Node: Licenses, Next: Index, Prev: More functionality, Up: Top
+
+Appendix A Licenses
+*******************
+
+ The files of this package are covered by the licenses indicated in
+each particular file or directory. Here is a summary:
+
+ * The `libunistring' library is covered by the GNU Lesser General
+ Public License (LGPL). A copy of the license is included in *note
+ GNU LGPL::.
+
+ * This manual is free documentation. It is dually licensed under the
+ GNU FDL and the GNU GPL. This means that you can redistribute this
+ manual under either of these two licenses, at your choice.
+ This manual is covered by the GNU FDL. Permission is granted to
+ copy, distribute and/or modify this document under the terms of the
+ GNU Free Documentation License (FDL), either version 1.2 of the
+ License, or (at your option) any later version published by the
+ Free Software Foundation (FSF); with no Invariant Sections, with no
+ Front-Cover Text, and with no Back-Cover Texts. A copy of the
+ license is included in *note GNU FDL::.
+ This manual is covered by the GNU GPL. You can redistribute it
+ and/or modify it under the terms of the GNU General Public License
+ (GPL), either version 3 of the License, or (at your option) any
+ later version published by the Free Software Foundation (FSF). A
+ copy of the license is included in *note GNU GPL::.
+
+* Menu:
+
+* GNU GPL:: GNU General Public License
+* GNU LGPL:: GNU Lesser General Public License
+* GNU FDL:: GNU Free Documentation License
+
+
+File: libunistring.info, Node: GNU GPL, Next: GNU LGPL, Up: Licenses
+
+A.1 GNU GENERAL PUBLIC LICENSE
+==============================
+
+ Version 3, 29 June 2007
+
+ Copyright (C) 2007 Free Software Foundation, Inc. `http://fsf.org/'
+
+ Everyone is permitted to copy and distribute verbatim copies of this
+ license document, but changing it is not allowed.
+
+Preamble
+========
+
+ The GNU General Public License is a free, copyleft license for
+software and other kinds of works.
+
+ The licenses for most software and other practical works are designed
+to take away your freedom to share and change the works. By contrast,
+the GNU General Public License is intended to guarantee your freedom to
+share and change all versions of a program--to make sure it remains
+free software for all its users. We, the Free Software Foundation, use
+the GNU General Public License for most of our software; it applies
+also to any other work released this way by its authors. You can apply
+it to your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+them if you wish), that you receive source code or can get it if you
+want it, that you can change the software or use pieces of it in new
+free programs, and that you know you can do these things.
+
+ To protect your rights, we need to prevent others from denying you
+these rights or asking you to surrender the rights. Therefore, you
+have certain responsibilities if you distribute copies of the software,
+or if you modify it: responsibilities to respect the freedom of others.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must pass on to the recipients the same
+freedoms that you received. You must make sure that they, too, receive
+or can get the source code. And you must show them these terms so they
+know their rights.
+
+ Developers that use the GNU GPL protect your rights with two steps:
+(1) assert copyright on the software, and (2) offer you this License
+giving you legal permission to copy, distribute and/or modify it.
+
+ For the developers' and authors' protection, the GPL clearly explains
+that there is no warranty for this free software. For both users' and
+authors' sake, the GPL requires that modified versions be marked as
+changed, so that their problems will not be attributed erroneously to
+authors of previous versions.
+
+ Some devices are designed to deny users access to install or run
+modified versions of the software inside them, although the
+manufacturer can do so. This is fundamentally incompatible with the
+aim of protecting users' freedom to change the software. The
+systematic pattern of such abuse occurs in the area of products for
+individuals to use, which is precisely where it is most unacceptable.
+Therefore, we have designed this version of the GPL to prohibit the
+practice for those products. If such problems arise substantially in
+other domains, we stand ready to extend this provision to those domains
+in future versions of the GPL, as needed to protect the freedom of
+users.
+
+ Finally, every program is threatened constantly by software patents.
+States should not allow patents to restrict development and use of
+software on general-purpose computers, but in those that do, we wish to
+avoid the special danger that patents applied to a free program could
+make it effectively proprietary. To prevent this, the GPL assures that
+patents cannot be used to render the program non-free.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+TERMS AND CONDITIONS
+====================
+
+ 0. Definitions.
+
+ "This License" refers to version 3 of the GNU General Public
+ License.
+
+ "Copyright" also means copyright-like laws that apply to other
+ kinds of works, such as semiconductor masks.
+
+ "The Program" refers to any copyrightable work licensed under this
+ License. Each licensee is addressed as "you". "Licensees" and
+ "recipients" may be individuals or organizations.
+
+ To "modify" a work means to copy from or adapt all or part of the
+ work in a fashion requiring copyright permission, other than the
+ making of an exact copy. The resulting work is called a "modified
+ version" of the earlier work or a work "based on" the earlier work.
+
+ A "covered work" means either the unmodified Program or a work
+ based on the Program.
+
+ To "propagate" a work means to do anything with it that, without
+ permission, would make you directly or secondarily liable for
+ infringement under applicable copyright law, except executing it
+ on a computer or modifying a private copy. Propagation includes
+ copying, distribution (with or without modification), making
+ available to the public, and in some countries other activities as
+ well.
+
+ To "convey" a work means any kind of propagation that enables other
+ parties to make or receive copies. Mere interaction with a user
+ through a computer network, with no transfer of a copy, is not
+ conveying.
+
+ An interactive user interface displays "Appropriate Legal Notices"
+ to the extent that it includes a convenient and prominently visible
+ feature that (1) displays an appropriate copyright notice, and (2)
+ tells the user that there is no warranty for the work (except to
+ the extent that warranties are provided), that licensees may
+ convey the work under this License, and how to view a copy of this
+ License. If the interface presents a list of user commands or
+ options, such as a menu, a prominent item in the list meets this
+ criterion.
+
+ 1. Source Code.
+
+ The "source code" for a work means the preferred form of the work
+ for making modifications to it. "Object code" means any
+ non-source form of a work.
+
+ A "Standard Interface" means an interface that either is an
+ official standard defined by a recognized standards body, or, in
+ the case of interfaces specified for a particular programming
+ language, one that is widely used among developers working in that
+ language.
+
+ The "System Libraries" of an executable work include anything,
+ other than the work as a whole, that (a) is included in the normal
+ form of packaging a Major Component, but which is not part of that
+ Major Component, and (b) serves only to enable use of the work
+ with that Major Component, or to implement a Standard Interface
+ for which an implementation is available to the public in source
+ code form. A "Major Component", in this context, means a major
+ essential component (kernel, window system, and so on) of the
+ specific operating system (if any) on which the executable work
+ runs, or a compiler used to produce the work, or an object code
+ interpreter used to run it.
+
+ The "Corresponding Source" for a work in object code form means all
+ the source code needed to generate, install, and (for an executable
+ work) run the object code and to modify the work, including
+ scripts to control those activities. However, it does not include
+ the work's System Libraries, or general-purpose tools or generally
+ available free programs which are used unmodified in performing
+ those activities but which are not part of the work. For example,
+ Corresponding Source includes interface definition files
+ associated with source files for the work, and the source code for
+ shared libraries and dynamically linked subprograms that the work
+ is specifically designed to require, such as by intimate data
+ communication or control flow between those subprograms and other
+ parts of the work.
+
+ The Corresponding Source need not include anything that users can
+ regenerate automatically from other parts of the Corresponding
+ Source.
+
+ The Corresponding Source for a work in source code form is that
+ same work.
+
+ 2. Basic Permissions.
+
+ All rights granted under this License are granted for the term of
+ copyright on the Program, and are irrevocable provided the stated
+ conditions are met. This License explicitly affirms your unlimited
+ permission to run the unmodified Program. The output from running
+ a covered work is covered by this License only if the output,
+ given its content, constitutes a covered work. This License
+ acknowledges your rights of fair use or other equivalent, as
+ provided by copyright law.
+
+ You may make, run and propagate covered works that you do not
+ convey, without conditions so long as your license otherwise
+ remains in force. You may convey covered works to others for the
+ sole purpose of having them make modifications exclusively for
+ you, or provide you with facilities for running those works,
+ provided that you comply with the terms of this License in
+ conveying all material for which you do not control copyright.
+ Those thus making or running the covered works for you must do so
+ exclusively on your behalf, under your direction and control, on
+ terms that prohibit them from making any copies of your
+ copyrighted material outside their relationship with you.
+
+ Conveying under any other circumstances is permitted solely under
+ the conditions stated below. Sublicensing is not allowed; section
+ 10 makes it unnecessary.
+
+ 3. Protecting Users' Legal Rights From Anti-Circumvention Law.
+
+ No covered work shall be deemed part of an effective technological
+ measure under any applicable law fulfilling obligations under
+ article 11 of the WIPO copyright treaty adopted on 20 December
+ 1996, or similar laws prohibiting or restricting circumvention of
+ such measures.
+
+ When you convey a covered work, you waive any legal power to forbid
+ circumvention of technological measures to the extent such
+ circumvention is effected by exercising rights under this License
+ with respect to the covered work, and you disclaim any intention
+ to limit operation or modification of the work as a means of
+ enforcing, against the work's users, your or third parties' legal
+ rights to forbid circumvention of technological measures.
+
+ 4. Conveying Verbatim Copies.
+
+ You may convey verbatim copies of the Program's source code as you
+ receive it, in any medium, provided that you conspicuously and
+ appropriately publish on each copy an appropriate copyright notice;
+ keep intact all notices stating that this License and any
+ non-permissive terms added in accord with section 7 apply to the
+ code; keep intact all notices of the absence of any warranty; and
+ give all recipients a copy of this License along with the Program.
+
+ You may charge any price or no price for each copy that you convey,
+ and you may offer support or warranty protection for a fee.
+
+ 5. Conveying Modified Source Versions.
+
+ You may convey a work based on the Program, or the modifications to
+ produce it from the Program, in the form of source code under the
+ terms of section 4, provided that you also meet all of these
+ conditions:
+
+ a. The work must carry prominent notices stating that you
+ modified it, and giving a relevant date.
+
+ b. The work must carry prominent notices stating that it is
+ released under this License and any conditions added under
+ section 7. This requirement modifies the requirement in
+ section 4 to "keep intact all notices".
+
+ c. You must license the entire work, as a whole, under this
+ License to anyone who comes into possession of a copy. This
+ License will therefore apply, along with any applicable
+ section 7 additional terms, to the whole of the work, and all
+ its parts, regardless of how they are packaged. This License
+ gives no permission to license the work in any other way, but
+ it does not invalidate such permission if you have separately
+ received it.
+
+ d. If the work has interactive user interfaces, each must display
+ Appropriate Legal Notices; however, if the Program has
+ interactive interfaces that do not display Appropriate Legal
+ Notices, your work need not make them do so.
+
+ A compilation of a covered work with other separate and independent
+ works, which are not by their nature extensions of the covered
+ work, and which are not combined with it such as to form a larger
+ program, in or on a volume of a storage or distribution medium, is
+ called an "aggregate" if the compilation and its resulting
+ copyright are not used to limit the access or legal rights of the
+ compilation's users beyond what the individual works permit.
+ Inclusion of a covered work in an aggregate does not cause this
+ License to apply to the other parts of the aggregate.
+
+ 6. Conveying Non-Source Forms.
+
+ You may convey a covered work in object code form under the terms
+ of sections 4 and 5, provided that you also convey the
+ machine-readable Corresponding Source under the terms of this
+ License, in one of these ways:
+
+ a. Convey the object code in, or embodied in, a physical product
+ (including a physical distribution medium), accompanied by the
+ Corresponding Source fixed on a durable physical medium
+ customarily used for software interchange.
+
+ b. Convey the object code in, or embodied in, a physical product
+ (including a physical distribution medium), accompanied by a
+ written offer, valid for at least three years and valid for
+ as long as you offer spare parts or customer support for that
+ product model, to give anyone who possesses the object code
+ either (1) a copy of the Corresponding Source for all the
+ software in the product that is covered by this License, on a
+ durable physical medium customarily used for software
+ interchange, for a price no more than your reasonable cost of
+ physically performing this conveying of source, or (2) access
+ to copy the Corresponding Source from a network server at no
+ charge.
+
+ c. Convey individual copies of the object code with a copy of
+ the written offer to provide the Corresponding Source. This
+ alternative is allowed only occasionally and noncommercially,
+ and only if you received the object code with such an offer,
+ in accord with subsection 6b.
+
+ d. Convey the object code by offering access from a designated
+ place (gratis or for a charge), and offer equivalent access
+ to the Corresponding Source in the same way through the same
+ place at no further charge. You need not require recipients
+ to copy the Corresponding Source along with the object code.
+ If the place to copy the object code is a network server, the
+ Corresponding Source may be on a different server (operated
+ by you or a third party) that supports equivalent copying
+ facilities, provided you maintain clear directions next to
+ the object code saying where to find the Corresponding Source.
+ Regardless of what server hosts the Corresponding Source, you
+ remain obligated to ensure that it is available for as long
+ as needed to satisfy these requirements.
+
+ e. Convey the object code using peer-to-peer transmission,
+ provided you inform other peers where the object code and
+ Corresponding Source of the work are being offered to the
+ general public at no charge under subsection 6d.
+
+
+ A separable portion of the object code, whose source code is
+ excluded from the Corresponding Source as a System Library, need
+ not be included in conveying the object code work.
+
+ A "User Product" is either (1) a "consumer product", which means
+ any tangible personal property which is normally used for personal,
+ family, or household purposes, or (2) anything designed or sold for
+ incorporation into a dwelling. In determining whether a product
+ is a consumer product, doubtful cases shall be resolved in favor of
+ coverage. For a particular product received by a particular user,
+ "normally used" refers to a typical or common use of that class of
+ product, regardless of the status of the particular user or of the
+ way in which the particular user actually uses, or expects or is
+ expected to use, the product. A product is a consumer product
+ regardless of whether the product has substantial commercial,
+ industrial or non-consumer uses, unless such uses represent the
+ only significant mode of use of the product.
+
+ "Installation Information" for a User Product means any methods,
+ procedures, authorization keys, or other information required to
+ install and execute modified versions of a covered work in that
+ User Product from a modified version of its Corresponding Source.
+ The information must suffice to ensure that the continued
+ functioning of the modified object code is in no case prevented or
+ interfered with solely because modification has been made.
+
+ If you convey an object code work under this section in, or with,
+ or specifically for use in, a User Product, and the conveying
+ occurs as part of a transaction in which the right of possession
+ and use of the User Product is transferred to the recipient in
+ perpetuity or for a fixed term (regardless of how the transaction
+ is characterized), the Corresponding Source conveyed under this
+ section must be accompanied by the Installation Information. But
+ this requirement does not apply if neither you nor any third party
+ retains the ability to install modified object code on the User
+ Product (for example, the work has been installed in ROM).
+
+ The requirement to provide Installation Information does not
+ include a requirement to continue to provide support service,
+ warranty, or updates for a work that has been modified or
+ installed by the recipient, or for the User Product in which it
+ has been modified or installed. Access to a network may be denied
+ when the modification itself materially and adversely affects the
+ operation of the network or violates the rules and protocols for
+ communication across the network.
+
+ Corresponding Source conveyed, and Installation Information
+ provided, in accord with this section must be in a format that is
+ publicly documented (and with an implementation available to the
+ public in source code form), and must require no special password
+ or key for unpacking, reading or copying.
+
+ 7. Additional Terms.
+
+ "Additional permissions" are terms that supplement the terms of
+ this License by making exceptions from one or more of its
+ conditions. Additional permissions that are applicable to the
+ entire Program shall be treated as though they were included in
+ this License, to the extent that they are valid under applicable
+ law. If additional permissions apply only to part of the Program,
+ that part may be used separately under those permissions, but the
+ entire Program remains governed by this License without regard to
+ the additional permissions.
+
+ When you convey a copy of a covered work, you may at your option
+ remove any additional permissions from that copy, or from any part
+ of it. (Additional permissions may be written to require their own
+ removal in certain cases when you modify the work.) You may place
+ additional permissions on material, added by you to a covered work,
+ for which you have or can give appropriate copyright permission.
+
+ Notwithstanding any other provision of this License, for material
+ you add to a covered work, you may (if authorized by the copyright
+ holders of that material) supplement the terms of this License
+ with terms:
+
+ a. Disclaiming warranty or limiting liability differently from
+ the terms of sections 15 and 16 of this License; or
+
+ b. Requiring preservation of specified reasonable legal notices
+ or author attributions in that material or in the Appropriate
+ Legal Notices displayed by works containing it; or
+
+ c. Prohibiting misrepresentation of the origin of that material,
+ or requiring that modified versions of such material be
+ marked in reasonable ways as different from the original
+ version; or
+
+ d. Limiting the use for publicity purposes of names of licensors
+ or authors of the material; or
+
+ e. Declining to grant rights under trademark law for use of some
+ trade names, trademarks, or service marks; or
+
+ f. Requiring indemnification of licensors and authors of that
+ material by anyone who conveys the material (or modified
+ versions of it) with contractual assumptions of liability to
+ the recipient, for any liability that these contractual
+ assumptions directly impose on those licensors and authors.
+
+ All other non-permissive additional terms are considered "further
+ restrictions" within the meaning of section 10. If the Program as
+ you received it, or any part of it, contains a notice stating that
+ it is governed by this License along with a term that is a further
+ restriction, you may remove that term. If a license document
+ contains a further restriction but permits relicensing or
+ conveying under this License, you may add to a covered work
+ material governed by the terms of that license document, provided
+ that the further restriction does not survive such relicensing or
+ conveying.
+
+ If you add terms to a covered work in accord with this section, you
+ must place, in the relevant source files, a statement of the
+ additional terms that apply to those files, or a notice indicating
+ where to find the applicable terms.
+
+ Additional terms, permissive or non-permissive, may be stated in
+ the form of a separately written license, or stated as exceptions;
+ the above requirements apply either way.
+
+ 8. Termination.
+
+ You may not propagate or modify a covered work except as expressly
+ provided under this License. Any attempt otherwise to propagate or
+ modify it is void, and will automatically terminate your rights
+ under this License (including any patent licenses granted under
+ the third paragraph of section 11).
+
+ However, if you cease all violation of this License, then your
+ license from a particular copyright holder is reinstated (a)
+ provisionally, unless and until the copyright holder explicitly
+ and finally terminates your license, and (b) permanently, if the
+ copyright holder fails to notify you of the violation by some
+ reasonable means prior to 60 days after the cessation.
+
+ Moreover, your license from a particular copyright holder is
+ reinstated permanently if the copyright holder notifies you of the
+ violation by some reasonable means, this is the first time you have
+ received notice of violation of this License (for any work) from
+ that copyright holder, and you cure the violation prior to 30 days
+ after your receipt of the notice.
+
+ Termination of your rights under this section does not terminate
+ the licenses of parties who have received copies or rights from
+ you under this License. If your rights have been terminated and
+ not permanently reinstated, you do not qualify to receive new
+ licenses for the same material under section 10.
+
+ 9. Acceptance Not Required for Having Copies.
+
+ You are not required to accept this License in order to receive or
+ run a copy of the Program. Ancillary propagation of a covered work
+ occurring solely as a consequence of using peer-to-peer
+ transmission to receive a copy likewise does not require
+ acceptance. However, nothing other than this License grants you
+ permission to propagate or modify any covered work. These actions
+ infringe copyright if you do not accept this License. Therefore,
+ by modifying or propagating a covered work, you indicate your
+ acceptance of this License to do so.
+
+ 10. Automatic Licensing of Downstream Recipients.
+
+ Each time you convey a covered work, the recipient automatically
+ receives a license from the original licensors, to run, modify and
+ propagate that work, subject to this License. You are not
+ responsible for enforcing compliance by third parties with this
+ License.
+
+ An "entity transaction" is a transaction transferring control of an
+ organization, or substantially all assets of one, or subdividing an
+ organization, or merging organizations. If propagation of a
+ covered work results from an entity transaction, each party to that
+ transaction who receives a copy of the work also receives whatever
+ licenses to the work the party's predecessor in interest had or
+ could give under the previous paragraph, plus a right to
+ possession of the Corresponding Source of the work from the
+ predecessor in interest, if the predecessor has it or can get it
+ with reasonable efforts.
+
+ You may not impose any further restrictions on the exercise of the
+ rights granted or affirmed under this License. For example, you
+ may not impose a license fee, royalty, or other charge for
+ exercise of rights granted under this License, and you may not
+ initiate litigation (including a cross-claim or counterclaim in a
+ lawsuit) alleging that any patent claim is infringed by making,
+ using, selling, offering for sale, or importing the Program or any
+ portion of it.
+
+ 11. Patents.
+
+ A "contributor" is a copyright holder who authorizes use under this
+ License of the Program or a work on which the Program is based.
+ The work thus licensed is called the contributor's "contributor
+ version".
+
+ A contributor's "essential patent claims" are all patent claims
+ owned or controlled by the contributor, whether already acquired or
+ hereafter acquired, that would be infringed by some manner,
+ permitted by this License, of making, using, or selling its
+ contributor version, but do not include claims that would be
+ infringed only as a consequence of further modification of the
+ contributor version. For purposes of this definition, "control"
+ includes the right to grant patent sublicenses in a manner
+ consistent with the requirements of this License.
+
+ Each contributor grants you a non-exclusive, worldwide,
+ royalty-free patent license under the contributor's essential
+ patent claims, to make, use, sell, offer for sale, import and
+ otherwise run, modify and propagate the contents of its
+ contributor version.
+
+ In the following three paragraphs, a "patent license" is any
+ express agreement or commitment, however denominated, not to
+ enforce a patent (such as an express permission to practice a
+ patent or covenant not to sue for patent infringement). To
+ "grant" such a patent license to a party means to make such an
+ agreement or commitment not to enforce a patent against the party.
+
+ If you convey a covered work, knowingly relying on a patent
+ license, and the Corresponding Source of the work is not available
+ for anyone to copy, free of charge and under the terms of this
+ License, through a publicly available network server or other
+ readily accessible means, then you must either (1) cause the
+ Corresponding Source to be so available, or (2) arrange to deprive
+ yourself of the benefit of the patent license for this particular
+ work, or (3) arrange, in a manner consistent with the requirements
+ of this License, to extend the patent license to downstream
+ recipients. "Knowingly relying" means you have actual knowledge
+ that, but for the patent license, your conveying the covered work
+ in a country, or your recipient's use of the covered work in a
+ country, would infringe one or more identifiable patents in that
+ country that you have reason to believe are valid.
+
+ If, pursuant to or in connection with a single transaction or
+ arrangement, you convey, or propagate by procuring conveyance of, a
+ covered work, and grant a patent license to some of the parties
+ receiving the covered work authorizing them to use, propagate,
+ modify or convey a specific copy of the covered work, then the
+ patent license you grant is automatically extended to all
+ recipients of the covered work and works based on it.
+
+ A patent license is "discriminatory" if it does not include within
+ the scope of its coverage, prohibits the exercise of, or is
+ conditioned on the non-exercise of one or more of the rights that
+ are specifically granted under this License. You may not convey a
+ covered work if you are a party to an arrangement with a third
+ party that is in the business of distributing software, under
+ which you make payment to the third party based on the extent of
+ your activity of conveying the work, and under which the third
+ party grants, to any of the parties who would receive the covered
+ work from you, a discriminatory patent license (a) in connection
+ with copies of the covered work conveyed by you (or copies made
+ from those copies), or (b) primarily for and in connection with
+ specific products or compilations that contain the covered work,
+ unless you entered into that arrangement, or that patent license
+ was granted, prior to 28 March 2007.
+
+ Nothing in this License shall be construed as excluding or limiting
+ any implied license or other defenses to infringement that may
+ otherwise be available to you under applicable patent law.
+
+ 12. No Surrender of Others' Freedom.
+
+ If conditions are imposed on you (whether by court order,
+ agreement or otherwise) that contradict the conditions of this
+ License, they do not excuse you from the conditions of this
+ License. If you cannot convey a covered work so as to satisfy
+ simultaneously your obligations under this License and any other
+ pertinent obligations, then as a consequence you may not convey it
+ at all. For example, if you agree to terms that obligate you to
+ collect a royalty for further conveying from those to whom you
+ convey the Program, the only way you could satisfy both those
+ terms and this License would be to refrain entirely from conveying
+ the Program.
+
+ 13. Use with the GNU Affero General Public License.
+
+ Notwithstanding any other provision of this License, you have
+ permission to link or combine any covered work with a work licensed
+ under version 3 of the GNU Affero General Public License into a
+ single combined work, and to convey the resulting work. The terms
+ of this License will continue to apply to the part which is the
+ covered work, but the special requirements of the GNU Affero
+ General Public License, section 13, concerning interaction through
+ a network will apply to the combination as such.
+
+ 14. Revised Versions of this License.
+
+ The Free Software Foundation may publish revised and/or new
+ versions of the GNU General Public License from time to time.
+ Such new versions will be similar in spirit to the present
+ version, but may differ in detail to address new problems or
+ concerns.
+
+ Each version is given a distinguishing version number. If the
+ Program specifies that a certain numbered version of the GNU
+ General Public License "or any later version" applies to it, you
+ have the option of following the terms and conditions either of
+ that numbered version or of any later version published by the
+ Free Software Foundation. If the Program does not specify a
+ version number of the GNU General Public License, you may choose
+ any version ever published by the Free Software Foundation.
+
+ If the Program specifies that a proxy can decide which future
+ versions of the GNU General Public License can be used, that
+ proxy's public statement of acceptance of a version permanently
+ authorizes you to choose that version for the Program.
+
+ Later license versions may give you additional or different
+ permissions. However, no additional obligations are imposed on any
+ author or copyright holder as a result of your choosing to follow a
+ later version.
+
+ 15. Disclaimer of Warranty.
+
+ THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
+ APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE
+ COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS"
+ WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED,
+ INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+ MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE
+ RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU.
+ SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL
+ NECESSARY SERVICING, REPAIR OR CORRECTION.
+
+ 16. Limitation of Liability.
+
+ IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN
+ WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES
+ AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU
+ FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR
+ CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE
+ THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA
+ BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
+ PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+ PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF
+ THE POSSIBILITY OF SUCH DAMAGES.
+
+ 17. Interpretation of Sections 15 and 16.
+
+ If the disclaimer of warranty and limitation of liability provided
+ above cannot be given local legal effect according to their terms,
+ reviewing courts shall apply local law that most closely
+ approximates an absolute waiver of all civil liability in
+ connection with the Program, unless a warranty or assumption of
+ liability accompanies a copy of the Program in return for a fee.
+
+
+END OF TERMS AND CONDITIONS
+===========================
+
+How to Apply These Terms to Your New Programs
+=============================================
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these
+terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+state the exclusion of warranty; and each file should have at least the
+"copyright" line and a pointer to where the full notice is found.
+
+ ONE LINE TO GIVE THE PROGRAM'S NAME AND A BRIEF IDEA OF WHAT IT DOES.
+ Copyright (C) YEAR NAME OF AUTHOR
+
+ This program is free software: you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation, either version 3 of the License, or (at
+ your option) any later version.
+
+ This program is distributed in the hope that it will be useful, but
+ WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
+ General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program. If not, see `http://www.gnu.org/licenses/'.
+
+ Also add information on how to contact you by electronic and paper
+mail.
+
+ If the program does terminal interaction, make it output a short
+notice like this when it starts in an interactive mode:
+
+ PROGRAM Copyright (C) YEAR NAME OF AUTHOR
+ This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+ The hypothetical commands `show w' and `show c' should show the
+appropriate parts of the General Public License. Of course, your
+program's commands might be different; for a GUI interface, you would
+use an "about box".
+
+ You should also get your employer (if you work as a programmer) or
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary. For more information on this, and how to apply and follow
+the GNU GPL, see `http://www.gnu.org/licenses/'.
+
+ The GNU General Public License does not permit incorporating your
+program into proprietary programs. If your program is a subroutine
+library, you may consider it more useful to permit linking proprietary
+applications with the library. If this is what you want to do, use the
+GNU Lesser General Public License instead of this License. But first,
+please read `http://www.gnu.org/philosophy/why-not-lgpl.html'.
+
+
+File: libunistring.info, Node: GNU LGPL, Next: GNU FDL, Prev: GNU GPL, Up: Licenses
+
+A.2 GNU LESSER GENERAL PUBLIC LICENSE
+=====================================
+
+ Version 3, 29 June 2007
+
+ Copyright (C) 2007 Free Software Foundation, Inc. `http://fsf.org/'
+
+ Everyone is permitted to copy and distribute verbatim copies of this
+ license document, but changing it is not allowed.
+
+ This version of the GNU Lesser General Public License incorporates
+the terms and conditions of version 3 of the GNU General Public
+License, supplemented by the additional permissions listed below.
+
+ 0. Additional Definitions.
+
+ As used herein, "this License" refers to version 3 of the GNU
+ Lesser General Public License, and the "GNU GPL" refers to version
+ 3 of the GNU General Public License.
+
+ "The Library" refers to a covered work governed by this License,
+ other than an Application or a Combined Work as defined below.
+
+ An "Application" is any work that makes use of an interface
+ provided by the Library, but which is not otherwise based on the
+ Library. Defining a subclass of a class defined by the Library is
+ deemed a mode of using an interface provided by the Library.
+
+ A "Combined Work" is a work produced by combining or linking an
+ Application with the Library. The particular version of the
+ Library with which the Combined Work was made is also called the
+ "Linked Version".
+
+ The "Minimal Corresponding Source" for a Combined Work means the
+ Corresponding Source for the Combined Work, excluding any source
+ code for portions of the Combined Work that, considered in
+ isolation, are based on the Application, and not on the Linked
+ Version.
+
+ The "Corresponding Application Code" for a Combined Work means the
+ object code and/or source code for the Application, including any
+ data and utility programs needed for reproducing the Combined Work
+ from the Application, but excluding the System Libraries of the
+ Combined Work.
+
+ 1. Exception to Section 3 of the GNU GPL.
+
+ You may convey a covered work under sections 3 and 4 of this
+ License without being bound by section 3 of the GNU GPL.
+
+ 2. Conveying Modified Versions.
+
+ If you modify a copy of the Library, and, in your modifications, a
+ facility refers to a function or data to be supplied by an
+ Application that uses the facility (other than as an argument
+ passed when the facility is invoked), then you may convey a copy
+ of the modified version:
+
+ a. under this License, provided that you make a good faith
+ effort to ensure that, in the event an Application does not
+ supply the function or data, the facility still operates, and
+ performs whatever part of its purpose remains meaningful, or
+
+ b. under the GNU GPL, with none of the additional permissions of
+ this License applicable to that copy.
+
+ 3. Object Code Incorporating Material from Library Header Files.
+
+ The object code form of an Application may incorporate material
+ from a header file that is part of the Library. You may convey
+ such object code under terms of your choice, provided that, if the
+ incorporated material is not limited to numerical parameters, data
+ structure layouts and accessors, or small macros, inline functions
+ and templates (ten or fewer lines in length), you do both of the
+ following:
+
+ a. Give prominent notice with each copy of the object code that
+ the Library is used in it and that the Library and its use are
+ covered by this License.
+
+ b. Accompany the object code with a copy of the GNU GPL and this
+ license document.
+
+ 4. Combined Works.
+
+ You may convey a Combined Work under terms of your choice that,
+ taken together, effectively do not restrict modification of the
+ portions of the Library contained in the Combined Work and reverse
+ engineering for debugging such modifications, if you also do each
+ of the following:
+
+ a. Give prominent notice with each copy of the Combined Work that
+ the Library is used in it and that the Library and its use are
+ covered by this License.
+
+ b. Accompany the Combined Work with a copy of the GNU GPL and
+ this license document.
+
+ c. For a Combined Work that displays copyright notices during
+ execution, include the copyright notice for the Library among
+ these notices, as well as a reference directing the user to
+ the copies of the GNU GPL and this license document.
+
+ d. Do one of the following:
+
+ 0. Convey the Minimal Corresponding Source under the terms
+ of this License, and the Corresponding Application Code
+ in a form suitable for, and under terms that permit, the
+ user to recombine or relink the Application with a
+ modified version of the Linked Version to produce a
+ modified Combined Work, in the manner specified by
+ section 6 of the GNU GPL for conveying Corresponding
+ Source.
+
+ 1. Use a suitable shared library mechanism for linking with
+ the Library. A suitable mechanism is one that (a) uses
+ at run time a copy of the Library already present on the
+ user's computer system, and (b) will operate properly
+ with a modified version of the Library that is
+ interface-compatible with the Linked Version.
+
+ e. Provide Installation Information, but only if you would
+ otherwise be required to provide such information under
+ section 6 of the GNU GPL, and only to the extent that such
+ information is necessary to install and execute a modified
+ version of the Combined Work produced by recombining or
+ relinking the Application with a modified version of the
+ Linked Version. (If you use option 4d0, the Installation
+ Information must accompany the Minimal Corresponding Source
+ and Corresponding Application Code. If you use option 4d1,
+ you must provide the Installation Information in the manner
+ specified by section 6 of the GNU GPL for conveying
+ Corresponding Source.)
+
+ 5. Combined Libraries.
+
+ You may place library facilities that are a work based on the
+ Library side by side in a single library together with other
+ library facilities that are not Applications and are not covered
+ by this License, and convey such a combined library under terms of
+ your choice, if you do both of the following:
+
+ a. Accompany the combined library with a copy of the same work
+ based on the Library, uncombined with any other library
+ facilities, conveyed under the terms of this License.
+
+ b. Give prominent notice with the combined library that part of
+ it is a work based on the Library, and explaining where to
+ find the accompanying uncombined form of the same work.
+
+ 6. Revised Versions of the GNU Lesser General Public License.
+
+ The Free Software Foundation may publish revised and/or new
+ versions of the GNU Lesser General Public License from time to
+ time. Such new versions will be similar in spirit to the present
+ version, but may differ in detail to address new problems or
+ concerns.
+
+ Each version is given a distinguishing version number. If the
+ Library as you received it specifies that a certain numbered
+ version of the GNU Lesser General Public License "or any later
+ version" applies to it, you have the option of following the terms
+ and conditions either of that published version or of any later
+ version published by the Free Software Foundation. If the Library
+ as you received it does not specify a version number of the GNU
+ Lesser General Public License, you may choose any version of the
+ GNU Lesser General Public License ever published by the Free
+ Software Foundation.
+
+ If the Library as you received it specifies that a proxy can decide
+ whether future versions of the GNU Lesser General Public License
+ shall apply, that proxy's public statement of acceptance of any
+ version is permanent authorization for you to choose that version
+ for the Library.
+
+
+
+File: libunistring.info, Node: GNU FDL, Prev: GNU LGPL, Up: Licenses
+
+A.3 GNU Free Documentation License
+==================================
+
+ Version 1.3, 3 November 2008
+
+ Copyright (C) 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
+ `http://fsf.org/'
+
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ 0. PREAMBLE
+
+ The purpose of this License is to make a manual, textbook, or other
+ functional and useful document "free" in the sense of freedom: to
+ assure everyone the effective freedom to copy and redistribute it,
+ with or without modifying it, either commercially or
+ noncommercially. Secondarily, this License preserves for the
+ author and publisher a way to get credit for their work, while not
+ being considered responsible for modifications made by others.
+
+ This License is a kind of "copyleft", which means that derivative
+ works of the document must themselves be free in the same sense.
+ It complements the GNU General Public License, which is a copyleft
+ license designed for free software.
+
+ We have designed this License in order to use it for manuals for
+ free software, because free software needs free documentation: a
+ free program should come with manuals providing the same freedoms
+ that the software does. But this License is not limited to
+ software manuals; it can be used for any textual work, regardless
+ of subject matter or whether it is published as a printed book.
+ We recommend this License principally for works whose purpose is
+ instruction or reference.
+
+ 1. APPLICABILITY AND DEFINITIONS
+
+ This License applies to any manual or other work, in any medium,
+ that contains a notice placed by the copyright holder saying it
+ can be distributed under the terms of this License. Such a notice
+ grants a world-wide, royalty-free license, unlimited in duration,
+ to use that work under the conditions stated herein. The
+ "Document", below, refers to any such manual or work. Any member
+ of the public is a licensee, and is addressed as "you". You
+ accept the license if you copy, modify or distribute the work in a
+ way requiring permission under copyright law.
+
+ A "Modified Version" of the Document means any work containing the
+ Document or a portion of it, either copied verbatim, or with
+ modifications and/or translated into another language.
+
+ A "Secondary Section" is a named appendix or a front-matter section
+ of the Document that deals exclusively with the relationship of the
+ publishers or authors of the Document to the Document's overall
+ subject (or to related matters) and contains nothing that could
+ fall directly within that overall subject. (Thus, if the Document
+ is in part a textbook of mathematics, a Secondary Section may not
+ explain any mathematics.) The relationship could be a matter of
+ historical connection with the subject or with related matters, or
+ of legal, commercial, philosophical, ethical or political position
+ regarding them.
+
+ The "Invariant Sections" are certain Secondary Sections whose
+ titles are designated, as being those of Invariant Sections, in
+ the notice that says that the Document is released under this
+ License. If a section does not fit the above definition of
+ Secondary then it is not allowed to be designated as Invariant.
+ The Document may contain zero Invariant Sections. If the Document
+ does not identify any Invariant Sections then there are none.
+
+ The "Cover Texts" are certain short passages of text that are
+ listed, as Front-Cover Texts or Back-Cover Texts, in the notice
+ that says that the Document is released under this License. A
+ Front-Cover Text may be at most 5 words, and a Back-Cover Text may
+ be at most 25 words.
+
+ A "Transparent" copy of the Document means a machine-readable copy,
+ represented in a format whose specification is available to the
+ general public, that is suitable for revising the document
+ straightforwardly with generic text editors or (for images
+ composed of pixels) generic paint programs or (for drawings) some
+ widely available drawing editor, and that is suitable for input to
+ text formatters or for automatic translation to a variety of
+ formats suitable for input to text formatters. A copy made in an
+ otherwise Transparent file format whose markup, or absence of
+ markup, has been arranged to thwart or discourage subsequent
+ modification by readers is not Transparent. An image format is
+ not Transparent if used for any substantial amount of text. A
+ copy that is not "Transparent" is called "Opaque".
+
+ Examples of suitable formats for Transparent copies include plain
+ ASCII without markup, Texinfo input format, LaTeX input format,
+ SGML or XML using a publicly available DTD, and
+ standard-conforming simple HTML, PostScript or PDF designed for
+ human modification. Examples of transparent image formats include
+ PNG, XCF and JPG. Opaque formats include proprietary formats that
+ can be read and edited only by proprietary word processors, SGML or
+ XML for which the DTD and/or processing tools are not generally
+ available, and the machine-generated HTML, PostScript or PDF
+ produced by some word processors for output purposes only.
+
+ The "Title Page" means, for a printed book, the title page itself,
+ plus such following pages as are needed to hold, legibly, the
+ material this License requires to appear in the title page. For
+ works in formats which do not have any title page as such, "Title
+ Page" means the text near the most prominent appearance of the
+ work's title, preceding the beginning of the body of the text.
+
+ The "publisher" means any person or entity that distributes copies
+ of the Document to the public.
+
+ A section "Entitled XYZ" means a named subunit of the Document
+ whose title either is precisely XYZ or contains XYZ in parentheses
+ following text that translates XYZ in another language. (Here XYZ
+ stands for a specific section name mentioned below, such as
+ "Acknowledgements", "Dedications", "Endorsements", or "History".)
+ To "Preserve the Title" of such a section when you modify the
+ Document means that it remains a section "Entitled XYZ" according
+ to this definition.
+
+ The Document may include Warranty Disclaimers next to the notice
+ which states that this License applies to the Document. These
+ Warranty Disclaimers are considered to be included by reference in
+ this License, but only as regards disclaiming warranties: any other
+ implication that these Warranty Disclaimers may have is void and
+ has no effect on the meaning of this License.
+
+ 2. VERBATIM COPYING
+
+ You may copy and distribute the Document in any medium, either
+ commercially or noncommercially, provided that this License, the
+ copyright notices, and the license notice saying this License
+ applies to the Document are reproduced in all copies, and that you
+ add no other conditions whatsoever to those of this License. You
+ may not use technical measures to obstruct or control the reading
+ or further copying of the copies you make or distribute. However,
+ you may accept compensation in exchange for copies. If you
+ distribute a large enough number of copies you must also follow
+ the conditions in section 3.
+
+ You may also lend copies, under the same conditions stated above,
+ and you may publicly display copies.
+
+ 3. COPYING IN QUANTITY
+
+ If you publish printed copies (or copies in media that commonly
+ have printed covers) of the Document, numbering more than 100, and
+ the Document's license notice requires Cover Texts, you must
+ enclose the copies in covers that carry, clearly and legibly, all
+ these Cover Texts: Front-Cover Texts on the front cover, and
+ Back-Cover Texts on the back cover. Both covers must also clearly
+ and legibly identify you as the publisher of these copies. The
+ front cover must present the full title with all words of the
+ title equally prominent and visible. You may add other material
+ on the covers in addition. Copying with changes limited to the
+ covers, as long as they preserve the title of the Document and
+ satisfy these conditions, can be treated as verbatim copying in
+ other respects.
+
+ If the required texts for either cover are too voluminous to fit
+ legibly, you should put the first ones listed (as many as fit
+ reasonably) on the actual cover, and continue the rest onto
+ adjacent pages.
+
+ If you publish or distribute Opaque copies of the Document
+ numbering more than 100, you must either include a
+ machine-readable Transparent copy along with each Opaque copy, or
+ state in or with each Opaque copy a computer-network location from
+ which the general network-using public has access to download
+ using public-standard network protocols a complete Transparent
+ copy of the Document, free of added material. If you use the
+ latter option, you must take reasonably prudent steps, when you
+ begin distribution of Opaque copies in quantity, to ensure that
+ this Transparent copy will remain thus accessible at the stated
+ location until at least one year after the last time you
+ distribute an Opaque copy (directly or through your agents or
+ retailers) of that edition to the public.
+
+ It is requested, but not required, that you contact the authors of
+ the Document well before redistributing any large number of
+ copies, to give them a chance to provide you with an updated
+ version of the Document.
+
+ 4. MODIFICATIONS
+
+ You may copy and distribute a Modified Version of the Document
+ under the conditions of sections 2 and 3 above, provided that you
+ release the Modified Version under precisely this License, with
+ the Modified Version filling the role of the Document, thus
+ licensing distribution and modification of the Modified Version to
+ whoever possesses a copy of it. In addition, you must do these
+ things in the Modified Version:
+
+ A. Use in the Title Page (and on the covers, if any) a title
+ distinct from that of the Document, and from those of
+ previous versions (which should, if there were any, be listed
+ in the History section of the Document). You may use the
+ same title as a previous version if the original publisher of
+ that version gives permission.
+
+ B. List on the Title Page, as authors, one or more persons or
+ entities responsible for authorship of the modifications in
+ the Modified Version, together with at least five of the
+ principal authors of the Document (all of its principal
+ authors, if it has fewer than five), unless they release you
+ from this requirement.
+
+ C. State on the Title page the name of the publisher of the
+ Modified Version, as the publisher.
+
+ D. Preserve all the copyright notices of the Document.
+
+ E. Add an appropriate copyright notice for your modifications
+ adjacent to the other copyright notices.
+
+ F. Include, immediately after the copyright notices, a license
+ notice giving the public permission to use the Modified
+ Version under the terms of this License, in the form shown in
+ the Addendum below.
+
+ G. Preserve in that license notice the full lists of Invariant
+ Sections and required Cover Texts given in the Document's
+ license notice.
+
+ H. Include an unaltered copy of this License.
+
+ I. Preserve the section Entitled "History", Preserve its Title,
+ and add to it an item stating at least the title, year, new
+ authors, and publisher of the Modified Version as given on
+ the Title Page. If there is no section Entitled "History" in
+ the Document, create one stating the title, year, authors,
+ and publisher of the Document as given on its Title Page,
+ then add an item describing the Modified Version as stated in
+ the previous sentence.
+
+ J. Preserve the network location, if any, given in the Document
+ for public access to a Transparent copy of the Document, and
+ likewise the network locations given in the Document for
+ previous versions it was based on. These may be placed in
+ the "History" section. You may omit a network location for a
+ work that was published at least four years before the
+ Document itself, or if the original publisher of the version
+ it refers to gives permission.
+
+ K. For any section Entitled "Acknowledgements" or "Dedications",
+ Preserve the Title of the section, and preserve in the
+ section all the substance and tone of each of the contributor
+ acknowledgements and/or dedications given therein.
+
+ L. Preserve all the Invariant Sections of the Document,
+ unaltered in their text and in their titles. Section numbers
+ or the equivalent are not considered part of the section
+ titles.
+
+ M. Delete any section Entitled "Endorsements". Such a section
+ may not be included in the Modified Version.
+
+ N. Do not retitle any existing section to be Entitled
+ "Endorsements" or to conflict in title with any Invariant
+ Section.
+
+ O. Preserve any Warranty Disclaimers.
+
+ If the Modified Version includes new front-matter sections or
+ appendices that qualify as Secondary Sections and contain no
+ material copied from the Document, you may at your option
+ designate some or all of these sections as invariant. To do this,
+ add their titles to the list of Invariant Sections in the Modified
+ Version's license notice. These titles must be distinct from any
+ other section titles.
+
+ You may add a section Entitled "Endorsements", provided it contains
+ nothing but endorsements of your Modified Version by various
+ parties--for example, statements of peer review or that the text
+ has been approved by an organization as the authoritative
+ definition of a standard.
+
+ You may add a passage of up to five words as a Front-Cover Text,
+ and a passage of up to 25 words as a Back-Cover Text, to the end
+ of the list of Cover Texts in the Modified Version. Only one
+ passage of Front-Cover Text and one of Back-Cover Text may be
+ added by (or through arrangements made by) any one entity. If the
+ Document already includes a cover text for the same cover,
+ previously added by you or by arrangement made by the same entity
+ you are acting on behalf of, you may not add another; but you may
+ replace the old one, on explicit permission from the previous
+ publisher that added the old one.
+
+ The author(s) and publisher(s) of the Document do not by this
+ License give permission to use their names for publicity for or to
+ assert or imply endorsement of any Modified Version.
+
+ 5. COMBINING DOCUMENTS
+
+ You may combine the Document with other documents released under
+ this License, under the terms defined in section 4 above for
+ modified versions, provided that you include in the combination
+ all of the Invariant Sections of all of the original documents,
+ unmodified, and list them all as Invariant Sections of your
+ combined work in its license notice, and that you preserve all
+ their Warranty Disclaimers.
+
+ The combined work need only contain one copy of this License, and
+ multiple identical Invariant Sections may be replaced with a single
+ copy. If there are multiple Invariant Sections with the same name
+ but different contents, make the title of each such section unique
+ by adding at the end of it, in parentheses, the name of the
+ original author or publisher of that section if known, or else a
+ unique number. Make the same adjustment to the section titles in
+ the list of Invariant Sections in the license notice of the
+ combined work.
+
+ In the combination, you must combine any sections Entitled
+ "History" in the various original documents, forming one section
+ Entitled "History"; likewise combine any sections Entitled
+ "Acknowledgements", and any sections Entitled "Dedications". You
+ must delete all sections Entitled "Endorsements."
+
+ 6. COLLECTIONS OF DOCUMENTS
+
+ You may make a collection consisting of the Document and other
+ documents released under this License, and replace the individual
+ copies of this License in the various documents with a single copy
+ that is included in the collection, provided that you follow the
+ rules of this License for verbatim copying of each of the
+ documents in all other respects.
+
+ You may extract a single document from such a collection, and
+ distribute it individually under this License, provided you insert
+ a copy of this License into the extracted document, and follow
+ this License in all other respects regarding verbatim copying of
+ that document.
+
+ 7. AGGREGATION WITH INDEPENDENT WORKS
+
+ A compilation of the Document or its derivatives with other
+ separate and independent documents or works, in or on a volume of
+ a storage or distribution medium, is called an "aggregate" if the
+ copyright resulting from the compilation is not used to limit the
+ legal rights of the compilation's users beyond what the individual
+ works permit. When the Document is included in an aggregate, this
+ License does not apply to the other works in the aggregate which
+ are not themselves derivative works of the Document.
+
+ If the Cover Text requirement of section 3 is applicable to these
+ copies of the Document, then if the Document is less than one half
+ of the entire aggregate, the Document's Cover Texts may be placed
+ on covers that bracket the Document within the aggregate, or the
+ electronic equivalent of covers if the Document is in electronic
+ form. Otherwise they must appear on printed covers that bracket
+ the whole aggregate.
+
+ 8. TRANSLATION
+
+ Translation is considered a kind of modification, so you may
+ distribute translations of the Document under the terms of section
+ 4. Replacing Invariant Sections with translations requires special
+ permission from their copyright holders, but you may include
+ translations of some or all Invariant Sections in addition to the
+ original versions of these Invariant Sections. You may include a
+ translation of this License, and all the license notices in the
+ Document, and any Warranty Disclaimers, provided that you also
+ include the original English version of this License and the
+ original versions of those notices and disclaimers. In case of a
+ disagreement between the translation and the original version of
+ this License or a notice or disclaimer, the original version will
+ prevail.
+
+ If a section in the Document is Entitled "Acknowledgements",
+ "Dedications", or "History", the requirement (section 4) to
+ Preserve its Title (section 1) will typically require changing the
+ actual title.
+
+ 9. TERMINATION
+
+ You may not copy, modify, sublicense, or distribute the Document
+ except as expressly provided under this License. Any attempt
+ otherwise to copy, modify, sublicense, or distribute it is void,
+ and will automatically terminate your rights under this License.
+
+ However, if you cease all violation of this License, then your
+ license from a particular copyright holder is reinstated (a)
+ provisionally, unless and until the copyright holder explicitly
+ and finally terminates your license, and (b) permanently, if the
+ copyright holder fails to notify you of the violation by some
+ reasonable means prior to 60 days after the cessation.
+
+ Moreover, your license from a particular copyright holder is
+ reinstated permanently if the copyright holder notifies you of the
+ violation by some reasonable means, this is the first time you have
+ received notice of violation of this License (for any work) from
+ that copyright holder, and you cure the violation prior to 30 days
+ after your receipt of the notice.
+
+ Termination of your rights under this section does not terminate
+ the licenses of parties who have received copies or rights from
+ you under this License. If your rights have been terminated and
+ not permanently reinstated, receipt of a copy of some or all of
+ the same material does not give you any rights to use it.
+
+ 10. FUTURE REVISIONS OF THIS LICENSE
+
+ The Free Software Foundation may publish new, revised versions of
+ the GNU Free Documentation License from time to time. Such new
+ versions will be similar in spirit to the present version, but may
+ differ in detail to address new problems or concerns. See
+ `http://www.gnu.org/copyleft/'.
+
+ Each version of the License is given a distinguishing version
+ number. If the Document specifies that a particular numbered
+ version of this License "or any later version" applies to it, you
+ have the option of following the terms and conditions either of
+ that specified version or of any later version that has been
+ published (not as a draft) by the Free Software Foundation. If
+ the Document does not specify a version number of this License,
+ you may choose any version ever published (not as a draft) by the
+ Free Software Foundation. If the Document specifies that a proxy
+ can decide which future versions of this License can be used, that
+ proxy's public statement of acceptance of a version permanently
+ authorizes you to choose that version for the Document.
+
+ 11. RELICENSING
+
+ "Massive Multiauthor Collaboration Site" (or "MMC Site") means any
+ World Wide Web server that publishes copyrightable works and also
+ provides prominent facilities for anybody to edit those works. A
+ public wiki that anybody can edit is an example of such a server.
+ A "Massive Multiauthor Collaboration" (or "MMC") contained in the
+ site means any set of copyrightable works thus published on the MMC
+ site.
+
+ "CC-BY-SA" means the Creative Commons Attribution-Share Alike 3.0
+ license published by Creative Commons Corporation, a not-for-profit
+ corporation with a principal place of business in San Francisco,
+ California, as well as future copyleft versions of that license
+ published by that same organization.
+
+ "Incorporate" means to publish or republish a Document, in whole or
+ in part, as part of another Document.
+
+ An MMC is "eligible for relicensing" if it is licensed under this
+ License, and if all works that were first published under this
+ License somewhere other than this MMC, and subsequently
+ incorporated in whole or in part into the MMC, (1) had no cover
+ texts or invariant sections, and (2) were thus incorporated prior
+ to November 1, 2008.
+
+ The operator of an MMC Site may republish an MMC contained in the
+ site under CC-BY-SA on the same site at any time before August 1,
+ 2009, provided the MMC is eligible for relicensing.
+
+
+ADDENDUM: How to use this License for your documents
+====================================================
+
+ To use this License in a document you have written, include a copy of
+the License in the document and put the following copyright and license
+notices just after the title page:
+
+ Copyright (C) YEAR YOUR NAME.
+ Permission is granted to copy, distribute and/or modify this document
+ under the terms of the GNU Free Documentation License, Version 1.3
+ or any later version published by the Free Software Foundation;
+ with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
+ Texts. A copy of the license is included in the section entitled ``GNU
+ Free Documentation License''.
+
+ If you have Invariant Sections, Front-Cover Texts and Back-Cover
+Texts, replace the "with...Texts." line with this:
+
+ with the Invariant Sections being LIST THEIR TITLES, with
+ the Front-Cover Texts being LIST, and with the Back-Cover Texts
+ being LIST.
+
+ If you have Invariant Sections without Cover Texts, or some other
+combination of the three, merge those two alternatives to suit the
+situation.
+
+ If your document contains nontrivial examples of program code, we
+recommend releasing these examples in parallel under your choice of
+free software license, such as the GNU General Public License, to
+permit their use in free software.
+
+
+File: libunistring.info, Node: Index, Prev: Licenses, Up: Top
+
+Index
+*****
+
+
+* Menu:
+
+* ambiguous width: uniwidth.h. (line 10)
+* argument conventions: Conventions. (line 9)
+* autoconf macro: Autoconf macro. (line 6)
+* bidirectional category: Bidirectional category.
+ (line 6)
+* bidirectional reordering: More functionality. (line 6)
+* block: Blocks. (line 6)
+* breaks, line: unilbrk.h. (line 6)
+* breaks, word: uniwbrk.h. (line 6)
+* bug reports: Reporting problems. (line 6)
+* bug tracker: Reporting problems. (line 6)
+* C string functions: char * strings. (line 6)
+* C, programming language: ISO C and Java syntax.
+ (line 6)
+* C-like API: Classifications like in ISO C.
+ (line 6)
+* canonical combining class: Canonical combining class.
+ (line 6)
+* case detection: Case detection. (line 6)
+* case mappings: Case mappings of strings.
+ (line 6)
+* casing_prefix_context_t: Case mappings of substrings.
+ (line 15)
+* casing_suffix_context_t: Case mappings of substrings.
+ (line 46)
+* char, type: char * strings. (line 23)
+* combining, Unicode characters: Composition of characters.
+ (line 6)
+* comparing <1>: Elementary string functions on NUL terminated strings.
+ (line 128)
+* comparing: Elementary string functions.
+ (line 108)
+* comparing, ignoring case: Case insensitive comparison.
+ (line 6)
+* comparing, ignoring case, with collation rules: Case insensitive comparison.
+ (line 66)
+* comparing, ignoring normalization: Normalizing comparisons.
+ (line 6)
+* comparing, ignoring normalization and case: Case insensitive comparison.
+ (line 6)
+* comparing, ignoring normalization and case, with collation rules: Case insensitive comparison.
+ (line 66)
+* comparing, ignoring normalization, with collation rules: Normalizing comparisons.
+ (line 23)
+* comparing, with collation rules: Elementary string functions on NUL terminated strings.
+ (line 140)
+* comparing, with collation rules, ignoring case: Case insensitive comparison.
+ (line 66)
+* comparing, with collation rules, ignoring normalization: Normalizing comparisons.
+ (line 23)
+* comparing, with collation rules, ignoring normalization and case: Case insensitive comparison.
+ (line 66)
+* compiler options: Compiler options. (line 24)
+* composing, Unicode characters: Composition of characters.
+ (line 6)
+* converting <1>: uniconv.h. (line 45)
+* converting: Elementary string conversions.
+ (line 6)
+* copying <1>: Elementary string functions on NUL terminated strings.
+ (line 61)
+* copying: Elementary string functions.
+ (line 72)
+* counting: Elementary string functions.
+ (line 153)
+* decomposing: Decomposition of characters.
+ (line 6)
+* dependencies: Installation. (line 6)
+* detecting case: Case detection. (line 6)
+* duplicating <1>: Elementary string functions on NUL terminated strings.
+ (line 166)
+* duplicating: Elementary string functions with memory allocation.
+ (line 6)
+* enum iconv_ilseq_handler: uniconv.h. (line 30)
+* FDL, GNU Free Documentation License: GNU FDL. (line 6)
+* formatted output: unistdio.h. (line 6)
+* fullwidth: uniwidth.h. (line 22)
+* general category: General category. (line 6)
+* gl_LIBUNISTRING: Autoconf macro. (line 11)
+* GPL, GNU General Public License: GNU GPL. (line 6)
+* halfwidth: uniwidth.h. (line 22)
+* identifiers: ISO C and Java syntax.
+ (line 6)
+* installation: Installation. (line 10)
+* internationalization: Unicode and i18n. (line 6)
+* iterating <1>: Elementary string functions on NUL terminated strings.
+ (line 15)
+* iterating: Elementary string functions.
+ (line 6)
+* Java, programming language: ISO C and Java syntax.
+ (line 6)
+* LGPL, GNU Lesser General Public License: GNU LGPL. (line 6)
+* License, GNU FDL: GNU FDL. (line 6)
+* License, GNU GPL: GNU GPL. (line 6)
+* License, GNU LGPL: GNU LGPL. (line 6)
+* Licenses: Licenses. (line 6)
+* line breaks: unilbrk.h. (line 6)
+* locale: Locale encodings. (line 6)
+* locale categories: Locale encodings. (line 10)
+* locale encoding <1>: uniconv.h. (line 10)
+* locale encoding: Locale encodings. (line 28)
+* locale language: Case mappings of strings.
+ (line 16)
+* locale, multibyte: char * strings. (line 13)
+* locale_charset: uniconv.h. (line 13)
+* lowercasing: Case mappings of strings.
+ (line 6)
+* mailing list: Reporting problems. (line 6)
+* mirroring, of Unicode character: Mirrored character. (line 6)
+* normal forms: uninorm.h. (line 6)
+* normalizing: uninorm.h. (line 6)
+* output, formatted: unistdio.h. (line 6)
+* properties, of Unicode character: Properties. (line 6)
+* regular expression: uniregex.h. (line 6)
+* rendering: More functionality. (line 9)
+* return value conventions: Conventions. (line 47)
+* scripts: Scripts. (line 6)
+* searching, for a character <1>: Elementary string functions on NUL terminated strings.
+ (line 176)
+* searching, for a character: Elementary string functions.
+ (line 140)
+* searching, for a substring: Elementary string functions on NUL terminated strings.
+ (line 232)
+* stream, normalizing a: Normalization of streams.
+ (line 6)
+* struct uninorm_filter: Normalization of streams.
+ (line 11)
+* titlecasing: Case mappings of strings.
+ (line 6)
+* u16_asnprintf: unistdio.h. (line 132)
+* u16_asprintf: unistdio.h. (line 129)
+* u16_casecmp: Case insensitive comparison.
+ (line 51)
+* u16_casecoll: Case insensitive comparison.
+ (line 95)
+* u16_casefold: Case insensitive comparison.
+ (line 15)
+* u16_casexfrm: Case insensitive comparison.
+ (line 75)
+* u16_casing_prefix_context: Case mappings of substrings.
+ (line 30)
+* u16_casing_prefixes_context: Case mappings of substrings.
+ (line 39)
+* u16_casing_suffix_context: Case mappings of substrings.
+ (line 61)
+* u16_casing_suffixes_context: Case mappings of substrings.
+ (line 70)
+* u16_check: Elementary string checks.
+ (line 11)
+* u16_chr: Elementary string functions.
+ (line 145)
+* u16_cmp: Elementary string functions.
+ (line 115)
+* u16_cmp2: Elementary string functions.
+ (line 131)
+* u16_conv_from_encoding: uniconv.h. (line 54)
+* u16_conv_to_encoding: uniconv.h. (line 91)
+* u16_cpy: Elementary string functions.
+ (line 78)
+* u16_cpy_alloc: Elementary string functions with memory allocation.
+ (line 10)
+* u16_ct_casefold: Case insensitive comparison.
+ (line 37)
+* u16_ct_tolower: Case mappings of substrings.
+ (line 107)
+* u16_ct_totitle: Case mappings of substrings.
+ (line 125)
+* u16_ct_toupper: Case mappings of substrings.
+ (line 89)
+* u16_endswith: Elementary string functions on NUL terminated strings.
+ (line 258)
+* u16_is_cased: Case detection. (line 57)
+* u16_is_casefolded: Case detection. (line 44)
+* u16_is_lowercase: Case detection. (line 24)
+* u16_is_titlecase: Case detection. (line 34)
+* u16_is_uppercase: Case detection. (line 14)
+* u16_mblen: Elementary string functions.
+ (line 11)
+* u16_mbsnlen: Elementary string functions.
+ (line 157)
+* u16_mbtouc: Elementary string functions.
+ (line 38)
+* u16_mbtouc_unsafe: Elementary string functions.
+ (line 23)
+* u16_mbtoucr: Elementary string functions.
+ (line 45)
+* u16_move: Elementary string functions.
+ (line 89)
+* u16_next: Elementary string functions on NUL terminated strings.
+ (line 24)
+* u16_normalize: Normalization of strings.
+ (line 50)
+* u16_normcmp: Normalizing comparisons.
+ (line 13)
+* u16_normcoll: Normalizing comparisons.
+ (line 40)
+* u16_normxfrm: Normalizing comparisons.
+ (line 27)
+* u16_possible_linebreaks: unilbrk.h. (line 46)
+* u16_prev: Elementary string functions on NUL terminated strings.
+ (line 36)
+* u16_set: Elementary string functions.
+ (line 101)
+* u16_snprintf: unistdio.h. (line 126)
+* u16_sprintf: unistdio.h. (line 123)
+* u16_startswith: Elementary string functions on NUL terminated strings.
+ (line 250)
+* u16_stpcpy: Elementary string functions on NUL terminated strings.
+ (line 76)
+* u16_stpncpy: Elementary string functions on NUL terminated strings.
+ (line 99)
+* u16_strcat: Elementary string functions on NUL terminated strings.
+ (line 110)
+* u16_strchr: Elementary string functions on NUL terminated strings.
+ (line 180)
+* u16_strcmp: Elementary string functions on NUL terminated strings.
+ (line 132)
+* u16_strcoll: Elementary string functions on NUL terminated strings.
+ (line 142)
+* u16_strconv_from_encoding: uniconv.h. (line 129)
+* u16_strconv_from_locale: uniconv.h. (line 157)
+* u16_strconv_to_encoding: uniconv.h. (line 142)
+* u16_strconv_to_locale: uniconv.h. (line 167)
+* u16_strcpy: Elementary string functions on NUL terminated strings.
+ (line 66)
+* u16_strcspn: Elementary string functions on NUL terminated strings.
+ (line 201)
+* u16_strdup: Elementary string functions on NUL terminated strings.
+ (line 170)
+* u16_strlen: Elementary string functions on NUL terminated strings.
+ (line 47)
+* u16_strmblen: Elementary string functions on NUL terminated strings.
+ (line 11)
+* u16_strmbtouc: Elementary string functions on NUL terminated strings.
+ (line 17)
+* u16_strncat: Elementary string functions on NUL terminated strings.
+ (line 121)
+* u16_strncmp: Elementary string functions on NUL terminated strings.
+ (line 159)
+* u16_strncpy: Elementary string functions on NUL terminated strings.
+ (line 88)
+* u16_strnlen: Elementary string functions on NUL terminated strings.
+ (line 55)
+* u16_strpbrk: Elementary string functions on NUL terminated strings.
+ (line 225)
+* u16_strrchr: Elementary string functions on NUL terminated strings.
+ (line 188)
+* u16_strspn: Elementary string functions on NUL terminated strings.
+ (line 213)
+* u16_strstr: Elementary string functions on NUL terminated strings.
+ (line 239)
+* u16_strtok: Elementary string functions on NUL terminated strings.
+ (line 268)
+* u16_strwidth: uniwidth.h. (line 39)
+* u16_to_u32: Elementary string conversions.
+ (line 23)
+* u16_to_u8: Elementary string conversions.
+ (line 19)
+* u16_tolower: Case mappings of strings.
+ (line 44)
+* u16_totitle: Case mappings of strings.
+ (line 58)
+* u16_toupper: Case mappings of strings.
+ (line 30)
+* u16_u16_asnprintf: unistdio.h. (line 159)
+* u16_u16_asprintf: unistdio.h. (line 156)
+* u16_u16_snprintf: unistdio.h. (line 153)
+* u16_u16_sprintf: unistdio.h. (line 150)
+* u16_u16_vasnprintf: unistdio.h. (line 171)
+* u16_u16_vasprintf: unistdio.h. (line 168)
+* u16_u16_vsnprintf: unistdio.h. (line 165)
+* u16_u16_vsprintf: unistdio.h. (line 162)
+* u16_uctomb: Elementary string functions.
+ (line 62)
+* u16_vasnprintf: unistdio.h. (line 144)
+* u16_vasprintf: unistdio.h. (line 141)
+* u16_vsnprintf: unistdio.h. (line 138)
+* u16_vsprintf: unistdio.h. (line 135)
+* u16_width: uniwidth.h. (line 31)
+* u16_width_linebreaks: unilbrk.h. (line 65)
+* u16_wordbreaks: Word breaks in a string.
+ (line 10)
+* u32_asnprintf: unistdio.h. (line 185)
+* u32_asprintf: unistdio.h. (line 182)
+* u32_casecmp: Case insensitive comparison.
+ (line 54)
+* u32_casecoll: Case insensitive comparison.
+ (line 98)
+* u32_casefold: Case insensitive comparison.
+ (line 18)
+* u32_casexfrm: Case insensitive comparison.
+ (line 78)
+* u32_casing_prefix_context: Case mappings of substrings.
+ (line 32)
+* u32_casing_prefixes_context: Case mappings of substrings.
+ (line 42)
+* u32_casing_suffix_context: Case mappings of substrings.
+ (line 63)
+* u32_casing_suffixes_context: Case mappings of substrings.
+ (line 73)
+* u32_check: Elementary string checks.
+ (line 12)
+* u32_chr: Elementary string functions.
+ (line 147)
+* u32_cmp: Elementary string functions.
+ (line 117)
+* u32_cmp2: Elementary string functions.
+ (line 133)
+* u32_conv_from_encoding: uniconv.h. (line 57)
+* u32_conv_to_encoding: uniconv.h. (line 94)
+* u32_cpy: Elementary string functions.
+ (line 80)
+* u32_cpy_alloc: Elementary string functions with memory allocation.
+ (line 11)
+* u32_ct_casefold: Case insensitive comparison.
+ (line 42)
+* u32_ct_tolower: Case mappings of substrings.
+ (line 112)
+* u32_ct_totitle: Case mappings of substrings.
+ (line 130)
+* u32_ct_toupper: Case mappings of substrings.
+ (line 94)
+* u32_endswith: Elementary string functions on NUL terminated strings.
+ (line 260)
+* u32_is_cased: Case detection. (line 59)
+* u32_is_casefolded: Case detection. (line 46)
+* u32_is_lowercase: Case detection. (line 26)
+* u32_is_titlecase: Case detection. (line 36)
+* u32_is_uppercase: Case detection. (line 16)
+* u32_mblen: Elementary string functions.
+ (line 12)
+* u32_mbsnlen: Elementary string functions.
+ (line 158)
+* u32_mbtouc: Elementary string functions.
+ (line 39)
+* u32_mbtouc_unsafe: Elementary string functions.
+ (line 25)
+* u32_mbtoucr: Elementary string functions.
+ (line 46)
+* u32_move: Elementary string functions.
+ (line 91)
+* u32_next: Elementary string functions on NUL terminated strings.
+ (line 25)
+* u32_normalize: Normalization of strings.
+ (line 52)
+* u32_normcmp: Normalizing comparisons.
+ (line 15)
+* u32_normcoll: Normalizing comparisons.
+ (line 42)
+* u32_normxfrm: Normalizing comparisons.
+ (line 29)
+* u32_possible_linebreaks: unilbrk.h. (line 48)
+* u32_prev: Elementary string functions on NUL terminated strings.
+ (line 38)
+* u32_set: Elementary string functions.
+ (line 102)
+* u32_snprintf: unistdio.h. (line 179)
+* u32_sprintf: unistdio.h. (line 176)
+* u32_startswith: Elementary string functions on NUL terminated strings.
+ (line 252)
+* u32_stpcpy: Elementary string functions on NUL terminated strings.
+ (line 78)
+* u32_stpncpy: Elementary string functions on NUL terminated strings.
+ (line 101)
+* u32_strcat: Elementary string functions on NUL terminated strings.
+ (line 112)
+* u32_strchr: Elementary string functions on NUL terminated strings.
+ (line 181)
+* u32_strcmp: Elementary string functions on NUL terminated strings.
+ (line 133)
+* u32_strcoll: Elementary string functions on NUL terminated strings.
+ (line 143)
+* u32_strconv_from_encoding: uniconv.h. (line 131)
+* u32_strconv_from_locale: uniconv.h. (line 158)
+* u32_strconv_to_encoding: uniconv.h. (line 144)
+* u32_strconv_to_locale: uniconv.h. (line 168)
+* u32_strcpy: Elementary string functions on NUL terminated strings.
+ (line 68)
+* u32_strcspn: Elementary string functions on NUL terminated strings.
+ (line 203)
+* u32_strdup: Elementary string functions on NUL terminated strings.
+ (line 171)
+* u32_strlen: Elementary string functions on NUL terminated strings.
+ (line 48)
+* u32_strmblen: Elementary string functions on NUL terminated strings.
+ (line 12)
+* u32_strmbtouc: Elementary string functions on NUL terminated strings.
+ (line 18)
+* u32_strncat: Elementary string functions on NUL terminated strings.
+ (line 123)
+* u32_strncmp: Elementary string functions on NUL terminated strings.
+ (line 161)
+* u32_strncpy: Elementary string functions on NUL terminated strings.
+ (line 90)
+* u32_strnlen: Elementary string functions on NUL terminated strings.
+ (line 56)
+* u32_strpbrk: Elementary string functions on NUL terminated strings.
+ (line 227)
+* u32_strrchr: Elementary string functions on NUL terminated strings.
+ (line 189)
+* u32_strspn: Elementary string functions on NUL terminated strings.
+ (line 215)
+* u32_strstr: Elementary string functions on NUL terminated strings.
+ (line 241)
+* u32_strtok: Elementary string functions on NUL terminated strings.
+ (line 270)
+* u32_strwidth: uniwidth.h. (line 40)
+* u32_to_u16: Elementary string conversions.
+ (line 31)
+* u32_to_u8: Elementary string conversions.
+ (line 27)
+* u32_tolower: Case mappings of strings.
+ (line 47)
+* u32_totitle: Case mappings of strings.
+ (line 61)
+* u32_toupper: Case mappings of strings.
+ (line 33)
+* u32_u32_asnprintf: unistdio.h. (line 212)
+* u32_u32_asprintf: unistdio.h. (line 209)
+* u32_u32_snprintf: unistdio.h. (line 206)
+* u32_u32_sprintf: unistdio.h. (line 203)
+* u32_u32_vasnprintf: unistdio.h. (line 224)
+* u32_u32_vasprintf: unistdio.h. (line 221)
+* u32_u32_vsnprintf: unistdio.h. (line 218)
+* u32_u32_vsprintf: unistdio.h. (line 215)
+* u32_uctomb: Elementary string functions.
+ (line 63)
+* u32_vasnprintf: unistdio.h. (line 197)
+* u32_vasprintf: unistdio.h. (line 194)
+* u32_vsnprintf: unistdio.h. (line 191)
+* u32_vsprintf: unistdio.h. (line 188)
+* u32_width: uniwidth.h. (line 33)
+* u32_width_linebreaks: unilbrk.h. (line 68)
+* u32_wordbreaks: Word breaks in a string.
+ (line 11)
+* u8_asnprintf: unistdio.h. (line 79)
+* u8_asprintf: unistdio.h. (line 76)
+* u8_casecmp: Case insensitive comparison.
+ (line 48)
+* u8_casecoll: Case insensitive comparison.
+ (line 92)
+* u8_casefold: Case insensitive comparison.
+ (line 12)
+* u8_casexfrm: Case insensitive comparison.
+ (line 72)
+* u8_casing_prefix_context: Case mappings of substrings.
+ (line 28)
+* u8_casing_prefixes_context: Case mappings of substrings.
+ (line 36)
+* u8_casing_suffix_context: Case mappings of substrings.
+ (line 59)
+* u8_casing_suffixes_context: Case mappings of substrings.
+ (line 67)
+* u8_check: Elementary string checks.
+ (line 10)
+* u8_chr: Elementary string functions.
+ (line 143)
+* u8_cmp: Elementary string functions.
+ (line 113)
+* u8_cmp2: Elementary string functions.
+ (line 129)
+* u8_conv_from_encoding: uniconv.h. (line 51)
+* u8_conv_to_encoding: uniconv.h. (line 88)
+* u8_cpy: Elementary string functions.
+ (line 76)
+* u8_cpy_alloc: Elementary string functions with memory allocation.
+ (line 9)
+* u8_ct_casefold: Case insensitive comparison.
+ (line 32)
+* u8_ct_tolower: Case mappings of substrings.
+ (line 102)
+* u8_ct_totitle: Case mappings of substrings.
+ (line 120)
+* u8_ct_toupper: Case mappings of substrings.
+ (line 84)
+* u8_endswith: Elementary string functions on NUL terminated strings.
+ (line 256)
+* u8_is_cased: Case detection. (line 55)
+* u8_is_casefolded: Case detection. (line 42)
+* u8_is_lowercase: Case detection. (line 22)
+* u8_is_titlecase: Case detection. (line 32)
+* u8_is_uppercase: Case detection. (line 12)
+* u8_mblen: Elementary string functions.
+ (line 10)
+* u8_mbsnlen: Elementary string functions.
+ (line 156)
+* u8_mbtouc: Elementary string functions.
+ (line 37)
+* u8_mbtouc_unsafe: Elementary string functions.
+ (line 21)
+* u8_mbtoucr: Elementary string functions.
+ (line 44)
+* u8_move: Elementary string functions.
+ (line 87)
+* u8_next: Elementary string functions on NUL terminated strings.
+ (line 23)
+* u8_normalize: Normalization of strings.
+ (line 48)
+* u8_normcmp: Normalizing comparisons.
+ (line 11)
+* u8_normcoll: Normalizing comparisons.
+ (line 38)
+* u8_normxfrm: Normalizing comparisons.
+ (line 25)
+* u8_possible_linebreaks: unilbrk.h. (line 44)
+* u8_prev: Elementary string functions on NUL terminated strings.
+ (line 34)
+* u8_set: Elementary string functions.
+ (line 100)
+* u8_snprintf: unistdio.h. (line 73)
+* u8_sprintf: unistdio.h. (line 70)
+* u8_startswith: Elementary string functions on NUL terminated strings.
+ (line 248)
+* u8_stpcpy: Elementary string functions on NUL terminated strings.
+ (line 74)
+* u8_stpncpy: Elementary string functions on NUL terminated strings.
+ (line 97)
+* u8_strcat: Elementary string functions on NUL terminated strings.
+ (line 108)
+* u8_strchr: Elementary string functions on NUL terminated strings.
+ (line 179)
+* u8_strcmp: Elementary string functions on NUL terminated strings.
+ (line 131)
+* u8_strcoll: Elementary string functions on NUL terminated strings.
+ (line 141)
+* u8_strconv_from_encoding: uniconv.h. (line 127)
+* u8_strconv_from_locale: uniconv.h. (line 156)
+* u8_strconv_to_encoding: uniconv.h. (line 140)
+* u8_strconv_to_locale: uniconv.h. (line 166)
+* u8_strcpy: Elementary string functions on NUL terminated strings.
+ (line 64)
+* u8_strcspn: Elementary string functions on NUL terminated strings.
+ (line 199)
+* u8_strdup: Elementary string functions on NUL terminated strings.
+ (line 169)
+* u8_strlen: Elementary string functions on NUL terminated strings.
+ (line 46)
+* u8_strmblen: Elementary string functions on NUL terminated strings.
+ (line 10)
+* u8_strmbtouc: Elementary string functions on NUL terminated strings.
+ (line 16)
+* u8_strncat: Elementary string functions on NUL terminated strings.
+ (line 119)
+* u8_strncmp: Elementary string functions on NUL terminated strings.
+ (line 157)
+* u8_strncpy: Elementary string functions on NUL terminated strings.
+ (line 86)
+* u8_strnlen: Elementary string functions on NUL terminated strings.
+ (line 54)
+* u8_strpbrk: Elementary string functions on NUL terminated strings.
+ (line 223)
+* u8_strrchr: Elementary string functions on NUL terminated strings.
+ (line 187)
+* u8_strspn: Elementary string functions on NUL terminated strings.
+ (line 211)
+* u8_strstr: Elementary string functions on NUL terminated strings.
+ (line 237)
+* u8_strtok: Elementary string functions on NUL terminated strings.
+ (line 266)
+* u8_strwidth: uniwidth.h. (line 38)
+* u8_to_u16: Elementary string conversions.
+ (line 11)
+* u8_to_u32: Elementary string conversions.
+ (line 15)
+* u8_tolower: Case mappings of strings.
+ (line 41)
+* u8_totitle: Case mappings of strings.
+ (line 55)
+* u8_toupper: Case mappings of strings.
+ (line 27)
+* u8_u8_asnprintf: unistdio.h. (line 106)
+* u8_u8_asprintf: unistdio.h. (line 103)
+* u8_u8_snprintf: unistdio.h. (line 100)
+* u8_u8_sprintf: unistdio.h. (line 97)
+* u8_u8_vasnprintf: unistdio.h. (line 118)
+* u8_u8_vasprintf: unistdio.h. (line 115)
+* u8_u8_vsnprintf: unistdio.h. (line 112)
+* u8_u8_vsprintf: unistdio.h. (line 109)
+* u8_uctomb: Elementary string functions.
+ (line 61)
+* u8_vasnprintf: unistdio.h. (line 91)
+* u8_vasprintf: unistdio.h. (line 88)
+* u8_vsnprintf: unistdio.h. (line 85)
+* u8_vsprintf: unistdio.h. (line 82)
+* u8_width: uniwidth.h. (line 29)
+* u8_width_linebreaks: unilbrk.h. (line 62)
+* u8_wordbreaks: Word breaks in a string.
+ (line 9)
+* uc_all_blocks: Blocks. (line 38)
+* uc_all_scripts: Scripts. (line 37)
+* uc_bidi_category: Bidirectional category.
+ (line 88)
+* uc_bidi_category_byname: Bidirectional category.
+ (line 82)
+* uc_bidi_category_name: Bidirectional category.
+ (line 79)
+* uc_block: Blocks. (line 27)
+* uc_block_t: Blocks. (line 12)
+* uc_c_ident_category: ISO C and Java syntax.
+ (line 39)
+* uc_canonical_decomposition: Decomposition of characters.
+ (line 92)
+* uc_combining_class: Canonical combining class.
+ (line 89)
+* uc_composition: Composition of characters.
+ (line 10)
+* uc_decimal_value: Decimal digit value. (line 11)
+* uc_decomposition: Decomposition of characters.
+ (line 82)
+* uc_digit_value: Digit value. (line 11)
+* uc_fraction_t: Numeric value. (line 14)
+* uc_general_category: Object oriented API. (line 207)
+* uc_general_category_and: Object oriented API. (line 179)
+* uc_general_category_and_not: Object oriented API. (line 186)
+* uc_general_category_byname: Object oriented API. (line 201)
+* uc_general_category_name: Object oriented API. (line 195)
+* uc_general_category_or: Object oriented API. (line 173)
+* uc_general_category_t: Object oriented API. (line 7)
+* uc_is_alnum: Classifications like in ISO C.
+ (line 14)
+* uc_is_alpha: Classifications like in ISO C.
+ (line 18)
+* uc_is_bidi_category: Bidirectional category.
+ (line 91)
+* uc_is_blank: Classifications like in ISO C.
+ (line 64)
+* uc_is_block: Blocks. (line 32)
+* uc_is_c_whitespace: ISO C and Java syntax.
+ (line 10)
+* uc_is_cntrl: Classifications like in ISO C.
+ (line 24)
+* uc_is_digit: Classifications like in ISO C.
+ (line 27)
+* uc_is_general_category: Object oriented API. (line 213)
+* uc_is_general_category_withtable: Bit mask API. (line 52)
+* uc_is_graph: Classifications like in ISO C.
+ (line 31)
+* uc_is_java_whitespace: ISO C and Java syntax.
+ (line 14)
+* uc_is_lower: Classifications like in ISO C.
+ (line 35)
+* uc_is_print: Classifications like in ISO C.
+ (line 41)
+* uc_is_property: Properties as objects.
+ (line 140)
+* uc_is_property_alphabetic: Properties as functions.
+ (line 10)
+* uc_is_property_ascii_hex_digit: Properties as functions.
+ (line 74)
+* uc_is_property_bidi_arabic_digit: Properties as functions.
+ (line 60)
+* uc_is_property_bidi_arabic_right_to_left: Properties as functions.
+ (line 56)
+* uc_is_property_bidi_block_separator: Properties as functions.
+ (line 62)
+* uc_is_property_bidi_boundary_neutral: Properties as functions.
+ (line 66)
+* uc_is_property_bidi_common_separator: Properties as functions.
+ (line 61)
+* uc_is_property_bidi_control: Properties as functions.
+ (line 53)
+* uc_is_property_bidi_embedding_or_override: Properties as functions.
+ (line 68)
+* uc_is_property_bidi_eur_num_separator: Properties as functions.
+ (line 58)
+* uc_is_property_bidi_eur_num_terminator: Properties as functions.
+ (line 59)
+* uc_is_property_bidi_european_digit: Properties as functions.
+ (line 57)
+* uc_is_property_bidi_hebrew_right_to_left: Properties as functions.
+ (line 55)
+* uc_is_property_bidi_left_to_right: Properties as functions.
+ (line 54)
+* uc_is_property_bidi_non_spacing_mark: Properties as functions.
+ (line 65)
+* uc_is_property_bidi_other_neutral: Properties as functions.
+ (line 69)
+* uc_is_property_bidi_pdf: Properties as functions.
+ (line 67)
+* uc_is_property_bidi_segment_separator: Properties as functions.
+ (line 63)
+* uc_is_property_bidi_whitespace: Properties as functions.
+ (line 64)
+* uc_is_property_combining: Properties as functions.
+ (line 104)
+* uc_is_property_composite: Properties as functions.
+ (line 105)
+* uc_is_property_currency_symbol: Properties as functions.
+ (line 99)
+* uc_is_property_dash: Properties as functions.
+ (line 91)
+* uc_is_property_decimal_digit: Properties as functions.
+ (line 106)
+* uc_is_property_default_ignorable_code_point: Properties as functions.
+ (line 14)
+* uc_is_property_deprecated: Properties as functions.
+ (line 17)
+* uc_is_property_diacritic: Properties as functions.
+ (line 108)
+* uc_is_property_extender: Properties as functions.
+ (line 109)
+* uc_is_property_format_control: Properties as functions.
+ (line 90)
+* uc_is_property_grapheme_base: Properties as functions.
+ (line 46)
+* uc_is_property_grapheme_extend: Properties as functions.
+ (line 47)
+* uc_is_property_grapheme_link: Properties as functions.
+ (line 49)
+* uc_is_property_hex_digit: Properties as functions.
+ (line 73)
+* uc_is_property_hyphen: Properties as functions.
+ (line 92)
+* uc_is_property_id_continue: Properties as functions.
+ (line 36)
+* uc_is_property_id_start: Properties as functions.
+ (line 34)
+* uc_is_property_ideographic: Properties as functions.
+ (line 78)
+* uc_is_property_ids_binary_operator: Properties as functions.
+ (line 81)
+* uc_is_property_ids_trinary_operator: Properties as functions.
+ (line 82)
+* uc_is_property_ignorable_control: Properties as functions.
+ (line 110)
+* uc_is_property_iso_control: Properties as functions.
+ (line 89)
+* uc_is_property_join_control: Properties as functions.
+ (line 45)
+* uc_is_property_left_of_pair: Properties as functions.
+ (line 103)
+* uc_is_property_line_separator: Properties as functions.
+ (line 94)
+* uc_is_property_logical_order_exception: Properties as functions.
+ (line 18)
+* uc_is_property_lowercase: Properties as functions.
+ (line 27)
+* uc_is_property_math: Properties as functions.
+ (line 100)
+* uc_is_property_non_break: Properties as functions.
+ (line 88)
+* uc_is_property_not_a_character: Properties as functions.
+ (line 12)
+* uc_is_property_numeric: Properties as functions.
+ (line 107)
+* uc_is_property_other_alphabetic: Properties as functions.
+ (line 11)
+* uc_is_property_other_default_ignorable_code_point: Properties as functions.
+ (line 16)
+* uc_is_property_other_grapheme_extend: Properties as functions.
+ (line 48)
+* uc_is_property_other_id_continue: Properties as functions.
+ (line 37)
+* uc_is_property_other_id_start: Properties as functions.
+ (line 35)
+* uc_is_property_other_lowercase: Properties as functions.
+ (line 28)
+* uc_is_property_other_math: Properties as functions.
+ (line 101)
+* uc_is_property_other_uppercase: Properties as functions.
+ (line 26)
+* uc_is_property_paired_punctuation: Properties as functions.
+ (line 102)
+* uc_is_property_paragraph_separator: Properties as functions.
+ (line 95)
+* uc_is_property_pattern_syntax: Properties as functions.
+ (line 41)
+* uc_is_property_pattern_white_space: Properties as functions.
+ (line 40)
+* uc_is_property_private_use: Properties as functions.
+ (line 20)
+* uc_is_property_punctuation: Properties as functions.
+ (line 93)
+* uc_is_property_quotation_mark: Properties as functions.
+ (line 96)
+* uc_is_property_radical: Properties as functions.
+ (line 80)
+* uc_is_property_sentence_terminal: Properties as functions.
+ (line 97)
+* uc_is_property_soft_dotted: Properties as functions.
+ (line 30)
+* uc_is_property_space: Properties as functions.
+ (line 87)
+* uc_is_property_terminal_punctuation: Properties as functions.
+ (line 98)
+* uc_is_property_titlecase: Properties as functions.
+ (line 29)
+* uc_is_property_unassigned_code_value: Properties as functions.
+ (line 21)
+* uc_is_property_unified_ideograph: Properties as functions.
+ (line 79)
+* uc_is_property_uppercase: Properties as functions.
+ (line 25)
+* uc_is_property_variation_selector: Properties as functions.
+ (line 19)
+* uc_is_property_white_space: Properties as functions.
+ (line 9)
+* uc_is_property_xid_continue: Properties as functions.
+ (line 39)
+* uc_is_property_xid_start: Properties as functions.
+ (line 38)
+* uc_is_property_zero_width: Properties as functions.
+ (line 86)
+* uc_is_punct: Classifications like in ISO C.
+ (line 44)
+* uc_is_script: Scripts. (line 31)
+* uc_is_space: Classifications like in ISO C.
+ (line 49)
+* uc_is_upper: Classifications like in ISO C.
+ (line 54)
+* uc_is_xdigit: Classifications like in ISO C.
+ (line 60)
+* uc_java_ident_category: ISO C and Java syntax.
+ (line 43)
+* uc_locale_language: Case mappings of strings.
+ (line 21)
+* uc_mirror_char: Mirrored character. (line 14)
+* uc_numeric_value: Numeric value. (line 23)
+* uc_property_byname: Properties as objects.
+ (line 123)
+* uc_property_is_valid: Properties as objects.
+ (line 133)
+* uc_property_t: Properties as objects.
+ (line 9)
+* uc_script: Scripts. (line 20)
+* uc_script_byname: Scripts. (line 25)
+* uc_script_t: Scripts. (line 11)
+* uc_tolower: Case mappings of characters.
+ (line 20)
+* uc_totitle: Case mappings of characters.
+ (line 23)
+* uc_toupper: Case mappings of characters.
+ (line 17)
+* uc_width: uniwidth.h. (line 23)
+* uc_wordbreak_property: Word break property. (line 32)
+* UCS-4: Unicode. (line 14)
+* ucs4_t: unitypes.h. (line 16)
+* uint16_t: unitypes.h. (line 10)
+* uint32_t: unitypes.h. (line 11)
+* uint8_t: unitypes.h. (line 9)
+* ulc_asnprintf: unistdio.h. (line 53)
+* ulc_asprintf: unistdio.h. (line 50)
+* ulc_casecmp: Case insensitive comparison.
+ (line 57)
+* ulc_casecoll: Case insensitive comparison.
+ (line 101)
+* ulc_casexfrm: Case insensitive comparison.
+ (line 81)
+* ulc_fprintf: unistdio.h. (line 229)
+* ulc_possible_linebreaks: unilbrk.h. (line 50)
+* ulc_snprintf: unistdio.h. (line 48)
+* ulc_sprintf: unistdio.h. (line 45)
+* ulc_vasnprintf: unistdio.h. (line 65)
+* ulc_vasprintf: unistdio.h. (line 62)
+* ulc_vfprintf: unistdio.h. (line 232)
+* ulc_vsnprintf: unistdio.h. (line 59)
+* ulc_vsprintf: unistdio.h. (line 56)
+* ulc_width_linebreaks: unilbrk.h. (line 71)
+* ulc_wordbreaks: Word breaks in a string.
+ (line 12)
+* Unicode: Unicode. (line 6)
+* Unicode character, bidirectional category: Bidirectional category.
+ (line 6)
+* Unicode character, block: Blocks. (line 24)
+* Unicode character, canonical combining class: Canonical combining class.
+ (line 6)
+* Unicode character, case mappings: Case mappings of characters.
+ (line 6)
+* Unicode character, classification: General category. (line 6)
+* Unicode character, classification like in C: Classifications like in ISO C.
+ (line 6)
+* Unicode character, general category: General category. (line 6)
+* Unicode character, mirroring: Mirrored character. (line 6)
+* Unicode character, name: uniname.h. (line 6)
+* Unicode character, properties: Properties. (line 6)
+* Unicode character, script: Scripts. (line 17)
+* Unicode character, validity in C identifiers: ISO C and Java syntax.
+ (line 38)
+* Unicode character, validity in Java identifiers: ISO C and Java syntax.
+ (line 42)
+* Unicode character, value <1>: Numeric value. (line 6)
+* Unicode character, value <2>: Digit value. (line 6)
+* Unicode character, value: Decimal digit value. (line 6)
+* Unicode character, width: uniwidth.h. (line 22)
+* unicode_character_name: uniname.h. (line 19)
+* unicode_name_character: uniname.h. (line 25)
+* uninorm_decomposing_form: Normalization of strings.
+ (line 40)
+* uninorm_filter_create: Normalization of streams.
+ (line 19)
+* uninorm_filter_flush: Normalization of streams.
+ (line 33)
+* uninorm_filter_free: Normalization of streams.
+ (line 43)
+* uninorm_filter_write: Normalization of streams.
+ (line 29)
+* uninorm_is_compat_decomposing: Normalization of strings.
+ (line 32)
+* uninorm_is_composing: Normalization of strings.
+ (line 36)
+* uninorm_t: Normalization of strings.
+ (line 10)
+* uppercasing: Case mappings of strings.
+ (line 6)
+* use cases: Introduction. (line 44)
+* UTF-16: Unicode. (line 14)
+* UTF-16, strings: Unicode strings. (line 6)
+* UTF-32: Unicode. (line 14)
+* UTF-32, strings: Unicode strings. (line 6)
+* UTF-8: Unicode. (line 14)
+* UTF-8, strings: Unicode strings. (line 6)
+* validity: Elementary string checks.
+ (line 6)
+* value, of libunistring: Introduction. (line 44)
+* value, of Unicode character <1>: Numeric value. (line 6)
+* value, of Unicode character <2>: Digit value. (line 6)
+* value, of Unicode character: Decimal digit value. (line 6)
+* verification: Elementary string checks.
+ (line 6)
+* wchar_t, type: The wchar_t mess. (line 6)
+* width: uniwidth.h. (line 6)
+* word breaks: uniwbrk.h. (line 6)
+* wrapping: unilbrk.h. (line 6)
+
+
+
+Tag Table:
+Node: Top270
+Node: Introduction3239
+Node: Unicode5236
+Node: Unicode and i18n7116
+Node: Locale encodings8579
+Node: In-memory representation10787
+Node: char * strings11896
+Node: The wchar_t mess17153
+Node: Unicode strings19357
+Node: Conventions20508
+Node: unitypes.h22708
+Node: unistr.h23280
+Node: Elementary string checks23837
+Node: Elementary string conversions24459
+Node: Elementary string functions25761
+Node: Elementary string functions with memory allocation32732
+Node: Elementary string functions on NUL terminated strings33354
+Node: uniconv.h45090
+Node: unistdio.h52801
+Node: uniname.h61004
+Node: unictype.h62337
+Node: General category63246
+Node: Object oriented API64289
+Node: Bit mask API72751
+Node: Canonical combining class75005
+Node: Bidirectional category78219
+Node: Decimal digit value81276
+Node: Digit value81837
+Node: Numeric value82398
+Node: Mirrored character83289
+Node: Properties83962
+Node: Properties as objects84653
+Node: Properties as functions91031
+Node: Scripts96582
+Node: Blocks97968
+Node: ISO C and Java syntax99291
+Node: Classifications like in ISO C101001
+Node: uniwidth.h103705
+Node: uniwbrk.h105742
+Node: Word breaks in a string106269
+Node: Word break property107320
+Node: unilbrk.h108416
+Node: uninorm.h112587
+Node: Decomposition of characters113219
+Node: Composition of characters116595
+Node: Normalization of strings117304
+Node: Normalizing comparisons119366
+Node: Normalization of streams121722
+Node: unicase.h123810
+Node: Case mappings of characters124495
+Node: Case mappings of strings126542
+Node: Case mappings of substrings129875
+Node: Case insensitive comparison136805
+Node: Case detection142156
+Node: uniregex.h145424
+Node: Using the library145647
+Node: Installation146058
+Node: Compiler options146531
+Node: Include files148090
+Node: Autoconf macro149314
+Node: Reporting problems150872
+Node: More functionality151669
+Node: Licenses152112
+Node: GNU GPL153747
+Node: GNU LGPL191292
+Node: GNU FDL199738
+Node: Index224863
+
+End Tag Table
+
+
+Local Variables:
+coding: utf-8
+End: