From fa095a4504cbe668e4244547e2c141597bea4ecf Mon Sep 17 00:00:00 2001 From: Andreas Rottmann Date: Mon, 14 Sep 2009 12:32:44 +0200 Subject: Imported Upstream version 0.9.1 --- doc/libunistring_5.html | 296 ++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 296 insertions(+) create mode 100644 doc/libunistring_5.html (limited to 'doc/libunistring_5.html') diff --git a/doc/libunistring_5.html b/doc/libunistring_5.html new file mode 100644 index 00000000..92e115f9 --- /dev/null +++ b/doc/libunistring_5.html @@ -0,0 +1,296 @@ + + + + + +GNU libunistring: 5. Conversions between Unicode and encodings <uniconv.h> + + + + + + + + + + + + + + + + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+ +
+ + +

5. Conversions between Unicode and encodings <uniconv.h>

+ +

This include file declares functions for converting between Unicode strings +and char * strings in locale encoding or in other specified encodings. +

+ +

The following function returns the locale encoding. +

+
+
Function: const char * locale_charset () + +
+

Determines the current locale's character encoding, and canonicalizes it +into one of the canonical names listed in ‘config.charset’. +If the canonical name cannot be determined, the result is a non-canonical +name. +

+

The result must not be freed; it is statically allocated. +

+

The result of this function can be used as an argument to the iconv_open +function in GNU libc, in GNU libiconv, or in the gnulib provided wrapper +around the native iconv_open function. It may not work as an argument +to the native iconv_open function directly. +

+ +

The handling of unconvertible characters during the conversions can be +parametrized through the following enumeration type: +

+
+
Type: enum iconv_ilseq_handler + +
+

This type specifies how unconvertible characters in the input are handled. +

+ +
+
Constant: enum iconv_ilseq_handler iconveh_error + +
+

This handler causes the function to return with errno set to +EILSEQ. +

+ +
+
Constant: enum iconv_ilseq_handler iconveh_question_mark + +
+

This handler produces one question mark ‘?’ per unconvertible character. +

+ +
+
Constant: enum iconv_ilseq_handler iconveh_escape_sequence + +
+

This handler produces an escape sequence \uxxxx or +\Uxxxxxxxx for each unconvertible character. +

+ + +

The following functions convert between strings in a specified encoding and +Unicode strings. +

+
+
Function: uint8_t * u8_conv_from_encoding (const char *fromcode, enum iconv_ilseq_handler handler, const char *src, size_t srclen, size_t *offsets, uint8_t *resultbuf, size_t *lengthp) + +
+
Function: uint16_t * u16_conv_from_encoding (const char *fromcode, enum iconv_ilseq_handler handler, const char *src, size_t srclen, size_t *offsets, uint16_t *resultbuf, size_t *lengthp) + +
+
Function: uint32_t * u32_conv_from_encoding (const char *fromcode, enum iconv_ilseq_handler handler, const char *src, size_t srclen, size_t *offsets, uint32_t *resultbuf, size_t *lengthp) + +
+

Converts an entire string, possibly including NUL bytes, from one encoding +to UTF-8 encoding. +

+

Converts a memory region given in encoding fromcode. fromcode is +as for the iconv_open function. +

+

The input is in the memory region between src (inclusive) and +src + srclen (exclusive). +

+

If offsets is not NULL, it should point to an array of srclen +integers; this array is filled with offsets into the result, i.e. the +character starting at src[i] corresponds to the character starting +at result[offsets[i]], and other offsets are set to +(size_t)(-1). +

+

resultbuf and *lengthp should be a scratch +buffer and its size, or resultbuf can be NULL. +

+

May erase the contents of the memory at resultbuf. +

+

If successful: The resulting Unicode string (non-NULL) is returned and +its length stored in *lengthp. The resulting string is +resultbuf if no dynamic memory allocation was necessary, +or a freshly allocated memory block otherwise. +

+

In case of error: NULL is returned and errno is set. +Particular errno values: EINVAL, EILSEQ, ENOMEM. +

+ +
+
Function: char * u8_conv_to_encoding (const char *tocode, enum iconv_ilseq_handler handler, const uint8_t *src, size_t srclen, size_t *offsets, char *resultbuf, size_t *lengthp) + +
+
Function: char * u16_conv_to_encoding (const char *tocode, enum iconv_ilseq_handler handler, const uint16_t *src, size_t srclen, size_t *offsets, char *resultbuf, size_t *lengthp) + +
+
Function: char * u32_conv_to_encoding (const char *tocode, enum iconv_ilseq_handler handler, const uint32_t *src, size_t srclen, size_t *offsets, char *resultbuf, size_t *lengthp) + +
+

Converts an entire Unicode string, possibly including NUL units, from UTF-8 +encoding to a given encoding. +

+

Converts a memory region to encoding tocode. tocode is as for +the iconv_open function. +

+

The input is in the memory region between src (inclusive) and +src + srclen (exclusive). +

+

If offsets is not NULL, it should point to an array of srclen +integers; this array is filled with offsets into the result, i.e. the +character starting at src[i] corresponds to the character starting +at result[offsets[i]], and other offsets are set to +(size_t)(-1). +

+

resultbuf and *lengthp should be a scratch +buffer and its size, or resultbuf can be NULL. +

+

May erase the contents of the memory at resultbuf. +

+

If successful: The resulting Unicode string (non-NULL) is returned and +its length stored in *lengthp. The resulting string is +resultbuf if no dynamic memory allocation was necessary, +or a freshly allocated memory block otherwise. +

+

In case of error: NULL is returned and errno is set. +Particular errno values: EINVAL, EILSEQ, ENOMEM. +

+ +

The following functions convert between NUL terminated strings in a specified +encoding and NUL terminated Unicode strings. +

+
+
Function: uint8_t * u8_strconv_from_encoding (const char *string, const char *fromcode, enum iconv_ilseq_handler handler) + +
+
Function: uint16_t * u16_strconv_from_encoding (const char *string, const char *fromcode, enum iconv_ilseq_handler handler) + +
+
Function: uint32_t * u32_strconv_from_encoding (const char *string, const char *fromcode, enum iconv_ilseq_handler handler) + +
+

Converts a NUL terminated string from a given encoding. +

+

The result is malloc allocated, or NULL (with errno set) in case of error. +

+

Particular errno values: EILSEQ, ENOMEM. +

+ +
+
Function: char * u8_strconv_to_encoding (const uint8_t *string, const char *tocode, enum iconv_ilseq_handler handler) + +
+
Function: char * u16_strconv_to_encoding (const uint16_t *string, const char *tocode, enum iconv_ilseq_handler handler) + +
+
Function: char * u32_strconv_to_encoding (const uint32_t *string, const char *tocode, enum iconv_ilseq_handler handler) + +
+

Converts a NUL terminated string to a given encoding. +

+

The result is malloc allocated, or NULL (with errno set) in case of error. +

+

Particular errno values: EILSEQ, ENOMEM. +

+ +

The following functions are shorthands that convert between NUL terminated +strings in locale encoding and NUL terminated Unicode strings. +

+
+
Function: uint8_t * u8_strconv_from_locale (const char *string) + +
+
Function: uint16_t * u16_strconv_from_locale (const char *string) + +
+
Function: uint32_t * u32_strconv_from_locale (const char *string) + +
+

Converts a NUL terminated string from the locale encoding. +

+

The result is malloc allocated, or NULL (with errno set) in case of error. +

+

Particular errno values: ENOMEM. +

+ +
+
Function: char * u8_strconv_to_locale (const uint8_t *string) + +
+
Function: char * u16_strconv_to_locale (const uint16_t *string) + +
+
Function: char * u32_strconv_to_locale (const uint32_t *string) + +
+

Converts a NUL terminated string to the locale encoding. +

+

The result is malloc allocated, or NULL (with errno set) in case of error. +

+

Particular errno values: ENOMEM. +

+
+ + + + + + + + + + + + +
[ << ][ >> ]           [Top][Contents][Index][ ? ]
+

+ + This document was generated by Bruno Haible on July, 1 2009 using texi2html 1.78a. + +
+ +

+ + -- cgit v1.2.3