summaryrefslogtreecommitdiff
path: root/doc/libunistring_2.html
blob: 2b10b0e83f0e9bbf5be44a223079ad943bbcb48b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html401/loose.dtd">
<html>
<!-- Created on October, 16 2024 by texi2html 1.78a -->
<!--
Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author)
            Karl Berry  <karl@freefriends.org>
            Olaf Bachmann <obachman@mathematik.uni-kl.de>
            and many others.
Maintained by: Many creative people.
Send bugs and suggestions to <texi2html-bug@nongnu.org>

-->
<head>
<title>GNU libunistring: 2. Conventions</title>

<meta name="description" content="GNU libunistring: 2. Conventions">
<meta name="keywords" content="GNU libunistring: 2. Conventions">
<meta name="resource-type" content="document">
<meta name="distribution" content="global">
<meta name="Generator" content="texi2html 1.78a">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<style type="text/css">
<!--
a.summary-letter {text-decoration: none}
pre.display {font-family: serif}
pre.format {font-family: serif}
pre.menu-comment {font-family: serif}
pre.menu-preformatted {font-family: serif}
pre.smalldisplay {font-family: serif; font-size: smaller}
pre.smallexample {font-size: smaller}
pre.smallformat {font-family: serif; font-size: smaller}
pre.smalllisp {font-size: smaller}
span.roman {font-family:serif; font-weight:normal;}
span.sansserif {font-family:sans-serif; font-weight:normal;}
ul.toc {list-style: none}
-->
</style>


</head>

<body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000">

<table cellpadding="1" cellspacing="1" border="0">
<tr><td valign="middle" align="left">[<a href="libunistring_1.html#SEC1" title="Beginning of this chapter or previous chapter"> &lt;&lt; </a>]</td>
<td valign="middle" align="left">[<a href="libunistring_3.html#SEC9" title="Next chapter"> &gt;&gt; </a>]</td>
<td valign="middle" align="left"> &nbsp; </td>
<td valign="middle" align="left"> &nbsp; </td>
<td valign="middle" align="left"> &nbsp; </td>
<td valign="middle" align="left"> &nbsp; </td>
<td valign="middle" align="left"> &nbsp; </td>
<td valign="middle" align="left">[<a href="libunistring_toc.html#SEC_Top" title="Cover (top) of document">Top</a>]</td>
<td valign="middle" align="left">[<a href="libunistring_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
<td valign="middle" align="left">[<a href="libunistring_21.html#SEC94" title="Index">Index</a>]</td>
<td valign="middle" align="left">[<a href="libunistring_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
</tr></table>

<hr size="2">
<a name="Conventions"></a>
<a name="SEC8"></a>
<h1 class="chapter"> <a href="libunistring_toc.html#TOC8">2. Conventions</a> </h1>

<p>This chapter explains conventions valid throughout the libunistring library.
</p>
<a name="IDX14"></a>
<p>Variables of type <code>char *</code> denote C strings in locale encoding.
See <a href="libunistring_1.html#SEC4">Locale encodings</a>.
</p>
<p>Variables of type <code>uint8_t *</code> denote UTF-8 strings.  Their units
are bytes.
</p>
<p>Variables of type <code>uint16_t *</code> denote UTF-16 strings, without byte
order mark.  Their units are 2-byte words.
</p>
<p>Variables of type <code>uint32_t *</code> denote UTF-32 strings, without byte
order mark.  Their units are 4-byte words.
</p>
<p>Argument pairs <code>(<var>s</var>, <var>n</var>)</code> denote a string
<code><var>s</var>[0..<var>n</var>-1]</code> with exactly <var>n</var> units.<a name="DOCF1" href="libunistring_fot.html#FOOT1">(1)</a>
</p>
<p>All functions with prefix &lsquo;<samp>ulc_</samp>&rsquo; operate on C strings in locale
encoding.
</p>
<p>All functions with prefix &lsquo;<samp>u8_</samp>&rsquo; operate on UTF-8 strings.
</p>
<p>All functions with prefix &lsquo;<samp>u16_</samp>&rsquo; operate on UTF-16 strings.
</p>
<p>All functions with prefix &lsquo;<samp>u32_</samp>&rsquo; operate on UTF-32 strings.
</p>
<p>For every function with prefix &lsquo;<samp>u8_</samp>&rsquo;, operating on UTF-8 strings,
there is also a corresponding function with prefix &lsquo;<samp>u16_</samp>&rsquo;,
operating on UTF-16 strings, and a corresponding function with prefix
&lsquo;<samp>u32_</samp>&rsquo;, operating on UTF-32 strings.  Their description is
analogous; in this documentation we describe only the function that
operates on UTF-8 strings, for brevity.
</p>
<p>A declaration with a variable <var>n</var> denotes the three concrete
declarations with <var>n</var> = 8, <var>n</var> = 16, <var>n</var> = 32.
</p>
<p>All parameters starting with &lsquo;<samp>str</samp>&rsquo; and the parameters of
functions starting with <code>u8_str</code>/<code>u16_str</code>/<code>u32_str</code>
denote a NUL terminated string.
</p>
<a name="IDX15"></a>
<p>Error values are always returned through the <code>errno</code> variable,
usually with a return value that indicates the presence of an error
(NULL for functions that return an pointer, or -1 for functions that
return an <code>int</code>).
</p>
<p>Functions returning a string result take a
<code>(<var>resultbuf</var>, <var>lengthp</var>)</code>
argument pair.  If <var>resultbuf</var> is not NULL and the result fits
into <code>*<var>lengthp</var></code> units, it is put in <var>resultbuf</var>, and
<var>resultbuf</var> is returned.  Otherwise, a freshly allocated string
is returned.  In both cases, <code>*<var>lengthp</var></code> is set to the
length (number of units) of the returned string.  In case of error,
NULL is returned and <code>errno</code> is set.
</p>
<p>To invoke such a function:
</p><ul>
<li>
First ask yourself whether you want to accept the overhead of a
<code>malloc</code> invocation even for a small-sized result.
If yes, pass NULL as <var>resultbuf</var>.
If no, allocate an array of units on the stack, typically between 50 and
4000 bytes large; pass this array as <var>resultbuf</var>; and initialize
<code>*<var>lengthp</var></code> to the number of units of this array.
</li><li>
Upon return from such a function, look at the return value:
NULL means an error; look at the value of <code>errno</code> in this case.
Otherwise, the return value is the result, with <code>*<var>lengthp</var></code>
units.  Note that the function has <em>not</em> added an extra NUL
character at the end.
</li><li>
Finally, do memory management.  You know that the result was
<code>malloc</code>-allocated if it is <code>!= NULL</code> and
<code>!= <var>resultbuf</var></code>.
</li></ul>

<hr size="6">
<table cellpadding="1" cellspacing="1" border="0">
<tr><td valign="middle" align="left">[<a href="libunistring_1.html#SEC1" title="Beginning of this chapter or previous chapter"> &lt;&lt; </a>]</td>
<td valign="middle" align="left">[<a href="libunistring_3.html#SEC9" title="Next chapter"> &gt;&gt; </a>]</td>
<td valign="middle" align="left"> &nbsp; </td>
<td valign="middle" align="left"> &nbsp; </td>
<td valign="middle" align="left"> &nbsp; </td>
<td valign="middle" align="left"> &nbsp; </td>
<td valign="middle" align="left"> &nbsp; </td>
<td valign="middle" align="left">[<a href="libunistring_toc.html#SEC_Top" title="Cover (top) of document">Top</a>]</td>
<td valign="middle" align="left">[<a href="libunistring_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td>
<td valign="middle" align="left">[<a href="libunistring_21.html#SEC94" title="Index">Index</a>]</td>
<td valign="middle" align="left">[<a href="libunistring_abt.html#SEC_About" title="About (help)"> ? </a>]</td>
</tr></table>
<p>
 <font size="-1">
  This document was generated by <em>Bruno Haible</em> on <em>October, 16 2024</em> using <a href="https://www.nongnu.org/texi2html/"><em>texi2html 1.78a</em></a>.
 </font>
 <br>

</p>
</body>
</html>