diff options
author | Jörg Frings-Fürst <debian@jff.email> | 2024-03-03 19:11:58 +0100 |
---|---|---|
committer | Jörg Frings-Fürst <debian@jff.email> | 2024-03-03 19:11:58 +0100 |
commit | 9853b168f68cbb09b75a817343cedde2aca4c76c (patch) | |
tree | db628840acea83dbccaf5676b89579a80e02ef51 /doc/libunistring_10.html | |
parent | d83e85a2e6064c36f6ad3c848e39d8b8c101c4f7 (diff) | |
parent | 7cf710f6587e71a193a55d84dd6d8ae1a8a69ce0 (diff) |
Merge branch 'feature/upstream' into develop
Diffstat (limited to 'doc/libunistring_10.html')
-rw-r--r-- | doc/libunistring_10.html | 105 |
1 files changed, 54 insertions, 51 deletions
diff --git a/doc/libunistring_10.html b/doc/libunistring_10.html index 911fd297..75a7888a 100644 --- a/doc/libunistring_10.html +++ b/doc/libunistring_10.html @@ -1,6 +1,6 @@ <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html401/loose.dtd"> <html> -<!-- Created on October, 16 2022 by texi2html 1.78a --> +<!-- Created on February, 24 2024 by texi2html 1.78a --> <!-- Written by: Lionel Cons <Lionel.Cons@cern.ch> (original author) Karl Berry <karl@freefriends.org> @@ -42,8 +42,8 @@ ul.toc {list-style: none} <body lang="en" bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#800080" alink="#FF0000"> <table cellpadding="1" cellspacing="1" border="0"> -<tr><td valign="middle" align="left">[<a href="libunistring_9.html#SEC53" title="Beginning of this chapter or previous chapter"> << </a>]</td> -<td valign="middle" align="left">[<a href="libunistring_11.html#SEC57" title="Next chapter"> >> </a>]</td> +<tr><td valign="middle" align="left">[<a href="libunistring_9.html#SEC55" title="Beginning of this chapter or previous chapter"> << </a>]</td> +<td valign="middle" align="left">[<a href="libunistring_11.html#SEC59" title="Next chapter"> >> </a>]</td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> @@ -51,14 +51,14 @@ ul.toc {list-style: none} <td valign="middle" align="left"> </td> <td valign="middle" align="left">[<a href="libunistring_toc.html#SEC_Top" title="Cover (top) of document">Top</a>]</td> <td valign="middle" align="left">[<a href="libunistring_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td> -<td valign="middle" align="left">[<a href="libunistring_21.html#SEC92" title="Index">Index</a>]</td> +<td valign="middle" align="left">[<a href="libunistring_21.html#SEC94" title="Index">Index</a>]</td> <td valign="middle" align="left">[<a href="libunistring_abt.html#SEC_About" title="About (help)"> ? </a>]</td> </tr></table> <hr size="2"> <a name="unigbrk_002eh"></a> -<a name="SEC54"></a> -<h1 class="chapter"> <a href="libunistring_toc.html#TOC54">10. Grapheme cluster breaks in strings <code><unigbrk.h></code></a> </h1> +<a name="SEC56"></a> +<h1 class="chapter"> <a href="libunistring_toc.html#TOC56">10. Grapheme cluster breaks in strings <code><unigbrk.h></code></a> </h1> <p>This include file declares functions for determining where in a string “grapheme clusters” start and end. A “grapheme cluster” is an @@ -85,21 +85,21 @@ clusters. <hr size="6"> <a name="Grapheme-cluster-breaks-in-a-string"></a> -<a name="SEC55"></a> -<h2 class="section"> <a href="libunistring_toc.html#TOC55">10.1 Grapheme cluster breaks in a string</a> </h2> +<a name="SEC57"></a> +<h2 class="section"> <a href="libunistring_toc.html#TOC57">10.1 Grapheme cluster breaks in a string</a> </h2> <p>The following functions find a single boundary between grapheme clusters in a string. </p> <dl> <dt><u>Function:</u> void <b>u8_grapheme_next</b><i> (const uint8_t *<var>s</var>, const uint8_t *<var>end</var>)</i> -<a name="IDX769"></a> +<a name="IDX787"></a> </dt> <dt><u>Function:</u> void <b>u16_grapheme_next</b><i> (const uint16_t *<var>s</var>, const uint16_t *<var>end</var>)</i> -<a name="IDX770"></a> +<a name="IDX788"></a> </dt> <dt><u>Function:</u> void <b>u32_grapheme_next</b><i> (const uint32_t *<var>s</var>, const uint32_t *<var>end</var>)</i> -<a name="IDX771"></a> +<a name="IDX789"></a> </dt> <dd><p>Returns the start of the next grapheme cluster following <var>s</var>, or <var>end</var> if no grapheme cluster break is encountered before it. @@ -107,19 +107,20 @@ Returns NULL if and only if <code><var>s</var> == <var>end</var></code>. </p> <p>Note that these functions do not handle the case when a character outside of the range between <var>s</var> and <var>end</var> is needed to -determine the boundary. Use <code>_grapheme_breaks</code> functions for such -cases. +determine the boundary. +This is the case in particular with syllables in Indic scripts or emojis. +Use <code>_grapheme_breaks</code> functions for such cases. </p></dd></dl> <dl> <dt><u>Function:</u> void <b>u8_grapheme_prev</b><i> (const uint8_t *<var>s</var>, const uint8_t *<var>start</var>)</i> -<a name="IDX772"></a> +<a name="IDX790"></a> </dt> <dt><u>Function:</u> void <b>u16_grapheme_prev</b><i> (const uint16_t *<var>s</var>, const uint16_t *<var>start</var>)</i> -<a name="IDX773"></a> +<a name="IDX791"></a> </dt> <dt><u>Function:</u> void <b>u32_grapheme_prev</b><i> (const uint32_t *<var>s</var>, const uint32_t *<var>start</var>)</i> -<a name="IDX774"></a> +<a name="IDX792"></a> </dt> <dd><p>Returns the start of the grapheme cluster preceding <var>s</var>, or <var>start</var> if no grapheme cluster break is encountered before it. @@ -127,8 +128,9 @@ Returns NULL if and only if <code><var>s</var> == <var>start</var></code>. </p> <p>Note that these functions do not handle the case when a character outside of the range between <var>start</var> and <var>s</var> is needed to -determine the boundary. Use <code>_grapheme_breaks</code> functions for such -cases. +determine the boundary. +This is the case in particular with syllables in Indic scripts or emojis. +Use <code>_grapheme_breaks</code> functions for such cases. </p> <p>Note also that these functions work only on well-formed Unicode strings. </p></dd></dl> @@ -138,19 +140,19 @@ boundaries in a string. </p> <dl> <dt><u>Function:</u> void <b>u8_grapheme_breaks</b><i> (const uint8_t *<var>s</var>, size_t <var>n</var>, char *<var>p</var>)</i> -<a name="IDX775"></a> +<a name="IDX793"></a> </dt> <dt><u>Function:</u> void <b>u16_grapheme_breaks</b><i> (const uint16_t *<var>s</var>, size_t <var>n</var>, char *<var>p</var>)</i> -<a name="IDX776"></a> +<a name="IDX794"></a> </dt> <dt><u>Function:</u> void <b>u32_grapheme_breaks</b><i> (const uint32_t *<var>s</var>, size_t <var>n</var>, char *<var>p</var>)</i> -<a name="IDX777"></a> +<a name="IDX795"></a> </dt> <dt><u>Function:</u> void <b>ulc_grapheme_breaks</b><i> (const char *<var>s</var>, size_t <var>n</var>, char *<var>p</var>)</i> -<a name="IDX778"></a> +<a name="IDX796"></a> </dt> <dt><u>Function:</u> void <b>uc_grapheme_breaks</b><i> (const ucs_t *<var>s</var>, size_t <var>n</var>, char *<var>p</var>)</i> -<a name="IDX779"></a> +<a name="IDX797"></a> </dt> <dd><p>Determines the grapheme cluster break points in <var>s</var>, an array of <var>n</var> units, and stores the result at <code><var>p</var>[0..<var>nx</var>-1]</code>. @@ -177,8 +179,8 @@ characters. <hr size="6"> <a name="Grapheme-cluster-break-property"></a> -<a name="SEC56"></a> -<h2 class="section"> <a href="libunistring_toc.html#TOC56">10.2 Grapheme cluster break property</a> </h2> +<a name="SEC58"></a> +<h2 class="section"> <a href="libunistring_toc.html#TOC58">10.2 Grapheme cluster break property</a> </h2> <p>This is a more low-level API. The grapheme cluster break property is a property defined in Unicode Standard Annex #29, section “Grapheme Cluster @@ -191,58 +193,58 @@ property. More values may be added in the future. </p> <dl> <dt><u>Constant:</u> int <b>GBP_OTHER</b> -<a name="IDX780"></a> +<a name="IDX798"></a> </dt> <dt><u>Constant:</u> int <b>GBP_CR</b> -<a name="IDX781"></a> +<a name="IDX799"></a> </dt> <dt><u>Constant:</u> int <b>GBP_LF</b> -<a name="IDX782"></a> +<a name="IDX800"></a> </dt> <dt><u>Constant:</u> int <b>GBP_CONTROL</b> -<a name="IDX783"></a> +<a name="IDX801"></a> </dt> <dt><u>Constant:</u> int <b>GBP_EXTEND</b> -<a name="IDX784"></a> +<a name="IDX802"></a> </dt> <dt><u>Constant:</u> int <b>GBP_PREPEND</b> -<a name="IDX785"></a> +<a name="IDX803"></a> </dt> <dt><u>Constant:</u> int <b>GBP_SPACINGMARK</b> -<a name="IDX786"></a> +<a name="IDX804"></a> </dt> <dt><u>Constant:</u> int <b>GBP_L</b> -<a name="IDX787"></a> +<a name="IDX805"></a> </dt> <dt><u>Constant:</u> int <b>GBP_V</b> -<a name="IDX788"></a> +<a name="IDX806"></a> </dt> <dt><u>Constant:</u> int <b>GBP_T</b> -<a name="IDX789"></a> +<a name="IDX807"></a> </dt> <dt><u>Constant:</u> int <b>GBP_LV</b> -<a name="IDX790"></a> +<a name="IDX808"></a> </dt> <dt><u>Constant:</u> int <b>GBP_LVT</b> -<a name="IDX791"></a> +<a name="IDX809"></a> </dt> <dt><u>Constant:</u> int <b>GBP_RI</b> -<a name="IDX792"></a> +<a name="IDX810"></a> </dt> <dt><u>Constant:</u> int <b>GBP_ZWJ</b> -<a name="IDX793"></a> +<a name="IDX811"></a> </dt> <dt><u>Constant:</u> int <b>GBP_EB</b> -<a name="IDX794"></a> +<a name="IDX812"></a> </dt> <dt><u>Constant:</u> int <b>GBP_EM</b> -<a name="IDX795"></a> +<a name="IDX813"></a> </dt> <dt><u>Constant:</u> int <b>GBP_GAZ</b> -<a name="IDX796"></a> +<a name="IDX814"></a> </dt> <dt><u>Constant:</u> int <b>GBP_EBG</b> -<a name="IDX797"></a> +<a name="IDX815"></a> </dt> </dl> @@ -251,7 +253,7 @@ character. </p> <dl> <dt><u>Function:</u> int <b>uc_graphemeclusterbreak_property</b><i> (ucs4_t <var>uc</var>)</i> -<a name="IDX798"></a> +<a name="IDX816"></a> </dt> <dd><p>Returns the Grapheme_Cluster_Break property of a Unicode character. </p></dd></dl> @@ -262,7 +264,7 @@ the higher-level functions in the previous section are directly based. </p> <dl> <dt><u>Function:</u> bool <b>uc_is_grapheme_break</b><i> (ucs4_t <var>a</var>, ucs4_t <var>b</var>)</i> -<a name="IDX799"></a> +<a name="IDX817"></a> </dt> <dd><p>Returns true if there is an grapheme cluster boundary between Unicode characters <var>a</var> and <var>b</var>. @@ -276,13 +278,14 @@ described in the Unicode standard, because the standard says that they are preferred. </p> <p>Note that this function does not handle the case when three or more -consecutive characters are needed to determine the boundary. Use -<code>uc_grapheme_breaks</code> for such cases. +consecutive characters are needed to determine the boundary. +This is the case in particular with syllables in Indic scripts or emojis. +Use <code>uc_grapheme_breaks</code> for such cases. </p></dd></dl> <hr size="6"> <table cellpadding="1" cellspacing="1" border="0"> -<tr><td valign="middle" align="left">[<a href="#SEC54" title="Beginning of this chapter or previous chapter"> << </a>]</td> -<td valign="middle" align="left">[<a href="libunistring_11.html#SEC57" title="Next chapter"> >> </a>]</td> +<tr><td valign="middle" align="left">[<a href="#SEC56" title="Beginning of this chapter or previous chapter"> << </a>]</td> +<td valign="middle" align="left">[<a href="libunistring_11.html#SEC59" title="Next chapter"> >> </a>]</td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> <td valign="middle" align="left"> </td> @@ -290,12 +293,12 @@ consecutive characters are needed to determine the boundary. Use <td valign="middle" align="left"> </td> <td valign="middle" align="left">[<a href="libunistring_toc.html#SEC_Top" title="Cover (top) of document">Top</a>]</td> <td valign="middle" align="left">[<a href="libunistring_toc.html#SEC_Contents" title="Table of contents">Contents</a>]</td> -<td valign="middle" align="left">[<a href="libunistring_21.html#SEC92" title="Index">Index</a>]</td> +<td valign="middle" align="left">[<a href="libunistring_21.html#SEC94" title="Index">Index</a>]</td> <td valign="middle" align="left">[<a href="libunistring_abt.html#SEC_About" title="About (help)"> ? </a>]</td> </tr></table> <p> <font size="-1"> - This document was generated by <em>Bruno Haible</em> on <em>October, 16 2022</em> using <a href="https://www.nongnu.org/texi2html/"><em>texi2html 1.78a</em></a>. + This document was generated by <em>Bruno Haible</em> on <em>February, 24 2024</em> using <a href="https://www.nongnu.org/texi2html/"><em>texi2html 1.78a</em></a>. </font> <br> |