NEWS


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165

New in 1.2:
* The data tables and algorithms have been updated to Unicode version 15.1.0.
* New functions u8_pcpy, u16_pcpy, u32_pcpy, similar to mempcpy.
* New functions uc_indic_conjunct_break_name, uc_indic_conjunct_break_byname,
  uc_indic_conjunct_break.
* New functions
    uc_is_property_prepended_concatenation_mark,
    uc_is_property_id_compat_math_start, uc_is_property_id_compat_math_continue,
    uc_is_property_ids_unary_operator
  and new constants
    UC_PROPERTY_PREPENDED_CONCATENATION_MARK,
    UC_PROPERTY_ID_COMPAT_MATH_START, UC_PROPERTY_ID_COMPAT_MATH_CONTINUE,
    UC_PROPERTY_IDS_UNARY_OPERATOR.
* New constant _libunistring_unicode_version.
* The UTF-8 decoder functions, especially u8_mbtouc, are now more Unicode
  Standard compliant.
* The *printf functions no longer support the %n directive, for security
  reasons.
* Fixed a bug in the *printf functions: In the %U, %lU, %llU directives, a
  negative width given as an argument did not trigger left-justification.
* The functions u16_strstr and u32_strstr now operate in worst-case linear
  time.

New in 1.1:
* The data tables and algorithms have been updated to Unicode version 15.0.0.

New in 1.0:
* The license has changed from "LGPLv3+ or GPLv2" to "LGPLv3+ or GPLv2+".
* The data tables and algorithms have been updated to Unicode version 14.0.0.
* The functions u8_uctomb, u16_uctomb, u32_uctomb now support strings larger
  than 2 GiB by taking an 'n' argument of type ptrdiff_t (instead of int).
* The functions u*_possible_linebreaks and u*_width_linebreaks now make it
  easier to work with strings that contain CR-LF sequences: In this case,
  in the returned array, it will return UC_BREAK_CR_BEFORE_LF followed by
  UC_BREAK_MANDATORY (instead of twice UC_BREAK_MANDATORY).
* There are new properties for recognizing pictographic symbols and
  regional indicators:
    - UC_PROPERTY_EMOJI                  uc_is_property_emoji
    - UC_PROPERTY_EMOJI_PRESENTATION     uc_is_property_emoji_presentation
    - UC_PROPERTY_EMOJI_MODIFIER         uc_is_property_emoji_modifier
    - UC_PROPERTY_EMOJI_MODIFIER_BASE    uc_is_property_emoji_modifier_base
    - UC_PROPERTY_EMOJI_COMPONENT        uc_is_property_emoji_component
    - UC_PROPERTY_EXTENDED_PICTOGRAPHIC  uc_is_property_extended_pictographic
    - UC_PROPERTY_REGIONAL_INDICATOR     uc_is_property_regional_indicator
* Fixed multithread-safety bugs on Cygwin, native Windows, and Haiku.

New in 0.9.10:
* The functions
    u8_casing_prefix_context, u8_casing_prefixes_context,
    u8_casing_suffix_context, u8_casing_suffixes_context,
    u16_casing_prefix_context, u16_casing_prefixes_context,
    u16_casing_suffix_context, u16_casing_suffixes_context,
    u32_casing_prefix_context, u32_casing_prefixes_context,
    u32_casing_suffix_context, u32_casing_suffixes_context,
  that are documented since version 0.9.1, are now actually implemented.

New in 0.9.9:
* Fixed a multithread-safety bug.

New in 0.9.8:
* The data tables and line breaking algorithm have been updated to Unicode
  version 9.0.0.
* In the include file unigbrk.h, the function uc_grapheme_breaks has
  been added to accommodate the new UAX#29 rules involving 3 or more
  consecutive characters.

New in 0.9.7:
* The license has changed from LGPLv3+ to "LGPLv3+ or GPLv2"

New in 0.9.6:
* The data tables and line breaking algorithm have been updated to Unicode
  version 8.0.0.

New in 0.9.5:
* The data tables and line breaking algorithm have been updated to Unicode
  version 7.0.0.
* In the include file uniname.h, the function unicode_name_character
  has been extended to look for name aliases.

New in 0.9.4:
* The data tables and line breaking algorithm have been updated to Unicode
  version 6.0.0.
* A new include file unigbrk.h is provided. It declares functions for
  grapheme cluster breaking, that is, determining the boundaries between
  graphemes. See the documentation chapter "Grapheme cluster breaks in strings"
  for details.
* In the include file unictype.h, constants are defined for the group of
  general categories LC ("Cased Letter").
* In the include file unictype.h, functions for associating canonical
  combining classes with names have been added:
    uc_combining_class_name
    uc_combining_class_long_name
    uc_combining_class_byname
* In the include file unictype.h, functions for the Arabic joining type and
  the Arabic joining group have been added:
    uc_joining_type_name
    uc_joining_type_long_name
    uc_joining_type_byname
    uc_joining_type
    uc_joining_group_name
    uc_joining_group_byname
    uc_joining_group
* In the include file unictype.h, functions for new predefined properties
  have been added:
    uc_is_property_cased
    uc_is_property_case_ignorable
    uc_is_property_changes_when_lowercased
    uc_is_property_changes_when_uppercased
    uc_is_property_changes_when_titlecased
    uc_is_property_changes_when_casefolded
    uc_is_property_changes_when_casemapped
  But it's recommended to use the case mapping functions from unicase.h
  instead.
* In the include file unictype.h, the functions for bidi class, formerly known
  as bidirectional category, have been renamed:
    uc_bidi_category_name   -> uc_bidi_class_name
    uc_bidi_category_byname -> uc_bidi_class_byname
    uc_bidi_category        -> uc_bidi_class
    uc_is_bidi_category     -> uc_is_bidi_class
  The old function names still exist, but are obsolete.
* In the include file unictype.h, functions for returning long names of
  property values have been added:
    uc_general_category_long_name
    uc_bidi_class_long_name
  The functions
    uc_general_category_byname
    uc_bidi_class_byname
  have been extended to recognize long names as well as short names.
* It is now easier to detect the subminor version: The value of the variable
  _libunistring_version and of the macro _LIBUNISTRING_VERSION now includes
  also the subminor version.
* The functions u8_mbtouc and u8_mbtouc_unsafe now handle ill-formed UTF-8
  input in a better way, that is more compliant with W3C recommendations.
* The functions u8_strcoll, u16_strcoll, u32_strcoll now produce results that
  are less dependent on the iconv implementation in use.
* The functions u8_strstr, u16_strstr, u32_strstr now perform in O(n) time
  worst-case, where n is the sum of the lengths of the argument strings.

New in 0.9.3:
* Bug fixes in unistr.h functions:
  - The functions u16_to_u32, u16_to_u8, u8_to_u32, u8_to_u16 now fail when
    the argument is not valid. Previously, they returned a converted string
    where invalid parts were each replaced with U+FFFD.
  - The function u8_mbsnlen now counts an incomplete character at the end
    of the argument string as 1 character. Previously, it could count as 2
    or 3 characters.
  - The return value of the u8_stpncpy, u16_stpncpy, u32_stpncpy functions
    was incorrect.
  - The u8_strcoll, u16_strcoll, u32_strcoll now try harder to give a non-zero
    return value.
* Portability to MacOS X 10.6 and Cygwin 1.7.

New in 0.9.2:
* The function uc_locale_language now uses the locale of the current thread,
  if a thread-specific locale has been set.

New in 0.9.1:
* In the include file unicase.h, functions for case mapping of substrings have
  been added.

New in 0.9:
* The new include file unicase.h implements case folding.
* The new include file uninorm.h implements normalization.
* The new include file uniwbrk.h implements word break determination.
* The library source code is now part of gnulib.