summaryrefslogtreecommitdiff
path: root/doc/UNICODE_PROPERTIES
blob: 8521f0c72c65ac0546188799dbca9fd0cbb3c06b (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
Unicode Properties (from Unicode Version: 8.0.0)

 15: ASCII_Hex_Digit
 16: Ahom
 17: Alphabetic
 18: Anatolian_Hieroglyphs
 19: Any
 20: Arabic
 21: Armenian
 22: Assigned
 23: Avestan
 24: Balinese
 25: Bamum
 26: Bassa_Vah
 27: Batak
 28: Bengali
 29: Bidi_Control
 30: Bopomofo
 31: Brahmi
 32: Braille
 33: Buginese
 34: Buhid
 35: C
 36: Canadian_Aboriginal
 37: Carian
 38: Case_Ignorable
 39: Cased
 40: Caucasian_Albanian
 41: Cc
 42: Cf
 43: Chakma
 44: Cham
 45: Changes_When_Casefolded
 46: Changes_When_Casemapped
 47: Changes_When_Lowercased
 48: Changes_When_Titlecased
 49: Changes_When_Uppercased
 50: Cherokee
 51: Cn
 52: Co
 53: Common
 54: Coptic
 55: Cs
 56: Cuneiform
 57: Cypriot
 58: Cyrillic
 59: Dash
 60: Default_Ignorable_Code_Point
 61: Deprecated
 62: Deseret
 63: Devanagari
 64: Diacritic
 65: Duployan
 66: Egyptian_Hieroglyphs
 67: Elbasan
 68: Ethiopic
 69: Extender
 70: Georgian
 71: Glagolitic
 72: Gothic
 73: Grantha
 74: Grapheme_Base
 75: Grapheme_Extend
 76: Grapheme_Link
 77: Greek
 78: Gujarati
 79: Gurmukhi
 80: Han
 81: Hangul
 82: Hanunoo
 83: Hatran
 84: Hebrew
 85: Hex_Digit
 86: Hiragana
 87: Hyphen
 88: IDS_Binary_Operator
 89: IDS_Trinary_Operator
 90: ID_Continue
 91: ID_Start
 92: Ideographic
 93: Imperial_Aramaic
 94: Inherited
 95: Inscriptional_Pahlavi
 96: Inscriptional_Parthian
 97: Javanese
 98: Join_Control
 99: Kaithi
100: Kannada
101: Katakana
102: Kayah_Li
103: Kharoshthi
104: Khmer
105: Khojki
106: Khudawadi
107: L
108: LC
109: Lao
110: Latin
111: Lepcha
112: Limbu
113: Linear_A
114: Linear_B
115: Lisu
116: Ll
117: Lm
118: Lo
119: Logical_Order_Exception
120: Lowercase
121: Lt
122: Lu
123: Lycian
124: Lydian
125: M
126: Mahajani
127: Malayalam
128: Mandaic
129: Manichaean
130: Math
131: Mc
132: Me
133: Meetei_Mayek
134: Mende_Kikakui
135: Meroitic_Cursive
136: Meroitic_Hieroglyphs
137: Miao
138: Mn
139: Modi
140: Mongolian
141: Mro
142: Multani
143: Myanmar
144: N
145: Nabataean
146: Nd
147: New_Tai_Lue
148: Nko
149: Nl
150: No
151: Noncharacter_Code_Point
152: Ogham
153: Ol_Chiki
154: Old_Hungarian
155: Old_Italic
156: Old_North_Arabian
157: Old_Permic
158: Old_Persian
159: Old_South_Arabian
160: Old_Turkic
161: Oriya
162: Osmanya
163: Other_Alphabetic
164: Other_Default_Ignorable_Code_Point
165: Other_Grapheme_Extend
166: Other_ID_Continue
167: Other_ID_Start
168: Other_Lowercase
169: Other_Math
170: Other_Uppercase
171: P
172: Pahawh_Hmong
173: Palmyrene
174: Pattern_Syntax
175: Pattern_White_Space
176: Pau_Cin_Hau
177: Pc
178: Pd
179: Pe
180: Pf
181: Phags_Pa
182: Phoenician
183: Pi
184: Po
185: Ps
186: Psalter_Pahlavi
187: Quotation_Mark
188: Radical
189: Rejang
190: Runic
191: S
192: STerm
193: Samaritan
194: Saurashtra
195: Sc
196: Sharada
197: Shavian
198: Siddham
199: SignWriting
200: Sinhala
201: Sk
202: Sm
203: So
204: Soft_Dotted
205: Sora_Sompeng
206: Sundanese
207: Syloti_Nagri
208: Syriac
209: Tagalog
210: Tagbanwa
211: Tai_Le
212: Tai_Tham
213: Tai_Viet
214: Takri
215: Tamil
216: Telugu
217: Terminal_Punctuation
218: Thaana
219: Thai
220: Tibetan
221: Tifinagh
222: Tirhuta
223: Ugaritic
224: Unified_Ideograph
225: Unknown
226: Uppercase
227: Vai
228: Variation_Selector
229: Warang_Citi
230: White_Space
231: XID_Continue
232: XID_Start
233: Yi
234: Z
235: Zl
236: Zp
237: Zs
 40: Aghb
 15: AHex
 20: Arab
 93: Armi
 21: Armn
 23: Avst
 24: Bali
 25: Bamu
 26: Bass
 27: Batk
 28: Beng
 29: Bidi_C
 30: Bopo
 31: Brah
 32: Brai
 33: Bugi
 34: Buhd
 43: Cakm
 36: Cans
 37: Cari
108: Cased_Letter
 50: Cher
 38: CI
179: Close_Punctuation
125: Combining_Mark
177: Connector_Punctuation
 41: Control
 54: Copt
 57: Cprt
195: Currency_Symbol
 45: CWCF
 46: CWCM
 47: CWL
 48: CWT
 49: CWU
 58: Cyrl
178: Dash_Punctuation
146: Decimal_Number
 61: Dep
 63: Deva
 60: DI
 64: Dia
 62: Dsrt
 65: Dupl
 66: Egyp
 67: Elba
132: Enclosing_Mark
 68: Ethi
 69: Ext
180: Final_Punctuation
 42: Format
 70: Geor
 71: Glag
 72: Goth
 73: Gran
 74: Gr_Base
 77: Grek
 75: Gr_Ext
 76: Gr_Link
 78: Gujr
 79: Guru
 81: Hang
 80: Hani
 82: Hano
 83: Hatr
 84: Hebr
 85: Hex
 86: Hira
 18: Hluw
172: Hmng
154: Hung
 90: IDC
 92: Ideo
 91: IDS
 88: IDSB
 89: IDST
183: Initial_Punctuation
155: Ital
 97: Java
 98: Join_C
102: Kali
101: Kana
103: Khar
104: Khmr
105: Khoj
100: Knda
 99: Kthi
212: Lana
109: Laoo
110: Latn
111: Lepc
107: Letter
149: Letter_Number
112: Limb
113: Lina
114: Linb
235: Line_Separator
119: LOE
116: Lowercase_Letter
123: Lyci
124: Lydi
126: Mahj
128: Mand
129: Mani
125: Mark
202: Math_Symbol
134: Mend
135: Merc
136: Mero
127: Mlym
117: Modifier_Letter
201: Modifier_Symbol
140: Mong
141: Mroo
133: Mtei
142: Mult
143: Mymr
156: Narb
145: Nbat
151: NChar
148: Nkoo
138: Nonspacing_Mark
144: Number
163: OAlpha
164: ODI
152: Ogam
165: OGr_Ext
166: OIDC
167: OIDS
153: Olck
168: OLower
169: OMath
185: Open_Punctuation
160: Orkh
161: Orya
162: Osma
 35: Other
118: Other_Letter
150: Other_Number
184: Other_Punctuation
203: Other_Symbol
170: OUpper
173: Palm
236: Paragraph_Separator
174: Pat_Syn
175: Pat_WS
176: Pauc
157: Perm
181: Phag
 95: Phli
186: Phlp
182: Phnx
137: Plrd
 52: Private_Use
 96: Prti
171: Punctuation
 54: Qaac
 94: Qaai
187: QMark
189: Rjng
190: Runr
193: Samr
159: Sarb
194: Saur
204: SD
234: Separator
199: Sgnw
197: Shaw
196: Shrd
198: Sidd
106: Sind
200: Sinh
205: Sora
237: Space_Separator
131: Spacing_Mark
206: Sund
 55: Surrogate
207: Sylo
191: Symbol
208: Syrc
210: Tagb
214: Takr
211: Tale
147: Talu
215: Taml
213: Tavt
216: Telu
217: Term
221: Tfng
209: Tglg
218: Thaa
220: Tibt
222: Tirh
121: Titlecase_Letter
223: Ugar
224: UIdeo
 51: Unassigned
122: Uppercase_Letter
227: Vaii
228: VS
229: Wara
230: WSpace
231: XIDC
232: XIDS
158: Xpeo
 56: Xsux
233: Yiii
 94: Zinh
 53: Zyyy
225: Zzzz
238: In_Basic_Latin
239: In_Latin_1_Supplement
240: In_Latin_Extended_A
241: In_Latin_Extended_B
242: In_IPA_Extensions
243: In_Spacing_Modifier_Letters
244: In_Combining_Diacritical_Marks
245: In_Greek_and_Coptic
246: In_Cyrillic
247: In_Cyrillic_Supplement
248: In_Armenian
249: In_Hebrew
250: In_Arabic
251: In_Syriac
252: In_Arabic_Supplement
253: In_Thaana
254: In_NKo
255: In_Samaritan
256: In_Mandaic
257: In_Arabic_Extended_A
258: In_Devanagari
259: In_Bengali
260: In_Gurmukhi
261: In_Gujarati
262: In_Oriya
263: In_Tamil
264: In_Telugu
265: In_Kannada
266: In_Malayalam
267: In_Sinhala
268: In_Thai
269: In_Lao
270: In_Tibetan
271: In_Myanmar
272: In_Georgian
273: In_Hangul_Jamo
274: In_Ethiopic
275: In_Ethiopic_Supplement
276: In_Cherokee
277: In_Unified_Canadian_Aboriginal_Syllabics
278: In_Ogham
279: In_Runic
280: In_Tagalog
281: In_Hanunoo
282: In_Buhid
283: In_Tagbanwa
284: In_Khmer
285: In_Mongolian
286: In_Unified_Canadian_Aboriginal_Syllabics_Extended
287: In_Limbu
288: In_Tai_Le
289: In_New_Tai_Lue
290: In_Khmer_Symbols
291: In_Buginese
292: In_Tai_Tham
293: In_Combining_Diacritical_Marks_Extended
294: In_Balinese
295: In_Sundanese
296: In_Batak
297: In_Lepcha
298: In_Ol_Chiki
299: In_Sundanese_Supplement
300: In_Vedic_Extensions
301: In_Phonetic_Extensions
302: In_Phonetic_Extensions_Supplement
303: In_Combining_Diacritical_Marks_Supplement
304: In_Latin_Extended_Additional
305: In_Greek_Extended
306: In_General_Punctuation
307: In_Superscripts_and_Subscripts
308: In_Currency_Symbols
309: In_Combining_Diacritical_Marks_for_Symbols
310: In_Letterlike_Symbols
311: In_Number_Forms
312: In_Arrows
313: In_Mathematical_Operators
314: In_Miscellaneous_Technical
315: In_Control_Pictures
316: In_Optical_Character_Recognition
317: In_Enclosed_Alphanumerics
318: In_Box_Drawing
319: In_Block_Elements
320: In_Geometric_Shapes
321: In_Miscellaneous_Symbols
322: In_Dingbats
323: In_Miscellaneous_Mathematical_Symbols_A
324: In_Supplemental_Arrows_A
325: In_Braille_Patterns
326: In_Supplemental_Arrows_B
327: In_Miscellaneous_Mathematical_Symbols_B
328: In_Supplemental_Mathematical_Operators
329: In_Miscellaneous_Symbols_and_Arrows
330: In_Glagolitic
331: In_Latin_Extended_C
332: In_Coptic
333: In_Georgian_Supplement
334: In_Tifinagh
335: In_Ethiopic_Extended
336: In_Cyrillic_Extended_A
337: In_Supplemental_Punctuation
338: In_CJK_Radicals_Supplement
339: In_Kangxi_Radicals
340: In_Ideographic_Description_Characters
341: In_CJK_Symbols_and_Punctuation
342: In_Hiragana
343: In_Katakana
344: In_Bopomofo
345: In_Hangul_Compatibility_Jamo
346: In_Kanbun
347: In_Bopomofo_Extended
348: In_CJK_Strokes
349: In_Katakana_Phonetic_Extensions
350: In_Enclosed_CJK_Letters_and_Months
351: In_CJK_Compatibility
352: In_CJK_Unified_Ideographs_Extension_A
353: In_Yijing_Hexagram_Symbols
354: In_CJK_Unified_Ideographs
355: In_Yi_Syllables
356: In_Yi_Radicals
357: In_Lisu
358: In_Vai
359: In_Cyrillic_Extended_B
360: In_Bamum
361: In_Modifier_Tone_Letters
362: In_Latin_Extended_D
363: In_Syloti_Nagri
364: In_Common_Indic_Number_Forms
365: In_Phags_pa
366: In_Saurashtra
367: In_Devanagari_Extended
368: In_Kayah_Li
369: In_Rejang
370: In_Hangul_Jamo_Extended_A
371: In_Javanese
372: In_Myanmar_Extended_B
373: In_Cham
374: In_Myanmar_Extended_A
375: In_Tai_Viet
376: In_Meetei_Mayek_Extensions
377: In_Ethiopic_Extended_A
378: In_Latin_Extended_E
379: In_Cherokee_Supplement
380: In_Meetei_Mayek
381: In_Hangul_Syllables
382: In_Hangul_Jamo_Extended_B
383: In_High_Surrogates
384: In_High_Private_Use_Surrogates
385: In_Low_Surrogates
386: In_Private_Use_Area
387: In_CJK_Compatibility_Ideographs
388: In_Alphabetic_Presentation_Forms
389: In_Arabic_Presentation_Forms_A
390: In_Variation_Selectors
391: In_Vertical_Forms
392: In_Combining_Half_Marks
393: In_CJK_Compatibility_Forms
394: In_Small_Form_Variants
395: In_Arabic_Presentation_Forms_B
396: In_Halfwidth_and_Fullwidth_Forms
397: In_Specials
398: In_Linear_B_Syllabary
399: In_Linear_B_Ideograms
400: In_Aegean_Numbers
401: In_Ancient_Greek_Numbers
402: In_Ancient_Symbols
403: In_Phaistos_Disc
404: In_Lycian
405: In_Carian
406: In_Coptic_Epact_Numbers
407: In_Old_Italic
408: In_Gothic
409: In_Old_Permic
410: In_Ugaritic
411: In_Old_Persian
412: In_Deseret
413: In_Shavian
414: In_Osmanya
415: In_Elbasan
416: In_Caucasian_Albanian
417: In_Linear_A
418: In_Cypriot_Syllabary
419: In_Imperial_Aramaic
420: In_Palmyrene
421: In_Nabataean
422: In_Hatran
423: In_Phoenician
424: In_Lydian
425: In_Meroitic_Hieroglyphs
426: In_Meroitic_Cursive
427: In_Kharoshthi
428: In_Old_South_Arabian
429: In_Old_North_Arabian
430: In_Manichaean
431: In_Avestan
432: In_Inscriptional_Parthian
433: In_Inscriptional_Pahlavi
434: In_Psalter_Pahlavi
435: In_Old_Turkic
436: In_Old_Hungarian
437: In_Rumi_Numeral_Symbols
438: In_Brahmi
439: In_Kaithi
440: In_Sora_Sompeng
441: In_Chakma
442: In_Mahajani
443: In_Sharada
444: In_Sinhala_Archaic_Numbers
445: In_Khojki
446: In_Multani
447: In_Khudawadi
448: In_Grantha
449: In_Tirhuta
450: In_Siddham
451: In_Modi
452: In_Takri
453: In_Ahom
454: In_Warang_Citi
455: In_Pau_Cin_Hau
456: In_Cuneiform
457: In_Cuneiform_Numbers_and_Punctuation
458: In_Early_Dynastic_Cuneiform
459: In_Egyptian_Hieroglyphs
460: In_Anatolian_Hieroglyphs
461: In_Bamum_Supplement
462: In_Mro
463: In_Bassa_Vah
464: In_Pahawh_Hmong
465: In_Miao
466: In_Kana_Supplement
467: In_Duployan
468: In_Shorthand_Format_Controls
469: In_Byzantine_Musical_Symbols
470: In_Musical_Symbols
471: In_Ancient_Greek_Musical_Notation
472: In_Tai_Xuan_Jing_Symbols
473: In_Counting_Rod_Numerals
474: In_Mathematical_Alphanumeric_Symbols
475: In_Sutton_SignWriting
476: In_Mende_Kikakui
477: In_Arabic_Mathematical_Alphabetic_Symbols
478: In_Mahjong_Tiles
479: In_Domino_Tiles
480: In_Playing_Cards
481: In_Enclosed_Alphanumeric_Supplement
482: In_Enclosed_Ideographic_Supplement
483: In_Miscellaneous_Symbols_and_Pictographs
484: In_Emoticons
485: In_Ornamental_Dingbats
486: In_Transport_and_Map_Symbols
487: In_Alchemical_Symbols
488: In_Geometric_Shapes_Extended
489: In_Supplemental_Arrows_C
490: In_Supplemental_Symbols_and_Pictographs
491: In_CJK_Unified_Ideographs_Extension_B
492: In_CJK_Unified_Ideographs_Extension_C
493: In_CJK_Unified_Ideographs_Extension_D
494: In_CJK_Unified_Ideographs_Extension_E
495: In_CJK_Compatibility_Ideographs_Supplement
496: In_Tags
497: In_Variation_Selectors_Supplement
498: In_Supplementary_Private_Use_Area_A
499: In_Supplementary_Private_Use_Area_B
500: In_No_Block