summaryrefslogtreecommitdiff
path: root/doc/UNICODE_PROPERTIES
blob: dedc6587c851973d822d9065adbff79aeb9c6977 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
Unicode Properties (from Unicode Version: 8.0.0)

  1: Any
  2: Assigned
  3: C
  4: Cc
  5: Cf
  6: Cn
  7: Co
  8: Cs
  9: L
 10: LC
 11: Ll
 12: Lm
 13: Lo
 14: Lt
 15: Lu
 16: M
 17: Mc
 18: Me
 19: Mn
 20: N
 21: Nd
 22: Nl
 23: No
 24: P
 25: Pc
 26: Pd
 27: Pe
 28: Pf
 29: Pi
 30: Po
 31: Ps
 32: S
 33: Sc
 34: Sk
 35: Sm
 36: So
 37: Z
 38: Zl
 39: Zp
 40: Zs
 41: Math
 42: Alphabetic
 43: Lowercase
 44: Uppercase
 45: Cased
 46: Case_Ignorable
 47: Changes_When_Lowercased
 48: Changes_When_Uppercased
 49: Changes_When_Titlecased
 50: Changes_When_Casefolded
 51: Changes_When_Casemapped
 52: ID_Start
 53: ID_Continue
 54: XID_Start
 55: XID_Continue
 56: Default_Ignorable_Code_Point
 57: Grapheme_Extend
 58: Grapheme_Base
 59: Grapheme_Link
 60: Common
 61: Latin
 62: Greek
 63: Cyrillic
 64: Armenian
 65: Hebrew
 66: Arabic
 67: Syriac
 68: Thaana
 69: Devanagari
 70: Bengali
 71: Gurmukhi
 72: Gujarati
 73: Oriya
 74: Tamil
 75: Telugu
 76: Kannada
 77: Malayalam
 78: Sinhala
 79: Thai
 80: Lao
 81: Tibetan
 82: Myanmar
 83: Georgian
 84: Hangul
 85: Ethiopic
 86: Cherokee
 87: Canadian_Aboriginal
 88: Ogham
 89: Runic
 90: Khmer
 91: Mongolian
 92: Hiragana
 93: Katakana
 94: Bopomofo
 95: Han
 96: Yi
 97: Old_Italic
 98: Gothic
 99: Deseret
100: Inherited
101: Tagalog
102: Hanunoo
103: Buhid
104: Tagbanwa
105: Limbu
106: Tai_Le
107: Linear_B
108: Ugaritic
109: Shavian
110: Osmanya
111: Cypriot
112: Braille
113: Buginese
114: Coptic
115: New_Tai_Lue
116: Glagolitic
117: Tifinagh
118: Syloti_Nagri
119: Old_Persian
120: Kharoshthi
121: Balinese
122: Cuneiform
123: Phoenician
124: Phags_Pa
125: Nko
126: Sundanese
127: Lepcha
128: Ol_Chiki
129: Vai
130: Saurashtra
131: Kayah_Li
132: Rejang
133: Lycian
134: Carian
135: Lydian
136: Cham
137: Tai_Tham
138: Tai_Viet
139: Avestan
140: Egyptian_Hieroglyphs
141: Samaritan
142: Lisu
143: Bamum
144: Javanese
145: Meetei_Mayek
146: Imperial_Aramaic
147: Old_South_Arabian
148: Inscriptional_Parthian
149: Inscriptional_Pahlavi
150: Old_Turkic
151: Kaithi
152: Batak
153: Brahmi
154: Mandaic
155: Chakma
156: Meroitic_Cursive
157: Meroitic_Hieroglyphs
158: Miao
159: Sharada
160: Sora_Sompeng
161: Takri
162: Caucasian_Albanian
163: Bassa_Vah
164: Duployan
165: Elbasan
166: Grantha
167: Pahawh_Hmong
168: Khojki
169: Linear_A
170: Mahajani
171: Manichaean
172: Mende_Kikakui
173: Modi
174: Mro
175: Old_North_Arabian
176: Nabataean
177: Palmyrene
178: Pau_Cin_Hau
179: Old_Permic
180: Psalter_Pahlavi
181: Siddham
182: Khudawadi
183: Tirhuta
184: Warang_Citi
185: Ahom
186: Anatolian_Hieroglyphs
187: Hatran
188: Multani
189: Old_Hungarian
190: SignWriting
191: White_Space
192: Bidi_Control
193: Join_Control
194: Dash
195: Hyphen
196: Quotation_Mark
197: Terminal_Punctuation
198: Other_Math
199: Hex_Digit
200: ASCII_Hex_Digit
201: Other_Alphabetic
202: Ideographic
203: Diacritic
204: Extender
205: Other_Lowercase
206: Other_Uppercase
207: Noncharacter_Code_Point
208: Other_Grapheme_Extend
209: IDS_Binary_Operator
210: IDS_Trinary_Operator
211: Radical
212: Unified_Ideograph
213: Other_Default_Ignorable_Code_Point
214: Deprecated
215: Soft_Dotted
216: Logical_Order_Exception
217: Other_ID_Start
218: Other_ID_Continue
219: STerm
220: Variation_Selector
221: Pattern_White_Space
222: Pattern_Syntax
223: Unknown
224: Aghb
225: AHex
226: Arab
227: Armi
228: Armn
229: Avst
230: Bali
231: Bamu
232: Bass
233: Batk
234: Beng
235: Bidi_C
236: Bopo
237: Brah
238: Brai
239: Bugi
240: Buhd
241: Cakm
242: Cans
243: Cari
244: Cased_Letter
245: Cher
246: CI
247: Close_Punctuation
248: Combining_Mark
249: Connector_Punctuation
250: Control
251: Copt
252: Cprt
253: Currency_Symbol
254: CWCF
255: CWCM
256: CWL
257: CWT
258: CWU
259: Cyrl
260: Dash_Punctuation
261: Decimal_Number
262: Dep
263: Deva
264: DI
265: Dia
266: Dsrt
267: Dupl
268: Egyp
269: Elba
270: Enclosing_Mark
271: Ethi
272: Ext
273: Final_Punctuation
274: Format
275: Geor
276: Glag
277: Goth
278: Gran
279: Gr_Base
280: Grek
281: Gr_Ext
282: Gr_Link
283: Gujr
284: Guru
285: Hang
286: Hani
287: Hano
288: Hatr
289: Hebr
290: Hex
291: Hira
292: Hluw
293: Hmng
294: Hung
295: IDC
296: Ideo
297: IDS
298: IDSB
299: IDST
300: Initial_Punctuation
301: Ital
302: Java
303: Join_C
304: Kali
305: Kana
306: Khar
307: Khmr
308: Khoj
309: Knda
310: Kthi
311: Lana
312: Laoo
313: Latn
314: Lepc
315: Letter
316: Letter_Number
317: Limb
318: Lina
319: Linb
320: Line_Separator
321: LOE
322: Lowercase_Letter
323: Lyci
324: Lydi
325: Mahj
326: Mand
327: Mani
328: Mark
329: Math_Symbol
330: Mend
331: Merc
332: Mero
333: Mlym
334: Modifier_Letter
335: Modifier_Symbol
336: Mong
337: Mroo
338: Mtei
339: Mult
340: Mymr
341: Narb
342: Nbat
343: NChar
344: Nkoo
345: Nonspacing_Mark
346: Number
347: OAlpha
348: ODI
349: Ogam
350: OGr_Ext
351: OIDC
352: OIDS
353: Olck
354: OLower
355: OMath
356: Open_Punctuation
357: Orkh
358: Orya
359: Osma
360: Other
361: Other_Letter
362: Other_Number
363: Other_Punctuation
364: Other_Symbol
365: OUpper
366: Palm
367: Paragraph_Separator
368: Pat_Syn
369: Pat_WS
370: Pauc
371: Perm
372: Phag
373: Phli
374: Phlp
375: Phnx
376: Plrd
377: Private_Use
378: Prti
379: Punctuation
380: Qaac
381: Qaai
382: QMark
383: Rjng
384: Runr
385: Samr
386: Sarb
387: Saur
388: SD
389: Separator
390: Sgnw
391: Shaw
392: Shrd
393: Sidd
394: Sind
395: Sinh
396: Sora
397: Space_Separator
398: Spacing_Mark
399: Sund
400: Surrogate
401: Sylo
402: Symbol
403: Syrc
404: Tagb
405: Takr
406: Tale
407: Talu
408: Taml
409: Tavt
410: Telu
411: Term
412: Tfng
413: Tglg
414: Thaa
415: Tibt
416: Tirh
417: Titlecase_Letter
418: Ugar
419: UIdeo
420: Unassigned
421: Uppercase_Letter
422: Vaii
423: VS
424: Wara
425: WSpace
426: XIDC
427: XIDS
428: Xpeo
429: Xsux
430: Yiii
431: Zinh
432: Zyyy
433: Zzzz
434: In_Basic_Latin
435: In_Latin_1_Supplement
436: In_Latin_Extended_A
437: In_Latin_Extended_B
438: In_IPA_Extensions
439: In_Spacing_Modifier_Letters
440: In_Combining_Diacritical_Marks
441: In_Greek_and_Coptic
442: In_Cyrillic
443: In_Cyrillic_Supplement
444: In_Armenian
445: In_Hebrew
446: In_Arabic
447: In_Syriac
448: In_Arabic_Supplement
449: In_Thaana
450: In_NKo
451: In_Samaritan
452: In_Mandaic
453: In_Arabic_Extended_A
454: In_Devanagari
455: In_Bengali
456: In_Gurmukhi
457: In_Gujarati
458: In_Oriya
459: In_Tamil
460: In_Telugu
461: In_Kannada
462: In_Malayalam
463: In_Sinhala
464: In_Thai
465: In_Lao
466: In_Tibetan
467: In_Myanmar
468: In_Georgian
469: In_Hangul_Jamo
470: In_Ethiopic
471: In_Ethiopic_Supplement
472: In_Cherokee
473: In_Unified_Canadian_Aboriginal_Syllabics
474: In_Ogham
475: In_Runic
476: In_Tagalog
477: In_Hanunoo
478: In_Buhid
479: In_Tagbanwa
480: In_Khmer
481: In_Mongolian
482: In_Unified_Canadian_Aboriginal_Syllabics_Extended
483: In_Limbu
484: In_Tai_Le
485: In_New_Tai_Lue
486: In_Khmer_Symbols
487: In_Buginese
488: In_Tai_Tham
489: In_Combining_Diacritical_Marks_Extended
490: In_Balinese
491: In_Sundanese
492: In_Batak
493: In_Lepcha
494: In_Ol_Chiki
495: In_Sundanese_Supplement
496: In_Vedic_Extensions
497: In_Phonetic_Extensions
498: In_Phonetic_Extensions_Supplement
499: In_Combining_Diacritical_Marks_Supplement
500: In_Latin_Extended_Additional
501: In_Greek_Extended
502: In_General_Punctuation
503: In_Superscripts_and_Subscripts
504: In_Currency_Symbols
505: In_Combining_Diacritical_Marks_for_Symbols
506: In_Letterlike_Symbols
507: In_Number_Forms
508: In_Arrows
509: In_Mathematical_Operators
510: In_Miscellaneous_Technical
511: In_Control_Pictures
512: In_Optical_Character_Recognition
513: In_Enclosed_Alphanumerics
514: In_Box_Drawing
515: In_Block_Elements
516: In_Geometric_Shapes
517: In_Miscellaneous_Symbols
518: In_Dingbats
519: In_Miscellaneous_Mathematical_Symbols_A
520: In_Supplemental_Arrows_A
521: In_Braille_Patterns
522: In_Supplemental_Arrows_B
523: In_Miscellaneous_Mathematical_Symbols_B
524: In_Supplemental_Mathematical_Operators
525: In_Miscellaneous_Symbols_and_Arrows
526: In_Glagolitic
527: In_Latin_Extended_C
528: In_Coptic
529: In_Georgian_Supplement
530: In_Tifinagh
531: In_Ethiopic_Extended
532: In_Cyrillic_Extended_A
533: In_Supplemental_Punctuation
534: In_CJK_Radicals_Supplement
535: In_Kangxi_Radicals
536: In_Ideographic_Description_Characters
537: In_CJK_Symbols_and_Punctuation
538: In_Hiragana
539: In_Katakana
540: In_Bopomofo
541: In_Hangul_Compatibility_Jamo
542: In_Kanbun
543: In_Bopomofo_Extended
544: In_CJK_Strokes
545: In_Katakana_Phonetic_Extensions
546: In_Enclosed_CJK_Letters_and_Months
547: In_CJK_Compatibility
548: In_CJK_Unified_Ideographs_Extension_A
549: In_Yijing_Hexagram_Symbols
550: In_CJK_Unified_Ideographs
551: In_Yi_Syllables
552: In_Yi_Radicals
553: In_Lisu
554: In_Vai
555: In_Cyrillic_Extended_B
556: In_Bamum
557: In_Modifier_Tone_Letters
558: In_Latin_Extended_D
559: In_Syloti_Nagri
560: In_Common_Indic_Number_Forms
561: In_Phags_pa
562: In_Saurashtra
563: In_Devanagari_Extended
564: In_Kayah_Li
565: In_Rejang
566: In_Hangul_Jamo_Extended_A
567: In_Javanese
568: In_Myanmar_Extended_B
569: In_Cham
570: In_Myanmar_Extended_A
571: In_Tai_Viet
572: In_Meetei_Mayek_Extensions
573: In_Ethiopic_Extended_A
574: In_Latin_Extended_E
575: In_Cherokee_Supplement
576: In_Meetei_Mayek
577: In_Hangul_Syllables
578: In_Hangul_Jamo_Extended_B
579: In_High_Surrogates
580: In_High_Private_Use_Surrogates
581: In_Low_Surrogates
582: In_Private_Use_Area
583: In_CJK_Compatibility_Ideographs
584: In_Alphabetic_Presentation_Forms
585: In_Arabic_Presentation_Forms_A
586: In_Variation_Selectors
587: In_Vertical_Forms
588: In_Combining_Half_Marks
589: In_CJK_Compatibility_Forms
590: In_Small_Form_Variants
591: In_Arabic_Presentation_Forms_B
592: In_Halfwidth_and_Fullwidth_Forms
593: In_Specials
594: In_Linear_B_Syllabary
595: In_Linear_B_Ideograms
596: In_Aegean_Numbers
597: In_Ancient_Greek_Numbers
598: In_Ancient_Symbols
599: In_Phaistos_Disc
600: In_Lycian
601: In_Carian
602: In_Coptic_Epact_Numbers
603: In_Old_Italic
604: In_Gothic
605: In_Old_Permic
606: In_Ugaritic
607: In_Old_Persian
608: In_Deseret
609: In_Shavian
610: In_Osmanya
611: In_Elbasan
612: In_Caucasian_Albanian
613: In_Linear_A
614: In_Cypriot_Syllabary
615: In_Imperial_Aramaic
616: In_Palmyrene
617: In_Nabataean
618: In_Hatran
619: In_Phoenician
620: In_Lydian
621: In_Meroitic_Hieroglyphs
622: In_Meroitic_Cursive
623: In_Kharoshthi
624: In_Old_South_Arabian
625: In_Old_North_Arabian
626: In_Manichaean
627: In_Avestan
628: In_Inscriptional_Parthian
629: In_Inscriptional_Pahlavi
630: In_Psalter_Pahlavi
631: In_Old_Turkic
632: In_Old_Hungarian
633: In_Rumi_Numeral_Symbols
634: In_Brahmi
635: In_Kaithi
636: In_Sora_Sompeng
637: In_Chakma
638: In_Mahajani
639: In_Sharada
640: In_Sinhala_Archaic_Numbers
641: In_Khojki
642: In_Multani
643: In_Khudawadi
644: In_Grantha
645: In_Tirhuta
646: In_Siddham
647: In_Modi
648: In_Takri
649: In_Ahom
650: In_Warang_Citi
651: In_Pau_Cin_Hau
652: In_Cuneiform
653: In_Cuneiform_Numbers_and_Punctuation
654: In_Early_Dynastic_Cuneiform
655: In_Egyptian_Hieroglyphs
656: In_Anatolian_Hieroglyphs
657: In_Bamum_Supplement
658: In_Mro
659: In_Bassa_Vah
660: In_Pahawh_Hmong
661: In_Miao
662: In_Kana_Supplement
663: In_Duployan
664: In_Shorthand_Format_Controls
665: In_Byzantine_Musical_Symbols
666: In_Musical_Symbols
667: In_Ancient_Greek_Musical_Notation
668: In_Tai_Xuan_Jing_Symbols
669: In_Counting_Rod_Numerals
670: In_Mathematical_Alphanumeric_Symbols
671: In_Sutton_SignWriting
672: In_Mende_Kikakui
673: In_Arabic_Mathematical_Alphabetic_Symbols
674: In_Mahjong_Tiles
675: In_Domino_Tiles
676: In_Playing_Cards
677: In_Enclosed_Alphanumeric_Supplement
678: In_Enclosed_Ideographic_Supplement
679: In_Miscellaneous_Symbols_and_Pictographs
680: In_Emoticons
681: In_Ornamental_Dingbats
682: In_Transport_and_Map_Symbols
683: In_Alchemical_Symbols
684: In_Geometric_Shapes_Extended
685: In_Supplemental_Arrows_C
686: In_Supplemental_Symbols_and_Pictographs
687: In_CJK_Unified_Ideographs_Extension_B
688: In_CJK_Unified_Ideographs_Extension_C
689: In_CJK_Unified_Ideographs_Extension_D
690: In_CJK_Unified_Ideographs_Extension_E
691: In_CJK_Compatibility_Ideographs_Supplement
692: In_Tags
693: In_Variation_Selectors_Supplement
694: In_Supplementary_Private_Use_Area_A
695: In_Supplementary_Private_Use_Area_B
696: In_No_Block