blob: 8eb7c25502665827043b32c2c5f64dfcbf1fac99 (
plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
|
This subdirectory contains a general character-set conversion
library, used in Timber, and available for use in other software if
it should happen to be useful.
I intend to use this same library in other programs at some future
date. (A cut-down version of it is already in use in some ports of
PuTTY.) It is therefore a _strong_ design goal that this library
should remain perfectly general, and not tied to particulars of
Timber. It must not reference any code outside its own subdirectory;
it should not have Timber-specific helper routines added to it
unless they can be documented in a general manner which might make
them useful in other circumstances as well.
There are some multibyte character encodings which this library does
not currently support. Those that I know of are:
- Johab. There is no reason why we _shouldn't_ support this, but it
wasn't immediately necessary at the time I did the initial
coding. If anyone needs it, it shouldn't be too hard. The Unicode
mapping table for the encoding is available at
http://www.unicode.org/Public/MAPPINGS/OBSOLETE/EASTASIA/KSC/JOHAB.TXT
- ISO-2022-JP-1 (RFC 2237), and ISO-2022-JP-2 (RFC 1554). These
should be even easier if required - we already have the ISO 2022
machinery in place, and support all the underlying character
sets.
- ISO-2022-CN and ISO-2022-CN-EXT (RFC 1922). These are a little tricky
as they allow use of both GB2312 (simplified Chinese) and CNS 11643
(traditional Chinese), so we may need some way to specify which to
prefer.
- The Hong Kong (HKSCS) extension to Big5. Again, mapping tables
are available in the Unihan database.
- Other Big Five extensions, which I don't have mapping tables for
at all.
|