Mercurial > hg > xemacs-beta
comparison etc/unicode/unicode-consortium/BIG5.TXT @ 3803:e51807f9eedd
[xemacs-hg @ 2007-01-27 18:28:57 by stephent]
Fix up copying situation in etc/unicode/unicode-consortium. <87mz4471zg.fsf@uwakimon.sk.tsukuba.ac.jp>
author | stephent |
---|---|
date | Sat, 27 Jan 2007 18:29:06 +0000 |
parents | 943eaba38521 |
children | 49c847ce8aa6 |
comparison
equal
deleted
inserted
replaced
3802:d6f975442bd3 | 3803:e51807f9eedd |
---|---|
2 # Name: BIG5 to Unicode table (complete) | 2 # Name: BIG5 to Unicode table (complete) |
3 # Unicode version: 1.1 | 3 # Unicode version: 1.1 |
4 # Table version: 0.0d3 | 4 # Table version: 0.0d3 |
5 # Table format: Format A | 5 # Table format: Format A |
6 # Date: 11 February 1994 | 6 # Date: 11 February 1994 |
7 # Authors: Glenn Adams <glenn@metis.com> | |
8 # John H. Jenkins <John_Jenkins@taligent.com> | |
9 # | 7 # |
10 # Copyright (c) 1991-1994 Unicode, Inc. All Rights reserved. | 8 # Copyright (c) 1991-1994 Unicode, Inc. All Rights reserved. |
11 # | 9 # |
12 # This file is provided as-is by Unicode, Inc. (The Unicode Consortium). | 10 # This file is provided as-is by Unicode, Inc. (The Unicode Consortium). |
13 # No claims are made as to fitness for any particular purpose. No | 11 # No claims are made as to fitness for any particular purpose. No |
23 # specifically excludes the right to re-distribute this file directly | 21 # specifically excludes the right to re-distribute this file directly |
24 # to third parties or other organizations whether for profit or not. | 22 # to third parties or other organizations whether for profit or not. |
25 # | 23 # |
26 # General notes: | 24 # General notes: |
27 # | 25 # |
28 # This table contains the data Metis and Taligent currently have on how | 26 # |
29 # BIG5 characters map into Unicode. | 27 # This table contains one set of mappings from BIG5 into Unicode. |
28 # Note that these data are *possible* mappings only and may not be the | |
29 # same as those used by actual products, nor may they be the best suited | |
30 # for all uses. For more information on the mappings between various code | |
31 # pages incorporating the repertoire of BIG5 and Unicode, consult the | |
32 # VENDORS mapping data. Normative information on the mapping between | |
33 # BIG5 and Unicode may be found in the Unihan.txt file in the | |
34 # latest Unicode Character Database. | |
35 # | |
36 # If you have carefully considered the fact that the mappings in | |
37 # this table are only one possible set of mappings between BIG5 and | |
38 # Unicode and have no normative status, but still feel that you | |
39 # have located an error in the table that requires fixing, you may | |
40 # report any such error to errata@unicode.org. | |
30 # | 41 # |
31 # WARNING! It is currently impossible to provide round-trip compatibility | 42 # WARNING! It is currently impossible to provide round-trip compatibility |
32 # between BIG5 and Unicode. | 43 # between BIG5 and Unicode. |
33 # | 44 # |
34 # A number of characters are not currently mapped because | 45 # A number of characters are not currently mapped because |
50 # | 61 # |
51 # Notes: | 62 # Notes: |
52 # | 63 # |
53 # 1. In addition to the above, there is some uncertainty about the | 64 # 1. In addition to the above, there is some uncertainty about the |
54 # mappings in the range C6A1 - C8FE, and F9DD - F9FE. The ETEN | 65 # mappings in the range C6A1 - C8FE, and F9DD - F9FE. The ETEN |
55 # version of BIG5 organizes the former range differently, and adds | 66 # version of BIG5 organizes the former range differently, and adds |
56 # additional characters in the latter range. The correct mappings | 67 # additional characters in the latter range. The correct mappings |
57 # these ranges need to be determined. | 68 # these ranges need to be determined. |
58 # | 69 # |
59 # 2. There is an uncertainty in the mapping of the Big Five character | 70 # 2. There is an uncertainty in the mapping of the Big Five character |
60 # 0xA3BC. This character occurs within the Big Five block of tone marks | 71 # 0xA3BC. This character occurs within the Big Five block of tone marks |
61 # for bopomofo and is intended to be the tone mark for the first tone in | 72 # for bopomofo and is intended to be the tone mark for the first tone in |
62 # Mandarin Chinese. We have selected the mapping U+02C9 MODIFIER LETTER | 73 # Mandarin Chinese. We have selected the mapping U+02C9 MODIFIER LETTER |
63 # MACRON (Mandarin Chinese first tone) to reflect this semantic. | 74 # MACRON (Mandarin Chinese first tone) to reflect this semantic. |
64 # However, because bopomofo uses the absense of a tone mark to indicate | 75 # However, because bopomofo uses the absense of a tone mark to indicate |
65 # the first Mandarin tone, most implementations of Big Five represent | 76 # the first Mandarin tone, most implementations of Big Five represent |
66 # this character with a blank space, and so a mapping such as U+2003 EM SPACE | 77 # this character with a blank space, and so a mapping such as U+2003 EM |
67 # might be preferred. | 78 # SPACE might be preferred. |
68 # | |
69 # | |
70 # | 79 # |
71 # Format: Three tab-separated columns | 80 # Format: Three tab-separated columns |
72 # Column #1 is the BIG5 code (in hex as 0xXXXX) | 81 # Column #1 is the BIG5 code (in hex as 0xXXXX) |
73 # Column #2 is the Unicode (in hex as 0xXXXX) | 82 # Column #2 is the Unicode (in hex as 0xXXXX) |
74 # Column #3 is the Unicode name (follows a comment sign, '#') | 83 # Column #3 is the Unicode name (follows a comment sign, '#') |
75 # The official names for Unicode characters U+4E00 | 84 # The official names for Unicode characters U+4E00 |
76 # to U+9FA5, inclusive, is "CJK UNIFIED IDEOGRAPH-XXXX", | 85 # to U+9FA5, inclusive, is "CJK UNIFIED IDEOGRAPH-XXXX", |
77 # where XXXX is the code point. Including all these | 86 # where XXXX is the code point. Including all these |
78 # names in this file increases its size substantially | 87 # names in this file increases its size substantially |
79 # and needlessly. The token "<CJK>" is used for the | 88 # and needlessly. The token "<CJK>" is used for the |
80 # name of these characters. If necessary, it can be | 89 # name of these characters. If necessary, it can be |
81 # expanded algorithmically by a parser or editor. | 90 # expanded algorithmically by a parser or editor. |
82 # | 91 # |
83 # The entries are in BIG5 order | 92 # The entries are in BIG5 order |
84 # | |
85 # Any comments or problems, contact <John_Jenkins@taligent.com> | |
86 # | 93 # |
87 # | 94 # |
88 0xA140 0x3000 # IDEOGRAPHIC SPACE | 95 0xA140 0x3000 # IDEOGRAPHIC SPACE |
89 0xA141 0xFF0C # FULLWIDTH COMMA | 96 0xA141 0xFF0C # FULLWIDTH COMMA |
90 0xA142 0x3001 # IDEOGRAPHIC COMMA | 97 0xA142 0x3001 # IDEOGRAPHIC COMMA |