Docstoc

GUJARATI

Document Sample
GUJARATI Powered By Docstoc
					##Adobe File Version: 1.000
#=======================================================================
#   FTP file name: GUJARATI.TXT
#
#   Contents:       Map (external version) from Mac OS Gujarati
#                   encoding to Unicode 2.1
#
#   Copyright:      (c) 1997-1999 by Apple Computer, Inc., all rights
#                   reserved.
#
#   Contact:        charsets@apple.com
#
#   Changes:
#
#       b02 1999-Sep-22     Update contact e-mail address. Matches
#                           internal utom<b1>, ufrm<b1>, and Text
#                           Encoding Converter version 1.5.
#       n02 1998-Feb-05     First version; matches internal utom<n4>,
#                           ufrm<n5>.
#
# Standard header:
# ----------------
#
#   Apple, the Apple logo, and Macintosh are trademarks of Apple
#   Computer, Inc., registered in the United States and other countries.
#   Unicode is a trademark of Unicode Inc. For the sake of brevity,
#   throughout this document, "Macintosh" can be used to refer to
#   Macintosh computers and "Unicode" can be used to refer to the
#   Unicode standard.
#
#   Apple makes no warranty or representation, either express or
#   implied, with respect to these tables, their quality, accuracy, or
#   fitness for a particular purpose. In no event will Apple be liable
#   for direct, indirect, special, incidental, or consequential damages
#   resulting from any defect or inaccuracy in this document or the
#   accompanying tables.
#
#   These mapping tables and character lists are subject to change.
#   The latest tables should be available from the following:
#
#   <ftp://ftp.unicode.org/Public/MAPPINGS/VENDORS/APPLE/>
#
<ftp://dev.apple.com/devworld/Technical_Documentation/Misc._Standards/>
#
#   For general information about Mac OS encodings and these mapping
#   tables, see the file "README.TXT".
#
# Format:
# -------
#
#   Three tab-separated columns;
#   '#' begins a comment which continues to the end of the line.
#     Column #1 is the Mac OS Gujarati code or code sequence
#       (in hex as 0xNN or 0xNN+0xNN)
#     Column #2 is the corresponding Unicode or Unicode sequence
#       (in hex as 0xNNNN or 0xNNNN+0xNNNN).
#     Column #3 is a comment containing the Unicode name or sequence
#       of names. In some cases an additional comment follows the
#       Unicode name(s).
#
#   The entries are in two sections. The first section is for pairs of
#   Mac OS Gujarati code points that must be mapped in a special way.
#   The second section maps individual code points.
#
#   Within each section, the entries are in Mac OS Gujarati code order.
#
#   Control character mappings are not shown in this table, following
#   the conventions of the standard UTC mapping tables. However, the
#   Mac OS Gujarati character set uses the standard control characters
#   at 0x00-0x1F and 0x7F.
#
# Notes on Mac OS Gujarati:
# -------------------------
#
#   Mac OS Gujarati is based on IS 13194:1991 (ISCII-91), with the
#   addition of several punctuation and symbol characters. However,
#   Mac OS Gujarati does not support the ATR (attribute) mechanism of
#   ISCII-91.
#
# 1. ISCII-91 features in Mac OS Gujarati include:
#
# a) Overloading of nukta
#
#     In addition to using the nukta (0xE9) like a combining dot below,
#     nukta is overloaded to function as a general character modifier.
#     In this role, certain code points followed by 0xE9 are treated as
#     a two-byte code point representing a character which may be
#     rather different than the characters represented by either of
#     the code points alone. For example, the character GUJARATI OM
#     (U+0AD0) is represented in ISCII-91 as candrabindu + nukta.
#
# b) Explicit halant and soft halant
#
#     A double halant (0xE8 + 0xE8) constitutes an "explicit halant",
#     which will always appear as a halant instead of causing formation
#     of a ligature or half-form consonant.
#
#     Halant followed by nukta (0xE8 + 0xE9) constitutes a "soft
#     halant", which prevents formation of a ligature and instead
#     retains the half-form of the first consonant.
#
# c) Invisible consonant
#
#     The byte 0xD9 (called INV in ISCII-91) is an invisible consonant:
#     It behaves like a consonant but has no visible appearance. It is
#     intended to be used (often in combination with halant) to display
#     dependent forms in isolation, such as the RA forms or consonant
#     half-forms.
#
#   d) Extensions for Vedic, etc.
#
#      The byte 0xF0 (called EXT in ISCII-91) followed by any byte in
#      the range 0xA1-0xEE constitutes a two-byte code point which can
#      be used to represent additional characters for Vedic (or other
#      extensions); 0xF0 followed by any other byte value constitutes
#      malformed text. Mac OS Gujarati supports this mechanism, but
#      does not currently map any of these two-byte code points to
#      anything.
#
#   2. Mac OS Gujarati additions
#
#    Mac OS Gujarati adds characters using the code points
#    0x80-0x8A and 0x90.
#
#   3. Unused code points
#
#    The following code points are currently unused, and are not shown
#    here: 0x8B-0x8F, 0x91-0xA0, 0xAB, 0xAF, 0xC7, 0xCE, 0xD0, 0xD3,
#    0xE0, 0xE4, 0xEB-0xEF, 0xFB-0xFF. In addition, 0xF0 is not shown
#    here, but it has a special function as described above.
#
#   Unicode mapping issues and notes:
#   ---------------------------------
#
#   1. Mapping the byte pairs
#
#    If one of the following byte values is encountered when mapping
#    Mac OS Gujarati text - xA1, xAA, xDF, or 0xE8 - then the next
#    byte (if there is one) should be examined. If the next byte is
#    0xE9 - or also 0xE8, if the first byte was 0xE8 - then the byte
#    pair should be mapped using the first section of the mapping
#    table below. Otherwise, each byte should be mapped using the
#    second section of the mapping table below.
#
#     - The Unicode Standard, Version 2.0, specifies how explicit
#       halant and soft halant should be represented in Unicode;
#       these mappings are used below.
#
#    If the byte value 0xF0 is encountered when mapping Mac OS
#    Gujarati text, then the next byte should be examined. If there
#    is no next byte (e.g. 0xF0 at end of buffer), the mapping
#    process should indicate incomplete character. If there is a next
#    byte but it is not in the range 0xA1-0xEE, the mapping process
#    should indicate malformed text. Otherwise, the mapping process
#    should treat the byte pair as a valid two-byte code point with no
#    mapping (e.g. map it to QUESTION MARK, REPLACEMENT CHARACTER,
#    etc.).
#
#   2. Mapping the invisible consonant
#
#    It has been suggested that INV in ISCII-91 should map to ZERO
#    WIDTH NON-JOINER in Unicode. However, this causes problems with
#   roundtrip fidelity: The ISCII-91 sequences 0xE8+0xE8 and 0xE8+0xD9
#   would map to the same sequence of Unicode characters. We have
#   instead mapped INV to LEFT-TO-RIGHT MARK, which avoids these
#   problems.
#
# Details of mapping changes in each version:
# -------------------------------------------
#
##################

# Section 1: Map the following byte pairs as indicated:
# (ZWNJ means ZERO WIDTH NON-JOINER, ZWJ means ZERO WIDTH JOINER)
# (Also see note about 0xF0 in comments above)

0xA1+0xE9   0x0AD0     # GUJARATI OM
0xAA+0xE9   0x0AE0     # GUJARATI LETTER VOCALIC RR
0xDF+0xE9   0x0AC4     # GUJARATI VOWEL SIGN VOCALIC RR
0xE8+0xE8   0x0ACD+0x200C    # GUJARATI SIGN VIRAMA + ZWNJ      # explicit
halant
0xE8+0xE9   0x0ACD+0x200D     # GUJARATI SIGN VIRAMA + ZWJ # soft halant

# Section 2: Map the remaining bytes as follows:

0x20   0x0020    #   SPACE
0x21   0x0021    #   EXCLAMATION MARK
0x22   0x0022    #   QUOTATION MARK
0x23   0x0023    #   NUMBER SIGN
0x24   0x0024    #   DOLLAR SIGN
0x25   0x0025    #   PERCENT SIGN
0x26   0x0026    #   AMPERSAND
0x27   0x0027    #   APOSTROPHE
0x28   0x0028    #   LEFT PARENTHESIS
0x29   0x0029    #   RIGHT PARENTHESIS
0x2A   0x002A    #   ASTERISK
0x2B   0x002B    #   PLUS SIGN
0x2C   0x002C    #   COMMA
0x2D   0x002D    #   HYPHEN-MINUS
0x2E   0x002E    #   FULL STOP
0x2F   0x002F    #   SOLIDUS
0x30   0x0030    #   DIGIT ZERO
0x31   0x0031    #   DIGIT ONE
0x32   0x0032    #   DIGIT TWO
0x33   0x0033    #   DIGIT THREE
0x34   0x0034    #   DIGIT FOUR
0x35   0x0035    #   DIGIT FIVE
0x36   0x0036    #   DIGIT SIX
0x37   0x0037    #   DIGIT SEVEN
0x38   0x0038    #   DIGIT EIGHT
0x39   0x0039    #   DIGIT NINE
0x3A   0x003A    #   COLON
0x3B   0x003B    #   SEMICOLON
0x3C   0x003C    #   LESS-THAN SIGN
0x3D   0x003D    #   EQUALS SIGN
0x3E   0x003E    #   GREATER-THAN SIGN
0x3F   0x003F   #   QUESTION MARK
0x40   0x0040   #   COMMERCIAL AT
0x41   0x0041   #   LATIN CAPITAL LETTER   A
0x42   0x0042   #   LATIN CAPITAL LETTER   B
0x43   0x0043   #   LATIN CAPITAL LETTER   C
0x44   0x0044   #   LATIN CAPITAL LETTER   D
0x45   0x0045   #   LATIN CAPITAL LETTER   E
0x46   0x0046   #   LATIN CAPITAL LETTER   F
0x47   0x0047   #   LATIN CAPITAL LETTER   G
0x48   0x0048   #   LATIN CAPITAL LETTER   H
0x49   0x0049   #   LATIN CAPITAL LETTER   I
0x4A   0x004A   #   LATIN CAPITAL LETTER   J
0x4B   0x004B   #   LATIN CAPITAL LETTER   K
0x4C   0x004C   #   LATIN CAPITAL LETTER   L
0x4D   0x004D   #   LATIN CAPITAL LETTER   M
0x4E   0x004E   #   LATIN CAPITAL LETTER   N
0x4F   0x004F   #   LATIN CAPITAL LETTER   O
0x50   0x0050   #   LATIN CAPITAL LETTER   P
0x51   0x0051   #   LATIN CAPITAL LETTER   Q
0x52   0x0052   #   LATIN CAPITAL LETTER   R
0x53   0x0053   #   LATIN CAPITAL LETTER   S
0x54   0x0054   #   LATIN CAPITAL LETTER   T
0x55   0x0055   #   LATIN CAPITAL LETTER   U
0x56   0x0056   #   LATIN CAPITAL LETTER   V
0x57   0x0057   #   LATIN CAPITAL LETTER   W
0x58   0x0058   #   LATIN CAPITAL LETTER   X
0x59   0x0059   #   LATIN CAPITAL LETTER   Y
0x5A   0x005A   #   LATIN CAPITAL LETTER   Z
0x5B   0x005B   #   LEFT SQUARE BRACKET
0x5C   0x005C   #   REVERSE SOLIDUS
0x5D   0x005D   #   RIGHT SQUARE BRACKET
0x5E   0x005E   #   CIRCUMFLEX ACCENT
0x5F   0x005F   #   LOW LINE
0x60   0x0060   #   GRAVE ACCENT
0x61   0x0061   #   LATIN SMALL LETTER A
0x62   0x0062   #   LATIN SMALL LETTER B
0x63   0x0063   #   LATIN SMALL LETTER C
0x64   0x0064   #   LATIN SMALL LETTER D
0x65   0x0065   #   LATIN SMALL LETTER E
0x66   0x0066   #   LATIN SMALL LETTER F
0x67   0x0067   #   LATIN SMALL LETTER G
0x68   0x0068   #   LATIN SMALL LETTER H
0x69   0x0069   #   LATIN SMALL LETTER I
0x6A   0x006A   #   LATIN SMALL LETTER J
0x6B   0x006B   #   LATIN SMALL LETTER K
0x6C   0x006C   #   LATIN SMALL LETTER L
0x6D   0x006D   #   LATIN SMALL LETTER M
0x6E   0x006E   #   LATIN SMALL LETTER N
0x6F   0x006F   #   LATIN SMALL LETTER O
0x70   0x0070   #   LATIN SMALL LETTER P
0x71   0x0071   #   LATIN SMALL LETTER Q
0x72   0x0072   #   LATIN SMALL LETTER R
0x73   0x0073   #   LATIN SMALL LETTER S
0x74   0x0074   #   LATIN SMALL LETTER T
0x75   0x0075   #   LATIN SMALL LETTER U
0x76   0x0076   #   LATIN SMALL LETTER V
0x77   0x0077   #   LATIN SMALL LETTER W
0x78   0x0078   #   LATIN SMALL LETTER X
0x79   0x0079   #   LATIN SMALL LETTER Y
0x7A   0x007A   #   LATIN SMALL LETTER Z
0x7B   0x007B   #   LEFT CURLY BRACKET
0x7C   0x007C   #   VERTICAL LINE
0x7D   0x007D   #   RIGHT CURLY BRACKET
0x7E   0x007E   #   TILDE
#
0x80   0x00D7   #   MULTIPLICATION SIGN
0x81   0x2212   #   MINUS SIGN
0x82   0x2013   #   EN DASH
0x83   0x2014   #   EM DASH
0x84   0x2018   #   LEFT SINGLE QUOTATION MARK
0x85   0x2019   #   RIGHT SINGLE QUOTATION MARK
0x86   0x2026   #   HORIZONTAL ELLIPSIS
0x87   0x2022   #   BULLET
0x88   0x00A9   #   COPYRIGHT SIGN
0x89   0x00AE   #   REGISTERED SIGN
0x8A   0x2122   #   TRADE MARK SIGN
#
0x90   0x0965   # DEVANAGARI DOUBLE DANDA
#
0xA1   0x0A81   #   GUJARATI   SIGN CANDRABINDU
0xA2   0x0A82   #   GUJARATI   SIGN ANUSVARA
0xA3   0x0A83   #   GUJARATI   SIGN VISARGA
0xA4   0x0A85   #   GUJARATI   LETTER A
0xA5   0x0A86   #   GUJARATI   LETTER AA
0xA6   0x0A87   #   GUJARATI   LETTER I
0xA7   0x0A88   #   GUJARATI   LETTER II
0xA8   0x0A89   #   GUJARATI   LETTER U
0xA9   0x0A8A   #   GUJARATI   LETTER UU
0xAA   0x0A8B   #   GUJARATI   LETTER VOCALIC R
#
0xAC   0x0A8F   # GUJARATI LETTER E
0xAD   0x0A90   # GUJARATI LETTER AI
0xAE   0x0A8D   # GUJARATI VOWEL CANDRA E
#
0xB0   0x0A93   #   GUJARATI   LETTER O
0xB1   0x0A94   #   GUJARATI   LETTER AU
0xB2   0x0A91   #   GUJARATI   VOWEL CANDRA O
0xB3   0x0A95   #   GUJARATI   LETTER KA
0xB4   0x0A96   #   GUJARATI   LETTER KHA
0xB5   0x0A97   #   GUJARATI   LETTER GA
0xB6   0x0A98   #   GUJARATI   LETTER GHA
0xB7   0x0A99   #   GUJARATI   LETTER NGA
0xB8   0x0A9A   #   GUJARATI   LETTER CA
0xB9   0x0A9B   #   GUJARATI   LETTER CHA
0xBA   0x0A9C   #   GUJARATI   LETTER JA
0xBB   0x0A9D   #   GUJARATI   LETTER JHA
0xBC   0x0A9E   #   GUJARATI   LETTER NYA
0xBD   0x0A9F   #   GUJARATI   LETTER TTA
0xBE   0x0AA0   #   GUJARATI   LETTER   TTHA
0xBF   0x0AA1   #   GUJARATI   LETTER   DDA
0xC0   0x0AA2   #   GUJARATI   LETTER   DDHA
0xC1   0x0AA3   #   GUJARATI   LETTER   NNA
0xC2   0x0AA4   #   GUJARATI   LETTER   TA
0xC3   0x0AA5   #   GUJARATI   LETTER   THA
0xC4   0x0AA6   #   GUJARATI   LETTER   DA
0xC5   0x0AA7   #   GUJARATI   LETTER   DHA
0xC6   0x0AA8   #   GUJARATI   LETTER   NA
#
0xC8   0x0AAA   #   GUJARATI   LETTER   PA
0xC9   0x0AAB   #   GUJARATI   LETTER   PHA
0xCA   0x0AAC   #   GUJARATI   LETTER   BA
0xCB   0x0AAD   #   GUJARATI   LETTER   BHA
0xCC   0x0AAE   #   GUJARATI   LETTER   MA
0xCD   0x0AAF   #   GUJARATI   LETTER   YA
#
0xCF   0x0AB0   # GUJARATI LETTER RA
#
0xD1   0x0AB2   # GUJARATI LETTER LA
0xD2   0x0AB3   # GUJARATI LETTER LLA
#
0xD4   0x0AB5   #   GUJARATI LETTER VA
0xD5   0x0AB6   #   GUJARATI LETTER SHA
0xD6   0x0AB7   #   GUJARATI LETTER SSA
0xD7   0x0AB8   #   GUJARATI LETTER SA
0xD8   0x0AB9   #   GUJARATI LETTER HA
0xD9   0x200E   #   LEFT-TO-RIGHT MARK         # invisible consonant
0xDA   0x0ABE   #   GUJARATI VOWEL SIGN       AA
0xDB   0x0ABF   #   GUJARATI VOWEL SIGN       I
0xDC   0x0AC0   #   GUJARATI VOWEL SIGN       II
0xDD   0x0AC1   #   GUJARATI VOWEL SIGN       U
0xDE   0x0AC2   #   GUJARATI VOWEL SIGN       UU
0xDF   0x0AC3   #   GUJARATI VOWEL SIGN       VOCALIC R
#
0xE1   0x0AC7   # GUJARATI VOWEL SIGN E
0xE2   0x0AC8   # GUJARATI VOWEL SIGN AI
0xE3   0x0AC5   # GUJARATI VOWEL SIGN CANDRA E
#
0xE5   0x0ACB   #   GUJARATI VOWEL SIGN O
0xE6   0x0ACC   #   GUJARATI VOWEL SIGN AU
0xE7   0x0AC9   #   GUJARATI VOWEL SIGN CANDRA O
0xE8   0x0ACD   #   GUJARATI SIGN VIRAMA # halant
0xE9   0x0ABC   #   GUJARATI SIGN NUKTA
0xEA   0x0964   #   DEVANAGARI DANDA
#
0xF1   0x0AE6   #   GUJARATI   DIGIT   ZERO
0xF2   0x0AE7   #   GUJARATI   DIGIT   ONE
0xF3   0x0AE8   #   GUJARATI   DIGIT   TWO
0xF4   0x0AE9   #   GUJARATI   DIGIT   THREE
0xF5   0x0AEA   #   GUJARATI   DIGIT   FOUR
0xF6   0x0AEB   #   GUJARATI   DIGIT   FIVE
0xF7   0x0AEC   #   GUJARATI   DIGIT   SIX
0xF8   0x0AED   #   GUJARATI   DIGIT   SEVEN
0xF9   0x0AEE   # GUJARATI DIGIT EIGHT
0xFA   0x0AEF   # GUJARATI DIGIT NINE

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:0
posted:12/8/2012
language:Latin
pages:8
fdsjlkhjlq fdisfsji fdsjlkhjlq fdisfsji success.... http://
About