Urdu Characters

Document Sample
Urdu Characters Powered By Docstoc
					      Hindi – Urdu
  Transliteration issues

Rahmat Yousufzai and Amba Kulkarni
‫‪Urdu Alphabet‬‬

Characteristics of Urdu alphabet
 Most of the Urdu characters join with the
 following character and make one ligature.
Example: ‫ تکلیف‬तक्रीप
This Urdu word is combination of 5 characters
5 4 3 2 1
‫ت ک ل ی ف‬
 Urdu characters typically have different
 shapes in different positions – beginning,
 middle, last.
      Characteristics contd ...
 Some characters do not join with the following
 character and are written in full form even if they
 come in a middle position.

 ‫ا آ د ذ ڈ ر ز ژڑ و‬
‫مالک آم بدلہ تہذیب بڈھا برما بزنس ویژن کڑیل بولنا‬

Please note that there is no space in between.
         Hindi Alphabet

अ आ इ ई उ ऊ ऋ
ए ऐ ओ औ अं अः

क   ख ग घ ङ
च   छ ज झ ञ
ट   ठ ड ढ ण
त   थ द ध न
ऩ   प फ ब भ
म   य र व ष श स ह
     Consonants missing in Hindi

‫ث ، ح ،خ ، ز ، ذ ، ص ، ض ، ط ، ظ ، ع ، غ ، ف ، ق‬

  These characters do not exist in Hindi.
  They are borrowed from Arabic and are
  used for the words borrowed from
  Arabic/Persian only.
‫ث‬   ‫ص‬                    ‫ظ ض ذ ز‬
(Close to ‫ س‬in Urdu with In Hindi ज & ज़ Is used
a minute difference in   for all these characters.
pronunciation. In Hindi स However most of the
is used to express these times ज is written
characters including ‫س‬   without dot.
                            This character is
                            represented by क़
Same as ‫ ہ‬in Urdu with
                            This character is
a minute difference in
                            represented by ख़
pronunciation. In Hindi ह
is used for both the
                            But normally the dot is
                            ‫ عمل‬अभर
‫ : ع‬This is an Arabic
                            ‫عالم‬   आरभ
consonant which is
transliterated into Hindi   ‫ عالج‬इराज
as one of the vowels.       ‫عید‬    ईद
‫ ا‬Alif: with some
                            ‫عمر‬    उम्र
difference in
                            ‫عود‬    ऊद
If ‫ ع‬comes in between or as a last
character then most of the Urdu speakers
normally pronounce it as Alif but in poetry,
special care is taken to pronounce it
                       ‫ ف‬This Character is
                       same like F in English.
‫ ط‬and ‫ ت‬have similar
pronunciation and in   This sound is
Hindi त is used to     represented by प or फ़.
represent both the     But many a times Hindi
                       writers do not use the
‫ غ‬This sound also does not exist in Hindi.
 To represent this character, ग or ग़ is
 used. However normally Hindi writers do
 not use the dot.
‫ ژ‬This is a Persian character and does not
  exist in Arabic. In Hindi this is represented
  by ज and ज़.
Example: Television ‫ ٹےلی ویژن‬टे रीववज़न
‫ ھ‬This is Do-chashmi He and gives its sound only
  when joined with certain characters.

  An important point to be noted:
  Arabic has a character ‫ ( ھ‬do-chashmi he ). This
  character retains its shape only when it comes
  as first and middle position. In the last position it
  gets changed as ‫( ہ‬gola he ).
Suggestion: For Urdu we should use ‫ ھ‬with
  unicode u06BE.
‫ں‬   Noon without dot (Noon Gunna)

   When this character comes in between the
      word then dot is marked. This creates
      ambiguity as it can be read as ‫ن‬
‫چھینٹا‬   ‫کانچ‬
Urdu characters borrowed from

‫ٹ،ڈ،ڑ‬        ट, ड, ड़

These characters do not exist in Arabic or
Persian. These have been borrowed from
      Certain Hindi characters
       representation in Urdu

‫چھ ،جھ ، گھ ، کھ ، ٹھ ، ڈھ ڑھ ، ، تھ ، دھ، بھ ، پھ‬
ख, घ, झ, छ, ठ, ढ, ढ़, थ, ध, प, ब
These are the Hindi characters and are
represented in Urdu by adding ‫( ھ‬Do-
Chashmi He) to the initial character.

‫، نھ، وھ، ےھ‬   ‫رھ ، لھ ، مھ ، نھ‬

There are no specific characters in Hindi.
 However the sound is represented as

     न्ह , भह , ल्ह, हह , व्ह , ँह, ँंह, ँेह
  ‫تےرھواں ، کولھو ، تمھارا ، ننھا ، اونھ، وھےل‬

   तेयहवा, कोल्हू, तुभहाया, नन्हा, ऊह,
   व्हे र

   However in Urdu these words are also written
  as ‫تےرہواں ، کولہو ، تمہارا‬
  But ‫ ننھا‬is written without change.
     Ambiguous Characters
 1. aliph (‫ & ) ا‬Ena (‫)ع‬
 These two characters are pronounced
 differently, but Urdu speakers do not pay
 attention to the difference.

Example: ‫(عام‬आभ common) ‫( ، آم‬आभ
2. sa: se (‫ ,)ث‬sIna (‫ ,)س‬svAda (‫)ص‬
  se (‫ )ث‬and svAda (‫)ص‬
   These are purely arabic and Persian
  characters and are used in only Arabic or
  Persian words. Where as sIna (‫ )س‬is used
  in Hindi, Urdu, Persian and Arabic.
  The above characters including ‫ س‬are
  written as स in Hindi.
3. Ta: Te (‫ ,)ت‬Toya (‫)ط‬
  Toya (‫ )ط‬is purely Arabic and Persian
  character and is used only in Persian and
  Arabic words.

 Both the characters are written as त in
4. he: badI he (‫ &)ح‬gola he (‫)ہ‬
  badI he (‫ )ح‬is Arabic/Persian character.
   gola he (‫ )ہ‬is common in Arabic, Persian,
  Urdu and Hindi.

Hindi equivalent for both: ह
5. ja: jAla (‫ ,)ذ‬je (‫ ,)ز‬jvAda (‫ ,)ض‬joya (‫,)ظ‬
    PArasI je (‫)ژ‬
  Z sound is not available in Hindi and
  almost in all Indian languages. Instead j is
  The sound of ja: jAla (‫ ,)ذ‬je (‫ ,)ز‬jvAxa (‫,)ض‬
  joya (‫ )ظ‬are almost the same and all are
  Arabic/Persian characters.
 je (‫ )ژ‬is purely Persian character and is not
  available in even Arabic.
   For all the above characters ज or ज़ is
It may be noted that if the next character after अं
is ऩ, प, फ, ब, भ then it is pronounced as "अभ"
otherwise it is pronounced as "अन".

1., अंफारा, अंबोज, अंभय
‫انبالہ ، انبھوج ، عنبر‬

           ं ू
 2. चंऩा, कप, फंफ, गंबीय, संभान

 3. अंक, अंग, अंतय, अंधा, अंडा
Also, when अं comes as a last character of the word
then it gives the sound of "अभ".
अहं , स्वमं, फारकं
‫اہم‬      ‫سویم‬           ‫بالکم‬
    Interestingly the same rule of ऩ, प, फ, ब, भ is
applied in Urdu also but mostly the character ‫ م‬is
used in proper nouns and English words.
Example: ‫امپائر‬              ‫امپھل‬       ‫امبانی‬

           अंऩामय, अंपर    अंफानी
  This also is the combination of अ and ँः
  Example :
  It does not come as the first character of the
  word however it comes in the middle and last.
  अंतःप्रवेश, अंतःकयण, प्रामः, अतः
‫انتہ پروےش‬   ‫انتہ کرن‬     ‫پرایہ‬   ‫اتہ‬
  These characters do not exist in Urdu.
  Instead ‫ ن‬is used.
  वाङ्मम, वऩङ्गरा ‫وانگ مے ، پنگال‬
  चञ्चर(चञ ् चर), गञ्जन(गञ ् जन) ‫چنچل ، گنجن‬
                    ु       ु
  कायण, कणक ‫کارن ، کنک‬
ष, श
  These two characters have almost the same
  sound with a minute difference.
  In Uru there is only one character ‫ ش‬to express
  this sound.

   षष्ठी,आकषहण, ऩरूष
  ‫ششٹھی ، آکرشن ، پرش‬

   शयीय, भश्ककर, ककशमभश
‫شریر‬   ‫مشکل‬      ‫کشمش‬
    Diacritic marks in Urdu

In Urdu, there are no Matras like Hindi.
Urdu has some diacritic marks but uses
them only in elementary books.
Zabara           َ      This is placed
above the character to indicate a
consonant with अ.
  Ex- ‫ب‬फ

Zer          َ      This is placed below the
character and to indicate a vowel इ or ए.
  Ex- ‫ب‬      बफ, फे
 Pesh         َ     This is placed above the
character and creates the sound of उ
along with the sound of the character on
which it is applied. Ex- ‫ ب‬फु

Jazama َ    Equivalent of Halant in
  Ex- ‫ب‬फ ् शब्द ‫شبد‬
 Tashdeed َ          This is used for
reduplication as in
   ‫ ہال _گال، دھبا‬हल्रागल्रा , धब्फा
Do zabar         َ   This is placed above the
last character Alif ( ‫ ) ا‬and gives the sound
of n . It may be noted that the character
just before Alif should be with Zabar.
Example : ‫ فورا‬फ़़ौयन           (the character
‫ ر‬is with Zabar but normally it is not
 Do zer         َ        This is placed above the
last character Alif ( ‫ ) ا‬and gives the sound of n
. It may be noted that the character just before
Alif should be with zer but normally it is not
Example:        ‫ نسال بعد نسال‬नस्रन फाद नश्स्रन
(The character ‫ ل‬is with Zer )

Ulta pesh :    َ      This is placed above the
character and gives the sound of oo (ऊ)
Example:        ‫ مالہ بعدہ‬फादहू     भारहू
Khada zabar: َ         This adds the sound of
 Alif to the character on which it is applied.
 Mostly it is put on Choti ye and Badi ye ( ‫ی‬
 ‫ ) ، ے‬and the efect of it ie Aa sound is
 transfered to the character earlier to Choti
 ye and Badi ye. It comes in the middle
 also and the character on which it is
 applied is added with the sound Aa.
 Example:       ‫اعلی ، مصطفے ، ہذا‬
            आरा भुस्तफ़ा हाज़ा

Other Diacritic marks are pesh and khada
Rules for gender in most of the words
which have been derived from Hindi/Indian
languages, do not change between Urdu
and Hindi. But for some of the words
which have been borrowed from Arabic or
persian, the gender changes.
vyavastha (feminine) ‫( انتظام‬Masculine)
aakarshan (Masculine) ‫( کشش‬feminine)
prakash(Masculine) ‫( روشنی‬feminine)
        Compound Words
 In Urdu two words are joined together.
 ( Same as in English where Apostrophe is
used to join the words and Apostrophe
gives the sense of "of"). In Urdu, the words
are joined by "Izaafat". There are three
types of Izaafat.
1. Zer is added after the first word and then the
  other word is written.
  Example: ‫درد دل‬

  Most important thing is that there has to be
  space after zer other wise the words may join
  together and will be problematic to read
  Example: ‫شاخ گل‬

  If space is not given then the word will appear
  like this. ‫شاخگل‬
2. If the last character of the first word is
  "he" or "choti ye", then Hamza is added
  after the first word.
  Example: ‫نغمہ آب، گرمی عشق‬
                   ٔ       ٔ
3. If the last character of the first word is
alif or wav then "Hamza Badi ye" is added
after the first word.
Example: ‫اداۓ خاص، بوۓ گل‬
Compound words without Izaafat
Normally in Hindi some words are written
Example isaka, usaka, Taajmahal etc but
in Urdu they are written separately.
 ‫اس کا، اس کا، تاج محل‬
It may be noted that if Tajmahal is written
together then it will not be readable. ‫تاجمحل‬
      Typographical errors
1. In Urdu when choti ye ‫ ی‬and ‫ ے‬come as
a last character of a word then it retains its
original shape. But if it comes in middle
then it is difficult to recognize because the
appearance is same in hand written text
and Word processing packages like
Inpage. Unicode badi ye does not join with
the following character.
example: ‫کےال کیال‬
In all Urdu packages this will be written as
‫ کیال‬which creates ambiguity and
transliteration through machine becomes
2. There are certain characters in Urdu
which do not join with the following
character. These characters are ،‫ا،آ،د، ذ، ڈ‬
‫ . ر، ز، ژ، ڑ و‬Data entry operators do not
care much to give space between the two
words as it is difficult for them to notice the
joined position of the words. Due to this
machine takes the two words as one and
fails to process the word.
3. The diacritic marks are ignored in Urdu
and hence apparently there is no
difference between
इस ‫اس‬     and उस ‫اس‬
हवा   ‫ہوا‬   (noun)

हुआ   ‫ہوا‬   (verb)

 In Urdu both words are written in the same
 way but meaning is different. The spelling
 in Hindi is also different.

Shared By: