Automatic Parsing For Arabic Sentences
Shared by: ijcsis
Categories
Tags
IJCSIS, call for paper, journal computer science, research, google scholar, IEEE, Scirus, download, ArXiV, library, information security, internet, peer review, scribd, docstoc, cornell university, archive, Journal of Computing, DOAJ, Open Access, March 2011, Volume 9, No. 3, Impact Factor, engineering, international, proQuest, computing, computer, technology
-
Stats
- views:
- 142
- posted:
- 4/9/2011
- language:
- English
- pages:
- 6
Document Sample


(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 3, March 2011
Automatic parsing For Arabic sentences
Zainab Ali Khalaf* Dr. Tan Tien Ping
School of computer science School of computer science
Universiti Sains Malaysia (USM) Universiti Sains Malaysia (USM)
Penang, Malaysia Penang, Malaysia
E-mail: zak10_com026@student.usm.my E-mail: tienping@cs.usm.my
*(Ass. Prof. In Computer Science Dept.,
Basra University, Iraq)
Abstract__The designed system is a parser for Arabic
sentences using syntactic and semantic relations The proposed system aims to use these properties
between deep and surface structures. The system to parse Arabic sentences depending on the position
depends on implementation of Case theory of Fillmore. of the words in the sentence and the functional
meaning of them.
The parsing algorithm starts analyzing the input
sentence to check its syntax, semantic and spelling using
Arabic transformation rules proposed in Al_Khouly to
gain semantic strength. The proposed system depends II. SYSTEM COMPONENTS
on the effective elements represented by the verb of the
sentence .This element is used to control the parsing
operation. The syntactical properties of any natural
The proposed system permits as input different
language are formally described by the use of what
surface structures of Arabic sentences to produce
automatic parsing forms for these input sentences. Chomsky calls production systems. A formal system
generally depends on three types of data [2,3,6]:
Keywords__Artificial intelligence; natural language
processing; transformation rules; deep structure and A. Data of vocabulary lexicon
surface structure; parsing Arabic sentences .
The lexicon plays an important role in any NLP
system. It is a huge data base of variable entries
I. INTRODUCTION describing the meaning of words in synonymy (and
antinomy) contextual fashion [3,6]. The implemented
lexicon consists of entries saved as a rule ( Entrance
Arabic language is a parsing language . Parsing [ Word , Features ] ).
means the relation among the words in the sentence.
The most important component is the verb which acts • The Entrance is one of the following indicators :-
as the basic unit to control the rules of choosing other Verb , Noun , Preposition , Determinate , Assistant
elements. Although Arabic sentences have different and Negation.
structures , but it is recognized as a ( verb , subject ,
object ) language. The subject or the object may be The Word is a string index for the lexicon entry.
precede the verb in the Arabic sentences according to
the pragmatic necessity [1,3,4].
• The Features is a list of structured integers coded
to hold the syntactical and semantic information of
Arabic Syntactic facilitates the flexibility of the the word. Each coded integer, written as [Fp],
deep structure and the surface structure of sentence to consists of two parts F and p. The [p] part is either 1
be connected together strongly. This propriety helps or 0 depending on whether the feature [F] exists or
Arabic language accept for automatic processing not. The [F] part is the feature code.
[4,5].
58 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 3, March 2011
B. Data of syntactical rules
The presence of the verb is necessary and obligatory,
whereas the presence of other elements is optional
These rules are formalized to describe the and dependent on the verb rules [1,4].
language in order to relate each one deep structure
into so many corresponding surface structures of the
same meaning. These rules are actually inductive and
sequential. Some are obligatory and others are III. DENESIGD SYSTEM STRUCTURE
optional rules. From the optional rules, one can obtain
various surface structures that act as contextual The designed system has many stages : Figure (1)
linguistics. The transformations are mainly operations acts flowchart of these stages which are described
that are addition, deletion, moving forward, moving below :
backward and some other secondary operations. These
operations are, in general, not performed at random,
but are governed and selected according to a set of A. Input sentence stage
conditions and rules of structure description. These
operations will generate all surface structures
emerging from that one deep structure. The function of this stage is to input Arabic sentence
from the keyboard to the computer , this sentence
ended by dot or semicolon or space character .
C. Data of syntactic structure
These data are rules described in BNF for
Arabic language , and acts as constraints and controls B. Segmentation stage
to form the sentences of Arabic language. The most
important component, as Fillmore and Shank
recognized, is the verb element which acts as the The function of this stage is to segment the input
basic unit that controls rules of choosing other sentence into words depended on space character
elements. The dependent phrase structure rules used (free number of space characters).
are the following :-
C. Lexicon search stage
<Sentence> ::= <Modality> + < Auxiliary > + <
Proposition >
<Sentence> ::= < Auxiliary > + < Proposition > < The function of this stage is to search for all sentence
Modality > ::= < External Condition > + < External words in the lexicon . If the word is not found in the
Adverb > lexicon, the program gives spelling error message
<Proposition > ::= < Verb > + < Theme > + < Indirect and stop .
Object > + < Place > + < Tool > + < Agent >
< Theme > ::= < Noun Phrase > D. Syntactical analysis stage
< Agent > ::= < Noun Phrase >
< Tool > ::= < Noun Phrase >
< Place > ::= < Noun Phrase > The function of this stage is to ensure and govern the
< Indirect Object > ::= < Noun Phrase > correctness of input sentence from its syntactical side
<Noun phrase> ::=<Proposition> + <determinate > + < . If the processing found errors , the program gives
Noun > syntactical error massage .
< Noun Phrase > ::= < Proposition >+ < Noun>
< External Condition > ::= semi statements used to
combine two sentences such as ( in spite of E. Semantic analysis stage
) or ( moreover ) etc.
< External Adverb > ::= <Time Adverb>+<
Interrogative Words> +<Negation Words> The function of this stage is to ensure and govern the
< Auxiliary > ::= lexical words such as ( ) or ( correctness of input sentence from its harmony, its
) etc. vocabulary and correctness of its meaning . If the
< Verb > ::= A dictionary verb such as ( write ) sentence is not correct in its meaning, the program
etc. gives semantic error massage .
< Noun > ::= A dictionary noun such as ( boy )
etc.
59 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 3, March 2011
F. Generative deep structure stage
IV. EXAMPLES
Transformational operations will carry out , and try
to compile the addition, deletion, replacement and
other operations to obtain on the sentence structure
which acts as the deep structure . For example we want to know the parsing of the
following sentences. Figure (2) depicted this
mechanism :-
G. Parsing stage
A. Example 1
The function of this stage is to parse sentence which
depends on its effective element and its position in
structure phrase . This stage has many Arabic The system prints the following parsing :
language rules which control the parsing operations . :
. . :
Here an examples of sentences that the system can :
parse its :-
.
B. Example 2
.1
.
.2 The system prints the following parsing :
.3 :
.
.4 . :
. :
.5
C. Example 3
.6
.
.7 The system prints the following parsing :
.8
.9 :
.
.10 :
.
.11 :
.
.12 D. Example 4
.13 .
The system prints the following parsing :
.14
.15 . :
. :
.16 :
.17 .
E. Example 5
.
The system prints the following parsing :
60 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 3, March 2011
References
. :
: [1] Abo-Arafah .A. , "A grammar for the Arabic language suitable
. for machine parsing and automatic text generation ", PH.D. thesis ,
: Illinois of technology , Chicago , USA,1995 .
.
. :
[2] Ali .N. , “Arabic language and Computer” , "Al-Tareeb
Publishing House, Cairo, Egypt, 1988.
[3] Al-Khouly, M. , “ Transformation rules for Arabic language”,
Al- Riyadh, 1981.
Conclusions
[4] Al-Shalabi .R. , Evens .M ." A Computational Morphology
System For Arabic " , Dept. Of Computer Science and applied
Mathematics , Illinos Institute Of Technology , Chicago , USA ,
The present research ends up with the following W.D.
conclusions :-
[5] Gheith .M. , Mashour .M . " A Computer Based System For
1. The verb is the main component which controls understanding Arabic language ", Computer Science Department
all other component appearing with it . From this Inst. Of Statistical Study & Research , Cairo University , Egypt
point, we consider all deep structures as containing ,W.D.
the verb in its structure .
[6] Khalaf .Z. , “Computerized Implementation For Processing
Arabic Sentences By Interpretation Synonymy Relationships” ,
2. The word meaning depends on the essential M.Sc. thesis, Basra University, Iraq, 2001.
effective element ( the deep element ) .
3. The lexicon plays the essential element to
provide any system by vocabulary and its features .
By these features, we can control the different
processing levels of syntax and semantics .
4. The absence of vowelization might bring some
ambiguities to sentence understanding. However the
transformation rules are used to remedy these
ambiguities in an explicit and easy way, as in the
following sentences which show where, in all the
sentences, the man is the subject and the lion is object
.
Acknowledgment
I would like to express my sincere appreciation to
TWAS organization , USM university for their
encouragement and continuous financial support
through the providing PHD fellowship. In addition we
would like to thank school of computer science for
their encouragement and motivation of international
students in the faculty.
61 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 3, March 2011
User Interface
Input Stage
Segmentation
Lexicon
Stage
Lexical Rules
Spelling
Lexicon Stage
Error
Initial Descriptive
Structure
Syntactical
Transformational Rules Errors
Transformational Rules
Semantic
Stage Deep Structure
Semantic Parsing stage
Error
User Interface
Figure (1) acts flowchart of
Parsing operations
62 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
(IJCSIS) International Journal of Computer Science and Information Security,
Vol. 9, No. 3, March 2011
Surface structure
.
Transformation Rules
An agent ( ) used a tool ( ) to
perform the verb ( ) to get the object
( )
Deep structure
Verb ( ) , Subject ( ) , Object ( ) , Tool ( )
Sentence structure
Parsing Stage
:
. :
. :
. :
Figure (2) acts the mechanism to Parse
Arabic sentence
63 http://sites.google.com/site/ijcsis/
ISSN 1947-5500
Related docs
Other docs by ijcsis
Comparative Analysis between Split and HierarchyMap Treemap Algorithms for Visualizing Hierarchical Data
Views: 15 | Downloads: 0
Non-Preemptive Multi-Constrain Scheduling for Multiprocessor with Hopfield Neural Network
Views: 5 | Downloads: 0
Reliable Multipath Routing Protocol (RMRP) For Mobile Ad Hoc Networks Using Adaptive Video Compression
Views: 10 | Downloads: 1
Single CCTA-Based Four Input Single Output Voltage-Mode Universal Biquad Filter
Views: 36 | Downloads: 0
A Cloud Computing Architecture for E-Learning Platform, Supporting Multimedia Content
Views: 42 | Downloads: 0
Get documents about "