Arabic Corpus by Bl0k9i

VIEWS: 23 PAGES: 2

									     Having an Arabic corpus: problems and
                   challenges

Arabic corpora : design, construction The use of corpora in Arabic language
and annotation                        research

1.   Availability                    Areas of research are :
2.   Forms and providers             1-Lexis
                                     2-Lexicography
3.   Ability to be target tailored
                                     3-Syntax
4.   Most famous providers
                                     4-Collocation
     (Linguistic Data
     Consortium , Arabic             5-NLP systems
     Treebank, Latifa Al- Sulaiti,   6-Analysis tools
     European Languages              7-Stylistics, and
     Resources Association)          8-Discourse analysis
    Nafs Corpus(under construction )
• 1- Selection of texts
• 2-Putting it in the right format for processing
• 3- Cleaning of the texts
• 4- Transliteration and its problems
• 5- MADA
• 6- Nouns lists – Dictionary
• 7-Propsed algorithm based on Mitkov’s
  knowledge- poor approach
• 8-Problems due to the nature of language itself

								
To top