CUI Lahore Repository

An Empirical Study of Urdu Noun and Verb Phrase Chunking

Show simple item record

dc.contributor.author Khurshid, Maryam
dc.date.accessioned 2021-01-19T09:42:35Z
dc.date.available 2021-01-19T09:42:35Z
dc.date.issued 2021-01-19
dc.identifier.uri http://repository.cuilahore.edu.pk/xmlui/handle/123456789/2042
dc.description.abstract Urdu is a language which is a morphologically rich and weak resourced language. The distinguishing features such as free word order, context-sensitive orthography, flexible grammar rules and complex morphology makes the representation of the Urdu language a difficult problem area. In Urdu's hand-written text the words are written without any space among them. A computer needs a text file that needs a separator when a word ends with a non-joiner character. Without these separators, the words will join with one another that will not be understandable for language native speakers. Chunking is a basic technique used for entity detection that labels and segments the sequence of Multi tokens. Chunking technique helps in the progress of many Natural Processing Applications. Chunking is a mature field while dealing with other languages like Hindi, English, Chinese and Turkish but it still requires the attention of researchers in the Urdu language. The Native speakers of Urdu language are more than 70 Million. The study is about the noun and verb phrase chunking in the Urdu language. The intention of this work is to explore the corpus accuracy based on the Noun and verb phrase chunking of the Urdu language. Chunking is an NLP (natural language processing) function that focuses on splitting a text into syntactically linked non-overlapping and non-exhaustive word-groups i.e. a word could only be a part of one chunk but not all words are in chunks. Different experiments are conducted on this work by using a tag set of different input and output schemes with the same Methodology. Firstly, the corpus is selected then preprocessing is performed on that corpus. After that part of speech tagging and IOB tags are assigned to that corpus then Noun and verb phrases are detected by neural networks and machine learning techniques. After this Noun and Verb phrases are detected from a corpus. At last, evaluation will be done by using different Parameters like F-call, recall and Precision. en_US
dc.language.iso en en_US
dc.subject NLP (natural language processing) en_US
dc.subject Urdu Noun and Verb phrase Chunking en_US
dc.title An Empirical Study of Urdu Noun and Verb Phrase Chunking en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

  • Thesis - MS / PhD
    This collection containts the Ms/PhD thesis of the studetns of Department of Computer Science

Show simple item record

Search DSpace


Advanced Search

Browse

My Account