Developing an Urdu Lemmatizer

Iqbal, Muntaha

dc.contributor.author	Iqbal, Muntaha
dc.date.accessioned	2021-06-11T10:04:13Z
dc.date.available	2021-06-11T10:04:13Z
dc.date.issued	2021-06-11
dc.identifier.uri	http://repository.cuilahore.edu.pk/xmlui/handle/123456789/2669
dc.description.abstract	Lemmatization is a process of obtaining root form of a given word. Lemmatizer is an important part of Natural Language Processing (NLP) toolkit and is essential for many NLP systems e.g. Information Retrieval (IR), plagiarism and text reuse detection, Information Extraction (IE), Machine Translation (MT), Word Sense Disambiguation (WSD) etc. Urdu is a widely spoken language in the world, but very less work has been done on developing basic NLP tools for this language, one of them is Urdu lemmatizer. Since Urdu is a morphologically rich language and has words with many inflectional and derivational forms, development of an efficient lemmatizer is a challenging task and it will be useful for many Urdu NLP applications.	en_US
dc.publisher	Department of Computer Science, COMSATS University Lahore	en_US
dc.relation.ispartofseries	;6821
dc.title	Developing an Urdu Lemmatizer	en_US
dc.type	Thesis	en_US