Abstract:
Text summarization achieved a lot of popularity in natural language processing because
of the large amount of literature available on internet, especially for English language.
Nowadays, most used technique is abstractive text summarization in which generated
summaries are quite related to the human-written summaries. In this research, to create
the summaries for the Urdu language the abstractive text summarization technique is
used. In this technique, the Attention based sequence to sequence encoder decoder
model are used to create the summaries. For the training of model for Urdu Language,
two dataset which are BBC Urdu Dataset and Urdu News 1M are used. In order to
evaluate the model, ROUGE metrics are used in which the model generated summary
and human-written summary are compared and then performance of the model is
measured. After training the model on both datasets, there is the quit the difference
between the results of both the datasets which is due to the size of dataset. Although
the model got 42.85 rouge-1 score on BBC Urdu Dataset and 66.67 on Urdu News 1M
Dataset. Our model shows promising results on both the datasets but if the size of
dataset increases the model performs better. We also discussed the problem faced
during the completion of research and results of the model in this research work.