Abstract:
The augmentation of data in low-resource languages gained significant importance re cently, primarily because of scarcity of datasets or the presence of highly unbalanced
datasets. In the case of the Bengali language, the detection of fake news has turned up
as a relevant problem, particularly in light of the surge in false information related to
Covid-19 and the pandemic [1]. However, there has been a lack of adequately balanced
data sets specifically designed for training Machine Learning (ML) and Deep Learning
(DL) models in the detection of fake news in Bengali. Furthermore, previous attempts
at augmenting fake news texts have yielded satisfactory results in lexical analysis but
unsatisfactory results in terms of semantic relevance. To address these challenges, we
propose a framework that involves the use of Text Augmentation techniques with the
assistance of the Bangla Text-to-Text Transfer Transformer (T5) model. This frame work aims to balance an unbalanced Bengali fake news dataset, while ensuring that the
augmented text retains semantic similarity and structural accuracy. By employing this
approach, we seek to strengthen the effectiveness and reliability of fake news detection
models in the Bengali language.
Description:
Supervised by
Dr. Hasan Mahmud,
Associate Professor,
Ms. Nafisa Sadaf,
Lecturer,
Dr. Md. Kamrul Hasan,
Professor,
Department of Computer Science and Engineering(CSE),
Islamic University of Technology(IUT),
Board Bazar, Gazipur-1704, Bangladesh.