Misogyny Detection in Social Media for Under-Resourced Bangla Language

dc.contributor.author Kader, Md. Wasif
dc.contributor.author Jamil, Chowdhury Farhan
dc.contributor.author Abir, Md. Tanvir Hasan
dc.date.accessioned 2024-08-29T05:38:56Z
dc.date.available 2024-08-29T05:38:56Z
dc.date.issued 2023-05-30
dc.identifier.uri http://hdl.handle.net/123456789/2140
dc.description Supervised by Dr. Hasan Mahmud, Associate Professor, Md. Mohsinul Kabir, Assistant Professor, Dr. Md. Kamrul Hasan Professor, Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh en_US
dc.description.abstract This study presents a new strategy based on Natural Language Processing (NLP) techniques for detecting and mitigating misogyny on social media. In this study a dataset was constructed of 3.8 million instances of hate speech from various social media networks that were collected meticulously. Advances in this research are substantially hampered by the lack of a sizable Bengali dataset for the detection of hate speech and sexism in Bengali language texts, making it difficult to effectively identify and address these problems. To improve the representation of hate speech in the dataset, an embedding model based on informal FastText is presented, which captures the complex semantics of hate speech more accurately than other meth ods. This improved word embedding model is incorporated into a Bidirectional Long Short-Term Memory (BiLSTM) architecture in order to identify contextual dependencies and sequential patterns within hate speech comments. The model’s layers are trained to encode and comprehend sequential information while tak ing both preceding and subsequent context into account, enabling it to better comprehend remarks and their context. The proposed methodology is evaluated exhaustively on a meticulously annotated dataset, allowing for a thorough anal ysis of its performance. Measurements of precision, recall, and F1-score are used to evaluate the accuracy and effectiveness of hate speech detection. The results demonstrate the framework’s superior performance and discrimination capabili ties, validating its capacity to accurately identify and categorize instances of hate speech. In addition, this research contributes the largest dataset of hate speech in the field and introduces a word embedding model that transcends existing tech niques. These findings substantially improve the understanding and detection of hate speech on social media platforms, laying the groundwork for more effective mechanisms to combat hate speech and promote safer online communities en_US
dc.language.iso en en_US
dc.publisher Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur-1704, Bangladesh en_US
dc.subject Hate Speech; Misogyny; FastText; Word Embedding; Se mantics; Sequential Pattern; Contextual Dependency; Bi-Directional Processing; Bi-LSTM; Dataset en_US
dc.title Misogyny Detection in Social Media for Under-Resourced Bangla Language en_US
dc.type Thesis en_US

