Abstract:
This study emphasizes the integration of clinical data, Copy Number Alteration (CNA), and gene expression data to present an impactful methodology for the classification of PAM-50 breast cancer subtypes. Since breast cancer is a diverse disease, identifying its subtypes with precision is essential to developing therapies tailored to individual treatment plans. Given the variety of molecular traits that contribute to the complexity of breast cancer, this work is relevant because it tackles the problem of using multi-omics data to improve subtype classification. We commit to the inclusion of informative features by using Boruta for feature selection on single-omics data. Graph Convolutional Networks (GCN) help us to capture complex relationships and dependencies within the multi-omics dataset by integrating these various data modalities. This work is important not just because of its methodology but also because it advances precision medicine and cancer research in general. By increasing the precision of PAM-50 subtype classification, the suggested method may help physicians make better-informed choices about treatment plans. The integration of multi-omics data for a thorough understanding of breast cancer might have advanced with this work, which emphasizes the significance of taking clinical, genomic, and expression data into account simultaneously when characterizing subtypes.
Description:
Supervised by
Mr. Tareque Mohmud Chowdhury,
Assistant Professor,
Department of Computer Science and Engineering (CSE)
Islamic University of Technology (IUT)
Board Bazar, Gazipur, Bangladesh
This thesis is submitted in partial fulfillment of the requirement for the degree of Bachelor of Science in Computer Science and Engineering, 2024