Bioinformatics Analysis of Differentially Expressed Gene's in Breast Cancer Using DESeq2

Malick, Sow Bocar Amadou; Conteh, Fatoumatta; Sawo, Muhammed

dc.contributor.author	Malick, Sow Bocar Amadou
dc.contributor.author	Conteh, Fatoumatta
dc.contributor.author	Sawo, Muhammed
dc.date.accessioned	2023-04-28T06:50:17Z
dc.date.available	2023-04-28T06:50:17Z
dc.date.issued	2022-05-30
dc.identifier.citation	[1] Sonali Arora. “Raw TCGA data using Bioconductor’s ExperimentHub”. In: Raw TCGA data using Bioconductor’s ExperimentHub (2021). doi: https://www.bioconductor. org/packages/release/data/experiment/vignettes/GSE62944/inst/doc/GSE62944. html. [2] Sandrine Dudoit et al. “Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments”. In: Statistica sinica (2002), pp. 111–139. [3] Vanessa M Kvam, Peng Liu, and Yaqing Si. “A comparison of statistical methods for detecting differentially expressed genes from RNA-seq data”. In: American journal of botany 99.2 (2012), pp. 248–256. [4] Cosmin Lazar et al. “A Survey on Filter Techniques for Feature Selection in Gene Expression Microarray Analysis”. In: IEEE/ACM Transactions on Computational Biology and Bioinformatics 9.4 (2012), pp. 1106–1119. doi: 10.1109/TCBB.2012.33. [5] Wentian Li. “Volcano plots in analyzing differential expressions with mRNA microarrays”. In: Journal of bioinformatics and computational biology 10.06 (2012), p. 1231003. [6] Wentian Li et al. “Using volcano plots and regularized-chi statistics in genetic association studies”. In: Computational biology and chemistry 48 (2014), pp. 77–83. [7] Shenghui Liu et al. “Feature selection of gene expression data for cancer classification using double RBF-kernels”. In: BMC bioinformatics 19.1 (2018), pp. 1–14. [8] Michael I Love, Simon Anders, and Wolfgang Huber. “Analyzing RNA-seq data with DESeq2”. In: R package reference manual (2017). [9] Yinglian Pan et al. “A novel signature of two long non-coding RNAs in BRCA mutant ovarian cancer to predict prognosis and efficiency of chemotherapy”. In: Journal of Ovarian Research 13.1 (2020), pp. 1–10. [10] Andrea Rau, Guillemette Marot, and Florence Jaffrézic. “Differential meta-analysis of RNA-seq data from multiple studies”. In: BMC bioinformatics 15.1 (2014), pp. 1–10. [11] Robert M Samstein et al. “Mutations in BRCA1 and BRCA2 differentially affect the tumor microenvironment and response to checkpoint blockade immunotherapy”. In: Nature cancer 1.12 (2020), pp. 1188–1203. [12] Terry Speed. Statistical analysis of gene expression microarray data. Chapman and Hall/CRC, 2003. [13] Zong Hong Zhang et al. “A comparative study of techniques for differential expression analysis on RNA-Seq data”. In: PloS one 9.8 (2014), e103207	en_US
dc.identifier.uri	http://hdl.handle.net/123456789/1866
dc.description	Supervised by Mr. Tareque Mohmud Chowdhury, Asst. Professor, Department of Computer Science and Engineering(CSE), Islamic University of Technology (IUT) Board Bazar, Gazipur-1704, Bangladesh. This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022.	en_US
dc.description.abstract	Differential Gene Expression Analysis is a strong tool for determining if genes in two or more sample groups are expressed at significantly different levels. To estimate gene counts and identify deferentially expressed genes, we’ll utilize the DESeq2 software. Also, while determining whether genes are deferentially expressed, we must account for variation in the data. The purpose is to see if differences between groups are substantial for each gene, given the biological differences between biological replicates. Using Normalized to Read Count Data (NRCD) and statistical analysis, DEG analysis was used to find quantitative differences in expression levels between experimental groups. For example; statistical testing is used to decide whether for a given gene and observed difference in read counts is significant. I.e., whether it is greater than what would be expected just due to natural random variation. The analysis requires gene expression values to be compared between sample group types. The goal is to determine which genes are expressed at different levels between conditions. It has become a widely used technology that allows for effective genome-wide relative gene expression quantification, and it is the method of choice for identifying deferentially expressed genes between two or more biological situations of interest. The primary challenges surrounding such DE analysis have been highlighted from the start, and several methodologies and tools have been offered in the relevant literature. One of the most difficult aspects of this study, as with any other statistical research, has been determining the probabilistic model that best fits the data, as well as the model’s optimal parameter estimates. Another significant challenge was the requirement for data normalization in order to appropriately compare two biological situations by analyzing and removing any potential technological and/or biological biases. Last but not least, several research have emphasized the practical requirement to determine the ideal number of biological replicates per condition and the optimal library size. We’ll go over the use of DeSeq2 method as a utilized methodology and tools for DE analysis in this article. The gene outcomes can offer biological insights into processes affected by the conditions. greater than what would be expected just due to natural random variation.	en_US
dc.language.iso	en	en_US
dc.publisher	Department of Computer Science and Engineering(CSE), Islamic University of Technology(IUT), Board Bazar, Gazipur, Bangladesh	en_US
dc.subject	Bioinformatics, Differential Expressed Genes, DESeq2, Breast Cancer	en_US
dc.title	Bioinformatics Analysis of Differentially Expressed Gene's in Breast Cancer Using DESeq2	en_US
dc.type	Thesis	en_US