Abstract:
Differential Gene Expression Analysis is a strong tool for determining if genes in two or more
sample groups are expressed at significantly different levels. To estimate gene counts and
identify deferentially expressed genes, we’ll utilize the DESeq2 software. Also, while determining
whether genes are deferentially expressed, we must account for variation in the data. The
purpose is to see if differences between groups are substantial for each gene, given the biological
differences between biological replicates. Using Normalized to Read Count Data (NRCD) and
statistical analysis, DEG analysis was used to find quantitative differences in expression levels
between experimental groups. For example; statistical testing is used to decide whether for
a given gene and observed difference in read counts is significant. I.e., whether it is greater
than what would be expected just due to natural random variation. The analysis requires
gene expression values to be compared between sample group types. The goal is to determine
which genes are expressed at different levels between conditions. It has become a widely used
technology that allows for effective genome-wide relative gene expression quantification, and
it is the method of choice for identifying deferentially expressed genes between two or more
biological situations of interest. The primary challenges surrounding such DE analysis have
been highlighted from the start, and several methodologies and tools have been offered in the relevant literature. One of the most difficult aspects of this study, as with any other statistical
research, has been determining the probabilistic model that best fits the data, as well as the
model’s optimal parameter estimates. Another significant challenge was the requirement for
data normalization in order to appropriately compare two biological situations by analyzing
and removing any potential technological and/or biological biases. Last but not least, several
research have emphasized the practical requirement to determine the ideal number of biological
replicates per condition and the optimal library size. We’ll go over the use of DeSeq2 method
as a utilized methodology and tools for DE analysis in this article.
The gene outcomes can offer biological insights into processes affected by the conditions. greater
than what would be expected just due to natural random variation.
Description:
Supervised by
Mr. Tareque Mohmud Chowdhury,
Asst. Professor,
Department of Computer Science and Engineering(CSE),
Islamic University of Technology (IUT)
Board Bazar, Gazipur-1704, Bangladesh.
This thesis is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2022.