Abstract:
Clustering metagenome refers to group genes with similar expression patterns of a metagenomic data set into clusters with the hope that these clusters correspond to groups of functionally related genes. It allows access to uncultivated microbial populations that may have important roles in natural and engineered ecosystems. Proper clustering of Metgenome sequence is a very essential step in recovering genomes and understanding microbial functions. We took the distance matrix from the expression matrix of a metagenomic sequence and used Expectation Maximization (EM) algorithm for clustering the metagenome. After clustering we label the clusters with proper name, we match the cluster nucleotides with reference genome of bacteria in HMPDAC and name the clusters with the bacteria title given in database. Finally for healthy/ patient sample we will show the percentage of bacteria and infer that since this bacteria is higher it might be causing the problem.
Description:
Supervised by
Prof. Dr. M.A Mottalib,
Head,
Department of Computer Science and Engineering,
Islamic University of Technology (IUT),
Co-Supervisor:
M. Arifur Rahman,
Lecturer,
Department of Computer Science and Engineering