Abstract:
With the evolution of time the gene composition of species have changed a lot. Consequently it has mutated to generate new diseases and traits. In order to identify genes or coding sections of a DNA sequence it is imperative to find out the promoter regions or the conserved regions of the DNA code first. But the main problem stands that the databases for these information are quite messy and needs to be researched. The main problems in finding motifs in a DNA sequence are finding a good and fast algorithm, considering mutations in those motifs, representing variable length motifs and being species general. In this thesis work we tried to formulate a new algorithm which is fast, accurate and effective. Instead of general string matching of DNA sequences we have done integer mapping and matching which are comparatively fast and accurate. Besides in order to formulate a complete DNA motif finding algorithm we also need a generalized and rational fitness function for evaluating the potential motifs. Thus we have formulated a desirable fitness function that enables us to compare the relativity among potential motifs and finally to predict a certain motif for the input DNA sequences.
Description:
Supervised by
Prof. Dr. M.A Mottalib,
Head of Department,
Department of Computer Science and Engineering (CSE),
Islamic University of Technology (IUT),
Board Bazar, Gazipur-1704, Bangladesh.