Abstract:
Electronic Mail is the “killer network application”. It is ubiquitous and pervasive. In a
relatively short timeframe, the Internet has become irrevocably and deeply
entrenched in our modern society primarily due to the power of its communication
substrate linking people and organizations around the globe. Much work on email
technology has focused on making email easy to use, permitting a wide variety of
information and information types to be conveniently, reliably, sent throughout the
Internet. However, the analysis of the vast storehouse of email content accumulated
or produced by individual users has received relatively little attention other than for
specific tasks such as spam and virus filtering. Users in the email continuously
receive spam and they get into trouble wasting their time and also harmful emails
can cause harm to the computers.
This thesis presents an implemented framework for data mining behavior models
from email data. The EMT is a data mining tool kit designed to analyze email corpora,
including the entire set of email sent and received by an individual user, revealing
much information about individual users as well as the behavior of groups of users in
an organization. A number of machine learning and anomaly detection algorithms
are embedded in the system to model the user’s email behavior in order to classify
email for a variety of tasks. There are different methods for detection of spam
through email. The main goal is to develop a method that outperforms the existing
methods in terms of detection of spam, ham and wrongly classified spam, i.e. need is
to improve the accuracy of the proposed method compared to the other existing
methods. The other goal is to implement the proposed algorithm for reducing the
time. So, to recapitulate, this thesis also deals the accuracy and process timing based
on prioritization of detecting email messages.
The proposed method uses prioritization of process criterion which is unavailable in
the earlier existing methods. It also uses the post-filtering concept which contributes
for the enhancement of accuracy of the proposed method. Thus the proposed
method, which we name as MAN is responsible for spam detection and outperforms
Abstract
xii
the existing methods. This method also provides user convenient spam detection
process. So, by using the concepts of post-filtering, process prioritization and
different criterion in order to detect spam, the optimum accuracy for detecting spam
will be possible.
Description:
Supervised by
Professor Dr. Md. Abdul Mottalib,
Computer Science and Engineering (CSE),
Islamic University of Technology (IUT),
Board Bazar, Gazipur-1704. Bangladesh.