This paper discusses an intrusion detection algorithm for analyzing university web server log files. It also discusses integrating hierarchical clustering with other algorithms for an intrusion detection system. The paper proposes to use hierarchical clustering as the main back bone of the intrusion detection system and then incorporating other algorithms like statistics and support vector machines (SVM) as needed.
From the Paper:
"The initial plan was to use the user signatures method by Seth Freeman or the Traffic Classification technique but the first method seems more suited to an OS than for web server log files and the second seems a lot more complicated and also requires a destination IP, which is not readily available from our log files. I started out by writing a statistics based algorithm but then added hierarchical clustering based on instructor feedback. Eventually I settled on this paper based on hierarchical clustering with other methods as backup although I still like the statistics approach."
Sample of Sources Used:
Tarek Abbes, ET. Al, High Performance Intrusion Detection using Traffic Classification, 11/15/2004. Research Paper. Page 1.
Wenke Lee and Salvatore J. Stolfo, Data Mining Approaches for Intrusion Detection, - Referenced 4/4/ 2007, http://www1.cs.columbia.edu/~sal/hpapers/USENIX/usenix.html#Fayyad_1996b
Jian Pei, ET. Al, Data Mining for Intrusion Detection - Techniques, Applications and Systems, Powerpoint presentation referenced 4/15/2007. Pages 10, 64. http://www.cse.uconn.edu/icde04/tutorials/Pei.pdf
Mamoun Awad, Data Mining &Intrusion Detection Systems - Powerpoint presentation referenced 4/15/2007 http://www.utdallas.edu/~bxt043000/Lecture20.ppt#18
1998 DARPA data set from the Lincoln MIT Laboratory http://kdd.ics.uci.edu/databases/kddcup99/kddcup99.html
Misuse Intrusion Detection (2012, January 15). Retrieved February 13, 2012, from http://www.academon.com/Research-Paper-Misuse-Intrusion-Detection/97601