- Notifications
You must be signed in to change notification settings - Fork1
dangnh0611/kdd99_ids
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
kdd99 dataset analyzing and some experiments to reproduce data collecting.
Quote fromKDD99 homepage:
This is the data set used for The Third International Knowledge Discovery and Data Mining Tools Competition, which was held in conjunction with KDD-99 The Fifth International Conference on Knowledge Discovery and Data Mining. The competition task was to build a network intrusion detector, a predictive model capable of distinguishing betweenbad connections, called intrusions or attacks, andgood normal connections. This database contains a standard set of data to be audited, which includes a wide variety of intrusions simulated in a military network environment.
More:
Some basic analyzing about KDD99 dataset can be found atkdd99.ipynb
Thiskdd99_feature_extractor repository contain a lot of helpful information about kdd99 data reproducing. This tool will be used in this project for feature extraction.
The data collecting process:
- Step 1: Setup simulated network environment with multiple hosts, links, switches,..
- Step 2: Collect normal and attack raw tcp dump data inside the network by peppering it with multiple attacks.
- Step 3: Feature extraction from raw tcp dump data
Simulate a SDN (Software Defined Network) usingMininet andONOS as the SDN controller.
Follow the offical instruction on each of those to install and setup required environment.
The following step assumse that you have installed Mininet and ONOS succesfully, and ONOS service is running.
mytopo.py contain python source code to create a mininet topology. For better understanding, consider to read more atIntroduction to Mininet.In this project, we create a simple network topology:The topology contain 4 switches, 8 hosts and 1 router, which devided to 2 main partions: DMZ network and Internal network, and both connect to router
r0
which is basically a linux host act as a simple router.
- DMZ network contain 2 hosts:
h1
andh2
connected to switchs1
. Onh1
andh2
, we start some service such as ftp, http, ssh,.. - Internal network: contain hosts connected to each other in a tree topology.
In ONOS Web UI, go to Setting and enable OpenFlow Provider Suite (activate OpenFlow) và Reactive Forwarding (auto routing).Create a SDN with ONOS controller as defined above with:
$ sudo python mytopo.py
(recommened method) or use it as a mininet extension (In this case, service running onh1
andh2
must be manually start. Seemytopo.py source code for more detail.)
$ sudo mn --controller=remote,ip=127.0.0.1,port=6653 --custom mytopo.py --topo tp,3,3
On ONOS Web UI, pressH
key for showing hosts, you will see as something similar:
Here the simulated network is ready. On mininet CLI, you can type command to interact between hosts, such ash1 ping h2
to ping fromh1
toh2
, orxterm h1, h2, h3
which will start 3 xterm windows that can be use to execute commands.
Most widely used tool isWireshark, or libpcap related tools..Use wireshark to listening for tcp dump on suitable network interface. On each experiment (attack,..), listening for tcp dump on the network interface that network packets pass through.For example, if an attack is performed from h3 to h1, use wireshark to listening ons2-eth3
ors4-eth1
of thes4
switch.The reason we don't listening on theany
interface is the limitation of the extraction toolkdd99_feature_extractor we will use. It only works with IPv4 protocol, one of ICMP, UDP, TCP protocols and doesn't work with the virtual network interface “any". Specifically, “any" is just a virtual interface which libcap create. Due to the difference between the data link layer protocols, packets from "any" interface will be in “cooked mode" in theEncapsulation Type
field.
We use hosts inside out pseudo network for attacking between each others.Some simple attacks and examples are:
TCP SYN flood
$ hping3 -V -d 100 -S -p 80 --flood 100.10.0.2$ hping3 -V -d 120 -S -p 80 --flood --rand-source 100.10.0.2$ hping3 -V -d 100 -S -s 10000 -p 80 -k --flood -a 100.10.0.2 100.10.0.2..
TCP RST Flood
$ hping3 --flood --rand-source -R -p 22 100.10.0.2$ hping3 --rand-source -F -q -d 80 -p 80 --flood 100.10.0.2..
ICMP flood
$ hping3 --flood --rand-source -1 -p 80 100.10.0.2..
HTTP flood
Usingthe HULK tool, i.e$ python hulk.py http://100.10.0.2/
HTTP slow
Usingthe Slowloris tool, i.e$ sudo pip3 install slowloris$ slowloris -v -s 1000 --sleeptime 2 100.10.0.2..
Probe attack using Nmap
:- Host discovery, port scanning, version detection,..
- More details:https://nmap.org/
- Examples:
$ nmap –sn 192.168.2.0/24 --disable-arp-ping $ nmap -sT 100.10.0.2 $ nmap -sT -F 100.10.0.2 ..
Some attack such as MAC flooding, ARP poisoning,..
which not attack on IP protocol or higher layer will not help due tothe limitation of the extraction tool we will use, and some attack are not related to the problem we focus on.
Nornal data can be collected in the same way as attack data. The data can be produced by use HTTP service, FTP service, SSH,VLC media streamming,..
In this step, we can use tools for extracting raw tcp dump data to feature data. We usekdd99_feature_extractor, for more detail seemydata.ipynb.
Some related tools are:
- kdd99_feature_extractor: a tool using libpcap.
- tcpdump2gureKDDCup99. This tool use bro-ids (zeek-ids) to extract log from pcap file.
- Tshark to extract some basic information from raw tcp dump pcap file.
For example:
$ tshark -r /home/dangnh/ids/test.pcap -T json$ tshark -r INPUT.PCAP -T fields -e frame.protocols -e frame.len -e frame.time_delta -e udp.dstport -e udp.srcport -Y"ip.dst == 192.168.1.1" -E separator=","> OUTPUT.CSV
- Network-intrusion-dataset-creator
For more details, consider to look forthis report (in Vietnamese)