DFOH Runner
This is the repository that contains the code to run DFOH.
Prerequesities
Clone this reposority and go in the main directory.
git clone https://forge.icube.unistra.fr/tholterbach/dfoh_runner.git
cd dfoh_runner
cd main
Install docker using this installation guide.
Then, install the python libraries using the requirements.txt
file.
pip install -r requirements.txt
Download the database
DFOH needs a database that contains various information such as AS topology, IRR information, PeeringDB data, etc. You can download the database from the DFOH's website (50TB). You then need to uncompress it.
wget https://dfoh.uclouvain.be/static/db.tar.bz2
tar -xvf db.tar.bz2
Running DFOH
After installing the required software and dependencies and downloading the database, you can run DFOH
using the run_daily.sh
script. Use the --help
option for some help.
We can only execute DFOH on a daily granularity (day by day).
For instance, the following command runs DFOH on the 24th of December 2024:
python3 run_daily.py --date 2023-12-24 --date_end 2023-12-25 --db_dir /mnt/db2/
⚠️ Running DFOH takes time. There are multiple steps that are automatically executed:
- Building the database for the given day;
- Running the sampling for the given;
- Computing feature values for the given;
- Finding new edges for the given;
- Make inferences for the given day;
- Parse the result for the givan.
Most of these steps execute in a docker container. Thus, the first time you run DFOH, docker will automatically installs several images.
DFOH prints some logs in the logs
dir.
❗ Running DFOH on a day d requires a complete database for the days prior d. Thus, if you download the database on day d and want to run DFOH on day d+10, you need to first run DFOH on d+1...d+9 and then you can run it on day d+10. ℹ️ Recall that we update the database on the website only once per week, on sunday.
🕜 Be aware that running DFOH takes timmmmmmmme. Several hours for one day on our server with 24 cores and 64GB of RAM. Among others, this is because DFOH relies on many different datasets than need to be downloaded, processed, etc.
DOFH's output
The output of DFOH is in the database directory, within the new_edge
and cases
directory.
New edges file
Files in the new_edge
directory contain the new edges found by DFOH along with some attributes.
We show below an excerpt of a new-edge file and describe its attributes.
Cx refers to x-th column.
C1,C2,C3-C4-C5,C6
5089 210511,328474 328333 30844 3356 5089 210511,1652396649-195.60.172.0/24-102.67.56.1-328474,True
5089 210511,12779 5089 210511,1652396656-195.60.172.0/24-80.249.209.17-12779,True
5089 210511,328474 328333 22355 37662 5089 210511,1652396720-195.60.172.0/24-102.67.56.1-328474,True
6775 147028,44393 60326 147028 6775 60728,1652359471-2a04:1d40::/29-49.12.70.222-44393,False
44570 44406,264479 20764 56630 44570 44406,1652342327-2a06:a005:780::/44-2001:12f8::222:156-264479,True
44570 44406,35619 58057 174 56630 44570 44406,1652347611-2a06:a005:670::/44-2a09:4c0:100:2d88::8805-35619,True
44570 44406,51088 2914 56630 44570 44406,1652363606-2a06:a005:780::/44-2001:7f8:1::a505:1088:1-51088,True
56630 262476,44393 60326 147028 212895 56630 262476,1652342389-2804:4e4:4000::/34-49.12.70.222-44393,True
22356 57811,13786 7195 22356 57811 201029,1652320823-2a00:8dc0:ff04::/48-2001:12f8::217:161-13786,False
22356 57811,263152 271253 13786 7195 22356 57811 201029,1652320823-2a00:8dc0:ff04::/48-2001:12f8::222:107-263152,False
5606 211313,31554 8708 5606 211313,1652380317-80.96.13.0/24-86.104.125.81-31554,False
5606 211313,6720 5606 211313,1652380326-80.96.13.0/24-193.203.0.63-6720,False
5606 211313,264479 5606 211313,1652380307-80.96.13.0/24-45.6.53.167-264479,False
5606 211313,267613 5606 211313,1652380308-80.96.13.0/24-45.6.52.62-267613,False
Description of the columns:
C1: the new edge between two ASes;
C2: the AS path observed by the VP that observed the new edge;
C3: timestamp at which the new edge was observed;
C4: prefix of the BGP announcement with the new edge;
C5: peer IP of the vantage point that observed the new edge;
C6: boolean indicating if it is a recurrent new edge, i.e. whether this new edge recurrently appear and is inferred as suspicious by DFOH.
Inference files
File in the cases
directory show the results of DFOH inferences for all the new edges cases found and listed in the new edges files.
We show below an excerpt of an inference file and describe its attributes.
Cx refers to x-th column.
C1 C2 C3 C4 C5 C6 C7
!sus 131958 264479 9 1 1 attackers:264479;victims:131958;type:1;valid_origin:True;recurrent:False;local:True
!leg 133968 134875 10 0 3 attackers:133968;victims:134875;type:1;valid_origin:True;recurrent:False;local:False
Description of the columns:
C1: `sus` means the new edge is inferred as suspicious by DFOH, `leg` means it is inferred as legimitate;
C2 and C3: the new edge between two ASes (C2 and C3);
C4: The number of inferences that classify this new edge as legimitate;
C5: The number of inferences that classify this new edge as suspicious;
C6: The number of AS paths that were used for the inferences;
C7: Additional attributes that we describe below:
attackers: the supposedly attackers;
victims: the supposedly victims;
type: type of the hijacks (type 1, 2 etc);
valid_origin: True if the origin of the route with the new edge is valid, False otherwise;
recurrent: True if its a recurrent case;
local: True if its a local case, i.e., observed by a VP in the attacker's AS (thus likely not a hijack).
⚠️Observe that we purposively omit some lines for clarity. These lines include details about the infereces for every sensitivity parameter used.