This repository contains the public version of the code for our work Flowrest which is published in the Proceedings of IEEE INFOCOM 2023, 17-20 May 2023, New York Area, USA.
An extended version of the paper will soon be available and we have updated this repository with a version of our code with all the functionalities described in the paper.
Flowrest is a practical framework that can run Random Forest (RF) models at flow-level in real-world programmable switches. It enables the embedding of large RF models into production-grade programmable hardware, for challenging inference tasks on individual traffic flows at line rate. Flowrest is implemented as open-source software using the P4 language.
For full details, please consult our paper.
There are two folders:
- P4 : the P4 code for Tofino
- Python : the jupyter notebooks for training the machine learning models, the python scripts for generating the M/A table entries from the saved trained models, and the control plane code.
In each folder, there are two sub-folders; one for the conference version of Flowrest and the other for the full version with all the functionalities fully implemented.
The use cases considered in the paper are:
- IoT device identification task based on the publicly available UNSW-IOT Traces.
The challenge is to classify traffic into one of 16 or 26 classes. - Protocol classification with 8 protocol classes, based on the UNIBS 2009 Internet Traces.
- Intrusion detection system separating malware from benign traffic.
It is based on the CICIDS 2017 Friday dataset containing DDoS attacks and normal traffic. - Bot classification with 10 attack classes and 4 benign classes. It is based on the IoT-23 public traces. The classification task is to distinguish 14 traffic classes
- IoT attack classification with 10 classes, based on the ToN-IoT network data.
For the conference version of the code, we provide the python and P4 code for the UNSW-IoT device identification use case with 16 classes.
The same approach for feature/model selection and encoding to P4 applies to all the use cases.
For the full version of the code, we provide the code for the UNIBS 2009 Internet Traces use case with 8 classes.
The same approach for feature/model selection and encoding to P4 applies to all the use cases.
You can access the train/test files for the examples above from this Box folder.
To reproduce any of the benchmark solutions, please refer to the respective repositories below. Any variations to the original proposals are described in the paper.
- Mousika: https://github.com/xgr19/Mousika
- Soter: https://github.com/xgr19/Soter
- NetBeacon:https://github.com/IDP-code/NetBeacon
- Planter (from Henna): https://github.com/nds-group/Henna
If you make use of this code, kindly cite our paper:
@inproceedings{flowrest-2023,
author = {Akem, Aristide Tanyi-Jong and Gucciardo, Michele and Fiore, Marco},
title = {Flowrest: Practical Flow-Level Inference in Programmable Switches with Random Forests},
year = {2023},
publisher = {},
address = {},
doi = {10.1109/INFOCOM53939.2023.10229100},
booktitle = {INFOCOM 2023 - IEEE Conference on Computer Communications},
numpages = {10},
location = {New York, USA}
}
If you need any additional information, send us an email at aristide.akem at imdea.org or beyza.butun at imdea.org.