This repository contains the source code for our work Jewel which will appear in the Proceedings of IEEE INFOCOM 2024, 20-23 May 2024, Vancouver, Canada.
Jewel is an in-switch inference framework that can run Random Forest (RF) models at both packet-level and flow-level in real-world programmable switches. It enables the embedding of RF models into production-grade programmable hardware, for challenging inference tasks, ensuring that all packets are classified. Jewel is implemented as open-source software using the P4 language.
For full details, please consult our paper.
There are three folders:
- P4 : The P4 code for Tofino and the M/A table entries
- Python : The jupyter notebook for training the machine learning models (unsw_model_analysis_26_classes.ipynb), and the python scripts for generating the M/A table entries from the saved trained models (convert_RF_to_table_entries.py) and for generating the train/test data of the joint solution with the generated features for the first N packet (clean_and_label_n_pkts_hybrid.py). Also, a bash script to extract pcap capture and call clean_and_label_n_pkts_hybrid.py to generate the train/test data for the specific N value. (N=the rank of the packet for which the Flow-Level inference is triggered.)
- Controller: The python script that realizes control plane functionality to store statistics, release of the flow tracking registers occupied by the flow and update target traffic table upon receiving a digest from the switch.
The use cases considered in the paper are:
- IoT device identification task based on the publicly available UNSW-IOT Traces.
The challenge is to classify traffic into 26 classes. - Protocol classification with 8 protocol classes, based on the UNIBS 2009 Internet Traces.
- Malicious traffic detection task with 10 malware and 4 benign traffic classes generated from Internet of Things (IoT) devices, based on the puclicly available Aposemat IoT-23.
- Cyberattack identification task with benign traffic and 9 types of cyberattacks. It is based on the ToN_IoT dataset.
We provide the python, P4, and the controller code for the UNSW-IoT device identification use case with 26 classes.
The same approach for generating the train/test data for the joint solution, feature/model selection, encoding to P4, and getting digest from the data plane applies to all the use cases.
You can access the train/test files and packet count file for the test data for this use case from this Box folder.
If you make use of this code, kindly cite our paper:
@inproceedings{jewel-2024,
author = {Akem, Aristide Tanyi-Jong and Bütün, Beyza and Gucciardo, Michele and Fiore, Marco},
title = {Jewel: Resource-Efficient Joint Packet and Flow Level Inference in Programmable Switches},
year = {2024},
publisher = {},
address = {},
url = {},
doi = {},
booktitle = {Proceedings of the 2024 IEEE International Conference on Computer Communications},
numpages = {10},
location = {Vancouver, Canada},
series = {INFOCOM 2024}
}
If you need any additional information, send us an email at beyza.butun at imdea.org or aristide.akem at imdea.org.