The popularity of Deep Learning (DL), coupled with network traffic visibility reduction due to the increased adoption of HTTPS, QUIC, and DNS-SEC, re-ignited interest towards Traffic Classification (TC). However, to tame the dependency from task-specific large labeled datasets, we need to find better ways to learn representations that are valid across tasks. In this work we investigate this problem comparing transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models (16 methods total). Using two publicly available datasets, namely MIRAGE19 (40 classes) and AppClassNet (500 classes), we show that ($i$) by using DL methods on large datasets we can obtain more general representations with (i i) contrastive learning methods yielding the best performance and (iii) meta-learning the worst one. While (iv) tree-based models can be impractical for large tasks but fit well small tasks, (v) DL methods that reuse better learned representations are closing their performance gap against trees also for small tasks.
Guarino, I., Wang, C., Finamore, A., Pescape, A., Rossi, D. (2023). Many or Few Samples?: Comparing Transfer, Contrastive and Meta-Learning in Encrypted Traffic Classification. Institute of Electrical and Electronics Engineers Inc. [10.23919/TMA58422.2023.10198965].
Many or Few Samples?: Comparing Transfer, Contrastive and Meta-Learning in Encrypted Traffic Classification
Guarino I.Primo
;
2023
Abstract
The popularity of Deep Learning (DL), coupled with network traffic visibility reduction due to the increased adoption of HTTPS, QUIC, and DNS-SEC, re-ignited interest towards Traffic Classification (TC). However, to tame the dependency from task-specific large labeled datasets, we need to find better ways to learn representations that are valid across tasks. In this work we investigate this problem comparing transfer learning, meta-learning and contrastive learning against reference Machine Learning (ML) tree-based and monolithic DL models (16 methods total). Using two publicly available datasets, namely MIRAGE19 (40 classes) and AppClassNet (500 classes), we show that ($i$) by using DL methods on large datasets we can obtain more general representations with (i i) contrastive learning methods yielding the best performance and (iii) meta-learning the worst one. While (iv) tree-based models can be impractical for large tasks but fit well small tasks, (v) DL methods that reuse better learned representations are closing their performance gap against trees also for small tasks.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


