Diogo Barradas (Instituto Superior Técnico, Universidade de Lisboa)

The advent of programmable switches has sparked a general interest in devising new security solutions for high-speed networks. Recently, we introduced FlowLens, a system that leverages programmable switches to efficiently support multi-purpose security network applications based on machine learning algorithms. With FlowLens, network operators are able to program their switches to automatically scan and classify flows with high accuracy for a wide range of scenarios, such as multimedia covert channel detection, website fingerprinting, or botnet traffic identification. To make this possible, FlowLens introduces a new system design that solves a fundamental tension between the need for comprehensive flow information required by machine learning algorithms and the scarcity of hardware resources available in modern programmable switches.

To tackle this tension, we faced several major challenges at the implementation and evaluation levels that have raised the bar in proving the feasibility and effectiveness of our design. First, we identified a substantial gap between the programming environment (based on the P4 programming language) targeting a software-emulated switch and a real-world proprietary switch (e.g., the Barefoot Tofino). This gap forced us to deeply restructure our code and revisit our assumptions underpinning our original flow compression technique. Second, we realized that different machine learning security tasks proposed in the literature had been fine-tuned for their specific application domains. This means that not only do they employ different classification algorithms but even the datasets used and the training processes are different from one another. As such, we had to adopt several strategies to repurpose the classification machinery of previously existing applications to ensure their compatibility with FlowLens. Lastly, the comparison between our compression technique and other related compression techniques was hampered by the lack of accessibility to the latter’s implementation. This forced us to re-implement several of such approaches and to resort to analytical comparisons of their compute, storage, and communication costs.

In this presentation, we discuss in detail how we addressed the above challenges and provide a set of guidelines that may prove useful for future practitioners in the realm of the intersection between network security and machine learning.

Speaker's biography

Diogo Barradas is a Ph.D. candidate in Information Systems and Computer Engineering at Instituto Superior Técnico, Universidade de Lisboa. He received his BSc. (2014) and MSc. (2016) from the same institution. His main research interests include network security and privacy, with particular emphasis on statistical traffic analysis and Internet censorship circumvention. He conducts his research at the Distributed Systems Group at INESC-ID Lisboa.

View More Papers

Understanding Worldwide Private Information Collection on Android

Yun Shen (NortonLifeLock Research Group), Pierre-Antoine Vervier (NortonLifeLock Research Group), Gianluca Stringhini (Boston University)

Read More

XDA: Accurate, Robust Disassembly with Transfer Learning

Kexin Pei (Columbia University), Jonas Guan (University of Toronto), David Williams-King (Columbia University), Junfeng Yang (Columbia University), Suman Jana (Columbia University)

Read More

FARE: Enabling Fine-grained Attack Categorization under Low-quality Labeled Data

Junjie Liang (The Pennsylvania State University), Wenbo Guo (The Pennsylvania State University), Tongbo Luo (Robinhood), Vasant Honavar (The Pennsylvania State University), Gang Wang (University of Illinois at Urbana-Champaign), Xinyu Xing (The Pennsylvania State University)

Read More

WeepingCAN: A Stealthy CAN Bus-off Attack

Gedare Bloom (University of Colorado Colorado Springs) Best Paper Award Winner ($300 cash prize)!

Read More