He Shuang (University of Toronto), Lianying Zhao (Carleton University and University of Toronto), David Lie (University of Toronto)

Web tracking harms user privacy. As a result, the
use of tracker detection and blocking tools is a common practice
among Internet users. However, no such tool can be perfect,
and thus there is a trade-off between avoiding breakage (caused
by unintentionally blocking some required functionality) and ne-
glecting to block some trackers. State-of-the-art tools usually rely
on user reports and developer effort to detect breakages, which
can be broadly categorized into two causes: 1) misidentifying
non-trackers as trackers, and 2) blocking mixed trackers which
blend tracking with functional components.

We propose incorporating a machine learning-based break-
age detector into the tracker detection pipeline to automatically
avoid misidentification of functional resources. For both tracker
detection and breakage detection, we propose using differential
features that can more clearly elucidate the differences caused by
blocking a request. We designed and implemented a prototype of
our proposed approach, Duumviri, for non-mixed trackers. We
then adopt it to automatically identify mixed trackers, drawing
differential features at partial-request granularity.

In the case of non-mixed trackers, evaluating Duumviri on 15K
pages shows its ability to replicate the labels of human-generated
filter lists, EasyPrivacy, with an accuracy of 97.44%. Through a
manual analysis, we find that Duumviri can identify previously
unreported trackers and its breakage detector can identify overly
strict EasyPrivacy rules that cause breakage. In the case of mixed
trackers, Duumviri is the first automated mixed tracker detector,
and achieves a lower bound accuracy of 74.19%. Duumviri has
enabled us to detect and confirm 22 previously unreported unique
trackers and 26 unique mixed trackers.

View More Papers

YuraScanner: Leveraging LLMs for Task-driven Web App Scanning

Aleksei Stafeev (CISPA Helmholtz Center for Information Security), Tim Recktenwald (CISPA Helmholtz Center for Information Security), Gianluca De Stefano (CISPA Helmholtz Center for Information Security), Soheil Khodayari (CISPA Helmholtz Center for Information Security), Giancarlo Pellegrino (CISPA Helmholtz Center for Information Security)

Read More

You Can Rand but You Can't Hide: A Holistic...

Inon Kaplan (Independent researcher), Ron even (Independent researcher), Amit Klein (The Hebrew University of Jerusalem, Israel)

Read More

mmProcess: Phase-Based Speech Reconstruction from mmWave Radar

Hyeongjun Choi, Young Eun Kwon, Ji Won Yoon (Korea University)

Read More

Can a Cybersecurity Question Answering Assistant Help Change User...

Lea Duesterwald (Carnegie Mellon University), Ian Yang (Carnegie Mellon University), Norman Sadeh (Carnegie Mellon University)

Read More