He Shuang (University of Toronto), Lianying Zhao (Carleton University and University of Toronto), David Lie (University of Toronto)

Web tracking harms user privacy. As a result, the
use of tracker detection and blocking tools is a common practice
among Internet users. However, no such tool can be perfect,
and thus there is a trade-off between avoiding breakage (caused
by unintentionally blocking some required functionality) and ne-
glecting to block some trackers. State-of-the-art tools usually rely
on user reports and developer effort to detect breakages, which
can be broadly categorized into two causes: 1) misidentifying
non-trackers as trackers, and 2) blocking mixed trackers which
blend tracking with functional components.

We propose incorporating a machine learning-based break-
age detector into the tracker detection pipeline to automatically
avoid misidentification of functional resources. For both tracker
detection and breakage detection, we propose using differential
features that can more clearly elucidate the differences caused by
blocking a request. We designed and implemented a prototype of
our proposed approach, Duumviri, for non-mixed trackers. We
then adopt it to automatically identify mixed trackers, drawing
differential features at partial-request granularity.

In the case of non-mixed trackers, evaluating Duumviri on 15K
pages shows its ability to replicate the labels of human-generated
filter lists, EasyPrivacy, with an accuracy of 97.44%. Through a
manual analysis, we find that Duumviri can identify previously
unreported trackers and its breakage detector can identify overly
strict EasyPrivacy rules that cause breakage. In the case of mixed
trackers, Duumviri is the first automated mixed tracker detector,
and achieves a lower bound accuracy of 74.19%. Duumviri has
enabled us to detect and confirm 22 previously unreported unique
trackers and 26 unique mixed trackers.

View More Papers

PQConnect: Automated Post-Quantum End-to-End Tunnels

Daniel J. Bernstein (University of Illinois at Chicago and Academia Sinica), Tanja Lange (Eindhoven University of Technology amd Academia Sinica), Jonathan Levin (Academia Sinica and Eindhoven University of Technology), Bo-Yin Yang (Academia Sinica)

Read More

WAVEN: WebAssembly Memory Virtualization for Enclaves

Weili Wang (Southern University of Science and Technology), Honghan Ji (ByteDance Inc.), Peixuan He (ByteDance Inc.), Yao Zhang (ByteDance Inc.), Ye Wu (ByteDance Inc.), Yinqian Zhang (Southern University of Science and Technology)

Read More

Black-box Membership Inference Attacks against Fine-tuned Diffusion Models

Yan Pang (University of Virginia), Tianhao Wang (University of Virginia)

Read More

PolicyPulse: Precision Semantic Role Extraction for Enhanced Privacy Policy...

Andrick Adhikari (University of Denver), Sanchari Das (University of Denver), Rinku Dewri (University of Denver)

Read More