Chenxu Wang (Southern University of Science and Technology (SUSTech) and The Hong Kong Polytechnic University), Fengwei Zhang (Southern University of Science and Technology (SUSTech)), Yunjie Deng (Southern University of Science and Technology (SUSTech)), Kevin Leach (Vanderbilt University), Jiannong Cao (The Hong Kong Polytechnic University), Zhenyu Ning (Hunan University), Shoumeng Yan (Ant Group), Zhengyu He (Ant Group)

Confidential computing is an emerging technique that provides users and third-party developers with an isolated and transparent execution environment. To support this technique, Arm introduced the Confidential Computing Architecture (CCA), which creates multiple isolated address spaces, known as realms, to ensure data confidentiality and integrity in security-sensitive tasks. Arm recently proposed the concept of confidential computing on GPU hardware, which is widely used in general-purpose, high-performance, and artificial intelligence computing scenarios. However, hardware and firmware supporting confidential GPU workloads remain unavailable. Existing studies leverage Trusted Execution Environments (TEEs) to secure GPU computing on Arm- or Intel-based platforms, but they are not suitable for CCA's realm-style architecture, such as using incompatible hardware or introducing a large trusted computing base (TCB). Therefore, there is a need to complement existing Arm CCA capabilities with GPU acceleration.

To address this challenge, we present CAGE to support confidential GPU computing for Arm CCA. By leveraging the existing security features in Arm CCA, CAGE ensures data security during confidential computing on unified-memory GPUs, the mainstream accelerators in Arm devices. To adapt the GPU workflow to CCA's realm-style architecture, CAGE proposes a novel shadow task mechanism to manage confidential GPU applications flexibly. Additionally, CAGE leverages the memory isolation mechanism in Arm CCA to protect data confidentiality and integrity from the strong adversary. Based on this, CAGE also optimizes security operations in memory isolation to mitigate performance overhead. Without hardware changes, our approach uses the generic hardware security primitives in Arm CCA to defend against a privileged adversary. We present two prototypes to verify CAGE's functionality and evaluate performance, respectively. Results show that CAGE effectively provides GPU support for Arm CCA with an average of 2.45% performance overhead.

View More Papers

FirmDiff: Improving the Configuration of Linux Kernels Geared Towards...

Ioannis Angelakopoulos (Boston University), Gianluca Stringhini (Boston University), Manuel Egele (Boston University)

Read More

ORL-AUDITOR: Dataset Auditing in Offline Deep Reinforcement Learning

Linkang Du (Zhejiang University), Min Chen (CISPA Helmholtz Center for Information Security), Mingyang Sun (Zhejiang University), Shouling Ji (Zhejiang University), Peng Cheng (Zhejiang University), Jiming Chen (Zhejiang University), Zhikun Zhang (CISPA Helmholtz Center for Information Security and Stanford University)

Read More

GhostType: The Limits of Using Contactless Electromagnetic Interference to...

Qinhong Jiang (Zhejiang University), Yanze Ren (Zhejiang University), Yan Long (University of Michigan), Chen Yan (Zhejiang University), Yumai Sun (University of Michigan), Xiaoyu Ji (Zhejiang University), Kevin Fu (Northeastern University), Wenyuan Xu (Zhejiang University)

Read More

TinyML meets IoBT against Sensor Hacking

Raushan Kumar Singh (IIT Ropar), Sudeepta Mishra (IIT Ropar)

Read More