Jinghua Liu (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Yi Yang (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Kai Chen (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China), Miaoqian Lin (Institute of Information Engineering, Chinese Academy of Sciences, Beijing, China; School of Cyber Security, University of Chinese Academy of Sciences, China)

When utilizing library APIs, developers should follow the API security rules to mitigate the risk of API misuse. API Parameter Security Rule (APSR) is a common type of security rule that specifies how API parameters should be safely used and places constraints on their values. Failure to comply with the APSRs can lead to severe security issues, including null pointer dereference and memory corruption. Manually analyzing numerous APIs and their parameters to construct APSRs is labor-intensive and needs to be automated. Existing studies generate APSRs from documentation and code, but the missing information and limited analysis heuristics result in missing APSRs. Due to the superior Large Language Model’s (LLM) capability in code analysis and text generation without predefined heuristics, we attempt to utilize it to address the challenge encountered in API misuse detection. However, directly utilizing LLMs leads to incorrect APSRs which may lead to false bugs in detection, and overly general APSRs that could not generate applicable detection code resulting in many security bugs undiscovered.

In this paper, we present a new framework, named GPTAid, for automatic APSRs generation by analyzing API source code with LLM and detecting API misuse caused by incorrect parameter use. To validate the correctness of the LLM-generated APSRs, we propose an execution feedback-checking approach based on the observation that security-critical API misuse is often caused by APSRs violations, and most of them result in runtime errors. Specifically, GPTAid first uses LLM to generate raw APSRs and the Right calling code, and then generates Violation code for each raw APSR by modifying the Right calling code using LLM. Subsequently, GPTAid performs dynamic execution on each piece of Violation code and further filters out the incorrect APSRs based on runtime errors. To further generate concrete APSRs, GPTAid employs a code differential analysis to refine the filtered ones. Particularly, as the programming language is more precise than natural language, GPTAid identifies the key operations within Violation code by differential analysis, and then generates the corresponding concrete APSR based on the aforementioned operations. These concrete APSRs could be precisely interpreted into applicable detection code, which proven to be effective in API misuse detection. Implementing on the dataset containing 200 randomly selected APIs from eight popular libraries, GPTAid achieves a precision of 92.3%. Moreover, it generates 6 times more APSRs than state-of-the-art detectors on a comparison dataset of previously reported bugs and APSRs. We further evaluated GPTAid on 47 applications, 210 unknown security bugs were found potentially resulting in severe security issues (e.g., system crashes), 150 of which have been confirmed by developers after our reports.

View More Papers

DUMPLING: Fine-grained Differential JavaScript Engine Fuzzing

Liam Wachter (EPFL), Julian Gremminger (EPFL), Christian Wressnegger (Karlsruhe Institute of Technology (KIT)), Mathias Payer (EPFL), Flavio Toffalini (EPFL)

Read More

Revisiting Physical-World Adversarial Attack on Traffic Sign Recognition: A...

Ningfei Wang (University of California, Irvine), Shaoyuan Xie (University of California, Irvine), Takami Sato (University of California, Irvine), Yunpeng Luo (University of California, Irvine), Kaidi Xu (Drexel University), Qi Alfred Chen (University of California, Irvine)

Read More

Cross-Origin Web Attacks via HTTP/2 Server Push and Signed...

Pinji Chen (Tsinghua University), Jianjun Chen (Tsinghua University & Zhongguancun Laboratory), Mingming Zhang (Zhongguancun Laboratory), Qi Wang (Tsinghua University), Yiming Zhang (Tsinghua University), Mingwei Xu (Tsinghua University), Haixin Duan (Tsinghua University)

Read More

Distributed Function Secret Sharing and Applications

Pengzhi Xing (University of Electronic Science and Technology of China), Hongwei Li (University of Electronic Science and Technology of China), Meng Hao (Singapore Management University), Hanxiao Chen (University of Electronic Science and Technology of China), Jia Hu (University of Electronic Science and Technology of China), Dongxiao Liu (University of Electronic Science and Technology of China)

Read More