SemFuzz: Semantics-based Automatic Generation of Proof-of-Concept Exploits


Patches and related information about so ware vulnerabilities are o en made available to the public, aiming to facilitate timely xes. Unfortunately, the slow paces of system updates (30 days on average) o en present to the a ackers enough time to recover hidden bugs for a acking the unpatched systems. Making things worse is the potential to automatically generate exploits on input-validation aws through reverse-engineering patches, even though such vulnerabilities are relatively rare (e.g., 5% among all Linux kernel vulnerabilities in last few years). Less understood, however, are the implications of other bug-related information (e.g., bug descriptions in CVE), particularly whether utilization of such information can facilitate exploit generation, even on other vulnerability types that have never been automatically a acked. <br> In this paper, we seek to use such information to generate proof-of-concept (PoC) exploits for the vulnerability types never automatically a acked. Unlike an input validation aw that is o en patched by adding missing sanitization checks, xing other vulnerability types is more complicated, usually involving replacement of the whole chunk of code. Without understanding of the code changed, automatic exploit becomes less likely. To address this challenge, we present SemFuzz, a novel technique leveraging vulnerability-related text (e.g., CVE reports and Linux git logs) to guide automatic generation of PoC exploits. Such an end-to-end approach is made possible by natural-language processing (NLP) based information extraction and a semantics-based fuzzing process guided by such information. Running over 112 Linux kernel aws reported in the past ve years, SemFuzz successfully triggered 18 of them, and further discovered one zero-day and one undisclosed vulnerabilities. ese aws include use-after-free, memory corruption, information leak, etc., indicating that more complicated aws can also be automatically a acked. is nding calls into question the way vulnerability-related information is shared today.

Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security