Mass Discovery of Android Tra ic Imprints through Instantiated Partial Execution


Monitoring network behaviors of mobile applications, controlling their resource access and detecting potentially harmful apps are becoming increasingly important for the security protection within today’s organizational, ISP and carriers. For this purpose, apps need to be identi ed from their communication, based upon their individual tra c signatures (called imprints in our research). Creating imprints for a large number of apps is nontrivial, due to the challenges in comprehensively analyzing their network activities at a large scale, for millions of apps on today’s rapidly-growing app marketplaces. Prior research relies on automatic exploration of an app’s user interfaces (UIs) to trigger its network activities, which is less likely to scale given the cost of the operation (at least 5 minutes per app) and its e ectiveness (limited coverage of an app’s behaviors). <br> In this paper, we present Tiger (Tra c Imprint Generator), a novel technique that makes comprehensive app imprint generation possible in a massive scale. At the center of Tiger is a unique instantiated slicing technique, which aggressively prunes the program slice extracted from the app’s network-related code by evaluating each variable’s impact on possible network invariants, and removing those unlikely to contribute through assigning them concrete values. In this way, Tiger avoids exploring a large number of program paths unrelated to the app’s identi able tra c, thereby reducing the cost of the code analysis by more than one order of magnitude, in comparison with the conventional slicing and execution approach. Our experiments show that Tiger is capable of recovering an app’s full network activities within 18 seconds, achieving over 98% coverage of its identi able packets and 0.742% false detection rate on app identi cation. Further running the technique on over 200,000 real-world Android apps (including 78.23% potentially harmful apps) leads to the discovery of surprising new types oftra c invariants, including fake device information, hardcoded time values, session IDs and credentials, as well as complicated trigger conditions for an app’s network activities, such as human involvement, Intent trigger and server-side instructions. Our findings demonstrate that many network activities cannot easily be invoked through automatic UI exploration and code-analysis based approaches present a promising alternative.

Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security