Google Summer of Code 2018 - Phase 1

June 14, 2018

Here’s my work progress with the first phase of Google Summer of Code 2018.

Goals

There were two primary goals to be achieved:

Create a tool which would to create whitelists for projects using sip (eg. PyQt).
Maximise test coverage for Vulture.

Vulture-Whitelist

Idea was to create a tool for people running Vulture analysis on projects using sip to create Python bingings for C/C++ code.

Problem

Consider a scenario where a user imports a class in Python, defined through a C++ module. Now, if the user decides to overload any virtual function of that class - It won’t be executed by Python directly. Thus, leaving no way to determine beforehand if that particular method is used anytime, thus causing static analysis tools, like Vulture to report them as unused.

Proposed Solution

The most obvious solution is parse virtual functions present in sip files and write them to a file which could then be included in the list of files to be analysed by Vulture which would then inturn treat these methods as if they are used, thus preventing Vulture from reporting them as unused.

It has one caveat though - The results may contain a false negative when there is an overloaded method which isn’t run anytime because Vulture would consider it as “used” anyways, but nonetheless given that projects like PyQt (the main consumer of sip) have thousands of virtual functions, it would still be better to negotiate a false negative than a false positive.

Gladly, sip already provided us with a way to export data in XML format, and in no time @The-Compiler merged a patch so that XML includes a virtual="(1|0)" attribute for Function tag and he quickly implemented a script to parse the XML and filter out all virtual functions and save them as a whitelist.

My job was to create a plugin based python package as a wrapper around that script because there are multiple generators for creating bindings and we would want to support many of them incrementally. vulture-whitelist can be found here.

Code Coverage

Vulture had an excellent code coverage of 95% prior to my stride on maximizing the coverage ratio. At first, Jendrik thought that achieving 100% wouldn’t be possible without major changes in how the tests were currently written. But, after inspecting the coverage report closely we found out that only a minor tweaks and removal of some obsolete code did the job. It also unveiled a bug in how the tests for async functions were written - They weren’t even being run because of a faulty fixture I wrote.

Now, after the successful culmination of the first phase, I look forward to working on an entirely new feature on Vulture - Dynamic Analysis for detecting false positives. Stay tuned for more info.