Google Summer of Code 2018 - Phase 1
June 14, 2018
Here’s my work progress with the first phase of Google Summer of Code 2018.
Goals
There were two primary goals to be achieved:
- Create a tool which would to create whitelists for projects using
sip
(eg.PyQt
). - Maximise test coverage for Vulture.
Vulture-Whitelist
Idea was to create a tool for people running Vulture analysis on projects using
sip
to create Python bingings for C/C++ code.
Problem
Consider a scenario where a user imports a class in Python, defined through a C++ module. Now, if the user decides to overload any virtual function of that class - It won’t be executed by Python directly. Thus, leaving no way to determine beforehand if that particular method is used anytime, thus causing static analysis tools, like Vulture to report them as unused.
Proposed Solution
The most obvious solution is parse virtual functions present in sip
files and
write them to a file which could then be included in the list of files to be
analysed by Vulture which would then inturn treat these methods as if they are
used, thus preventing Vulture from reporting them as unused.
It has one caveat though - The results may contain a false negative when there
is an overloaded method which isn’t run anytime because Vulture would consider
it as “used” anyways, but nonetheless given that projects like PyQt (the main
consumer of sip
) have thousands of virtual functions, it would still be better
to negotiate a false negative than a false positive.
Gladly, sip
already provided us with a way to export data in XML format, and
in no time @The-Compiler merged a patch so that XML
includes a virtual="(1|0)"
attribute for Function tag and he quickly
implemented a script to parse the XML and filter out all virtual
functions and save them as a whitelist.
My job was to create a plugin based python package as a wrapper around that
script because there are multiple generators for creating bindings and we would
want to support many of them incrementally. vulture-whitelist
can be
found here.
Code Coverage
Vulture had an excellent code coverage of 95%
prior to my stride on maximizing
the coverage ratio. At first, Jendrik thought that achieving 100%
wouldn’t be possible without major changes in how the tests were currently
written. But, after inspecting the coverage report closely we found out that
only a minor tweaks and removal of some obsolete code did the job. It also
unveiled a bug in how the tests for async
functions were written - They
weren’t even being run because of a faulty fixture I wrote.
Now, after the successful culmination of the first phase, I look forward to working on an entirely new feature on Vulture - Dynamic Analysis for detecting false positives. Stay tuned for more info.