⚗️ trying fastcov

This commit is contained in:
Niels Lohmann 2019-03-30 09:12:32 +01:00
parent b21c04c938
commit b12287b362
No known key found for this signature in database
GPG key ID: 7F3CEA63AE251B69
6 changed files with 494 additions and 1 deletions

View file

@ -76,12 +76,20 @@ check-fast:
coverage:
rm -fr build_coverage
mkdir build_coverage
cd build_coverage ; CXX=$(COMPILER_DIR)/g++ cmake .. -GNinja -DJSON_Coverage=ON -DJSON_MultipleHeaders=ON
cd build_coverage ; CXX=g++-7 cmake .. -GNinja -DJSON_Coverage=ON -DJSON_MultipleHeaders=ON
cd build_coverage ; ninja
cd build_coverage ; ctest -E '.*_default' -j10
cd build_coverage ; ninja lcov_html
open build_coverage/test/html/index.html
fast-cov:
rm -fr build_coverage
mkdir build_coverage
cd build_coverage ; CXX=$(COMPILER_DIR)/g++ cmake .. -GNinja -DJSON_Coverage=ON -DJSON_MultipleHeaders=ON
cd build_coverage ; ninja
cd build_coverage ; ctest -E '.*_default' -j10
cd build_coverage ; ninja lcov_html2
open build_coverage/test/html/index.html
##########################################################################
# documentation tests

View file

@ -51,6 +51,17 @@ if(JSON_Coverage)
COMMAND genhtml --title "JSON for Modern C++" --legend --demangle-cpp --output-directory html --show-details --branch-coverage json.info.filtered.noexcept
COMMENT "Generating HTML report test/html/index.html"
)
# add target to collect coverage information and generate HTML file
# (filter script from https://stackoverflow.com/a/43726240/266378)
add_custom_target(lcov_html2
COMMAND ${CMAKE_SOURCE_DIR}/test/thirdparty/fastcov/fastcov.py --lcov -o json.info --gcov ${GCOV_BIN}
COMMAND gsed -i 's%build_coverage/%%g' json.info
COMMAND lcov -e json.info ${SOURCE_FILES} --output-file json.info.filtered --rc lcov_branch_coverage=1
COMMAND ${CMAKE_SOURCE_DIR}/test/thirdparty/imapdl/filterbr.py json.info.filtered > json.info.filtered.noexcept
COMMAND genhtml --title "JSON for Modern C++" --legend --demangle-cpp --output-directory html --show-details --branch-coverage json.info.filtered.noexcept
COMMENT "Generating HTML report test/html/index.html"
)
endif()
#############################################################################

21
test/thirdparty/fastcov/LICENSE vendored Normal file
View file

@ -0,0 +1,21 @@
The MIT License
Copyright (c) 2018-2019 Bryan Gillespie
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.

46
test/thirdparty/fastcov/README.md vendored Normal file
View file

@ -0,0 +1,46 @@
# fastcov
A massively parallel gcov wrapper for generating intermediate coverage formats *fast*
The goal of fastcov is to generate code coverage intermediate formats *as fast as possible* (ideally < 1 second), even for large projects with hundreds of gcda objects. The intermediate formats may then be consumed by a report generator such as lcov's genhtml, or a dedicated front end such as coveralls. fastcov was originally designed to be a drop-in replacement for lcov (application coverage only, not kernel coverage).
Currently the only intermediate formats supported are gcov json format and lcov info format. Adding support for other formats should require just a few lines of python to transform gcov json format to the desired shape.
In order to achieve the massive speed gains, a few constraints apply:
1. GCC version >= 9.0.0
These versions of GCOV have support for JSON intermediate format as well as streaming report data straight to stdout
2. Object files must be either be built:
- Using absolute paths for all `-I` flags passed to the compiler
- Invoking the compiler from the same root directory
If you use CMake, you are almost certainly satisfying the second constraint (unless you care about `ExternalProject` coverage).
## Sample Usage:
```bash
$ cd build_dir
$ fastcov.py --zerocounters
$ <run unit tests>
$ fastcov.py --exclude /usr/include --lcov -o report.info
$ genhtml -o code_coverage report.info
```
## Legacy fastcov
It is possible to reap most of the benefits of fastcov for GCC version < 9.0.0 and >= 7.1.0. However, there will be a *potential* header file loss of correctness.
`fastcov_legacy.py` can be used with pre GCC-9 down to GCC 7.1.0 but with a few penalties due to gcov limitations. This is because running gcov in parallel generates .gcov header reports in parallel which overwrite each other. This isn't a problem unless your header files have actual logic (i.e. header only library) that you want to measure coverage for. Use the `-F` flag to specify which gcda files should not be run in parallel in order to capture accurate header file data just for those. I don't plan on supporting `fastcov_legacy.py` aside from basic bug fixes.
## Benchmarks
Anecdotal testing on my own projects indicate that fastcov is over 100x faster than lcov and over 30x faster than gcovr:
Project Size: ~250 .gcda, ~500 .gcov generated by gcov
Time to process all gcda and parse all gcov:
- fastcov: ~700ms
- lcov: ~90s
- gcovr: ~30s

189
test/thirdparty/fastcov/fastcov.py vendored Executable file
View file

@ -0,0 +1,189 @@
#!/usr/bin/env python3
"""
Author: Bryan Gillespie
A massively parallel gcov wrapper for generating intermediate coverage formats fast
The goal of fastcov is to generate code coverage intermediate formats as fast as possible
(ideally < 1 second), even for large projects with hundreds of gcda objects. The intermediate
formats may then be consumed by a report generator such as lcov's genhtml, or a dedicated front
end such as coveralls.
Sample Usage:
$ cd build_dir
$ ./fastcov.py --zerocounters
$ <run unit tests>
$ ./fastcov.py --exclude-gcov /usr/include --lcov -o report.info
$ genhtml -o code_coverage report.info
"""
import re
import os
import sys
import glob
import json
import argparse
import threading
import subprocess
import multiprocessing
MINIMUM_GCOV = (9,0,0)
MINIMUM_CHUNK_SIZE = 10
# Interesting metrics
GCOVS_TOTAL = []
GCOVS_SKIPPED = []
def chunks(l, n):
"""Yield successive n-sized chunks from l."""
for i in range(0, len(l), n):
yield l[i:i + n]
def getGcovVersion(gcov):
p = subprocess.Popen([gcov, "-v"], stdout=subprocess.PIPE)
output = p.communicate()[0].decode('UTF-8')
p.wait()
version_str = re.search(r'\s([\d.]+)\s', output.split("\n")[0]).group(1)
version = tuple(map(int, version_str.split(".")))
return version
def removeFiles(files):
for file in files:
os.remove(file)
def getFilteredGcdaFiles(gcda_files, exclude):
def excludeGcda(gcda):
for ex in exclude:
if ex in gcda:
return False
return True
return list(filter(excludeGcda, gcda_files))
def getGcdaFiles(cwd, gcda_files):
if not gcda_files:
gcda_files = glob.glob(os.path.join(cwd, "**/*.gcda"), recursive=True)
return gcda_files
def gcovWorker(cwd, gcov, files, chunk, exclude):
p = subprocess.Popen([gcov, "-it"] + chunk, cwd=cwd, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)
for line in iter(p.stdout.readline, b''):
intermediate_json = json.loads(line.decode(sys.stdout.encoding))
intermediate_json_files = processGcovs(intermediate_json["files"], exclude)
for f in intermediate_json_files:
files.append(f) #thread safe, there might be a better way to do this though
GCOVS_TOTAL.append(len(intermediate_json["files"]))
GCOVS_SKIPPED.append(len(intermediate_json["files"])-len(intermediate_json_files))
p.wait()
def processGcdas(cwd, gcov, jobs, gcda_files, exclude):
chunk_size = max(MINIMUM_CHUNK_SIZE, int(len(gcda_files) / jobs) + 1)
threads = []
intermediate_json_files = []
for chunk in chunks(gcda_files, chunk_size):
t = threading.Thread(target=gcovWorker, args=(cwd, gcov, intermediate_json_files, chunk, exclude))
threads.append(t)
t.start()
log("Spawned %d gcov processes each processing at most %d gcda files" % (len(threads), chunk_size))
for t in threads:
t.join()
return intermediate_json_files
def processGcov(gcov, files, exclude):
for ex in exclude:
if ex in gcov["file"]:
return
files.append(gcov)
def processGcovs(gcov_files, exclude):
files = []
for gcov in gcov_files:
processGcov(gcov, files, exclude)
return files
def dumpToLcovInfo(cwd, intermediate, output):
with open(output, "w") as f:
for file in intermediate:
#Convert to absolute path so it plays nice with genhtml
sf = file["file"]
if not os.path.isabs(file["file"]):
sf = os.path.abspath(os.path.join(cwd, file["file"]))
f.write("SF:%s\n" % sf)
fn_miss = 0
for function in file["functions"]:
f.write("FN:%s,%s\n" % (function["start_line"], function["name"]))
f.write("FNDA:%s,%s\n" % (function["execution_count"], function["name"]))
fn_miss += int(not function["execution_count"] == 0)
f.write("FNF:%s\n" % len(file["functions"]))
f.write("FNH:%s\n" % (len(file["functions"]) - fn_miss))
line_miss = 0
for line in file["lines"]:
f.write("DA:%s,%s\n" % (line["line_number"], line["count"]))
line_miss += int(not line["count"] == 0)
f.write("LF:%s\n" % len(file["lines"]))
f.write("LH:%s\n" % (len(file["lines"]) - line_miss))
f.write("end_of_record\n")
def dumpToGcovJson(intermediate, output):
with open(output, "w") as f:
json.dump(intermediate, f)
def log(line):
if not args.quiet:
print(line)
def main(args):
# Need at least gcov 9.0.0 because that's when gcov JSON and stdout streaming was introduced
current_gcov_version = getGcovVersion(args.gcov)
if current_gcov_version < MINIMUM_GCOV:
sys.stderr.write("Minimum gcov version {} required, found {}\n".format(".".join(map(str, MINIMUM_GCOV)), ".".join(map(str, current_gcov_version))))
exit(1)
gcda_files = getGcdaFiles(args.directory, args.gcda_files)
log("%d .gcda files" % len(gcda_files))
if args.excludepre:
gcda_files = getFilteredGcdaFiles(gcda_files, args.excludepre)
log("%d .gcda files after filtering" % len(gcda_files))
# We "zero" the "counters" by simply deleting all gcda files
if args.zerocounters:
removeFiles(gcda_files)
log("%d .gcda files removed" % len(gcda_files))
return
intermediate_json_files = processGcdas(args.cdirectory, args.gcov, args.jobs, gcda_files, args.excludepost)
gcov_total = sum(GCOVS_TOTAL)
gcov_skipped = sum(GCOVS_SKIPPED)
log("%d .gcov files generated by gcov" % gcov_total)
log("%d .gcov files processed by fastcov (%d skipped)" % (gcov_total - gcov_skipped, gcov_skipped))
if args.lcov:
dumpToLcovInfo(args.cdirectory, intermediate_json_files, args.output)
log("Created lcov info file '%s'" % args.output)
else:
dumpToGcovJson(intermediate_json_files, args.output)
log("Created gcov json file '%s'" % args.output)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='A parallel gcov wrapper for fast coverage report generation')
parser.add_argument('-z', '--zerocounters', dest='zerocounters', action="store_true", help='Recursively delete all gcda files')
parser.add_argument('-f', '--gcda-files', dest='gcda_files', nargs="+", default=[], help='Specify exactly which gcda files should be processed instead of recursivly searching the search directory.')
parser.add_argument('-E', '--exclude-gcda', dest='excludepre', nargs="+", default=[], help='.gcda filter - Exclude gcda files from being processed via simple find matching (not regex)')
parser.add_argument('-e', '--exclude-gcov', dest='excludepost', nargs="+", default=[], help='.gcov filter - Exclude gcov files from being processed via simple find matching (not regex)')
parser.add_argument('-g', '--gcov', dest='gcov', default='gcov', help='which gcov binary to use')
parser.add_argument('-d', '--search-directory', dest='directory', default=".", help='Base directory to recursively search for gcda files (default: .)')
parser.add_argument('-c', '--compiler-directory', dest='cdirectory', default=".", help='Base directory compiler was invoked from (default: .)')
parser.add_argument('-j', '--jobs', dest='jobs', type=int, default=multiprocessing.cpu_count(), help='Number of parallel gcov to spawn (default: %d).' % multiprocessing.cpu_count())
parser.add_argument('-o', '--output', dest='output', default="coverage.json", help='Name of output file (default: coverage.json)')
parser.add_argument('-i', '--lcov', dest='lcov', action="store_true", help='Output in lcov info format instead of gcov json')
parser.add_argument('-q', '--quiet', dest='quiet', action="store_true", help='Suppress output to stdout')
args = parser.parse_args()
main(args)

218
test/thirdparty/fastcov/fastcov_legacy.py vendored Executable file
View file

@ -0,0 +1,218 @@
#!/usr/bin/env python3
"""
Author: Bryan Gillespie
Legacy version... supports versions 7.1.0 <= GCC < 9.0.0
A massively parallel gcov wrapper for generating intermediate coverage formats fast
The goal of fastcov is to generate code coverage intermediate formats as fast as possible
(ideally < 1 second), even for large projects with hundreds of gcda objects. The intermediate
formats may then be consumed by a report generator such as lcov's genhtml, or a dedicated front
end such as coveralls.
Sample Usage:
$ cd build_dir
$ ./fastcov.py --exclude-gcov /usr/include --lcov -o report.info
$ genhtml -o code_coverage report.info
"""
import re
import os
import glob
import json
import argparse
import subprocess
import multiprocessing
from random import shuffle
MINIMUM_GCOV = (7,1,0)
MINIMUM_CHUNK_SIZE = 10
def chunks(l, n):
"""Yield successive n-sized chunks from l."""
for i in range(0, len(l), n):
yield l[i:i + n]
def getGcovVersion(gcov):
p = subprocess.Popen([gcov, "-v"], stdout=subprocess.PIPE)
output = p.communicate()[0].decode('UTF-8')
p.wait()
version_str = re.search(r'\s([\d.]+)\s', output.split("\n")[0]).group(1)
version = tuple(map(int, version_str.split(".")))
return version
def removeFiles(files):
for file in files:
os.remove(file)
def getFilteredGcdaFiles(gcda_files, exclude):
def excludeGcda(gcda):
for ex in exclude:
if ex in gcda:
return False
return True
return list(filter(excludeGcda, gcda_files))
def getGcdaFiles(cwd, gcda_files, exclude):
if not gcda_files:
gcda_files = glob.glob(os.path.join(cwd, "**/*.gcda"), recursive=True)
if exclude:
return getFilteredGcdaFiles(gcda_files, exclude)
return gcda_files
def getGcovFiles(cwd):
return glob.glob(os.path.join(cwd, "*.gcov"))
def filterGcovFiles(gcov):
with open(gcov) as f:
path = f.readline()[5:]
for ex in args.exclude:
if ex in path:
return False
return True
def processGcdasPre9(cwd, gcov, jobs, gcda_files):
chunk_size = min(MINIMUM_CHUNK_SIZE, int(len(gcda_files) / jobs) + 1)
processes = []
# shuffle(gcda_files) # improves performance by preventing any one gcov from bottlenecking on a list of sequential, expensive gcdas (?)
for chunk in chunks(gcda_files, chunk_size):
processes.append(subprocess.Popen([gcov, "-i"] + chunk, cwd=cwd, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL))
for p in processes:
p.wait()
def processGcdasPre9Accurate(cwd, gcov, gcda_files, exclude):
intermediate_json_files = []
for gcda in gcda_files:
subprocess.Popen([gcov, "-i", gcda], cwd=cwd, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL).wait()
gcov_files = getGcovFiles(cwd)
intermediate_json_files += processGcovs(gcov_files, exclude)
removeFiles(gcov_files)
return intermediate_json_files
def processGcovLine(file, line):
line_type, data = line.split(":", 1)
if line_type == "lcount":
num, count = data.split(",")
hit = (count != 0)
file["lines_hit"] += int(hit)
file["lines"].append({
"branches": [],
"line_number": num,
"count": count,
"unexecuted_block": not hit
})
elif line_type == "function":
num, count, name = data.split(",")
hit = (count != 0)
file["functions_hit"] += int(hit)
file["functions"].append({
"name": name,
"execution_count": count,
"start_line": num,
"end_line": None,
"blocks": None,
"blocks_executed": None,
"demangled_name": None
})
def processGcov(files, gcov, exclude):
with open(gcov) as f:
path = f.readline()[5:].rstrip()
for ex in exclude:
if ex in path:
return False
file = {
"file": path,
"functions": [],
"functions_hit": 0,
"lines": [],
"lines_hit": 0
}
for line in f:
processGcovLine(file, line.rstrip())
files.append(file)
return True
def processGcovs(gcov_files, exclude):
files = []
filtered = 0
for gcov in gcov_files:
filtered += int(not processGcov(files, gcov, exclude))
print("Skipped %d .gcov files" % filtered)
return files
def dumpToLcovInfo(intermediate, output):
with open(output, "w") as f:
for file in intermediate:
f.write("SF:%s\n" % file["file"])
for function in file["functions"]:
f.write("FN:%s,%s\n" % (function["start_line"], function["name"]))
f.write("FNDA:%s,%s\n" % (function["execution_count"], function["name"]))
f.write("FNF:%s\n" % len(file["functions"]))
f.write("FNH:%s\n" % file["functions_hit"])
for line in file["lines"]:
f.write("DA:%s,%s\n" % (line["line_number"], line["count"]))
f.write("LF:%s\n" % len(file["lines"]))
f.write("LH:%s\n" % file["lines_hit"])
f.write("end_of_record\n")
def dumpToGcovJson(intermediate, output):
with open(output, "w") as f:
json.dump(intermediate, f)
def main(args):
# Need at least gcov 7.1.0 because of bug not allowing -i in conjunction with multiple files
# See: https://github.com/gcc-mirror/gcc/commit/41da7513d5aaaff3a5651b40edeccc1e32ea785a
current_gcov_version = getGcovVersion(args.gcov)
if current_gcov_version < MINIMUM_GCOV:
print("Minimum gcov version {} required, found {}".format(".".join(map(str, MINIMUM_GCOV)), ".".join(map(str, current_gcov_version))))
exit(1)
gcda_files = getGcdaFiles(args.directory, args.gcda_files, args.excludepre)
print("Found %d .gcda files" % len(gcda_files))
# We "zero" the "counters" by simply deleting all gcda files
if args.zerocounters:
removeFiles(gcda_files)
print("Removed %d .gcda files" % len(gcda_files))
return
# If we are less than gcov 9.0.0, convert .gcov files to GCOV 9 JSON format
processGcdasPre9(args.cdirectory, args.gcov, args.jobs, gcda_files)
gcov_files = getGcovFiles(args.cdirectory)
print("Found %d .gcov files" % len(gcov_files))
intermediate_json_files = processGcovs(gcov_files, args.excludepost)
removeFiles(gcov_files)
intermediate_json_files += processGcdasPre9Accurate(args.cdirectory, args.gcov, args.gcda_files_accurate, args.excludepost)
if args.lcov:
dumpToLcovInfo(intermediate_json_files, args.output)
else:
dumpToGcovJson(intermediate_json_files, args.output)
if __name__ == '__main__':
parser = argparse.ArgumentParser(description='A parallel gcov wrapper for fast coverage report generation')
parser.add_argument('-z', '--zerocounters', dest='zerocounters', action="store_true", help='Recursively delete all gcda files')
parser.add_argument('-f', '--gcda-files', dest='gcda_files', nargs="+", default=[], help='Specify exactly which gcda files should be processed instead of recursivly searching the search directory.')
parser.add_argument('-F', '--gcda-files-accurate', dest='gcda_files_accurate', nargs="+", default=[], help='(< gcov 9.0.0) Get accurate header coverage information for just these. These files cannot be processed in parallel')
parser.add_argument('-E', '--exclude-gcda', dest='excludepre', nargs="+", default=[], help='.gcda filter - Exclude gcda files from being processed via simple find matching (not regex)')
parser.add_argument('-e', '--exclude-gcov', dest='excludepost', nargs="+", default=[], help='.gcov filter - Exclude gcov files from being processed via simple find matching (not regex)')
parser.add_argument('-g', '--gcov', dest='gcov', default='gcov', help='which gcov binary to use')
parser.add_argument('-d', '--search-directory', dest='directory', default=".", help='Base directory to recursively search for gcda files (default: .)')
parser.add_argument('-c', '--compiler-directory', dest='cdirectory', default=".", help='Base directory compiler was invoked from (default: .)')
parser.add_argument('-j', '--jobs', dest='jobs', type=int, default=multiprocessing.cpu_count(), help='Number of parallel gcov to spawn (default: %d).' % multiprocessing.cpu_count())
parser.add_argument('-o', '--output', dest='output', default="coverage.json", help='Name of output file (default: coverage.json)')
parser.add_argument('-i', '--lcov', dest='lcov', action="store_true", help='Output in lcov info format instead of gcov json')
args = parser.parse_args()
main(args)