⚗️ trying fastcov

2019-03-30 09:12:32 +01:00 · 2019-03-30 09:12:32 +01:00 · b12287b362
commit b12287b362
parent b21c04c938
6 changed files with 494 additions and 1 deletions
--- a/10
+++ b/10
@ -76,12 +76,20 @@ check-fast:
 coverage:
 	rm -fr build_coverage
 	mkdir build_coverage
-	cd build_coverage ; CXX=$(COMPILER_DIR)/g++ cmake .. -GNinja -DJSON_Coverage=ON -DJSON_MultipleHeaders=ON
+	cd build_coverage ; CXX=g++-7 cmake .. -GNinja -DJSON_Coverage=ON -DJSON_MultipleHeaders=ON
 	cd build_coverage ; ninja
 	cd build_coverage ; ctest -E '.*_default' -j10
 	cd build_coverage ; ninja lcov_html
 	open build_coverage/test/html/index.html
 fast-cov:
 	rm -fr build_coverage
 	mkdir build_coverage
 	cd build_coverage ; CXX=$(COMPILER_DIR)/g++ cmake .. -GNinja -DJSON_Coverage=ON -DJSON_MultipleHeaders=ON
 	cd build_coverage ; ninja
 	cd build_coverage ; ctest -E '.*_default' -j10
 	cd build_coverage ; ninja lcov_html2
 	open build_coverage/test/html/index.html
 ##########################################################################
 # documentation tests
--- a/test/CMakeLists.txt
+++ b/test/CMakeLists.txt
@ -51,6 +51,17 @@ if(JSON_Coverage)
        COMMAND genhtml --title "JSON for Modern C++" --legend --demangle-cpp --output-directory html --show-details --branch-coverage json.info.filtered.noexcept
        COMMENT "Generating HTML report test/html/index.html"
    )
    # add target to collect coverage information and generate HTML file
    # (filter script from https://stackoverflow.com/a/43726240/266378)
    add_custom_target(lcov_html2
        COMMAND ${CMAKE_SOURCE_DIR}/test/thirdparty/fastcov/fastcov.py --lcov -o json.info --gcov ${GCOV_BIN}
        COMMAND gsed -i 's%build_coverage/%%g' json.info
        COMMAND lcov -e json.info ${SOURCE_FILES} --output-file json.info.filtered --rc lcov_branch_coverage=1
        COMMAND ${CMAKE_SOURCE_DIR}/test/thirdparty/imapdl/filterbr.py json.info.filtered > json.info.filtered.noexcept
        COMMAND genhtml --title "JSON for Modern C++" --legend --demangle-cpp --output-directory html --show-details --branch-coverage json.info.filtered.noexcept
        COMMENT "Generating HTML report test/html/index.html"
    )
 endif()
 #############################################################################
--- a/test/thirdparty/fastcov/LICENSE
+++ b/test/thirdparty/fastcov/LICENSE
@ -0,0 +1,21 @@
 The MIT License
 Copyright (c) 2018-2019 Bryan Gillespie
 Permission is hereby granted, free of charge, to any person obtaining a copy
 of this software and associated documentation files (the "Software"), to deal
 in the Software without restriction, including without limitation the rights
 to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 copies of the Software, and to permit persons to whom the Software is
 furnished to do so, subject to the following conditions:
 The above copyright notice and this permission notice shall be included in
 all copies or substantial portions of the Software.
 THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
 AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
 OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
 THE SOFTWARE.
--- a/test/thirdparty/fastcov/README.md
+++ b/test/thirdparty/fastcov/README.md
@ -0,0 +1,46 @@
 # fastcov
 A massively parallel gcov wrapper for generating intermediate coverage formats *fast*
 The goal of fastcov is to generate code coverage intermediate formats *as fast as possible* (ideally < 1 second), even for large projects with hundreds of gcda objects. The intermediate formats may then be consumed by a report generator such as lcov's genhtml, or a dedicated front end such as coveralls. fastcov was originally designed to be a drop-in replacement for lcov (application coverage only, not kernel coverage).
 Currently the only intermediate formats supported are gcov json format and lcov info format. Adding support for other formats should require just a few lines of python to transform gcov json format to the desired shape.
 In order to achieve the massive speed gains, a few constraints apply:
 1. GCC version >= 9.0.0
 These versions of GCOV have support for JSON intermediate format as well as streaming report data straight to stdout
 2. Object files must be either be built:
 - Using absolute paths for all `-I` flags passed to the compiler
 - Invoking the compiler from the same root directory
 If you use CMake, you are almost certainly satisfying the second constraint (unless you care about `ExternalProject` coverage).
 ## Sample Usage:
 ```bash
 $ cd build_dir
 $ fastcov.py --zerocounters
 $ <run unit tests>
 $ fastcov.py --exclude /usr/include --lcov -o report.info
 $ genhtml -o code_coverage report.info
 ```
 ## Legacy fastcov
 It is possible to reap most of the benefits of fastcov for GCC version < 9.0.0 and >= 7.1.0. However, there will be a *potential* header file loss of correctness.
 `fastcov_legacy.py` can be used with pre GCC-9 down to GCC 7.1.0 but with a few penalties due to gcov limitations. This is because running gcov in parallel generates .gcov header reports in parallel which overwrite each other. This isn't a problem unless your header files have actual logic (i.e. header only library) that you want to measure coverage for. Use the `-F` flag to specify which gcda files should not be run in parallel in order to capture accurate header file data just for those. I don't plan on supporting `fastcov_legacy.py` aside from basic bug fixes.
 ## Benchmarks
 Anecdotal testing on my own projects indicate that fastcov is over 100x faster than lcov and over 30x faster than gcovr:
 Project Size: ~250 .gcda, ~500 .gcov generated by gcov
 Time to process all gcda and parse all gcov:
 - fastcov: ~700ms
 - lcov:    ~90s
 - gcovr:   ~30s
--- a/test/thirdparty/fastcov/fastcov.py
+++ b/test/thirdparty/fastcov/fastcov.py
@ -0,0 +1,189 @@
 #!/usr/bin/env python3
 """
    Author: Bryan Gillespie
    A massively parallel gcov wrapper for generating intermediate coverage formats fast
    The goal of fastcov is to generate code coverage intermediate formats as fast as possible
    (ideally < 1 second), even for large projects with hundreds of gcda objects. The intermediate
    formats may then be consumed by a report generator such as lcov's genhtml, or a dedicated front
    end such as coveralls.
    Sample Usage:
        $ cd build_dir
        $ ./fastcov.py --zerocounters
        $ <run unit tests>
        $ ./fastcov.py --exclude-gcov /usr/include --lcov -o report.info
        $ genhtml -o code_coverage report.info
 """
 import re
 import os
 import sys
 import glob
 import json
 import argparse
 import threading
 import subprocess
 import multiprocessing
 MINIMUM_GCOV = (9,0,0)
 MINIMUM_CHUNK_SIZE = 10
 # Interesting metrics
 GCOVS_TOTAL = []
 GCOVS_SKIPPED = []
 def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in range(0, len(l), n):
        yield l[i:i + n]
 def getGcovVersion(gcov):
    p = subprocess.Popen([gcov, "-v"], stdout=subprocess.PIPE)
    output = p.communicate()[0].decode('UTF-8')
    p.wait()
    version_str = re.search(r'\s([\d.]+)\s', output.split("\n")[0]).group(1)
    version = tuple(map(int, version_str.split(".")))
    return version
 def removeFiles(files):
    for file in files:
        os.remove(file)
 def getFilteredGcdaFiles(gcda_files, exclude):
    def excludeGcda(gcda):
        for ex in exclude:
            if ex in gcda:
                return False
        return True
    return list(filter(excludeGcda, gcda_files))
 def getGcdaFiles(cwd, gcda_files):
    if not gcda_files:
        gcda_files = glob.glob(os.path.join(cwd, "**/*.gcda"), recursive=True)
    return gcda_files
 def gcovWorker(cwd, gcov, files, chunk, exclude):
    p = subprocess.Popen([gcov, "-it"] + chunk, cwd=cwd, stdout=subprocess.PIPE, stderr=subprocess.DEVNULL)
    for line in iter(p.stdout.readline, b''):
        intermediate_json = json.loads(line.decode(sys.stdout.encoding))
        intermediate_json_files = processGcovs(intermediate_json["files"], exclude)
        for f in intermediate_json_files:
            files.append(f) #thread safe, there might be a better way to do this though
        GCOVS_TOTAL.append(len(intermediate_json["files"]))
        GCOVS_SKIPPED.append(len(intermediate_json["files"])-len(intermediate_json_files))
    p.wait()
 def processGcdas(cwd, gcov, jobs, gcda_files, exclude):
    chunk_size = max(MINIMUM_CHUNK_SIZE, int(len(gcda_files) / jobs) + 1)
    threads = []
    intermediate_json_files = []
    for chunk in chunks(gcda_files, chunk_size):
        t = threading.Thread(target=gcovWorker, args=(cwd, gcov, intermediate_json_files, chunk, exclude))
        threads.append(t)
        t.start()
    log("Spawned %d gcov processes each processing at most %d gcda files" % (len(threads), chunk_size))
    for t in threads:
        t.join()
    return intermediate_json_files
 def processGcov(gcov, files, exclude):
    for ex in exclude:
        if ex in gcov["file"]:
            return
    files.append(gcov)
 def processGcovs(gcov_files, exclude):
    files = []
    for gcov in gcov_files:
        processGcov(gcov, files, exclude)
    return files
 def dumpToLcovInfo(cwd, intermediate, output):
    with open(output, "w") as f:
        for file in intermediate:
            #Convert to absolute path so it plays nice with genhtml
            sf = file["file"]
            if not os.path.isabs(file["file"]):
                sf = os.path.abspath(os.path.join(cwd, file["file"]))
            f.write("SF:%s\n" % sf)
            fn_miss = 0
            for function in file["functions"]:
                f.write("FN:%s,%s\n" % (function["start_line"], function["name"]))
                f.write("FNDA:%s,%s\n" % (function["execution_count"], function["name"]))
                fn_miss += int(not function["execution_count"] == 0)
            f.write("FNF:%s\n" % len(file["functions"]))
            f.write("FNH:%s\n" % (len(file["functions"]) - fn_miss))
            line_miss = 0
            for line in file["lines"]:
                f.write("DA:%s,%s\n" % (line["line_number"], line["count"]))
                line_miss += int(not line["count"] == 0)
            f.write("LF:%s\n" % len(file["lines"]))
            f.write("LH:%s\n" % (len(file["lines"]) - line_miss))
            f.write("end_of_record\n")
 def dumpToGcovJson(intermediate, output):
    with open(output, "w") as f:
        json.dump(intermediate, f)
 def log(line):
    if not args.quiet:
        print(line)
 def main(args):
    # Need at least gcov 9.0.0 because that's when gcov JSON and stdout streaming was introduced
    current_gcov_version = getGcovVersion(args.gcov)
    if current_gcov_version < MINIMUM_GCOV:
        sys.stderr.write("Minimum gcov version {} required, found {}\n".format(".".join(map(str, MINIMUM_GCOV)), ".".join(map(str, current_gcov_version))))
        exit(1)
    gcda_files = getGcdaFiles(args.directory, args.gcda_files)
    log("%d .gcda files" % len(gcda_files))
    if args.excludepre:
        gcda_files = getFilteredGcdaFiles(gcda_files, args.excludepre)
        log("%d .gcda files after filtering" % len(gcda_files))
    # We "zero" the "counters" by simply deleting all gcda files
    if args.zerocounters:
        removeFiles(gcda_files)
        log("%d .gcda files removed" % len(gcda_files))
        return
    intermediate_json_files = processGcdas(args.cdirectory, args.gcov, args.jobs, gcda_files, args.excludepost)
    gcov_total = sum(GCOVS_TOTAL)
    gcov_skipped = sum(GCOVS_SKIPPED)
    log("%d .gcov files generated by gcov" % gcov_total)
    log("%d .gcov files processed by fastcov (%d skipped)" % (gcov_total - gcov_skipped, gcov_skipped))
    if args.lcov:
        dumpToLcovInfo(args.cdirectory, intermediate_json_files, args.output)
        log("Created lcov info file '%s'" % args.output)
    else:
        dumpToGcovJson(intermediate_json_files, args.output)
        log("Created gcov json file '%s'" % args.output)
 if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='A parallel gcov wrapper for fast coverage report generation')
    parser.add_argument('-z', '--zerocounters', dest='zerocounters', action="store_true", help='Recursively delete all gcda files')
    parser.add_argument('-f', '--gcda-files', dest='gcda_files', nargs="+", default=[], help='Specify exactly which gcda files should be processed instead of recursivly searching the search directory.')
    parser.add_argument('-E', '--exclude-gcda', dest='excludepre', nargs="+", default=[], help='.gcda filter - Exclude gcda files from being processed via simple find matching (not regex)')
    parser.add_argument('-e', '--exclude-gcov', dest='excludepost', nargs="+", default=[], help='.gcov filter - Exclude gcov files from being processed via simple find matching (not regex)')
    parser.add_argument('-g', '--gcov', dest='gcov', default='gcov', help='which gcov binary to use')
    parser.add_argument('-d', '--search-directory', dest='directory', default=".", help='Base directory to recursively search for gcda files (default: .)')
    parser.add_argument('-c', '--compiler-directory', dest='cdirectory', default=".", help='Base directory compiler was invoked from (default: .)')
    parser.add_argument('-j', '--jobs', dest='jobs', type=int, default=multiprocessing.cpu_count(), help='Number of parallel gcov to spawn (default: %d).' % multiprocessing.cpu_count())
    parser.add_argument('-o', '--output', dest='output', default="coverage.json", help='Name of output file (default: coverage.json)')
    parser.add_argument('-i', '--lcov', dest='lcov', action="store_true", help='Output in lcov info format instead of gcov json')
    parser.add_argument('-q', '--quiet', dest='quiet', action="store_true", help='Suppress output to stdout')
    args = parser.parse_args()
    main(args)
--- a/test/thirdparty/fastcov/fastcov_legacy.py
+++ b/test/thirdparty/fastcov/fastcov_legacy.py
@ -0,0 +1,218 @@
 #!/usr/bin/env python3
 """
    Author: Bryan Gillespie
    Legacy version... supports versions 7.1.0 <= GCC < 9.0.0
    A massively parallel gcov wrapper for generating intermediate coverage formats fast
    The goal of fastcov is to generate code coverage intermediate formats as fast as possible
    (ideally < 1 second), even for large projects with hundreds of gcda objects. The intermediate
    formats may then be consumed by a report generator such as lcov's genhtml, or a dedicated front
    end such as coveralls.
    Sample Usage:
        $ cd build_dir
        $ ./fastcov.py --exclude-gcov /usr/include --lcov -o report.info
        $ genhtml -o code_coverage report.info
 """
 import re
 import os
 import glob
 import json
 import argparse
 import subprocess
 import multiprocessing
 from random import shuffle
 MINIMUM_GCOV = (7,1,0)
 MINIMUM_CHUNK_SIZE = 10
 def chunks(l, n):
    """Yield successive n-sized chunks from l."""
    for i in range(0, len(l), n):
        yield l[i:i + n]
 def getGcovVersion(gcov):
    p = subprocess.Popen([gcov, "-v"], stdout=subprocess.PIPE)
    output = p.communicate()[0].decode('UTF-8')
    p.wait()
    version_str = re.search(r'\s([\d.]+)\s', output.split("\n")[0]).group(1)
    version = tuple(map(int, version_str.split(".")))
    return version
 def removeFiles(files):
    for file in files:
        os.remove(file)
 def getFilteredGcdaFiles(gcda_files, exclude):
    def excludeGcda(gcda):
        for ex in exclude:
            if ex in gcda:
                return False
        return True
    return list(filter(excludeGcda, gcda_files))
 def getGcdaFiles(cwd, gcda_files, exclude):
    if not gcda_files:
        gcda_files = glob.glob(os.path.join(cwd, "**/*.gcda"), recursive=True)
    if exclude:
        return getFilteredGcdaFiles(gcda_files, exclude)
    return gcda_files
 def getGcovFiles(cwd):
    return glob.glob(os.path.join(cwd, "*.gcov"))
 def filterGcovFiles(gcov):
    with open(gcov) as f:
        path = f.readline()[5:]
        for ex in args.exclude:
            if ex in path:
                return False
        return True
 def processGcdasPre9(cwd, gcov, jobs, gcda_files):
    chunk_size = min(MINIMUM_CHUNK_SIZE, int(len(gcda_files) / jobs) + 1)
    processes = []
    # shuffle(gcda_files) # improves performance by preventing any one gcov from bottlenecking on a list of sequential, expensive gcdas (?)
    for chunk in chunks(gcda_files, chunk_size):
        processes.append(subprocess.Popen([gcov, "-i"] + chunk, cwd=cwd, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL))
    for p in processes:
        p.wait()
 def processGcdasPre9Accurate(cwd, gcov, gcda_files, exclude):
    intermediate_json_files = []
    for gcda in gcda_files:
        subprocess.Popen([gcov, "-i", gcda], cwd=cwd, stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL).wait()
        gcov_files = getGcovFiles(cwd)
        intermediate_json_files += processGcovs(gcov_files, exclude)
        removeFiles(gcov_files)
    return intermediate_json_files
 def processGcovLine(file, line):
    line_type, data = line.split(":", 1)
    if line_type == "lcount":
        num, count = data.split(",")
        hit = (count != 0)
        file["lines_hit"] += int(hit)
        file["lines"].append({
            "branches": [],
            "line_number": num,
            "count": count,
            "unexecuted_block": not hit
        })
    elif line_type == "function":
        num, count, name = data.split(",")
        hit = (count != 0)
        file["functions_hit"] += int(hit)
        file["functions"].append({
            "name": name,
            "execution_count": count,
            "start_line": num,
            "end_line": None,
            "blocks": None,
            "blocks_executed": None,
            "demangled_name": None
        })
 def processGcov(files, gcov, exclude):
    with open(gcov) as f:
        path = f.readline()[5:].rstrip()
        for ex in exclude:
            if ex in path:
                return False
        file = {
            "file": path,
            "functions": [],
            "functions_hit": 0,
            "lines": [],
            "lines_hit": 0
        }
        for line in f:
            processGcovLine(file, line.rstrip())
    files.append(file)
    return True
 def processGcovs(gcov_files, exclude):
    files = []
    filtered = 0
    for gcov in gcov_files:
        filtered += int(not processGcov(files, gcov, exclude))
    print("Skipped %d .gcov files" % filtered)
    return files
 def dumpToLcovInfo(intermediate, output):
    with open(output, "w") as f:
        for file in intermediate:
            f.write("SF:%s\n" % file["file"])
            for function in file["functions"]:
                f.write("FN:%s,%s\n" % (function["start_line"], function["name"]))
                f.write("FNDA:%s,%s\n" % (function["execution_count"], function["name"]))
            f.write("FNF:%s\n" % len(file["functions"]))
            f.write("FNH:%s\n" % file["functions_hit"])
            for line in file["lines"]:
                f.write("DA:%s,%s\n" % (line["line_number"], line["count"]))
            f.write("LF:%s\n" % len(file["lines"]))
            f.write("LH:%s\n" % file["lines_hit"])
            f.write("end_of_record\n")
 def dumpToGcovJson(intermediate, output):
    with open(output, "w") as f:
        json.dump(intermediate, f)
 def main(args):
    # Need at least gcov 7.1.0 because of bug not allowing -i in conjunction with multiple files
    # See: https://github.com/gcc-mirror/gcc/commit/41da7513d5aaaff3a5651b40edeccc1e32ea785a
    current_gcov_version = getGcovVersion(args.gcov)
    if current_gcov_version < MINIMUM_GCOV:
        print("Minimum gcov version {} required, found {}".format(".".join(map(str, MINIMUM_GCOV)), ".".join(map(str, current_gcov_version))))
        exit(1)
    gcda_files = getGcdaFiles(args.directory, args.gcda_files, args.excludepre)
    print("Found %d .gcda files" % len(gcda_files))
    # We "zero" the "counters" by simply deleting all gcda files
    if args.zerocounters:
        removeFiles(gcda_files)
        print("Removed %d .gcda files" % len(gcda_files))
        return
    # If we are less than gcov 9.0.0, convert .gcov files to GCOV 9 JSON format
    processGcdasPre9(args.cdirectory, args.gcov, args.jobs, gcda_files)
    gcov_files = getGcovFiles(args.cdirectory)
    print("Found %d .gcov files" % len(gcov_files))
    intermediate_json_files = processGcovs(gcov_files, args.excludepost)
    removeFiles(gcov_files)
    intermediate_json_files += processGcdasPre9Accurate(args.cdirectory, args.gcov, args.gcda_files_accurate, args.excludepost)
    if args.lcov:
        dumpToLcovInfo(intermediate_json_files, args.output)
    else:
        dumpToGcovJson(intermediate_json_files, args.output)
 if __name__ == '__main__':
    parser = argparse.ArgumentParser(description='A parallel gcov wrapper for fast coverage report generation')
    parser.add_argument('-z', '--zerocounters', dest='zerocounters', action="store_true", help='Recursively delete all gcda files')
    parser.add_argument('-f', '--gcda-files', dest='gcda_files', nargs="+", default=[], help='Specify exactly which gcda files should be processed instead of recursivly searching the search directory.')
    parser.add_argument('-F', '--gcda-files-accurate', dest='gcda_files_accurate', nargs="+", default=[], help='(< gcov 9.0.0) Get accurate header coverage information for just these. These files cannot be processed in parallel')
    parser.add_argument('-E', '--exclude-gcda', dest='excludepre', nargs="+", default=[], help='.gcda filter - Exclude gcda files from being processed via simple find matching (not regex)')
    parser.add_argument('-e', '--exclude-gcov', dest='excludepost', nargs="+", default=[], help='.gcov filter - Exclude gcov files from being processed via simple find matching (not regex)')
    parser.add_argument('-g', '--gcov', dest='gcov', default='gcov', help='which gcov binary to use')
    parser.add_argument('-d', '--search-directory', dest='directory', default=".", help='Base directory to recursively search for gcda files (default: .)')
    parser.add_argument('-c', '--compiler-directory', dest='cdirectory', default=".", help='Base directory compiler was invoked from (default: .)')
    parser.add_argument('-j', '--jobs', dest='jobs', type=int, default=multiprocessing.cpu_count(), help='Number of parallel gcov to spawn (default: %d).' % multiprocessing.cpu_count())
    parser.add_argument('-o', '--output', dest='output', default="coverage.json", help='Name of output file (default: coverage.json)')
    parser.add_argument('-i', '--lcov', dest='lcov', action="store_true", help='Output in lcov info format instead of gcov json')
    args = parser.parse_args()
    main(args)