Optional Output from Bazel Action? (SWIG rule for Bazel)

Question:

I’m working on a bazel rule (using version 5.2.0) that uses SWIG (version 4.0.1) to make a python library from C++ code, adapted from a rule in the tensorflow library. The problem I’ve run into is that, depending on the contents of ctx.file.source.path, the swig invocation might produce a necessary .h file. If it does, the rule below works great. If it doesn’t, I get:

ERROR: BUILD:31:11: output 'foo_swig_h.h' was not created
ERROR: BUILD:31:11: SWIGing foo.i. failed: not all outputs were created or valid

If the h_out stuff is removed from _py_swig_gen_impl, the rule below works great when swig doesn’t produce the .h file. But, if swig does produce one, bazel seems to ignore it and it isn’t available for native.cc_binary to compile, resulting in gcc failing with a ‘no such file or directory’ error on an #include <foo_swig_cc.h> line in foo_swig_cc.cc.

(The presence or absence of the .h file in the output is determined by whether the .i file at ctx.file.source.path uses SWIG’s "directors" feature.)

def _include_dirs(deps):
    return depset(transitive = [dep[CcInfo].compilation_context.includes for dep in deps]).to_list()

def _headers(deps):
    return depset(transitive = [dep[CcInfo].compilation_context.headers for dep in deps]).to_list()

# Bazel rules for building swig files.
def _py_swig_gen_impl(ctx):
    module_name = ctx.attr.module_name
    cc_out = ctx.actions.declare_file(module_name + "_swig_cc.cc")
    h_out = ctx.actions.declare_file(module_name + "_swig_h.h")
    py_out = ctx.actions.declare_file(module_name + ".py")
    args = ["-c++", "-python", "-py3"]
    args += ["-module", module_name]
    args += ["-I" + x for x in _include_dirs(ctx.attr.deps)]
    args += ["-I" + x.dirname for x in ctx.files.swig_includes]
    args += ["-o", cc_out.path]
    args += ["-outdir", py_out.dirname]
    args += ["-oh", h_out.path]
    args.append(ctx.file.source.path)
    outputs = [cc_out, h_out, py_out]
    ctx.actions.run(
        executable = "swig",
        arguments = args,
        mnemonic = "Swig",
        inputs = [ctx.file.source] + _headers(ctx.attr.deps) + ctx.files.swig_includes,
        outputs = outputs,
        progress_message = "SWIGing %{input}.",
    )
    return [DefaultInfo(files = depset(direct = [cc_out, py_out]))]

_py_swig_gen = rule(
    attrs = {
        "source": attr.label(
            mandatory = True,
            allow_single_file = True,
        ),
        "swig_includes": attr.label_list(
            allow_files = [".i"],
        ),
        "deps": attr.label_list(
            allow_files = True,
            providers = [CcInfo],
        ),
        "module_name": attr.string(mandatory = True),
    },
    implementation = _py_swig_gen_impl,
)

def py_wrap_cc(name, source, module_name = None, deps = [], copts = [], **kwargs):
    if module_name == None:
        module_name = name

    python_deps = [
        "@local_config_python//:python_headers",
        "@local_config_python//:python_lib",
    ]

    # First, invoke the _py_wrap_cc rule, which runs swig. This outputs:
    # `module_name.cc`, `module_name.py`, and, sometimes, `module_name.h` files.
    swig_rule_name = "swig_gen_" + name
    _py_swig_gen(
        name = swig_rule_name,
        source = source,
        swig_includes = ["//third_party/swig_rules:swig_includes"],
        deps = deps + python_deps,
        module_name = module_name,
    )

    # Next, we need to compile the `module_name.cc` and `module_name.h` files
    # from the previous rule. The `module_name.py` file already generated
    # expects there to be a `_module_name.so` file, so we name the cc_binary
    # rule this way to make sure that's the resulting file name.
    cc_lib_name = "_" + module_name + ".so"
    native.cc_binary(
        name = cc_lib_name,
        srcs = [":" + swig_rule_name],
        linkopts = ["-dynamic", "-L/usr/local/lib/"],
        linkshared = True,
        deps = deps + python_deps,
    )

    # Finally, package everything up as a python library that can be depended
    # on. Note that this rule uses the user-given `name`.
    native.py_library(
        name = name,
        srcs = [":" + swig_rule_name],
        srcs_version = "PY3",
        data = [":" + cc_lib_name],
        imports = ["./"],
    )

My question, broadly, how I might best handle this with a single rule. I’ve tried adding a ctx.actions.write before the ctx.actions.run, thinking that I could generate a dummy ‘.h’ file that would be overwritten if needed. That gives me:

ERROR: BUILD:41:11: for foo_swig_h.h, previous action: action 'Writing file foo_swig_h.h', attempted action: action 'SWIGing foo.i.'

My next idea is to remove the h_out stuff and then try to capture the h file for the cc_binary rule with some kind of glob invocation.

Asked By: John

||

Answers:

I’ve seen two approaches: add an attribute to indicate whether it applies, or write a wrapper script to generate it unconditionally.

Adding an attribute means something like "has_h": attr.bool(), and then use that in _py_swig_gen_impl to make the ctx.actions.declare_file(module_name + "_swig_h.h") conditional.

The wrapper script option means using something like this for the executable:

#!/bin/bash

set -e
touch the_path_of_the_header
exec swig "$@"

That will unconditionally create the output, and then swig will overwrite it if applicable. If it’s not applicable, then passing around an empty header file in the Bazel rules should be harmless.

Answered By: Brian Silverman

For posterity, this is what my _py_swig_gen_impl looks like after implementing @Brian’s suggestion above:

def _py_swig_gen_impl(ctx):
    module_name = ctx.attr.module_name
    cc_out = ctx.actions.declare_file(module_name + "_swig_cc.cc")
    h_out = ctx.actions.declare_file(module_name + "_swig_h.h")
    py_out = ctx.actions.declare_file(module_name + ".py")
    include_dirs = _include_dirs(ctx.attr.deps)
    headers = _headers(ctx.attr.deps)
    args = ["-c++", "-python", "-py3"]
    args += ["-module", module_name]
    args += ["-I" + x for x in include_dirs]
    args += ["-I" + x.dirname for x in ctx.files.swig_includes]
    args += ["-o", cc_out.path]
    args += ["-outdir", py_out.dirname]
    args += ["-oh", h_out.path]
    args.append(ctx.file.source.path)
    outputs = [cc_out, h_out, py_out]

    # Depending on the contents of `ctx.file.source`, swig may or may not
    # output a .h file needed by subsequent rules. Bazel doesn't like optional
    # outputs, so instead of invoking swig directly we're going to make a
    # lightweight executable script that first `touch`es the .h file that may
    # get generated, and then execute that. This means we may be propagating
    # an empty .h file around as a "dependency" sometimes, but that's okay.
    swig_script_file = ctx.actions.declare_file("swig_exec.sh")
    ctx.actions.write(
        output = swig_script_file,
        is_executable = True,
        content = "#!/bin/bashnnset -entouch " + h_out.path + "nexec swig "$@"",
    )
    ctx.actions.run(
        executable = swig_script_file,
        arguments = args,
        mnemonic = "Swig",
        inputs = [ctx.file.source] + headers + ctx.files.swig_includes,
        outputs = outputs,
        progress_message = "SWIGing %{input}.",
    )
    return [
        DefaultInfo(files = depset(direct = outputs)),
    ]

The ctx.actions.write generates the suggested bash script:

#!/bin/bash

set -e
touch %{h_out.path}
exec swig "$@"

Which guarantees that the expected h_out will always be output by ctx.actions.run, whether or not swig generates it.

Answered By: John
Categories: questions Tags: , , ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.