In Source Code Generation

Code generation is a common task supported by build systems. It is generally discouraged to generate files in the source directory and commit these generated files. However, there are times where one may need to go against this guidance:

(If you know of other reasons please post them.)

Most build systems will connect any generated code files up to the clean target. This means that anytime someone runs my_build_system clean all generated files will be removed. For generated code that is commited, this will cause extra burden on the developers as they’ll have to rebuild before commiting or check back out the now deleted source files.

The trick to work around the build system is to not tell it about the generated code. We’ll go over an example using CMake.

Using CMake to Generate in Source Code

First we’ll create a custom command that does 2 things:

add_custom_command(
    COMMAND ${CMAKE_COMMAND} -E copy ${CMAKE_CURRENT_SOURCE_DIR}/input.in ${CMAKE_CURRENT_SOURCE_DIR}/generated.c
    COMMAND ${CMAKE_COMMAND} -E touch "${CMAKE_BINARY_DIR}/my_generator.stamp"
    OUTPUT  ${CMAKE_BINARY_DIR}/my_generator.stamp
    DEPENDS input.in
    COMMENT "Generating code ..."
)
add_custom_target(generator_target DEPENDS ${CMAKE_BINARY_DIR}/my_generator.stamp)

One will notice that the OUTPUT only lists the stamp file. The generated file is not marked as an output. This means that CMake doesn’t know the file will be created via this custom command. The input file is listed as a DEPENDS to ensure anytime a downstream target is built and the stamp file either doesn’t exist or is older than the input file this command will run. It is important to note that if the generated source file is deleted and the stamp file exists and is newer than the input file the custom command will not be run.

We’ve also created a custom target. For those unfamiliar with CMake the two commands often go together for code generation. The custom target allows for easier referencing from other targets, as well as providing some other benefits.

CMake will complain if a source file is missing, unless it has the GENERATED property set. However setting the GENERATED property causes CMake to clean the file. This means we either need to manually create an empty version of the source file to get things started. Or one can utilize the [FILE(TOUCH )][file_touch] command at configure time. we'll show the [FILE(TOUCH )][file_touch] option as it is a littl more automated.

One issue with generating a compiled source file is that the backend build system, like Ninja, will do an initial pass to see what’s out of date. Since the generated file is not listed as generated, the build system will think it’s up to date and not realize that by invoking the custom command the generated file will be modified. We use the OBJECT_DEPENDS property to let CMake know that if the stamp file changes, it affects the resultant output of the generated source file, and thus the source file will need to be re-compiled.

FILE(TOUCH ${CMAKE_CURRENT_SOURCE_DIR}/generated.c)
add_library(some_lib ${CMAKE_CURRENT_SOURCE_DIR}/generated.c)
add_dependencies(some_lib generator_target)
set_source_files_properties(${CMAKE_CURRENT_SOURCE_DIR}/generated.c 
    PROPERTIES OBJECT_DEPENDS ${CMAKE_BINARY_DIR}/my_generator.stamp
)

Caveats