Designing Patterns

View within the Urban Jungle

Discussing nitty gritty software development details that Tony finds interesting



View within the Urban Jungle RSS FeedSite Feed

Non-trivial Non-Recursive Make

An earlier post discusses why non-recursive build systems are superior to recursive build systems and shows how to build a simple hello_world program non-recursively with make. The non-recursive hello_world build has several shortcomings, however. The worst is that adding a new directory to the build requires adding a line like this to the top-level makefile:


include new_directory/non_recursive.mk

It might not seem like a significant problem in this small example, but it becomes so in larger systems. Hard coding directories in makefiles is a mistake that duplicates knowledge of the directories that need to be built, already contained in the simple presence of makefiles in the directories. This duplication results in having to make two changes in order to add a new directory to the build: a directory-specific makefile must be created and a line must be added to some parent makefile. Renaming a directory similarly requires two changes. A former employer’s recursive build system required that each directory’s makefile list the subdirectories that needed to be built, and it was quite common for a developer to forget to modify the parent directory makefiles when adding or renaming directories. The build within the created or renamed directory worked perfectly well, of course, allowing the error to propagate and break the top-level production build. Another problem with hard coding all build directories into the top-level makefile is that many different developers will need to edit the file often in order to modify the build’s directory list. Depending on the sophistication of the system’s version control software, this will lead to work being slowed by contention over the file or by programmers always having to merge others’ changes into their work.

A better solution is for make itself to detect the directories that need to be built automatically, removing any need to duplicate and maintain this information in the makefiles. This is what pattmake does, and it is surprisingly simple to accomplish. Assuming that the directory-specific makefiles in the build system always have the same name (such files are named contents.mk in the pattmake system, for instance), the find utility can be invoked from within make to discover the directories that need to be built (I originally got this technique from this gmake book). I’ve created a new hello_world build that demonstrates this (the source code for which can be downloaded here). This example requires at least gmake 3.81 and can be run with make -f build.mk all.

Here is build.mk:


include rules.mk

DIR_SPECIFIC_MAKEFILE_NAME := contents.mk

buildable_dirs = \
	$(shell find . -name $(DIR_SPECIFIC_MAKEFILE_NAME) | \
	sed 's/$(DIR_SPECIFIC_MAKEFILE_NAME)//')

define process_dir
  PROGRAMS :=

  include $1/contents.mk

  ALL_PROGRAMS += $$(PROGRAMS)

  clean::
	$(RM) $1/*.o
	$(RM) $1/*.a

  #
  # The last line of the function must be left blank
  # in order to avoid some quirky, broken gmake
  # behavior when expanding macros within foreach
  # loops.
  #

endef

define process_dirs
  $(foreach DIR, $(buildable_dirs),\
	$(call process_dir,$(DIR)))
endef

ALL_PROGRAMS :=

$(eval $(process_dirs))

.PHONY:
	all \
	clean

all: $(ALL_PROGRAMS)

clean::
	$(RM) $(ALL_PROGRAMS)

Here is bin/contents.mk:


bin/hello_world: bin/hello_world.o lib/hello.a
        $(CXX) $(LDFLAGS) -o $@ $^

PROGRAMS += bin/hello_world

Here is lib/contents.mk:


lib/hello.a: lib/hello.o

The build.mk code may seem very confusing to those who have not done significant make programming, but it’s doing something quite simple. The apparent complication stems from make’s very unwieldy language. Before explaining the details, I am going to provide a broad overview of build.mk's flow. buildable_dirs (line #5) is a make function that invokes the UNIX find utility (via the built-in shell function) to get the paths of all contents.mk files in or under make's current working directory. The results then are piped to sed in order to remove contents.mk from the paths, yielding a list of directories containing a contents.mk file.

The process_dirs function (line #29) loops over the directories returned by buildable_dirs with the built-in foreach function and calls the process_dir function for each directory. The process_dir function (line #9) includes the specified directory’s contents.mk file, aggregates any programs defined by contents.mk (the PROGRAMS variable) into the ALL_PROGRAMS variable, and adds an action to the clean target that will remove the files created by the build (object files and libraries) in the directory. At the end of build.mk, a .PHONY all target (line #42) is defined to build ALL_PROGRAMS, and a command is added to the clean target to remove ALL_PROGRAMS.

build.mk's main complication is the use of gmake's built-in eval function to call process_dirs (line #36). eval allows makefile code to be generated and interpreted dynamically. It is passed a function call and makes two passes over the specified function:

  1. In the first pass, gmake evaluates all unescaped variables in the function (a variable evaluation can be escaped by adding an extra $; thus $$(PROGRAMS) is an escaped evaluation and won’t be evaluated until the second pass). This pass respects neither comments nor conditionals, since gmake is not interpreting the code but instead just evaluating all unescaped variables in order to generate code.
  2. In the second pass, gmake interprets the code generated by the first pass in the normal fashion.

The foreach function normally cannot be used to generate code (trying to do so will cause make to die with an error), but it can if evaluated during eval's first pass and so process_dirs must be called with eval.

process_dirs uses the built-in call function to invoke process_dir (line #31). The call function allows arguments to be passed to a user-defined make function; the first argument is bound to $1 within the function, the second to $2, etc. So when process_dir is called with the $(DIR) argument, $(DIR) is bound to $1 inside the function. Note, however, that since process_dirs is invoked through eval and the call to process_dir within process_dirs is unescaped, process_dir effectively also is invoked through eval and so is evaluated in two passes. Thus, $1 is evaluated in eval's first pass, and so it can be used safely in commands (such as $(RM) $1/*.o on line #17). Normally, variables in commands are not evaluated when make interprets the command, but instead the evaluation is deferred until the command actually is executed after all makefile code has been interpreted. This results in the variable evaluating to the last value the variable had during the interpretation of the makefile, not the value that the variable had when the command was interpreted (assuming otherwise is a very common mistake when writing makefiles).

The evaluation of the PROGRAMS variable within process_dir (line #14) must be escaped until eval's second pass, because PROGRAMS only will be defined when contents.mk is included, which gmake does during the second pass.

Another small complication is that the clean target has double-colon rules, which essentially allow multiple actions to be chained to a target. clean has one action for each directory built (the removal of the directory’s object and library files) and one action at the end of build.mk (the removal of the programs listed in ALL_PROGRAMS).

This hello_world build has several advantages over the previous version. Firstly, as promised, it automatically discovers the directories that need to be built, eliminating any need on the part of the developer to duplicate this information inside of makefiles. Secondly, it allows arbitrary makefile code to be interpreted before and after each directory-specific makefile is included. The build leverages this to add clean actions for each directory. It also aggregates the programs defined by each contents.mk (the PROGRAMS variable) into ALL_PROGRAMS. This sandboxes the directory-specific makefiles in that they do not need to reference (and thus potentially corrupt) variables like ALL_PROGRAMS that contain information for the entire build. In the original hello_world example, a developer accidentally could have assigned to PROGRAMS rather than appending new programs to it, which would have wiped out all programs appended in prior directory-specific makefiles. This kind of error might not be noticed until production because the build for the developer’s directory and all directories built afterward would work perfectly.

This hello_world build has a robust, non-recursive implementation that eliminates the duplication of directory information in the build, makes adding directories to the build trivially easy, allows make code to be run for each of the build’s directories, and sandboxes directory-specific makefiles. It could form the basis of a production quality build system, and the pattmake build system’s core is just an elaboration of this example.


You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

Leave a Reply