Designing Patterns

View within the Urban Jungle

Discussing nitty gritty software development details that Tony finds interesting



View within the Urban Jungle RSS FeedSite Feed

Non-trivial Non-Recursive Make

An earlier post discusses why non-recursive build systems are superior to recursive build systems and shows how to build a simple hello_world program non-recursively with make. The non-recursive hello_world build has several shortcomings, however. The worst is that adding a new directory to the build requires adding a line like this to the top-level makefile:


include new_directory/non_recursive.mk

It might not seem like a significant problem in this small example, but it becomes so in larger systems. Hard coding directories in makefiles is a mistake that duplicates knowledge of the directories that need to be built, already contained in the simple presence of makefiles in the directories. This duplication results in having to make two changes in order to add a new directory to the build: a directory-specific makefile must be created and a line must be added to some parent makefile. Renaming a directory similarly requires two changes. A former employer’s recursive build system required that each directory’s makefile list the subdirectories that needed to be built, and it was quite common for a developer to forget to modify the parent directory makefiles when adding or renaming directories. The build within the created or renamed directory worked perfectly well, of course, allowing the error to propagate and break the top-level production build. Another problem with hard coding all build directories into the top-level makefile is that many different developers will need to edit the file often in order to modify the build’s directory list. Depending on the sophistication of the system’s version control software, this will lead to work being slowed by contention over the file or by programmers always having to merge others’ changes into their work.

A better solution is for make itself to detect the directories that need to be built automatically, removing any need to duplicate and maintain this information in the makefiles. This is what pattmake does, and it is surprisingly simple to accomplish. Assuming that the directory-specific makefiles in the build system always have the same name (such files are named contents.mk in the pattmake system, for instance), the find utility can be invoked from within make to discover the directories that need to be built (I originally got this technique from this gmake book). I’ve created a new hello_world build that demonstrates this (the source code for which can be downloaded here). This example requires at least gmake 3.81 and can be run with make -f build.mk all.

Here is build.mk:


include rules.mk

DIR_SPECIFIC_MAKEFILE_NAME := contents.mk

buildable_dirs = \
	$(shell find . -name $(DIR_SPECIFIC_MAKEFILE_NAME) | \
	sed 's/$(DIR_SPECIFIC_MAKEFILE_NAME)//')

define process_dir
  PROGRAMS :=

  include $1/contents.mk

  ALL_PROGRAMS += $$(PROGRAMS)

  clean::
	$(RM) $1/*.o
	$(RM) $1/*.a

  #
  # The last line of the function must be left blank
  # in order to avoid some quirky, broken gmake
  # behavior when expanding macros within foreach
  # loops.
  #

endef

define process_dirs
  $(foreach DIR, $(buildable_dirs),\
	$(call process_dir,$(DIR)))
endef

ALL_PROGRAMS :=

$(eval $(process_dirs))

.PHONY:
	all \
	clean

all: $(ALL_PROGRAMS)

clean::
	$(RM) $(ALL_PROGRAMS)

Here is bin/contents.mk:


bin/hello_world: bin/hello_world.o lib/hello.a
        $(CXX) $(LDFLAGS) -o $@ $^

PROGRAMS += bin/hello_world

Here is lib/contents.mk:


lib/hello.a: lib/hello.o

The build.mk code may seem very confusing to those who have not done significant make programming, but it’s doing something quite simple. The apparent complication stems from make’s very unwieldy language. Before explaining the details, I am going to provide a broad overview of build.mk's flow. buildable_dirs (line #5) is a make function that invokes the UNIX find utility (via the built-in shell function) to get the paths of all contents.mk files in or under make's current working directory. The results then are piped to sed in order to remove contents.mk from the paths, yielding a list of directories containing a contents.mk file.

The process_dirs function (line #29) loops over the directories returned by buildable_dirs with the built-in foreach function and calls the process_dir function for each directory. The process_dir function (line #9) includes the specified directory’s contents.mk file, aggregates any programs defined by contents.mk (the PROGRAMS variable) into the ALL_PROGRAMS variable, and adds an action to the clean target that will remove the files created by the build (object files and libraries) in the directory. At the end of build.mk, a .PHONY all target (line #42) is defined to build ALL_PROGRAMS, and a command is added to the clean target to remove ALL_PROGRAMS.

build.mk's main complication is the use of gmake's built-in eval function to call process_dirs (line #36). eval allows makefile code to be generated and interpreted dynamically. It is passed a function call and makes two passes over the specified function:

  1. In the first pass, gmake evaluates all unescaped variables in the function (a variable evaluation can be escaped by adding an extra $; thus $$(PROGRAMS) is an escaped evaluation and won’t be evaluated until the second pass). This pass respects neither comments nor conditionals, since gmake is not interpreting the code but instead just evaluating all unescaped variables in order to generate code.
  2. In the second pass, gmake interprets the code generated by the first pass in the normal fashion.

The foreach function normally cannot be used to generate code (trying to do so will cause make to die with an error), but it can if evaluated during eval's first pass and so process_dirs must be called with eval.

process_dirs uses the built-in call function to invoke process_dir (line #31). The call function allows arguments to be passed to a user-defined make function; the first argument is bound to $1 within the function, the second to $2, etc. So when process_dir is called with the $(DIR) argument, $(DIR) is bound to $1 inside the function. Note, however, that since process_dirs is invoked through eval and the call to process_dir within process_dirs is unescaped, process_dir effectively also is invoked through eval and so is evaluated in two passes. Thus, $1 is evaluated in eval's first pass, and so it can be used safely in commands (such as $(RM) $1/*.o on line #17). Normally, variables in commands are not evaluated when make interprets the command, but instead the evaluation is deferred until the command actually is executed after all makefile code has been interpreted. This results in the variable evaluating to the last value the variable had during the interpretation of the makefile, not the value that the variable had when the command was interpreted (assuming otherwise is a very common mistake when writing makefiles).

The evaluation of the PROGRAMS variable within process_dir (line #14) must be escaped until eval's second pass, because PROGRAMS only will be defined when contents.mk is included, which gmake does during the second pass.

Another small complication is that the clean target has double-colon rules, which essentially allow multiple actions to be chained to a target. clean has one action for each directory built (the removal of the directory’s object and library files) and one action at the end of build.mk (the removal of the programs listed in ALL_PROGRAMS).

This hello_world build has several advantages over the previous version. Firstly, as promised, it automatically discovers the directories that need to be built, eliminating any need on the part of the developer to duplicate this information inside of makefiles. Secondly, it allows arbitrary makefile code to be interpreted before and after each directory-specific makefile is included. The build leverages this to add clean actions for each directory. It also aggregates the programs defined by each contents.mk (the PROGRAMS variable) into ALL_PROGRAMS. This sandboxes the directory-specific makefiles in that they do not need to reference (and thus potentially corrupt) variables like ALL_PROGRAMS that contain information for the entire build. In the original hello_world example, a developer accidentally could have assigned to PROGRAMS rather than appending new programs to it, which would have wiped out all programs appended in prior directory-specific makefiles. This kind of error might not be noticed until production because the build for the developer’s directory and all directories built afterward would work perfectly.

This hello_world build has a robust, non-recursive implementation that eliminates the duplication of directory information in the build, makes adding directories to the build trivially easy, allows make code to be run for each of the build’s directories, and sandboxes directory-specific makefiles. It could form the basis of a production quality build system, and the pattmake build system’s core is just an elaboration of this example.


Non-recursive Make Builds always are Better than Recursive Make Builds

Pattmake is the gmake-based build system used by Designing Patterns (I discuss why I chose to use gmake here). I had three main requirements when designing pattmake:

  1. That it be non-recursive
  2. That it easily accommodates simultaneous builds for different architectures
  3. That files generated by builds (build artifacts and targets, such as object files) be put in a different directory than that of the associated source code

I will discuss the reasons for and the implementation of each requirement in my next blog entries.

The first requirement might be a bit surprising, since recursive build systems still are very common. They are fundamentally and irretrievably flawed, however. Briefly, in a recursive build system, make spawns a new make process in order to build a subdirectory. This seems like a wonderful idea because:

  • Make works best when the source files are in make's current working directory. Developers do not have to take different source directories into account when writing rules, for instance.
  • The makefile for a given directory is completely independent of those for other directories.
  • Writing a recursive makefile requires minimal knowledge of the build infrastructure, since a recursive makefile is not very different than a standalone makefile.

These are very compelling advantages, and so recursive build systems often are used despite their fundamental flaw. A bit of background knowledge about make's implementation is required in order to understand the flaw, however. Make works in two phases

  1. Make processes the makefile, finding all possible targets, associating prerequisites with those targets, and storing commands that transform prerequisites into targets. From this information, it constructs a dependency graph.
  2. Make locates the target that it is supposed to build and works backward through the dependency graph from that target, rebuilding anything that is out of date using the commands stored in the first phase.

Recursive build systems are flawed because they restrict each make instance to creating a dependency graph with only the build information contained in the directory the instance is processing, resulting in truncated graphs. I’m going to prove this with the build for an example hello_world program (you have three chances to guess the program’s functionality, and the first two don’t count), the source for which can be downloaded here . This program’s build should work with any reasonably modern make, although I’ve only tested with gmake. The directory structure is:

  non_recursive.mk
  recursive.mk
  rules.mk
  bin:
      hello_world.cc
      non_recursive.mk
      recursive.mk
  lib:
      hello.cc
      hello.h
      non_recursive.mk
      recursive.mk

The package actually contains two build systems for the hello_world program, one recursive (consisting of rules.mk and all of the recursive.mk files; build with make -f recursive.mk all) and one non-recursive (consisting of rules.mk and all of the non_recursive.mk files; build with make -f non_recursive.mk all). Here is bin/recursive.mk:


TOP_LEVEL_DIR=..

include $(TOP_LEVEL_DIR)/rules.mk

hello_world: hello_world.o ../lib/hello.a
        $(CXX) $(LDFLAGS) -o $@ $^

When make recurses and launches a new instance for the bin directory, the new instance knows almost nothing about lib/hello.a. The only thing that it knows is the library’s time of last modification, which, if more recent than hello_world's time of last modification, will trigger a relink of hello_world. It does not know, however, how lib/hello.a is itself built and thus whether or not it needs to be rebuilt before linking bin/hello_world. The new instance has incomplete dependency information and thus can produce an incorrect bin/hello_world build. If the hello_world() function declaration and definition in lib/hello.h and lib/hello.cc are changed without lib being rebuilt, for instance, a broken bin/hello_world will be built. bin/hello_world will link against a stale lib/hello.a and get the old function definition, because the make instance for bin cannot trigger a rebuild for lib/hello.a as it is built in a different subdirectory. bin/hello_world.o will be compiled with the new function declaration in lib/hello.h, however. This will result in a likely fatal mismatch between the hello_world() function that bin/hello_world.cc tries to call and the hello_world() function that actually exists in lib/hello.a. The textbook solution for this problem is to duplicate the dependency of bin/hello_world on lib/hello.a in the top-level recursive.mk like so:


TOP_LEVEL_DIR=.

include $(TOP_LEVEL_DIR)/rules.mk

.PHONY: \
        all \
        bin \
        lib \
        clean

all: bin

bin: lib
        (cd $@; $(MAKE) -f recursive.mk)

lib:
        (cd $@; $(MAKE) -f recursive.mk)

#
# This is not a good way to implement clean,
# but for this simple example...
#
clean:
        $(RM) bin/*.o bin/hello_world lib/*.o lib/*.a

Make is told that both bin and lib are .PHONY targets. .PHONY is a directive (not a real target) that informs make that the targets that it modifies are not actual file system objects (and so are not physically built by make) and always should be rebuilt by make if they appear in the dependency chain (that is, .PHONY targets’ prerequisites should be evaluated and any associated commands should be run). Since they are not really file system objects, make never will try to check .PHONY targets’ modification times. .PHONY targets can be used to alias a group of physical targets. In this case, bin and lib are aliases for all of the targets in the given subdirectory, resulting in make always recursing into the given subdirectory. The dependency of bin/hello_world on lib/hello.a is duplicated in the dependency of bin on lib, which forces make to recurse first into lib and then into bin. This textbook solution is flawed in two ways. Firstly, duplication of dependency information is used to paper over the underlying problem, and it is very easy to modify the dependency in one place without changing it in the other place (resulting in a subtly broken build). Secondly, this band-aid only works if the build is kicked off at the top-level; a programmer who accidentally builds bin/hello_world in bin will create a broken executable if lib/hello.a is stale (the error scenario described above).

The problem described in the example may seem manageable, but the negative effects of a recursive build become corrosively bad in much larger software systems that have many more executables and libraries, a much more complicated web of dependencies, and code generation. A former employer had such a situation, and broken builds due to dependencies not being duplicated properly were quite common. The band-aid described above became very fragile. Peter Miller’s seminal paper on non-recursive builds mentions that many shops in fact decide to forgo the textbook band-aid and, instead, build the same code multiple times in order to paper over the recursive make instances’ lack of full dependency knowledge. This “solution” leads to a dramatically slower build, which can become a development bottleneck very quickly for large systems.

The alternative, a non-recursive build system, has a higher initial setup cost (the build system infrastructure is significantly more complicated than that of an equivalent recursive build system) and carries with it a steeper learning curve, but, by providing make with the full dependency graph, is the superior solution. The simplest non-recursive build system consists of a single makefile that is responsible for building multiple directories. This might be fine for small to medium projects but does not scale well. A far better alternative and the one that pattmake uses is to have a single makefile per directory that handles building the directory but to have make include these directory-specific makefiles rather than recursively processing them. This allows each directory to have a makefile that is independent of those for other directories, just as in recursive build systems. The top-level non_recursive.mk in the hello_world example demonstrates this technique:


TOP_LEVEL_DIR=.

include rules.mk

include bin/non_recursive.mk
include lib/non_recursive.mk

.PHONY:
        all \
        clean

all: $(PROGRAMS)

#
# This is not a good way to implement clean,
# but for this simple example...
#
clean:
        $(RM) bin/*.o bin/hello_world lib/*.o lib/*.a

Here is bin/non_recursive.mk:


bin/hello_world: bin/hello_world.o lib/hello.a
        $(CXX) $(LDFLAGS) -o $@ $^

PROGRAMS += bin/hello_world

Here is lib/non_recursive.mk:


lib/hello.a: lib/hello.o

While the makefiles may look a bit odd because sources and targets have to have a directory specified (make builds everything with the top-level directory as its current working directory in this system, which is quite different than in the recursive build where make changes to the directory to be processed before spawning a new make instance), developers quickly will get used to this. In fact, even recursive build systems may have such complication if the targets are being put into a different directory than the source, which is quite common for multi-architecture builds. The benefits of a non-recursive build system are:

  • Make always is able to build a full dependency graph, without any duplication of dependencies on the part of the programmer. Note that the top-level non_recursive.mk in the hello_world example does not contain any dependency relationship between bin and lib.
  • Unlike recursive build systems, in which only full, top-level system builds can be correct, non-recursive build systems can build specified targets (say a particular program) correctly, since, regardless of what it’s told to build, make always creates a full dependency graph. Note that make -f non_recursive.mk bin/hello_world works perfectly well while make -f recursive.mk bin/hello_world fails horribly. It is very easy for a non-recursive build system to provide this functionality in whichever subdirectory it is invoked, which pattmake does (although the hello_world example does not).
  • Well organized, complex build systems, either recursive or non-recursive, often have large build rule files (containing, for instance, implicit rules to transform .c files into .o files) that need to be included in each makefile. This ensures that the same build rules are used throughout the build. Even the simple hello_world example has such a file, rules.mk. Each recursive makefile has to include such system-wide files (every recursive.mk in the hello_world example does this), forcing make to process such files many times (once per each directory). Non-recursive makefiles, however, are processed by being included by some top-level makefile, and so only the top-level makefile needs to include the system-wide makefiles (rules.mk only is included by the top-level non_recursive.mk in the hello_world example). Thus, make only needs to process the system-wide makefiles once per non-recursive build.

The ability of non-recursive build systems to deliver correct builds consistently, without any duplication of dependency information, should lead to their always being chosen over recursive build systems.


Welcome to the Jungle!

Welcome to Designing Patterns and welcome to View within the Urban Jungle!  I will discuss low-level, technical subject matter on this blog (View from the Acropolis posts will be higher-level).  The posts will not be restricted to a particular language, toolkit, software, or technology but rather will be drawn from whatever I am working on at the moment.  A major focus of this blog, however, will be to document the reasons that I choose a particular solution, not just the solution itself.  In addition, I also will point out any general patterns that I see in a solution or technique.

I am responsible for selecting, deploying, and maintaining the build system at Designing Patterns.  Prior to Designing Patterns, I worked in a group (about 30 programmers, spread across three continents) with a home-brewed build system based on ClearCase’s clearmake make variant.  By the time that I left, the build system was a real asset to the group; in my opinion, it was best of breed for the entire company.  As this is the system with which I have had by far the most experience, its strengths and shortcomings have molded my vision of a build system.  My former employer’s system has the following features:

  • All builds are done within ClearCase dynamic views.
  • Builds can be started from the top-level or from any subdirectory.  The build recursively builds subdirectories, although this behavior needs to be specified for a particular subdirectory in the directory’s makefile.
  • Rules for building C, C++, Java, and FORTRAN code are present.  Perl scripts automatically are syntax checked before being staged for release.  Executables and shared libraries are linked.  Code is generated and then compiled.
  • Unit testing is integrated into the build process.  The unit tests for a library are placed in a test subdirectory and are run as part of the build; failure of any of the tests stops the build.  The tests automatically are Purified and tested for adequate code coverage.
  • Parallel builds are supported fully.
  • Multiple architectures are supported; all build artifacts are placed in architecture-specific directories (allowing builds for different architectures to be running simultaneously).
  • 32-bit and 64-bit builds are supported; libraries can be dual-built.
  • Most makefiles are quite simple, due to the system’s scaffolding having been hidden inside make include files.
  • C header files and FORTRAN include files included in a source file automatically are registered as dependencies of the object file built from the source file.  This feature relies on ClearCase’s build auditing capabilities (basically, ClearCase keeps track of which files are read while building a given target, and automagically identifies them as dependencies).
  • Leveraging ClearCase’s facilities, the build system “winks in” all targets that would not be built any differently than in the production build; this substantially cuts down the build time of development systems, since, even in large projects, most of the system’s files are unchanged and so associated targets do not need to be built.

I wish that I could claim responsibility for this wonderful build system, but I did not know much about build systems while working at my former employer.  I could make the necessary changes to makefiles for my projects, but I did not really understand how make worked in any depth.

After taking responsibility for the Designing Patterns build system, I researched the available options.  To my surprise, I found that make no longer was state of the art.  There are many higher-level build tools.  Some leverage the capabilities of full-fledged programming languages to describe builds (rake, which uses Ruby, SCons, which uses Python, and ant, which uses Java, are examples).  Others, like CMake, generate platform-appropriate build instructions (makefiles on UNIX, Visual Studio project files on Windows, etc.).  Despite the availability of these tools, I decided that Designing Patterns initially will use gmake.  All of the build experience at Designing Patterns has been with make variants, and it is beneficial to deploy a system that easily will be picked up by everyone, given how many other things must be learned.  Make still is used very widely; source code for packages quite often is built using makefiles generated by automake in the open source world, for instance.  Also, make is a very general, omnipresent build tool that can be molded to handle almost any task.  Many of the higher-level build tools are very closely associated with the needs of a particular language (Ruby’s rake, for instance); this does not mean that they cannot be used in other contexts, but it does mean that their target audience is relatively homogenous, and so that these other contexts may not be that well supported.  This is a very important point for Designing Patterns, as we still are exploring the technologies that we will use for our products, and we do not want the build system to be a factor in those decisions.  I choose the gmake variant because it is a very portable make (since it runs on virtually all platforms, gmake's makefiles automatically are portable), and it has a very expressive make language.  Designing Patterns will revisit this decision in the future, however, in light of the development work on our products.  Initially using gmake gives us a baseline against which to compare the capabilities of other, higher-level systems in the future.

I am in the process of writing the Designing Pattern’s gmake build system, which we are calling pattmake, and which we will release to the public.  Later blog entries will discuss its features and implementation, both of which were informed by my former employer’s build system.