How Different VC Systems Specify Sources - BuildBot Manual

Next: Attributes of Changes, Previous: Source Tree Specifications, Up: Version Control Systems

3.1.3 How Different VC Systems Specify Sources

For CVS, the static specifications are repository and module. In addition to those, each build uses a timestamp (or omits the timestamp to mean the latest) and branch tag (which defaults to HEAD). These parameters collectively specify a set of sources from which a build may be performed.

Subversion combines the repository, module, and branch into a single Subversion URL parameter. Within that scope, source checkouts can be specified by a numeric revision number (a repository-wide monotonically-increasing marker, such that each transaction that changes the repository is indexed by a different revision number), or a revision timestamp. When branches are used, the repository and module form a static baseURL, while each build has a revision number and a branch (which defaults to a statically-specified defaultBranch). The baseURL and branch are simply concatenated together to derive the svnurl to use for the checkout.

Perforce is similar. The server is specified through a P4PORT parameter. Module and branch are specified in a single depot path, and revisions are depot-wide. When branches are used, the p4base and defaultBranch are concatenated together to produce the depot path.

Arch and Bazaar specify a repository by URL, as well as a version which is kind of like a branch name. Arch uses the word archive to represent the repository. Arch lets you push changes from one archive to another, removing the strict centralization required by CVS and SVN. It retains the distinction between repository and working directory that most other VC systems use. For complex multi-module directory structures, Arch has a built-in build config layer with which the checkout process has two steps. First, an initial bootstrap checkout is performed to retrieve a set of build-config files. Second, one of these files is used to figure out which archives/modules should be used to populate subdirectories of the initial checkout.

Builders which use Arch and Bazaar therefore have a static archive url, and a default “branch” (which is a string that specifies a complete category–branch–version triple). Each build can have its own branch (the category–branch–version string) to override the default, as well as a revision number (which is turned into a –patch-NN suffix when performing the checkout).

Bzr (which is a descendant of Arch/Bazaar, and is frequently referred to as “Bazaar”) has the same sort of repository-vs-workspace model as Arch, but the repository data can either be stored inside the working directory or kept elsewhere (either on the same machine or on an entirely different machine). For the purposes of Buildbot (which never commits changes), the repository is specified with a URL and a revision number.

The most common way to obtain read-only access to a bzr tree is via HTTP, simply by making the repository visible through a web server like Apache. Bzr can also use FTP and SFTP servers, if the buildslave process has sufficient privileges to access them. Higher performance can be obtained by running a special Bazaar-specific server. None of these matter to the buildbot: the repository URL just has to match the kind of server being used. The repoURL argument provides the location of the repository.

Branches are expressed as subdirectories of the main central repository, which means that if branches are being used, the BZR step is given a baseURL and defaultBranch instead of getting the repoURL argument.

Darcs doesn't really have the notion of a single master repository. Nor does it really have branches. In Darcs, each working directory is also a repository, and there are operations to push and pull patches from one of these repositories to another. For the Buildbot's purposes, all you need to do is specify the URL of a repository that you want to build from. The build slave will then pull the latest patches from that repository and build them. Multiple branches are implemented by using multiple repositories (possibly living on the same server).

Builders which use Darcs therefore have a static repourl which specifies the location of the repository. If branches are being used, the source Step is instead configured with a baseURL and a defaultBranch, and the two strings are simply concatenated together to obtain the repository's URL. Each build then has a specific branch which replaces defaultBranch, or just uses the default one. Instead of a revision number, each build can have a “context”, which is a string that records all the patches that are present in a given tree (this is the output of darcs changes --context, and is considerably less concise than, e.g. Subversion's revision number, but the patch-reordering flexibility of Darcs makes it impossible to provide a shorter useful specification).

Mercurial is like Darcs, in that each branch is stored in a separate repository. The repourl, baseURL, and defaultBranch arguments are all handled the same way as with Darcs. The “revision”, however, is the hash identifier returned by hg identify.

Git also follows a decentralized model, and each repository can have several branches and tags. The source Step is configured with a static repourl which specifies the location of the repository. In addition, an optional branch parameter can be specified to check out code from a specific branch instead of the default “master” branch. The “revision” is specified as a SHA1 hash as returned by e.g. git rev-parse. No attempt is made to ensure that the specified revision is actually a subset of the specified branch.