Adding LogObservers - BuildBot Manual

Next: BuildStep URLs, Previous: Reading Logfiles, Up: Writing New BuildSteps

4.11.9.4 Adding LogObservers

Most shell commands emit messages to stdout or stderr as they operate, especially if you ask them nicely with a --verbose flag of some sort. They may also write text to a log file while they run. Your BuildStep can watch this output as it arrives, to keep track of how much progress the command has made. You can get a better measure of progress by counting the number of source files compiled or test cases run than by merely tracking the number of bytes that have been written to stdout. This improves the accuracy and the smoothness of the ETA display.

To accomplish this, you will need to attach a LogObserver to one of the log channels, most commonly to the “stdio” channel but perhaps to another one which tracks a log file. This observer is given all text as it is emitted from the command, and has the opportunity to parse that output incrementally. Once the observer has decided that some event has occurred (like a source file being compiled), it can use the setProgress method to tell the BuildStep about the progress that this event represents.

There are a number of pre-built LogObserver classes that you can choose from (defined in buildbot.process.buildstep, and of course you can subclass them to add further customization. The LogLineObserver class handles the grunt work of buffering and scanning for end-of-line delimiters, allowing your parser to operate on complete stdout/stderr lines. (Lines longer than a set maximum length are dropped; the maximum defaults to 16384 bytes, but you can change it by calling setMaxLineLength() on your LogLineObserver instance. Use sys.maxint for effective infinity.)

For example, let's take a look at the TrialTestCaseCounter, which is used by the Trial step to count test cases as they are run. As Trial executes, it emits lines like the following:

     buildbot.test.test_config.ConfigTest.testDebugPassword ... [OK]
     buildbot.test.test_config.ConfigTest.testEmpty ... [OK]
     buildbot.test.test_config.ConfigTest.testIRC ... [FAIL]
     buildbot.test.test_config.ConfigTest.testLocks ... [OK]

When the tests are finished, trial emits a long line of “======” and then some lines which summarize the tests that failed. We want to avoid parsing these trailing lines, because their format is less well-defined than the “[OK]” lines.

The parser class looks like this:

     from buildbot.process.buildstep import LogLineObserver
     
     class TrialTestCaseCounter(LogLineObserver):
         _line_re = re.compile(r'^([\w\.]+) \.\.\. \[([^\]]+)\]$')
         numTests = 0
         finished = False
     
         def outLineReceived(self, line):
             if self.finished:
                 return
             if line.startswith("=" * 40):
                 self.finished = True
                 return
     
             m = self._line_re.search(line.strip())
             if m:
                 testname, result = m.groups()
                 self.numTests += 1
                 self.step.setProgress('tests', self.numTests)

This parser only pays attention to stdout, since that's where trial writes the progress lines. It has a mode flag named finished to ignore everything after the “====” marker, and a scary-looking regular expression to match each line while hopefully ignoring other messages that might get displayed as the test runs.

Each time it identifies a test has been completed, it increments its counter and delivers the new progress value to the step with self.step.setProgress. This class is specifically measuring progress along the “tests” metric, in units of test cases (as opposed to other kinds of progress like the “output” metric, which measures in units of bytes). The Progress-tracking code uses each progress metric separately to come up with an overall completion percentage and an ETA value.

To connect this parser into the Trial BuildStep, Trial.__init__ ends with the following clause:

             # this counter will feed Progress along the 'test cases' metric
             counter = TrialTestCaseCounter()
             self.addLogObserver('stdio', counter)
             self.progressMetrics += ('tests',)

This creates a TrialTestCaseCounter and tells the step that the counter wants to watch the “stdio” log. The observer is automatically given a reference to the step in its .step attribute.

A Somewhat Whimsical Example

Let's say that we've got some snazzy new unit-test framework called Framboozle. It's the hottest thing since sliced bread. It slices, it dices, it runs unit tests like there's no tomorrow. Plus if your unit tests fail, you can use its name for a Web 2.1 startup company, make millions of dollars, and hire engineers to fix the bugs for you, while you spend your afternoons lazily hang-gliding along a scenic pacific beach, blissfully unconcerned about the state of your tests.¹

To run a Framboozle-enabled test suite, you just run the 'framboozler' command from the top of your source code tree. The 'framboozler' command emits a bunch of stuff to stdout, but the most interesting bit is that it emits the line "FNURRRGH!" every time it finishes running a test case². You'd like to have a test-case counting LogObserver that watches for these lines and counts them, because counting them will help the buildbot more accurately calculate how long the build will take, and this will let you know exactly how long you can sneak out of the office for your hang-gliding lessons without anyone noticing that you're gone.

This will involve writing a new BuildStep (probably named "Framboozle") which inherits from ShellCommand. The BuildStep class definition itself will look something like this:

     # START
     from buildbot.steps.shell import ShellCommand
     from buildbot.process.buildstep import LogLineObserver
     
     class FNURRRGHCounter(LogLineObserver):
         numTests = 0
         def outLineReceived(self, line):
             if "FNURRRGH!" in line:
                 self.numTests += 1
                 self.step.setProgress('tests', self.numTests)
     
     class Framboozle(ShellCommand):
         command = ["framboozler"]
     
         def __init__(self, **kwargs):
             ShellCommand.__init__(self, **kwargs)   # always upcall!
             counter = FNURRRGHCounter())
             self.addLogObserver('stdio', counter)
             self.progressMetrics += ('tests',)
     # FINISH

So that's the code that we want to wind up using. How do we actually deploy it?

You have a couple of different options.

Option 1: The simplest technique is to simply put this text (everything from START to FINISH) in your master.cfg file, somewhere before the BuildFactory definition where you actually use it in a clause like:

     f = BuildFactory()
     f.addStep(SVN(svnurl="stuff"))
     f.addStep(Framboozle())

Remember that master.cfg is secretly just a python program with one job: populating the BuildmasterConfig dictionary. And python programs are allowed to define as many classes as they like. So you can define classes and use them in the same file, just as long as the class is defined before some other code tries to use it.

This is easy, and it keeps the point of definition very close to the point of use, and whoever replaces you after that unfortunate hang-gliding accident will appreciate being able to easily figure out what the heck this stupid "Framboozle" step is doing anyways. The downside is that every time you reload the config file, the Framboozle class will get redefined, which means that the buildmaster will think that you've reconfigured all the Builders that use it, even though nothing changed. Bleh.

Option 2: Instead, we can put this code in a separate file, and import it into the master.cfg file just like we would the normal buildsteps like ShellCommand and SVN.

Create a directory named ~/lib/python, put everything from START to FINISH in ~/lib/python/framboozle.py, and run your buildmaster using:

      PYTHONPATH=~/lib/python buildbot start MASTERDIR

or use the Makefile.buildbot to control the way buildbot start works. Or add something like this to something like your ~/.bashrc or ~/.bash_profile or ~/.cshrc:

      export PYTHONPATH=~/lib/python

Once we've done this, our master.cfg can look like:

     from framboozle import Framboozle
     f = BuildFactory()
     f.addStep(SVN(svnurl="stuff"))
     f.addStep(Framboozle())

or:

     import framboozle
     f = BuildFactory()
     f.addStep(SVN(svnurl="stuff"))
     f.addStep(framboozle.Framboozle())

(check out the python docs for details about how "import" and "from A import B" work).

What we've done here is to tell python that every time it handles an "import" statement for some named module, it should look in our ~/lib/python/ for that module before it looks anywhere else. After our directories, it will try in a bunch of standard directories too (including the one where buildbot is installed). By setting the PYTHONPATH environment variable, you can add directories to the front of this search list.

Python knows that once it "import"s a file, it doesn't need to re-import it again. This means that reconfiguring the buildmaster (with "buildbot reconfig", for example) won't make it think the Framboozle class has changed every time, so the Builders that use it will not be spuriously restarted. On the other hand, you either have to start your buildmaster in a slightly weird way, or you have to modify your environment to set the PYTHONPATH variable.

Option 3: Install this code into a standard python library directory

Find out what your python's standard include path is by asking it:

     80:warner@luther% python
     Python 2.4.4c0 (#2, Oct  2 2006, 00:57:46)
     [GCC 4.1.2 20060928 (prerelease) (Debian 4.1.1-15)] on linux2
     Type "help", "copyright", "credits" or "license" for more information.
     >>> import sys
     >>> import pprint
     >>> pprint.pprint(sys.path)
     ['',
      '/usr/lib/python24.zip',
      '/usr/lib/python2.4',
      '/usr/lib/python2.4/plat-linux2',
      '/usr/lib/python2.4/lib-tk',
      '/usr/lib/python2.4/lib-dynload',
      '/usr/local/lib/python2.4/site-packages',
      '/usr/lib/python2.4/site-packages',
      '/usr/lib/python2.4/site-packages/Numeric',
      '/var/lib/python-support/python2.4',
      '/usr/lib/site-python']

In this case, putting the code into /usr/local/lib/python2.4/site-packages/framboozle.py would work just fine. We can use the same master.cfg "import framboozle" statement as in Option 2. By putting it in a standard include directory (instead of the decidedly non-standard ~/lib/python), we don't even have to set PYTHONPATH to anything special. The downside is that you probably have to be root to write to one of those standard include directories.

Option 4: Submit the code for inclusion in the Buildbot distribution

Make a fork of buildbot on http://github.com/djmitche/buildbot or post a patch in a bug at http://buildbot.net. In either case, post a note about your patch to the mailing list, so others can provide feedback and, eventually, commit it.

     from buildbot.steps import framboozle
     f = BuildFactory()
     f.addStep(SVN(svnurl="stuff"))
     f.addStep(framboozle.Framboozle())

And then you don't even have to install framboozle.py anywhere on your system, since it will ship with Buildbot. You don't have to be root, you don't have to set PYTHONPATH. But you do have to make a good case for Framboozle being worth going into the main distribution, you'll probably have to provide docs and some unit test cases, you'll need to figure out what kind of beer the author likes, and then you'll have to wait until the next release. But in some environments, all this is easier than getting root on your buildmaster box, so the tradeoffs may actually be worth it.

Putting the code in master.cfg (1) makes it available to that buildmaster instance. Putting it in a file in a personal library directory (2) makes it available for any buildmasters you might be running. Putting it in a file in a system-wide shared library directory (3) makes it available for any buildmasters that anyone on that system might be running. Getting it into the buildbot's upstream repository (4) makes it available for any buildmasters that anyone in the world might be running. It's all a matter of how widely you want to deploy that new class.

Footnotes

[1] framboozle.com is still available. Remember, I get 10% :).

[2] Framboozle gets very excited about running unit tests.