Project

General

Profile

Actions

Wiki » History » Revision 6

« Previous | Revision 6/13 (diff) | Next »
John Cary, 12/19/2015 08:35 AM


What is Bilder?

Bilder is a cross-platform (Linux, OS X, Windows), meta-build or package management system applicable to LCFs, such the IBM Blue Gene and the Cray series of computers. It automatically downloads packages, then configures, builds and installs them. It handles updating and building of a collection of packages and projects that have dependency relationships. When one package is updated, Bilder ensures that its dependents are updated as well. It can install in common areas, so that multiple packages can make use of the same dependency builds.

As of January 16, 2012, Bilder handles multiple builds of over 150 packages, with the multiple builds being, e.g., serial, parallel, shared, or static, as needed. The platforms include Linux, OS X, AIX, and the specialized Linuces found on the IBM Blue Gene P and the Cray XT4. It handles the compiler sets of gcc, XL, PathScale and PGI.

Bilder is not for replacing build systems. Instead it works with the build systems that come with each package. It supports packages with builds systems of autotools, CMake, qmake, Distutils, and the one-off builds systems of, e.g., lapack, ATLAS, and PETSc. In essence, Bilder acts as a repository of build knowledge.

Bilder Characteristics

  • Build workflow automation, handling interpackage dependencies, with builds triggered when a dependency has been built.
  • Uses soft inter-package dependencies: Suppose component A depends on B, and is updated but does not build (fails or is excluded). Attempt to build A anyway if any other dependency is rebuilt or if A is updated, as the newer A may be consistent with an existing installation of B, or A may be able to build without B.
  • Integration with version control systems.
  • Integration with testing.
  • Support for multiple OSs: Linux, OS X, Windows
  • Support for multiple compiler sets (gcc, XL, PGI, PathScale, Visual Studio)
    • LCFs have particular preferred compilers, e.g., for which some libraries have been built
    • Need to compare performance of code generated by different compilers
    • Have to use built libraries (HDF5, Lapack) when possible for performance
  • Ability to use different underlying package configuration/build systems.
  • Support for different kinds of builds (e.g., parallel, serial, static, shared) for any package.
  • Collection of build provenance information, including logging of all steps and notification using emails and dashboards.
  • Allows disabling the builds of particular packages (e.g., so that a system version will be used).
  • Parallel (multi-threaded or multi-process) builds of independent builds or packages.
  • Out of place build and installation: with defaults and also user-specified locations.
  • Defaults for all parameters on all supported platforms that can be overridden by users.
  • Integration with the Jenkins continuous integration tool.
  • Searching for packages within the installation area.
  • Isolation of general logic from specific logic and data
    • General logic in top-level Bilder files
    • Package specific logic and data in package files (the files in the package subdirectory)
    • Machine specific logic and data in machine files (the files in the machines subdirectory)

What does Bilder not handle?

  • Installing compilers
  • Probably much more

Preparing your machine for Bilder

Then check out a bilder repo and build. Below are some examples.

EXAMPLE1: Python Packages

Build ipython, scipy, tables with one command! This will build these packages and all of their dependencies, which are ipython scipy tables tornado pyzmq pyqt matplotlib hdf5 numexpr setuptools zeromq Cython qt sip numpy Python atlas clapack_cmake chrpath sqlite bzip2 lapack cmake.

svn checkout http://ice.txcorp.com/svnrepos/code/bilder/pypkgs/trunk pypkgs
cd pypkgs
./mkpypkgs.sh

EXAMPLE2: VisIt Visual Analysis Package

https://wci.llnl.gov/codes/visit/

Build the VisIt visualization tool with one command! This will build VisIt and all its dependencies, which are visit Imaging visit_vtk qt mesa hdf5 openmpi zlib cmake bzip2.

svn checkout http://ice.txcorp.com/svnrepos/code/bilder/visitall/trunk visitall
cd visitall
./mkvisitall.sh

Getting Bilder

Bilder is a set of shell scripts to configure software. All the configure scripts are available from a subversion repository. To access Bilder, enter:

svn co https://ice.txcorp.com/svnrepos/code/bilder/trunk bilder

Configuring Bilder

Required configuration information

Before running Bilder you need to tell it where its configuration information is. This is a directory. The value of the environment variable, BILDER_CONFDIR, is set to it. (E.g., BILDER_CONFDIR=/etc/bilder.)

Inside that directory, there must be at least two files. The first, bilderrc, defines a variable, PACKAGE_REPOS_FILE, that contains the name of the file containing the repositories to be searched for tarballs for packages to be built. E.g.,

PACKAGE_REPOS_FILE=${PACKAGE_REPOS_FILE:-"$BILDER_CONFDIR/numpkgssvn.txt"}

This follows the standard Bilder style, that no variable with a value is overwritten. This allows the person executing the build instruction to override any variable value on the command line, e.g., using env.

The Package Repos File, then contains the repos to be searched for packages, with the format:

    $ cat numpkgssvn.txt 
    ####
    #
    # File:    numpkgssvn.sh
    #
    # Purpose: List the package repos in the format,
    #          subdir,method=URL
    #          Where subdir is the desired location for the repo,
    #          method = svn to get by svn, empty to get with wget
    #          URL is the resource locator
    #
    # Version: $Id: numpkgssvn.txt 54 2012-04-08 13:52:09Z cary $
    #
    ####
    PACKAGE_REPO: numpkgs,svn=https://ice.txcorp.com/svnrepos/code/numpkgs/trunk

Each line starting with PACKAGE_REPO: defines the subdir (in this case numpkgs) into which the packages are put, the method (in this case svn) for getting the packages, and after the equals sign, the URL for the directory containing all packages.

For the method (part between the command and the equals sign) of svn, this means that the svn repo will be checked out as empty,

svn co --depth=empty https://ice.txcorp.com/svnrepos/code/numpkgs/trunk numpkgs

and packages will be obtained by

svn up pkgname

in the numpkgs subdirectory.

Optional logic in bilderrc

It can happen that "hostname -f" does not give the fully qualified hostname for your machine. In this case, you can define FQHOSTNAME to contain that hostname.

You can also find the following three methods:

  • bilderGetAuxData defines how to get any auxiliary data needed by a package
  • bilderFinalAction defines a final action (like posting to a dashboard) to be undertaken at the end of a build run
  • signInstaller to sign any installers that you create during your build

Optional additional logic

You can provide specific logic in domainname files that also define default installation directories and such in files named with the domain name. Examples are seen in bilder/runnr. E.g.,

    $ cat nersc.gov 
    ##############################################################
    ##
    ## File:    nersc.gov
    ##
    ## Purpose: Helper functions for setting variables and queues by domain
    ##
    ## Version: $Id: nersc.gov 5644 2012-04-02 13:35:02Z cary $
    ##
    ## /* vim: set filetype=sh : */
    ##
    ##############################################################
    #
    # Adjust the auxiliary names:
    #   MAILSRVR, INSTALLER_HOST, INSTALLER_ROOTDIR, FQMAILHOST, BLDRHOSTID
    #
    runnrSetNamesByDomain() {
    # Hosts for which FQMAILHOST is not obvious.  Also ensure that an
    # install host name is set for all cases.
      case $UQHOSTNAME in
        cvrsvc[0-9]*)
          FQMAILHOST=carver.nersc.gov
          ;;
        dirac[0-9]*)
          FQMAILHOST=dirac.nersc.gov
          ;;
        freedom[0-9]*)
          FQMAILHOST=freedom.nersc.gov
          RUNNRSYSTEM=XT4
          ;;
        hopper[01][0-9]*)
          FQMAILHOST=hopper.nersc.gov
          RUNNRSYSTEM=XE6
          ;;
        nid[0-9]*)
          FQMAILHOST=franklin.nersc.gov
          RUNNRSYSTEM=XT4
          ;;
      esac
    }
    runnrSetNamesByDomain
    cat >/dev/null <<EOF  ## (Block comment)
    MODULES AT NERSC

This is an incomplete list of modules that have to be loaded on the machines that use modules.

FRANKLIN:
Currently Loaded Modulefiles:
    1) modules/3.1.6.5
    2) moab/5.2.5
    3) torque/2.4.1b1-snap.200905131530
    4) xtpe-barcelona
    5) xtpe-target-cnl
    6) MySQL/5.0.45
    7) xt-service/2.1.50HDB_PS13A
    8) xt-libc/2.1.50HDB_PS13A
    9) xt-os/2.1.50HDB_PS13A
   10) xt-boot/2.1.50HDB_PS13A
   11) xt-lustre-ss/2.1.50HDB_PS13A_1.6.5
   12) Base-opts/2.1.50HDB_PS13A
   13) PrgEnv-gnu/2.1.50HDB_PS13A
   14) xt-asyncpe/3.3
   15) xt-pe/2.1.50HDB_PS13A
   16) xt-mpt/3.5.0
   17) xt-libsci/10.4.0
   18) gcc/4.4.1
   19) java/jdk1.6.0_07
   20) python/2.6.2
   21) subversion/1.6.4
   22) szip/2.1
HOPPER:
Currently Loaded Modulefiles:
    1) modules/3.1.6             9) xt-asyncpe/3.4
    2) torque/2.4.1b1           10) PrgEnv-pgi/2.2.41
    3) moab/5.3.4               11) xtpe-target-cnl
    4) pgi/9.0.4                12) eswrap/1.0.5
    5) xt-libsci/10.4.0         13) xtpe-shanghai
    6) xt-mpt/3.5.0             14) gcc/4.3.3
    7) xt-pe/2.2.41             15) java/jdk1.6.0_15
    8) xt-sysroot/2.2.20090720  16) szip/2.1
CARVER:
bilder needs to find either a pgi or a gcc module in your modules list.
EOF
    #
    # Determine RUNNR_QTARGET, RUNNR_QUEUE, RUNNR_ACCOUNT, RUNNR_PPN
    #
    runnrSetQInfoByDomain() {
      RUNNR_QTARGET=${RUNNR_QTARGET:-"headnode"}
      local fqdn
      if ! fqdn=`hostname -f 2>/dev/null`; then
        fqdn=`hostname`
      fi
      case $SCRIPT_NAME in
        mkfcall | mkfcpkgs)
          RUNNR_ACCOUNT=${RUNNR_ACCOUNT:-"m681"}    # FACETS
          ;;
        mkvpall)
          RUNNR_ACCOUNT=${RUNNR_ACCOUNT:-"m778"}    # ComPASS
          ;;
        *)
          RUNNR_ACCOUNT=${RUNNR_ACCOUNT:-"m778"}    # ComPASS
          ;;
      esac
      RUNNR_QUEUE=${RUNNR_QUEUE:-"regular"}
      RUNNR_NCPUSVAR=mppwidth
    }
    runnrSetQInfoByDomain
    #
    # Set default options.  This has to be called after option parsing.
    # Should set
    #  CONTRIB_ROOTDIR    The root directory for common installations of tarballs
    #  INSTALL_ROOTDIR    The root directory for common installations of repos
    #  USERINST_ROOTDIR   The root directory for user installations (same for
    #                     tarballs and repos)
    #  INSTALL_SUBDIR_SFX Added to subdir (software, contrib, volatile, internal)
    #                     to complete the installation dir
    #  BUILD_ROOTDIR      Where builds are to take place
    #  BILDER_ADDL_ARGS   Any additional args to pass to Bilder
    #  MACHINEFILE        The machine file to use
    #
    setBilderHostVars() {
      #
      # Preliminary variables
      #   Determine the compiler and version for machinefile and namespacing
      #
      local compkey=`modulecmd bash list -t 2>&1 | grep PrgEnv | sed -e 's/^PrgEnv-//' -e 's?/.*??'`
      # echo compkey = $compkey
      if test -z "$compkey"; then
        local comp=
        for comp in pgi gcc gnu; do
          compkey=`module list -t 2>&1 | grep ^$comp | sed -e 's?/.*$??'`
          if test -n "$compkey"; then
            break
          fi
        done
      fi
      if test -z "$compkey"; then
        echo "Cannot determine the compkey.  Quitting."
        exit 1
      fi
      # echo "compkey = $compkey."
      case $compkey in
        gnu)   compkey=gcc;;
        path*) compkey=path;;
      esac
      echo compkey = $compkey
      local compver=`modulecmd bash list -t 2>&1 | grep ^$compkey | sed -e 's?^.*/??'`
      local majorminor=`echo $compver | sed -e "s/\(^[^\.]*\.[^\.]*\).*/\1/"`
      compver=$majorminor
      echo compver = $compver
      # echo "Quitting in nersc.gov."; exit
      # Set the installation and project subdirs
      CONTRIB_ROOTDIR=/project/projectdirs/facets
      if test -z "$PROJECT_INSTSUBDIR"; then
        echo "PROJECT_INSTSUBDIR not set.  Quitting."
        exit 1
      fi
      INSTALL_ROOTDIR=/project/projectdirs/$PROJECT_INSTSUBDIR
      local machinedir=$UQMAILHOST
      if test $UQMAILHOST = freedom; then
        machinedir=franklin
      fi
      CONTRIB_ROOTDIR=$CONTRIB_ROOTDIR/$machinedir
      USERINST_ROOTDIR=$INSTALL_ROOTDIR/$USER/$machinedir
      INSTALL_ROOTDIR=$INSTALL_ROOTDIR/$machinedir
      INSTALL_SUBDIR_SFX="-$compkey-$compver"
      # Set the build directory
      if test -n "$GSCRATCH"; then
        BUILD_ROOTDIR=${BUILD_ROOTDIR:-"$GSCRATCH/builds-${UQHOSTNAME}-$compkey"}
      elif test -n "$SCRATCH"; then
        BUILD_ROOTDIR=${BUILD_ROOTDIR:-"$SCRATCH/builds-${UQHOSTNAME}-$compkey"}
      fi
      # Add to BILDER_ARGS
      BILDER_ADDL_ARGS=-P
      # Set machine file
      case $machinedir in
        hopper | franklin) MACHINEFILE=${MACHINEFILE:-"cray.$compkey"};;
        *) MACHINEFILE=${MACHINEFILE:-"nersclinux.$compkey"};;
      esac
    }

This file may also, as seen above, define the method, setBilderHostVars, which also can set the various variables defining where builds should take place, where installations should go, etc.

Running Bilder

Running Bilder for the Novice User

First you will need to check out a ''meta-project'' svn repo that includes the source that you want to build along with the bilder scripts repo.

For example, Tech-X maintains the visitall repo, which can be obtained by:

svn co https://ice.txcorp.com/svnrepos/code/visitall/trunk visitall

In the bilder'ized project, if there is a script usually named "mkall-default.sh" where is the project name that may be abbreviated (e.g. for visitall the script is mkvisitall.sh), then this is the easiest way to run bilder. The options of a top level "default" Bilder script can be seen by running the script with the -h flag:

    $ ./mkvisitall-default.sh -h
    source /Users/cary/projects/visitall/bilder/runnr/runnrfcns.sh
    Usage: ./mkvisitall-default.sh [options]
    This script is meant to handle some of the vagaries that occur at LCFs and
    clusters in large systems (which have complicated file systems) such as those 
    that have high performance scratch systems and NFS mounted home systems. This 
    script is also meant to ease the use of non-gfortran compilers.
    OPTIONS
    -c              common installations: for non-LCFS, goes into /contrib,
                    /volatile or /internal, for LCFSs, goes into group areas
    -C              Install in separate tarball and repo install dirs
                    (internal/volatile) rather than in one area (software).
    -E <env pairs>  Comma-delimited list of environment var=value pair
    -f <file>       File that contains extra arguments to pass
                    Default: .extra_args
    -F <compiler>   Specify fortran compiler on non-LCF systems
    -g              Label the gnu builds the same way other builds occur.
    -H <host name>  use rules for this hostname (carver, surveyor, intrepid)
    -h              print this message
    -i              Software directory is labeled with "internal" if '$USER'
                    is member of internal install list
    -I              Install in $HOME instead of default location
                    (projects directory at LCFs, BUILD_ROOTDIR on non-LCFs)
    -j              Maximum allowed value of the arg of make -j
    -k              On non-LCFs: Try to find a tarball directory (/contrib)
                    On LCFs:     Install tarballs (instead of using facetspkgs)
    -m              force this machine file
    -n              invoke with a nohup and a redirect output
    -p              just print the command
    -q <timelimit>  run in queue if possible, with limit of timelimit time
    -t              Pass the -t flag to the  mk script (turn on testing)
    -v <file>       A file containing a list (without commas) of declared
                    environment variables to be passed to mk*.sh script
    -w <file>       Specify the name of a file which has a comma-delimited
                    list of packages not to build (e.g.,
                    plasma_state,nubeam,uedge) Default: .nobuild
    --              End processing of args for mkall-default.sh, all remaining
                    args are passed to the script.

For this script to work, you must have defined the location of your Bilder configuration directory in the environment variable, BILDER_CONFDIR. This will be discussed more in [ConfiguringBilder].

Running Bilder for the Advanced User ...

In the bilder'ized project, there will be a script named "mkall.sh" where is the project name that may be abbreviated (e.g. for nautilus the script is mkvisitall.sh). The options of a top level Bilder script can be seen by running the script with the -h flag:

    $ ./mkvisitall.sh -h
    /Users/cary/projects/visitall/bilder/runnr/runnrfcns.sh sourced.
    Usage: ./mkvisitall.sh [options]
    GENERAL OPTIONS
      -A <addl_sp>        Add this to the supra search path
      -b <build_dir>      Build in <build_dir>
      -B <build_type>     CMake build type
      -c ............... Configure packages but don't build
      -C ............... Create installers
      -d ............... Create debug builds (limited package support)
      -D ............... Build/install docs
      -e <addr>          Email log to specified recipients
      -E <env pairs>.... Comma-delimited list of environment var=value pair
      -F ............... Force installation of packages that have local
                         modifications
      -g ............... Allow use of gfortran with version <4.3
      -h ............... Print this message
      -i <install_dir>   Set comma delimited list of installation directories
                         for code in subdirs, expected to be svn repos; install
                         in first directory unless command line contains -2,
                         in which case install in the last directory.
                         <install_dir> defaults to $HOME/software if not set.
      -I ............... Install even if tests fail (ignore test results)
      -j <n>             Pass arg to make with -j
      -k <tarball_dir>   Set installation directory for code in tarballs,
                         expected to be found in one of the pkg repo subdirs;
                         <tarball_dir> defaults to <install_dir> if not set.
      -l <mpi launcher>  The executable that launches an MPI job
      -L ............... Directory for logs (if different from build)
      -m <hostfile>      File to source for machine specific defs
      -M ............... Maximally thread
      -o ............... Install openmpi if not on cygwin.
      -O ............... Install optional packages = ATLAS, parallel visit, ...
      -p <path>          Specify a supra-search-path
      -P ............... Force build of python(does not apply to OS X or Windows)
      -r ............... Remove other installations of a package upon successful
                         installation of that package
      -R ............... Build RELEASE (i.e., licensed) version of executable,
                         if applicable.
      -S ............... Build static
      -t ............... Run tests
      -u ............... Do "svn up" at start
      -U ............... Do not get (direct access or svn up) tarballs
      -v ............... Verbose: print debug information from bilder
      -w <wait days>      Wait this many days before doing a new installation
      -W <disable builds> Build without these packages (comma delimited list)
                          e.g., -W nubeam,plasma_state
      -X ............... Build experimental (new) versions of packages
      -Z ............... Do not execute the final action
      -2 ............... Use the second installation directory of the comma
                         delimited list.  Causes -FI options.

Notes on Installation Directories and Path Modifications

Bilder builds all software, when possible, in ''the build directory'' or , which is specified by the -b flag. It also unpacks tarballs into this directory before building them.

Bilder defines two installation directories, which may be the same.

Tarballs are installed in ''the tarball directory'' or <tarballdir>, specified by the -k flag. This is the /contrib directory at Tech-X.

Code from repositories is installed in ''the repo directory'' or <repodir>, the directory specified by the -i flag. At Tech-X, this is typically /volatile or /internal.

If only one of the above directories is specified, then the other directory defaults to the specified directory. If neither directory is specified, then both directories default to $HOME/software.

During the build process, /contrib/autotools/bin:/contrib/valgrind/bin:/contrib/mpi/bin:/contrib/hdf5/bin:/contrib/bin: is added to the front of the path so that the installed packages are use to build the packages.

Debugging Bilder Errors

Bilder is a set of bash scripts. The [https://ice.txcorp.com/svnrepos/code/bilder/trunk/ trunk version of the scripts] will tell you exactly what bilder is doing if you know bash programming.

Bilder's Build Types

The standard builds of Bilder are

  • ser: static, serial build
  • par: static, parallel (MPI) build
  • sersh: shared, serial build
  • parsh: shared, parallel (MPI) build
  • cc4py: shared build compatible with the way Python was built

The Bilder standard is to install each build in its own directory. While libtool allows shared and static builds to be done within the same build, cmake generally does not as discussed at [[http://www.cmake.org/Wiki/CMake_FAQ#Library_questions]]. Further, to do this on Windows, library names have to differ, as otherwise the static library and the shared interface library files would overwrite each other. So in the end, is it simply easier to install shared and static libraries in their own directories.

In all cases, the builds are to be "as complete as possible". E.g., for HDF5 on Darwin, shared libraries are not supported with fortran. So in this case, sersh has to disable the fortran libraries. However, completeness may depend on other criteria. So, e.g., for trilinos, complete builds are provided, but so are builds that are as complete as possible and compatible with licenses that allow free reuse in commercial products.

Static builds

The static builds provide the most portable builds, as they eliminate or minimize the need to be compatible with any system shared libraries. The are also the most widely supported. For Windows, these mean libraries that import the static runtime library (libcmt). Generally this means that, for Windows, one should not use a static dependency for a shared build of a project, as doing so typically leads to the dreaded runtime conflict, e.g., http://stackoverflow.com/questions/2360084/runtime-library-mis-matches-and-vc-oh-the-misery.

Shared builds

Shared builds allow one to reuse libraries among executables, but then one has the difficulty of finding those libraries at runtime. This can be particularly difficult when moving an installation from one machine to another or when installing a package. To minimize these headaches, Bilder, as much as possible, uses rpath on Linux. However, packages need to figure out how to modify any executables or libraries post-build to make an installer.

Cc4py builds

This is a special build that is just a shared build using the compiler that Python was compiled with. This is generally gcc for Unices and Visual Studio for Windows. One adds a cc4py build only when the serial compiler is not the compiler used to build Python.

Bilder Hierarchy

It is possible to specialize Bilder: per machine, per poject and per person. by sourcing file(s) at each level of hierarchy:

Bilder default settings

When no specialization files are used, Bilder uses the default settings for the project.

By Machine

Set of machine files under bilder/machines directory to specify machine specific variables and settings. For example, to build a project on Windows platform with cygwin using Visual Studio 9, we have cygwin.vs9 machine file which sets up the environment as needed by Visual Studio 9. The machine files can be specified by "-m" option.

By Project

Please see [wiki:ConfiguringBilder Configuring Bilder] on how to set up per project configurations. Here, information needed for the project such as where to obtain third party dependency libraries, default installation directories, set the various variables defining where builds should take place, where installations should go, etc. can be specified.

By Person

Default settings using .bilddefrc

Every person building a project using Bilder can specify his/her own default settings by creating a .bilddefrc file in their home directory. This will be sourced in the mkXYZall-default.sh file to override any other default project settings.

Settings using .bilderrc

Every person building a project using Bilder can specify his/her own settings by creating a .bilderrc file in their home directory. This will be sourced in the mkXYZall.sh file to override any other project settings.

Per package per person

In cases where it is necessary to specify settings per package per person, a XYZ.conf file can be specified in the BILDER_CONFDIR/packages directory. If found, this file will be sourced in the mkXYZ.sh script to override all the other settings. If this file is modified, then Bilder will reconfigure and build the package.

Running Bilder Through The Defaults Scripts

The full set of options for Bilder are many, and this gives rise to the potential for mistakes. To facilitate this, we have created the defaultsfcns.sh and mkall-defaults.sh, and then then associated defaults scripts include the latter and execute runBilderCmd:

    $ cat mkfcall-default.sh 
    #!/bin/bash
    #
    # Determine (and possibly execute) the default Bilder command
    # for Facetsall.
    #
    # $Id: mkfcall-default.sh 593 2012-03-09 15:26:46Z cary $
    #
    h2.#########################################################
    # 
    # Set the default variables
    mydir=`dirname $0`
    mydir=${mydir:-"."}
    mydir=`(cd $mydir; pwd -P)`
    # Where to find configuration info
    BILDER_CONFDIR=$mydir/bilderconf
    # Subdir under INSTALL_ROOTDIR where this package is installed
    PROJECT_INSTSUBDIR=facets
    source $mydir/bilder/mkall-default.sh

    # Build the package
    runBilderCmd
    res=$?
    exit $res

The options,

    $ ./mkfcall-default.sh -h
    source /Users/cary/projects/facetsall/bilder/runnr/runnrfcns.sh
    WARNING: runnrGetHostVars unable to determine the domain name.
    Usage: ./mkfcall-default.sh [options]
    This script is meant to handle some of the vagaries that occur
    at LCFs and clusters in large systems (which have complicated file
    systems) such as those that have high performance scratch systems
    and NFS mounted home systems.  This script is also meant to ease
    the use of non-gfortran compilers.
    OPTIONS
      -c              common installations: for non-LCFS, goes into /contrib,
                      /volatile or /internal, for LCFSs, goes into group areas
      -C              Install in separate tarball and repo install dirs 
                      (internal/volatile) rather than software
      -E "<options>"  quoted list of extra options to pass to the mk script
      -f <file>       File that contains extra arguments to pass
                      Default: .extra_args
      -F <compiler>   Specify fortran compiler on non-LCF systems
      -g              Label the gnu builds the same way other builds occur.
      -H <host name>  use rules for this hostname (carver, surveyor, intrepid)
      -h              print this message
      -i              Software directory is labeled with "internal" if '$USER'
                      is member of internal install list
      -I              Install in $HOME instead of default location
                      (projects directory at LCFs, BUILD_ROOTDIR on non-LCFs)
      -j              Maximum allowed value of the arg of make -j
      -k              On non-LCFs: Try to find a tarball directory (/contrib)
                      On LCFs:     Install tarballs (instead of using facetspkgs)
      -m              force this machine file
      -n              invoke with a nohup and a redirect output
      -p              just print the command
      -q <timelimit>  run in queue if possible, with limit of timelimit time
      -t              Pass the -t flag to the  mk script (turn on testing)
      -v <file>       A file containing a list (without commas) of declared
                      environment variables to be passed to mk*.sh script
      -w <file>       Specify the name of a file which has a comma-delimited
                      list of packages not to build (e.g.,
                      plasma_state,nubeam,uedge) Default: .nobuild
      --              End processing of args for mkall-default.sh, all remaining
                      args are passed to the script.

mostly deal with which directory is to be used for installation, what is the time limit for the build, any extra options to be passed to the build, whether on the command line or in a file.

An example invocation look like

mkfcall-default.sh -cin -- -oXZ -E BUILD_ATLAS=true

which will (c) install in areas common to all users, (i) using the internal rather than the volatile directory for repo installations, (n) in background via nohup, -- what follows are more args for the base script, which are (o) build openmpi if on OS X or Linux, (X) build the newer, experimental packages, (Z) do not invoke the user defined bilderFinalAction method, (E) set this comma delimited list of environment variables, in this case to build Atlas if on Linux or Windows.

Using Jenkins with Bilder

Setting up Jenkins for use with Bilder

This set of pages is intended to describe how to set up the Jenkins continuous integration tools for launching Bilder jobs (which then handle the builds and testing). It is not intended to describe the most general way to set up Jenkins, but instead it describes a way that relies on having a Linux master node.

Starting up a Linux Jenkins master node

Install Jenkins using the installation mechanism for your platform. E.g., see
https://wiki.jenkins-ci.org/display/JENKINS/Installing+Jenkins+on+RedHat+distributions.

IMPORTANT: Before starting Jenkins for the first time:

  • create the directory where Jenkins will do its builds (known as JENKINS_HOME, not to be confused with the Jenkins home directory in /etc/passwd, which is initially set to /var/lib/jenkins, which we will assume here)
  • set the permissions of the Jenkins build directory (e.g., /home/bilder/jenkins)
  • Add jenkins to any groups as needed (e.g., contrib, research, xxusers)
  • modify {{{/etc/sysconfig/jenkins}}} as needed. Our settings are
    JENKINS_HOME="/home/bilder/jenkins"
    JENKINS_PORT="8300"
    JENKINS_AJP_PORT="8309"
    JENKINS_ARGS="--argumentsRealm.passwd.jenkins=somepassword --argumentsRealm.roles.jenkins=admin"

(somepassword is not literal.)

Create an ssh key for jenkins:

    sudo -u jenkins ssh-keygen

It cannot have a passphrase.

Start the jenkins service:

    sudo service jenkins start

Set Jenkins to start on boot:

    sudo chkconfig --level 35 jenkins on

Preparing a Unix Jenkins slave node

We will have one node prepared to act as a Jenkins slave for now. For ease, we will create a Unix slave. Later we will add more slaves.

  • On the service node, create the user who will run Jenkins.
  • As that user create the directory where Jenkins will work
  • Add that user to any groups needed to give it appropriate permissions (e.g., contrib, research, xxusers)
  • For public-key authentication
    • Add the public key created above for jenkins to that user's ~/.ssh/authorized_keys
    • On the master, check that you can do passwordless login by trying: "sudo -u jenkins ssh jenkins@yourhost"
  • For password authentication
    • Configure /etc/sshd_config to allow password authentication (PasswordAuthentication yes) and restart sshd

Configuring the Linux Jenkins master node

  • Open a browser and go to master.yourdomain:8300 and log in as admin with the password that you set in the JENKINS_ARGS variable, above.
  • Go to Manage Jenkins -> Manage Plugins -> Available and install the plugins,
    • Jenkins cross-platform shell (XShell)
    • Conditional Build-Step
    • Matrix Tie Parent
    • Jenkins build timeout (Build-timeout)
  • Go to Manage Jenkins -> Manage Users and then use Create User to create the users for your Jenkins installation. Make sure to create an administrative user perhaps yourself).
  • Go to Manage Jenkins -> Configure system and select/set
    • Enable security
    • Jenkins's own user database
    • If you wish, allow users to sign up
    • Project-based Matrix Authorization Strategy
      • Add an administrator name with all privileges
      • Give anonymous user Overall Read (only)
    • Default user e-mail suffix: e.g., @yourdomain
    • Sender: jenkins@yourdomain
  • Go to Manage Jenkins -> Manage Nodes -> New Node
    • Fill in name
    • Dumb Slave
    • You are taken to the configure form:
      • # of executors = 1
      • Remote FS root: what you decided upon when creating the slave
      • Usage: Leave this machine for tied jobs onlye
      • Launch methog: Launch slave gents on Unix machines via SSH
      • Advanced:
        • Host
        • Username (jenkins)

Creating your first Bilder-Jenkins project

We will create the first project to build on the master node. Later we will add more nodes.

  • Go to Jenkins -> New Job
    • Build multi-configuration project
  • Set name (here we will do visitall as our example)
  • Enable project-based security
    • For open source builds, give Anonymous Job Read and Job Workspace permission
    • Add user/group as needed
  • Source Code Management
  • Build Triggers (examples)
    • Build Periodically
      • Enter cron parameters, e.g., 0 20 * * *
    • Or have this build launched as a post-build step of another build
  • Configuration Matrix
    • Add axis -> slaves (is this available before we add nodes?)
      • Add master and the above unix slave
  • Build Environment
    • Abort the build if stuck (if desired)
      • Enter your timeout
    • Tie parent build to a node
      • Select master node
  • Build
    • Add build step -> Invoke XShell command
      • Command line: bilder/bildtrol/unibild -d mkvisitall
      • Executable is in workspace dir
  • Post-build Actions
    • Aggregate downstream test results
      • Select both
    • Archive the artifacts (select, see below for settings)
    • E-mail Notification
      • Set as desired

Creating a Windows Slave

  • Get all tools in place on the slave machine by following the instructions at https://ice.txcorp.com/trac/bilder/wiki/BilderOnWindows
  • Create the jenkins user account (if not already defined) as an Administrative account and log into the windows machine as the jenkins user
  • Make sure the slave's Windows name and its domain name are consistent.
  • Install Java (see http://www.java.com) and update the path to include C:\Windows\SYSWOW64 if on 64 bit Windows and then C:\Program Files (x86)\Java\jre6\bin
  • Create the build directory (e.g., C:\winsame\jenkins)
  • Set the owner to that directory to the jenkins user via properties->security->advanced->owner.
  • Install the Visual C++ redistributables from http://www.microsoft.com/download/en/details.aspx?id=5582
  • Follow the networking, registry, and security related instructions at https://wiki.jenkins-ci.org/display/JENKINS/Windows+slaves+fail+to+start+via+DCOM
  • (Cribbing from https://issues.jenkins-ci.org/browse/JENKINS-12820)
  • Start a web browser on the windows slave and connect to the master jenkins web page.
  • Manage Jenkins -> Manage Nodes -> New Node
  • Create a new node (Slave node)
    • Fill in name, choose it to be the same as the Windows name of the slave
    • Dumb Slave
    • In the configure form, set
    • # of executors: 1
    • Remote FS root: the directory created above (e.g., C:\winsame\jenkins)
    • Usage: Leave this machine for tied jobs only
    • Launch method: Launch slave agents via java web start
  • Launch the slave
  • Press the newly appeared button: Launch by webstart
  • A pop up window will be visible with a message as "Connected"
  • In that pop up window click File-> Install as Windows Service
  • Find jenkins service in the control panel, ensure that the owner is the jenkins user
  • UNCHECKED: Set startup to Automatic
  • Return to browser, take slave node off line in jenkins
  • Set launch method to: Windows slave as a Windows Service
    • Advanced:
    • Administrative username (jenkins) ''You may need to type it as computername\jenkins if you get an invalid service account error''
    • Password (set as selected in slave setup)
    • Use Administrator account given above
  • Relaunch slave node

Use the Slave on the Master

You should now be able to select this slave as a build node.

Launching Bilder through Jenkins

Jenkins runs Bilder through the scripts in the bildtrol subdir using the XShell command. The XShell command, when configured to launch somescript, actually invokes somescript.bat on Windows and somescript on unix. The Bilder .bat scripts simply translate the arguments and use them in a call to somescript, which is run through Cygwin.

Building and testing: jenkinsbild

The script, jenkinsbild, launches a build from Jenkins using the default scripts. For this example, we consider building the visualization package VisIT, for which the repo is https://ice.txcorp.com/svnrepos/code/visitall/trunk. This repo uses externals to bring in the VisIT source code. In this
case, the simplest XShell command is

    bilder/jenkins/jenkinsbild mkvisitall

which leads to execution of

    ./mkvisitall-default.sh -t -Ci -b builds-internal  -- -Z -w 7

which action is described in the rest of the Bilder documentation, but in particular, testing is invoked (-t), packages and repos are installed in separate areas (-C), use the ''internal'' directory for repo installation, do not do any post build action (-Z), and if a build less than 7 days old is found, do not execute the build (-w 7). The arguments after -- are passed directly to the underlying Bilder script, mkvisitall.sh.

The jenkinsbild script has very few options:

    Usage: $0 [options]
    GENERAL OPTIONS
      -b ................ Use internal/volatile build directory naming
      -m ................ Use mingw on cygwin
      -n ................ Do not add tests
      -p ................ Print the command only.
      -s step ..........  Sets which build step: 1 = internal, 2 = volatile.
      -2 ................ Sets build step to 2.

At present, the internal/volatile build directory naming is in fact always true. In this case, the first step (the default) builds in the subdir, builds-internal, and the second step (selected with -2 or -s 2) builds in the subdir, builds-volatile. Correspondingly, the repo installation directory is the internal directory on step 1 and the volatile directory on step 2.

Using mingw on cygwin (-m) is useful for codes that cannot build with Visual Studio.

Not adding the tests is useful in many instances where one is counting on only a few hosts to do testing.

The build step (-s2 or -2) will build in builds-volatile and install in the volatile directory, but it also determines several options by looking at the email subject of any step-1 build.

This is geared towards a general philosophy of having two builds, the stable (or internal) build that is done more rarely, and a volatile build that is done every night. So what is done in step 2 depends on the step 1 result, which can be determined from the email subject file, left behind. There are four cases:

  • Step 1 did nothing as there was a sufficiently recent build. Then step 2 does a full build with tests.
  • Step 1 was fully successful, both builds and tests. Then step 2 is not executed.
  • Step 1 builds succeeded, but some tests failed (and so some packages were not installed). Then step 2 is executed without testing, as that was done in step 1, and this permits installation of the built, but untested packages.
  • Step 1 builds failed (and so corresponding tests were not attempted). Then step 2 is not executed, as it will fail as well.

The error code returned by jenkinsbild for is success (0) if even only the builds succeeded but not the tests. This way the dashboard indicates jenkinsbild build success only. A subsequent job, jenkinstest, determines whether tests passed by examining the email subjects left behind.

For either build step, one wants to archive the artifacts,

mk*all.sh,jenkinsbild.log,builds-*/bilderenv.txt,builds-*/*-summary.txt,\
builds-*/*.log,builds-*/*-chain.txt,*/*-preconfig.sh,*/preconfig.txt,\
builds-*/*/*/*-config.sh,builds-*/*/*/*-config.txt,\
builds-*/*/*/*-build.sh,builds-*/*/*/*-build.txt,\
builds-*/*/*/*-test.sh,builds-*/*/*/*-test.txt,\
builds-*/*/*/*-submit.sh,builds-*/*/*/*-submit.txt,\
builds-*/*/*/*-install.sh,builds-*/*/*/*-install.txt,\
*tests/*-config.sh,*tests/*-config.txt,*tests/*-build.sh,\
*tests/*-build.txt,*tests/*-install.sh,*tests/*-install.txt,\
*tests/runtxtest-*.txt,*tests/*-txtest.log,\
builds-*/*/*/*-Darwin-*.dmg,builds-*/*/*/*-win_x??-*.exe,\
builds-*/*/*/*-Linux-x86*-*.tar.gz

in order to collect all results of builds and tests and any created installers.

Posting test results: jenkinstest

Bilder Architecture

Bilder has a largely Object Oriented structure, even though it is written in Bash. But like all (even OO) programs, it has a procedural aspect. Further it is task oriented (as opposed to event driven), with clear start and conclusion. We will break this architecture down into these three aspects: the task flow, the primary objects, and the procedures.

Task flow

Bilder scripts, like mkvisitall.sh, begin by setting some identifying variables, BILDER_NAME, BILDER_PACKAGE, ORBITER_NAME, and then continue by sourcing bildall.sh, which brings in the Bilder infrastructure: initializations of variables and methods used for building, testing, and installing packages.

Global method definition

The file, bildall.sh, brings in all of the global methods by first sourcing runr/runrfcns.sh, which contains the minimal methods for executing builds in job queues and reporting the results. It then obtains all of the more Bilder-specific methods by sourcing bildfcns.sh. These include generic methods for determining the build system, preconfiguring, configuring, building, testing (including running tests and collecting results), and installing. These files are the heart of Bilder, as they do all the heavy lifting.

A trivial, but important method is techo, which prints output to both stdout and to a log file. Another is decho, which does the same, but only if DEBUG=true, which is set by the option -d.

Option parsing

Options are parsed through the sourcing of bildopts.sh, which is sourced by bildall.sh. It then sets some basic command-line-argument derived variables, such as the installation directories, which it checks for writability. This file, bildopts.sh, has been written in such a way that Bilder-derived scripts (like mkvisitall.sh) can add their own arguments.

Initialization

Initialization is carried out by sourcing of two files, bildinit.sh and bildvars.sh (which are both sourced by bildall.sh). The purpose of bildinit.sh is to handle timing, to clear out indicating variables (like PIDLIST and configFailures), get the Bilder version, and define any path-like environment variables that might get changed in the course of the run.

The purpose of bildvars.sh is to determine useful variables for the build. The first comes from a possible machine file, then by OS (AIX, CYGWIN, Darwin, or Linux; MinGW is a work in progress). Then unset variables are set to default
values. These variables contain the compilers for serial (front-end nodes), back-end nodes, parallel, and gcc (as some packages build only with gcc, and the names of the gcc compilers can vary from one system to another). As well,
the flags for all of these compilers are set.

There are some packages that are so basic, that bilder defines variables for them. These include HDF5, the linear algebra libraries (lapack, blas, atlas), and Boost. These definitions allow the locations of these libraries to be defined on a per machine basis. This is needed particularly for LCFs, which have special builds of HDF5, BLAS, and LAPACK, and for CYGWIN, which must have Boost to make up for deficiencies in the Visual Studio compilers.

Finally, bildvars.sh prints out all of the determined values.

Package building

A Bilder-derived script, like mkvisitall.sh, after sourcing bildall.sh, then builds packages in groups. In the simplest case, a package is built in a straight-through sequence, like

    source $BILDER_TOPDIR/bilder/packages/facets.sh
    buildFacets
    testFacets
    installFacets

(The call to testFacets can be ignored if thepackage is not tested.) The methods for building, testing, and installing a package are defined in the ppropriate file under the packages subdirectory.

Bilder, however, has the capability of doing threaded builds, such as in the sequence,

    source $BILDER_TOPDIR/bilder/packages/trilinos.sh
    buildTrilinos
    source $BILDER_TOPDIR/bilder/packages/txphysics.sh
    buildTxphysics
    source $BILDER_TOPDIR/bilder/packages/txbase.sh
    buildTxbase
    installTxbase
    installTxphysics
    installTrilinos

In this case, all of the builds for Trilinos, TxPhysics, and TxBase are launched and so are occurring simultaneously. Then installTxbase waits for
the TxBase build to complete, then it installs it. Then it waits on and installs TxPhysics and Trilinos.

This ability to build multiple, non-interdependent packages simultaneously is a key feature of Bilder. It leads to great savings in time, especially with packages that must be built in serial due to a lack of dependency determination.

Concluding

The last part of the task flow is to install the configuration files, to summarize the build, and to email and post log files, build files, and the summary. The configuration files, which are created by createConfigFiles and installed by installConfigFiles into the installation directory, contain the necessary additions to environment variables to pick up the installed software.

The method, finish, then does the remaining tasks. It creates the summary file and emails it to the contact specified by the option parsing. It then posts all log and build files to Orbiter.

Package files

Package files define how a package is acquired, how it is configured for building on the particular platform for all builds, how all builds are done, and how they are all installed. Here we introduce an important distinction: the tarball packages are those obtained in the tar.gz format; the repo packages are obtained from a Subversion source code repo. Generic tarball packages are found in the Tech-X maintained Subversion repo at https://ice.txcorp.com/svnrepos/code/numpkgs and are available by anonymous svn. The repo packages are typically svn externals to a Bilder project, e.g., for visitall

visitall$ svn pg svn:externals .
bilder http://ice.txcorp.com/svnrepos/code/bilder/trunk
bilderconf http://ice.txcorp.com/svnrepos/code/bilder/bilderconf/trunk
visit http://portal.nersc.gov/svn/visit/trunk/src
visitwindows/distribution http://portal.nersc.gov/svn/visit/trunk/windowsbuild/distribution
visittest/data http://portal.nersc.gov/svn/visit/trunk/data
visittest/test http://portal.nersc.gov/svn/visit/trunk/test

Though written in Bash, Bilder uses object concepts. Each package file under packages acts an object, with instantiation, exposed (public) data members, private data members, and methods. As in OO, these package-build objects have the same data members and a common interface.

Instantiation is carried out by sourcing the package file. At this point, the data associated with that package is initialized as necessary.

Package-build data

The public data members for a package PKG_ are

    PKG_BLDRVERSION # Either the version to install or the
                    # version from the code repository
    PKG_DEPS        # The dependencies of this package
    PKG_BUILDS      # The names of the builds for this package
    PKG_UMASK       # The "umask" that determines the permissions
                    # for installation of this package

In the syntax of C++, the first underscore would be represented by '.', i.e., pgk.DEPS. Even dynamic binding can be implemented in Bash. I.e., if one
has pkgname that holds the name of a package, one can can extract, e.g., BLDRVERSION via

    vervar=`echo $pkgname | tr 'a-z./-' 'A-Z___'`_BLDRVERSION

Admittedly, many of these constructs would more easily be accomplished in a language like Python that naturally supports object orientation. The trade-off is that then one does not have the nearly trivial expression of executable invocation or threading that one has in Bash.

In addition, there are the per-build, public variables PKG_BUILD_OTHER_ARGS (e.g., FACETS_PAR_OTHER_ARGS or BABEL_STATIC_OTHER_ARGS. These are added to the command-line when configuring a package. In some cases, a package has more than one builds system, like HDF5, in which case one has two sets of variables, e.g., HDF5_PAR_CMAKE_OTHER_ARGS and HDF5_PAR_CONFIG_OTHER_ARGS.

Exposed package-build methods

All package files are supposed to provide three methods, e.g., buildPkg, testPkg, and installPkg, where "Pkg" is the name of the package being built. E.g., FACETS has buildFacets, testFacets, installFacets. For untested packages, the second method can simply be empty.

The method, buildPkg, is supposed to determine whether a package needs to be built. If so, it should either acquire a tarball package or preconfigure (prepare the build system for) a repo package, then configure the package, and finally launch the builds of the package. Preconfiguring in the example of an autotools package involves invoking the autoreconf and other executables for creating the various configuration scripts. In many other cases there is no associated action. If the Bilder infrastructure is used, then all builds are executed in a separate thread, and at the end of the buildPkg method all the process IDs for these builds have been stored in both the variable PIDLIST, and the particular process ID for build "ser" of package "pkg" is stored in the variable, PKG_SER_PID.

The method, testPkg, is supposed to determine whether a package is being tested. If not, it simply returns. But if the package is being tested, then testPkg executes wait for each build. Upon successful completion of
all builds, the tests are launched. These are treated just like builds, so the process IDs are stored as in the case of builds.

The last method, installPkg, in the case of a tested package, waits for the tests to complete, then installs the package if the tests completed successfully, after which is sets the tests as being installed, so that tests will not be
run again unless the version or dependencies of the package change. In the other case, where the package is not being tested, it waits for the builds to complete and installs any successful builds.

All three methods for any package are supposed to compensate for any errors or omissions in the build systems. Errors include fixing up library dependencies on Darwin, setting permissions of the installed software, and so forth.

The object-oriented analogy is that each package-build object has an interface with three methods. The syntax translation is buildPkg -> pkg.build.

Private package-build data

In the course of its build, any package will generate other variables with values. These are organized on a per-build basis, and so one can think of each
package-build object as containing an object for each build of that package.

Internal objects

Builds

Tests

Combined package objects

Future directions

Dependency determination.

Linear Algebra Libraries in Bilder

There are a wide variety of ways to get LAPACK and BLAS: Netlib's libraries (reference LAPACK and BLAS), CLapack (for when one does not have a Fortran compilers), ATLAS (for cpu-tuned libraries), GOTOBLAS (from TACC), and system libraries (MKL, ACML).

For numpy and all things that depend on it, Bilder uses ATLAS (if it has been built), and otherwise it uses LAPACK.

For non Python packages, the current logic is

Darwin

Always use -framework Accelerate

Linux and Windows

  • SYSTEM_LAPACK_SER_LIB and SYSTEM_BLAS_SER_LIB are used if set.
  • Otherwise, if USE_ATLAS is true, then ATLAS is used.
  • Otherwise, use Netlib LAPACK if that is found.
  • Otherwise
    • If on Windows, use CLAPACK
    • If on Linux, use any system blas/lapack

The results of the search are put into the variables, CMAKE_LINLIB_SER_ARGS, CONFIG_LINLIB_SER_ARGS, LINLIB_SER_LIBS.

Extending Bilder

Bilder builds packages using the general logic in bildfcns.sh, the operating-system logic in bildvars.sh,
logic for a particular package in the Bilder package file (kept in the packages subdir), logic for a
particular machine in the Bilder machine file (kept in the machines subdir), and additional settings for a particular package on a particular machine in the Bilder machine file. To extend Bilder, one adds the files that introduce the particular logic for a package or a machine.

Debugging your builds

This section describes some of the things that can go wrong and explains how to fix them.

Bilder With IDEs

This is a page to collect notes on using IDEs to develop code while at the same time using bilder to build the project.

Updated by John Cary about 9 years ago · 6 revisions