SQR-016: Stack release playbook

  • Frossie Economou

Latest Revision: 2023-07-13

Note

SQuaRE instructions for making official releases

1 The LSST Science Pipelines Release Process

The LSST stack is a large mixed Python/C++ codebase split among many repos with multiple dependencies between them built by a niche build system. The release process is therefore a bit complex and can be protracted. In an attempt not to have to co-ordinate the state of the release with 80+ developers, or worse, have them down tools while a release is being minted, this process has been optimized to impact development as little as possible. Unless a developer needs to be called upon to fix a failing integration or platform test, she can carry on as normal.

In gross, the conceptual steps are:

  1. A successful weekly release is identified that forms the basis for the official release.

  2. The first release candidate (rc1) is created using that weekly build as a seed. This means that it does not matter if the codebase’s master has moved on in the meanwhile.

  3. The rc is announced for developer testing

  4. Additional rc are created and announced, if/as necessary to resolve any release blocking issues that may arise.

  5. Documentation is updated on a branch (release notes, installation instructions, metrics report)

  6. The final release is created using the last rc verbatim (ie., the accepted rc and final release code should be identical except for the name of git/eups tags) and the documentation merged to master.

1.1 Kicking-off the process

The first step is to establish whether it is a good time to make a release, eg. if many developers are about to push significant features, it may be better to wait for them to finish. Announce the intent to make a release in the #dm channel and inform all DM T/CAMs and the DM System Engineer.

If it is generally a good time, identify the nearest weekly release, this will be the seed for the release candidate.

1.1.1 Announce start of release process

Next, start a clo post to use for status updates using the clo-stubb.

1.1.2 Branching the docs

At this point you should branch pipelines_lsst_io. so as not to capture any changes on the master branch that may occur during the release process.

Note that the branch does not have a v prefix.

git clone https://github.com/lsst/pipelines_lsst_io.git
cd pipelines_lsst_io
git checkout -b 888.0.x
git push -u origin 888.0.x

1.1.3 Identify base weekly build

Identify the git tag of the weekly build you wish to base the release release candidate upon, say w.9999.52. This should be determined by discussion with the product owner and team developer.

Example method for listing weekly git tags:

git clone https://github.com/lsst/lsst_distrib.git
cd lsst_distrib
git tag -l w.*

1.2 1st Release Candidate

1.2.1 Build and Publish

Run the jenkins official-release job. The source git refs should be only the tag of the “seed” weekly release. The release tag must start with a v and end with .rc1.

See git-tags for details on the formatting of git tags.

Example:

SOURCE_GIT_REFS: w.9999.52
RELEASE_GIT_TAG: v888.0.0.rc1
O_LATEST: false

1.2.2 Announce rc1

Update the release status clo post to to announce the availability of rc1.

1.3 2nd+ Release Candidate(s)

An .rcX, where X is > 1, is only required if a problem is found with the initial rc.

Any subsequent rc differs slightly from the initial rc1 process because it inherently is not identical to a previous git tag (if it was, there would be no reason to produce another rc). The creation of a git release branch prior to rc1 would eliminate the differences.

1.3.1 Branch, Merge

Any git repository that needs to be modified for additional rc releases should be branched and have the necessary changes merged to a release branch. Eg., if changes were needed in v888.0.0.rc1 a “release branch” along the lines of v888.0.x should be created in the repos that need changes.

(TBD: merge to master and cherry-pick to release branch or merge to release branch and merge to master???)

1.3.2 Test Release Branch(es)

As a sanity check, run the jenkins stack-os-matrix job to verify that the release branch + previous rcX tag combination is buildable prior to attempting to build and publish the new rc.

Example 1:

REFS: v888.0.x v888.0.0.rc1
PRODUCTS: lsst_distrib lsst_ci

1.3.3 Build and Publish

Run the jenkins official-release job.

For input source git refs use the previous rc tag along with the release branch(es). Ensure that the release branch is specified to the left of the rcX tag in the listing of git refs.

Example 1:

SOURCE_GIT_REFS: v888.0.x v888.0.0.rc1
RELEASE_GIT_TAG: v888.0.0.rc2
O_LATEST: false

Example 2:

SOURCE_GIT_REFS: v888.0.x v888.0.0.rc5
RELEASE_GIT_TAG: v888.0.0.rc6
O_LATEST: false

1.3.4 Announce rcX

Again, the post made from the clo-stubb should be updated to announce the current rcX.

1.4 Final Release

Note that a Final Release differs from a Release Candidate in that the DM internal/first party git repositories receive a git tag that does not have an alphabetic prefix (eg., v). This has the effect of changing the eups version strings as lsst-build sets the eups product version based on the most recent git ref that has an integer as the first character.

A consequence of this behavior is that the final git tag must be present prior to the production of eupspkg/eups tag.

1.4.1 Build and Publish

Run the jenkins official-release job.

The source input must only be the latest rc tag.

The O_LATEST flag controls if the produced science pipelines docker image has the -o_latest docker tags applied to it. This should only be set on a final release AND only if the release is the highest version release. For example, if 99.0.0 has been release and a 98.0.1 bugfix release is being made, O_LATEST should not be set.

Example 1:

SOURCE_GIT_REFS: v888.0.0.rc6
RELEASE_GIT_TAG: 888.0.0
O_LATEST: true

1.4.2 Branch newinstall.sh repo

In this process we make use of the fact that git doesn’t care whether a ref is a tag or a branch to constrain the number of branches to repositories that need retroactive maintenance or need to be available in more than one cadence. One such example is the lsst repo since it contains newinstall.sh which sets the version of eups, and that may be different for an official release than the current bleed.

Note that the branch does not have v prefix.

Branch the lsst repo:

git clone https://github.com/lsst/lsst.git
cd lsst
git checkout -b 888.0.x
git push -u origin 888.0.x

Now in lsst/scripts/newinstall.sh change the canonical reference for this newinstall to be one associated with the current branch:

NEWINSTALL_URL="https://raw.githubusercontent.com/lsst/lsst/v888.0.x/scripts/newinstall.sh"

and commit and push.

This means that if you need to update newinstall.sh for bleed users, official-release users will not be prompted to update to the latest version, but will phone home against their official-release branch for hotfixes.

Also double-check for other things that might need to be updated, like the documentation links (though these should really be fixed on master prior to branching or cherry-picked back).

1.4.3 Documentation

Documentation to be collected for the release notes in pipelines_lsst_io is:

  • Release notes from the T/CAMs for Pipelines, SUI, and DAX

  • Characterization report from the DM or SQuaRE scientist

  • Known issues and pre-requisites from the T/CAM for SQuaRE

  • Before merging to master, ask the Documentation Engineer to review

  • Update the newinstall.rst page on your release branch of pipelines_lsst_io with the new download location of the newinstall.sh script.

1.4.3.1 Documenting Deprecations and Removals

Deprecated interfaces may be removed in the major release after the one in which their deprecations first appear. These deprecations must be included in the release notes.

To identify all deprecations that have to be mentioned in a release note, we search the codebase looking for specific strings. The application ack is used here as a reference, since it is easy to install in Unix systems [1].

These are the strings to search:

  • python deprecations: ack -A 3 --python "@deprecated\(" stack/

  • pybind11 deprecations: ack -A 3 --python "deprecate_pybind11" stack/

  • C++ deprecations: ack -A 3 --cpp "\[deprecated\(" stack/

  • config deprecations: ack -B 3 --python "^\s+deprecated=" stack/

Deprecated code should be removed in the major release following the one in which it was deprecated (though removal work is occassionally not scheduled in time for this to occur). Removals should be documented in another section of the release notes. The set of deprecations in the release notes for the previous release is probably the most complete list of code that might have been removed in a release, but each of these must be checked against Jira or the code itself to determine whether removal actually occurred. Deprecations that are not followed by removals should be left in the release notes as deprecations.

1.4.4 Announce official release

Announce the final release on clo.

1.5 Other OS checking

While we only officially support the software on certain platforms (RHEL/CentOS 7 is the reference, and we CI MacOS and RHEL 6), we check in a number of other popular platforms (eg Ubuntu, newer versions of CentOS etc) by spinning up machines on Digital Ocean (typically) and following the user install instructions. This also allows us to check the user from-scratch installation instructions including the pre-requisites.

2 c.l.o stubb

Here is where we currently are in the release process. Current step in bold.

Release Precursor Steps
---------------------------------

1. Identify any pre-release blockers ("must-have features") :tools:
Contributors check if there are outstanding issues that have to be included in the next release and relate them as blocker to the above issue DM-XXXXX.
1. Wait untill all blocking issues are resolved.
1. Create Jira issues for each release activity.
1. Check that the weekly build is scientifically suitable to be used as starting point for the release

Release Jira issue: https://jira.lsstcorp.org/browse/DM-XXXXX
Tentative weekly to use as starting point for the release is w_20YY_WW
Tentative target date to close the release is YYYY-MM-DD.

Release Engineering Steps
-------------------------------

1. Create first release candidate vM.m.p.rc1
1. Release candidate vM.m.p.rc1 available:
 - Build: bxxxxx
 - Weekly: w_20YY_WW
1. Build the release candidate on supported platforms. Report bugs in Jira if any.
1. Invite developers, contributors and downstram users to verify the release candidate and report bugs in Jira if any.
1. Wait for bugs and additional issues to be identified, fixed and ported to the release branch.
1. Create new release candidates if bugs / new issues have been fixed in the release branch
1. Create official release M.m.p

Documentation Steps
-------------------------

In parallel with the engineering steps after rc1 is available.
[Integration on b.M.x branch of pipelines_lsst_io](https://github.com/lsst/pipelines_lsst_io/pull/TBD)

1. Update Prereqs/Install
1. Gather Release notes
1. Gather Metrics report
1. Update Known Issues
1. Release availability community post

3 Github teams

There are three “special” teams in the LSST Github org:

  • Data Management

  • DM Externals

  • DM Auxilliaries

These are used in the release process in the following way:

  • Data Management repos are a dependency of lsst_distrib and should be tagged with the bare release version, eg. 888.0.0, unless the repo is also a member of the DM Externals team. All repos tagged as part of a release should be members of the Data Management team to ensure that DM developers are able to modify all components of a release.

  • DM Externals also indicates a dependency of lsst_distrib but one that is tagged with a v prefix in front of the release version. Eg., v888.0.0 This is required because lsst-build derives the eups product version string from git tags that begin with a number. DM developers prefer that eups display external packages version string rather than of a DM composite release. Thus the v prefix causes the git tag to be ignored by lsst_distrib. “External” repos must not also be members of DM Auxilliaries.

  • DM Auxilliaries are repos that we want to snapshot as part of a release but are not an eups dependency of lsst_distrib. “Aux” repos must not also be members of DM Externals.

4 Format of “tags”

4.1 git tags

  • DM produced code this is part of an “official” release must have a git tag that starts with a number

  • “official” release git tags on external/third-party software that DM has repackaged must be prefixed with a v but are otherwise identical to that on DM produced code. Eg., 888.0.0 -> v888.0.0

  • Non-“official” releases, release candidates, weekly builds, etc. must start with a letter

  • shall only use [a-z], [0-9], and .

    • lowercase latin alphabet characters shall be used; uppercase characters are forbidden

    • These common characters must not be used: -, _, /

Examples of valid (good) git tags

# unofficial builds
d.9999.01.02
w.9999.52

# release candidate
v888.0.0.rc99

# official release of DM produced code
888.0.0

# official release of external/third-party product
v888.0.0

Examples of invalid (bad) git tags

d_9999_01_02
w_9999_52
v888-0-0-rc99
888_0_0
v888_0_0
foo/bar

4.1.1 eups tags

  • must not start with a numeric value

  • shall only use [a-z], [0-9], and _

    • lowercase latin alphabet characters shall be used; uppercase characters are forbidden

    • EUPS reportedly has or has had problems with . and -

  • official releases and release candidates must be prefixed with v

Examples of valid (good) eups tags

# unofficial builds
d_9999_01_02
w_9999_52

# release candidate
v888_0_0_rc99

# official release of DM produced code AND external/third-party product
v888_0_0

Examples of invalid (bad) eup tags

123
d.9999.01.02
w.9999.52
v888_0-rc99
888.0.0
v888.0.0
foo/bar

4.1.2 git <-> eups tag conversion

The “tags” along each row in the following table should be considered equivalent conversions.

internal git

external git

eups tag

d.9999.01.02

d.9999.01.19

d_9999_01_02

w.9999.52

w.9999.52

w_9999_52

v888.0.0.rc99

v88.0.0.rc99

v888_0_0_rc99

888.0.0

v888.0.0

v888_0_0

5 Conda Environment/Packages Update

There are conflicting pressures of updating the conda package list frequently to minimize the amount of [likely] breakage at one time and resisting changes as the git sha1 of the conda environment files is used to defined the ABI of the eups tarball packages.

5.1 Adding a new Conda package

  1. The name of the package needs to be added to the “bleed” or un-versioned environment files in the lsst/scipipe_conda_env repo. Which are:

    After the implementation of DM-17457, the conda environments have been migrated to yaml format. This permits to add pip packages to the environment definition.

    The bleed env files should be keep in sync with the exception of the nomkl package, which is required on linux. Also note that the env files should be kept sorted to allow for clean diff s.

  2. The regular conda env files need to be updated by running a fresh install with deploy -b` (bleed install) and then manually exporting the env to a file. A side effect of this is other package versions will almost certainly change and this is an ABI breaking event. The existing env files are:

    conda list -e should be run on linux and osx installs and the results committed for both platforms as a single commit so that the the abbrev sha1 of the latest commit for both files will be the same.

  3. As an abbreviated sha1 of the lsst/lsstsw repo is used to select which [version of] conda env files are used and to define the eups binary tarball “ABI”, jenkins needs to know this value to ensure that newinstall.sh is explicitly using the correct ref and to construct the paths of the tarball EUPS_PKGROOT s. The value of splenv_ref / LSST_SPLENV_REF needs to be updated at:

    Once a commit is present in the lsst/scipipe_conda_env (I.e., on an un-merged branch), the conda env may be tested by triggering the https://ci.lsst.codes/blue/organizations/jenkins/stack-os-matrix/activity job with the SPLENV_REF parameter set to the abbreviated sha1 of the candidate conda env.

  4. The ~last major release should be rebuilt in the new “ABI” EUPS_PKGROOT so that that newinstall.sh from master will still be able to do a binary install of the current major release. This may be done by triggering a Jenkins release/tarball-matrix build.