Incremental Caching

Availability

Please contact support@solanolabs.com regarding availability of this feature.

For information on Solano CI’s dependency caching, please see Caching Dependencies.

Usage Scenarios

The primary use case for this feature is changing the behavior of how Solano CI caches are saved and restored. Rather then create a new cache from scratch when a change to specific files is detected, a previously built cache can be provided as a strating point. Starting from an existing cache can speed up build setup/preparation and a new cache will be saved at the end of the build.

It is required that dependency installation tasks (such as bundle install, npm install, etc.) and/or asset generation tasks that can take advantage of the results of previously run executions to improve build setup/preparation are used.

Usage

With incremental caching enabled, the build preparation setup hook tasks must be able to properly handle distinct sceanarios:

  1. When no cache is supplied.
  2. When a restored cache is complete and does not need to be updated.
  3. When a restored cache is incomplete and/or its contents need to be updated.

While there can be a minor build time cost associated with allowing incremental caches to continue to grow, a more important reason to ensure they contain the minimal required content is to ensure the build/test environment accurately matches the development, staging, and/or production environment. Unused software packages (ruby gems, node modules, etc.) provided by an incremental cache should be removed (bundle clean, npm purge, etc.) as part of the build setup processes. Unused assets provided by the cache can often be identified and removed by comparing their file paths with a manifest generated during the asset preparation/compilation process.

Dynamic Cache Keys

In addition to changing the cache lifecycle this feature adds the ability to have cache objects controlled by scripts ran at the begining of a build.

Instead of relying solely on changes to files specified as key_paths in a solano.yml configuration file to determine when a cache should be invalidated (and a new one saved), a new update_scripts setting allows dynamically determining when a new cache should be saved.

Each update_scripts list item is keyed under a user provided name. (The user-provided name is just a name; it is used only for logging and for system generated messages.) It must include a key_script setting and one or more paths in a list.

To inform Solano CI that a new cache should be saved at the end of the build, each key_script should provide a unique output when appropriate. This output typically will be the combined MD5/SHA-1 sum of a set of files, or the hash of a manifest file.

The paths list should include files and directory paths that will be stored at the end of the build when a new cache is saved.

Examples

With a cache configuration like the following, Solano CI will save a new cache when the scripts/calculate_key_hash.sh script results in a different output, due to a change in a file in the app/assets directory. Please note that changes to key_paths files will cause the whole cache to be updated.

cache:
  # Changes to key path files will update the full cache
  key_paths:
    - Gemfile
    - Gemfile.lock
  save_paths:
    - HOME/bundle
    - HOME/.gem
  update_scripts:
    assets:  # user-provided name, can be used for logging/messaging
      key_script: ./scripts/calculate_key_hash.sh app/assets
      paths:
        - REPO/public/assets
        - REPO/tmp/cache/assets

The following cache configuration will trigger a new cache to be saved when there is a change to any of the files/directories parameters supplied to the scripts/calculate_key_hash.sh key scripts.

cache:
  update_scripts:
    ruby: # user-provided name, can be used for logging/messaging
      key_script: ./scripts/calculate_key_hash.sh Gemfile Gemfile.lock
      paths:
        - HOME/bundle
        - HOME/.gem
    node:
      key_script: ./scripts/calculate_key_hash.sh package.json bower.json
      paths:
        - REPO/node_modules
        - REPO/bower_components
    assets:
      key_script: ./scripts/calculate_key_hash.sh app/assets
      paths:
        - REPO/public/assets
        - REPO/tmp/cache/assets
  # Empty arrays are provided to override Solano CI defaults
  key_paths: []
  save_paths: []

Example scripts/calculate_key_hash.sh key script:

#!/bin/bash -e
# Calculate hash from files and directories supplied as arugments
find "$@" -type f -exec md5sum {} \; | md5sum | cut -c1-32

Notes

key_script is evaluated after the repository has been checked out but before any cache object have been downloaded and before any packages have been installed. Therefore, if the command or script has any non-default dependencies it must install them itself. We recommend using a shell script where possible to avoid dependencies.

The format of paths is a list of path specifiers:

  • Paths can be directories or filenames
  • By default, paths are relative to the repo root
  • To make the path relative to the home directory, prefix it with HOME/. You can also explicitly use prefix REPO/ to indicate a repo-relative path.
  • Paths cannot contain .. to reference parent directories

With incremental caching enabled, the priority in which caches will be restored:

  1. When the output values of all of the key_scripts match, the matching cache will be restored.
  2. When the outputs differ, the mostly recently saved cache for the branch will be restored.
  3. When no branch-specific cache can be restored, the most recently saved cache from a separate branch of the repo will be restored.

Incremental caching can be enabled on a Solano account/organization level only.