Skip to content

Testing presents special challenges for packages that wrap an API. Here we tackle one of those problems: how to deal with auth in a non-interactive setting on a remote machine. This affects gargle itself and will affect any client package that relies on gargle for auth.

This article documents the token management approach taken in gargle. We wanted it to be relatively easy to have a secret, such as an auth token, that we can:

  • Use locally
  • Use with continuous integration (CI) services, such as GitHub Actions
  • Use with R-hub

all while keeping the secret secure.

The approach uses symmetric encryption, where the shared key is stored in an environment variable. Why? This works well with existing conventions for local R usage. Most CI services offer support for secure environment variables. And R-hub accepts environment variables via the env_vars argument of rhub::check().

This is based on an approach originally worked out in bigrquery.

Accessing the secret_*() functions

gargle’s approach to managing test tokens is implemented through several functions that all start with the secret_ prefix. These functions are not (currently?) exported. This may seem odd, since others might want to use these functions. But note they are only needed during setup or at test time. This sort of usage is compatible with others calling internal gargle functions and possibly inlining a version of a couple test helpers.

One way to make the secret_*() functions available for local experimentation is to call devtools::load_all(), which exposes all internal objects in a package:

devtools::load_all("path/to/source/of/gargle/")

The approach I’ll take in this article is to call these functions via :::.

Overview of the approach

  1. Add the sodium package to Suggests in your DESCRIPTION, via usethis::use_package("sodium", "Suggests") if you like.
  2. Generate a random PASSWORD and give it a self-documenting name, e.g. GARGLE_PASSWORD. Store as an environment variable.
  3. Identify a secret file of interest, such as the JSON representing a service account token. This is presumably stored outside your package.
  4. Use the PASSWORD to apply a method for symmetric encryption to the target file. Store the resulting encrypted file in a designated location within your package.
  5. Store or pass the PASSWORD as an environment variable everywhere you’ll need to decrypt the secret.
    • Check that the platform has support for keeping the PASSWORD concealed.
    • Make sure you don’t do anything in your own code that would dump it to, e.g., a log file.
  6. Rig your tests to determine if the key is available and, therefore, whether decryption is going to be possible.
    • If “no”, carry on gracefully with any tests that don’t require auth.
    • If “yes”, decrypt the secret and put the associated token into force globally for the test run or on an “as needed” basis in individual tests.

Annotated code-through

Generate a name for the PASSWORD

secret_pw_name() creates a name of the form “PACKAGE_PASSWORD”, a convention baked into the secret_*() family of functions.

(pw_name <- gargle:::secret_pw_name("gargle"))
#> [1] "GARGLE_PASSWORD"

Generate a random PASSWORD

In real life, you should keep the output of secret_pw_gen() to yourself! We reveal it here as part of the exposition.

(pw <- gargle:::secret_pw_gen())
#> [1] "cLldBLiCfJepovI0nPXuE6BKmU0N6TYntYwjROMa5f230gwuQE"

Define environment variable in .Renviron

Combine the name and value to form a line like this in your user-level .Renviron file:

GARGLE_PASSWORD=cLldBLiCfJepovI0nPXuE6BKmU0N6TYntYwjROMa5f230gwuQE

usethis::edit_r_environ() can help create or open this file. We strongly recommend using the user-level .Renviron, as opposed to project-level, because this makes it less likely you will share sensitive information by mistake. If you don’t take our advice and choose to store the PASSWORD in a file inside a Git repo, you must make sure that file is listed in .gitignore. This still would not prevent leaking your secret if, for example, that project is in a directory that syncs to DropBox.

Make sure .Renviron ends in a newline; the lack of this is a notorious cause of silent failure. Remember you’ll need to restart R or call readRenviron("~/.Renviron") for the newly defined environment variable to take effect.

Provide environment variable to other services

GitHub Actions:

Define the environment variable as an encrypted secret in your repo:

https://help.github.com/en/actions/configuring-and-managing-workflows/creating-and-storing-encrypted-secrets

Use the secrets context to expose a secret as an environment variable in your workflows. That will look like like so, in some appropriate place in your workflow file:

env:
  PACKAGE_PASSWORD: ${{ secrets.PACKAGE_PASSWORD }}

Remember that the secret, and therefore the associated environment variable, is not available when workflows are triggered via an external pull request. This is another reason to carry on gracefully when token decryption is not possible (see below).

Travis-CI

Define the environment variable in your repo settings via the browser UI:

https://docs.travis-ci.com/user/environment-variables/#defining-variables-in-repository-settings

Alternatively, you can use the Travis command line interface to configure the environment variable or even define an encrypted environment variable in .travis.yml.

Regardless of how you define it, remember that private environment variables are not available to external pull requests, which is another reason to carry on gracefully when token decryption is not possible (see below).

You may also need something like this in .travis.yml so that the sodium R package can be installed:

addons:
  apt:
    sources:
    - sourceline: 'ppa:chris-lea/libsodium'
    packages:
    - libsodium-dev

AppVeyor

Define the environment variable in the Environment page of your repo’s Settings. Make sure to request variable encryption and to click “Save” at the bottom. In the General page, you probably want to check “Enable secure variables in Pull Requests from the same repository only” and, again, explicitly “Save”.

As with Travis, it is also possible to encrypt the password using your AppVeyor account’s public key and inline the value in appveyor.yml. There is a helpful web UI for that does the encryption and generates the lines to add to your config:

https://ci.appveyor.com/tools/encrypt

This can also be found via Settings > Encrypt YAML.

R-hub

Send the environment variable in your calls to rhub::check() and friends:

rhub::check(env_vars = Sys.getenv(gargle:::secret_pw_name("gargle"), names = TRUE))

Encrypt the secret file

secret_write() takes 3 arguments:

  • package name. Processed through secret_pw_name() in order to retrieve the PASSWORD from an appropriately named environment variable.
  • name of the encrypted file to write. The location is below inst/secret in the source of package.
  • data, either a file path to the unencrypted secret file or the data to be encrypted as a raw vector. In the case of a secret file, we strongly recommend that its primary home on your local computer is outside your package and, generally, outside of any folder that syncs regularly to a remote, e.g. GitHub or DropBox. This decreases the chance of accidental leakage.

Example of a call to secret_write(), where gargle-testing.json is a JSON file downloaded for a service account managed via the Google API / Cloud Platform console:

gargle:::secret_write(
  package = "gargle",
  name = "gargle-testing.json",
  input = "a/very/private/local/folder/gargle-testing.json"
)

This writes an encrypted version of gargle-testing.json to inst/secret/gargle-testing.json relative to the current working directory, which is presumably the top-level directory of gargle’s source. This encrypted file should be committed and pushed.

Test setup

Now you need to rig your tests or their setup around this encrypted token. You need to plan for two scenarios:

  • Decryption is going to work. This is where you actually get to test package functionality against the target API, with auth.
  • Decryption is not going to work. Either because the Suggested sodium package is not available or (much more likely) because the environment variable that represents the key is not available.
    • This will be the case on CRAN, by definition, because there is no way to share an encrypted secret.
    • This will be the case for external contributors, on their personal machines and when their GitHub pull requests are checked via CI services, such as GitHub Actions, Travis-CI, or AppVeyor.

CI configuration

We recommend that you actively check your package under the “no decryption, no token” scenario, so that you discover problems before CRAN or your contributors do. In fact, this should probably be the default situation for your CI jobs and you only supply the secret to a single, flagship job – probably the check with the current R release and your favorite operating system.

Here’s the simplified build matrix from the R CMD check GitHub Actions workflow file used by gargle (note: we redacted a very long rspm URL that’s irrelevant here):

strategy:
  matrix:
    config:
      - {os: macOS-latest,   r: 'devel', gargle_auth: GARGLE_NOAUTH}
      - {os: macOS-latest,   r: '4.0',   gargle_auth: GARGLE_PASSWORD}
      - {os: windows-latest, r: '4.0',   gargle_auth: GARGLE_NOAUTH}
      - {os: ubuntu-16.04,   r: '4.0',   gargle_auth: GARGLE_NOAUTH, rspm: "..."}
      - {os: ubuntu-16.04,   r: '3.6',   gargle_auth: GARGLE_NOAUTH, rspm: "..."}
      - {os: ubuntu-16.04,   r: '3.5',   gargle_auth: GARGLE_NOAUTH, rspm: "..."}
      - {os: ubuntu-16.04,   r: '3.4',   gargle_auth: GARGLE_NOAUTH, rspm: "..."}
      - {os: ubuntu-16.04,   r: '3.3',   gargle_auth: GARGLE_NOAUTH, rspm: "..."}

env:
  GARGLE_PASSWORD: ${{ secrets[matrix.config.gargle_auth] }}

Notice how GARGLE_PASSWORD will only be available when checking against the released version of R on macOS-latest.

bigrquery implements the same idea with a different approach: it does not provide BIGRQUERY_PASSWORD to the main R CMD check GitHub Actions workflow at all. Instead there is a separate “live api” workflow that only has one job and that accesses the secret.

The tidyverse / r-lib team is transitioning from Travis/AppVeyor to GitHub Actions. But in case it is helpful, here’s a simplified excerpt from a .travis.yml file used by gargle in the past. The main r: release build accesses GARGLE_PASSWORD implicitly as an encrypted environment variable, but R CMD check runs for the other builds with GARGLE_PASSWORD explicitly unset:

matrix:
  include:
    - r: release
      # <stuff about code coverage, pkgdown build & deploy, etc.>
    - r: release
      env: GARGLE_PASSWORD=''
    - r: devel
      env: GARGLE_PASSWORD=''
    - r: oldrel
      env: GARGLE_PASSWORD=''

Regardless of your CI platform, your absolute best bet for writing configuration files is to look at what other R package developers are doing in their public source repos. All of the above is static, simplified, and (probably) stale and will never reflect the current state-of-the-art.

Testthat configuration

In a wrapper package, you could determine decrypt-ability at the start of the test run. Here’s representative code from googledrive’s tests/testthat/helper.R file, but something similar can be seen in bigrquery and googlesheets4:

if (gargle:::secret_can_decrypt("googledrive")) {
  json <- gargle:::secret_read("googledrive", "googledrive-testing.json")
  drive_auth(path = rawToChar(json))
}

Versions of secret_can_decrypt() and secret_read() are defined here in gargle. drive_auth() is a function specific to googledrive that loads a token for use downstream (in multiple tests, in this case). Note that it can clearly accept a JSON string, as an alternative to a filepath, and that’s very favorable for our workflow. We’ll come back to this below.

But what if secret_can_decrypt() returns FALSE and no token is loaded? That’s where you rely on a custom test skipper. Here is the test skipper from googledrive:

# googledrive
skip_if_no_token <- function() {
  testthat::skip_if_not(drive_has_token(), "No Drive token")
}

googledrive::drive_has_token() returns TRUE if a token is available and FALSE otherwise. By calling the skipper at the start of tests that require auth, you arrange for your package to cope gracefully when the token cannot be decrypted, e.g., on CRAN and in pull requests. It is typical to define such a skipper in tests/testthat/helper.R or similar.

gargle’s usage of the testing token is a bit different, still evolving, and less relevant to the maintainers of wrapper packages. Therefore it’s not featured here.

Known sources of friction

Once you dig into the secret_*() family, you will notice there are two recurring sources of friction:

  • File or object? You almost certainly store your secrets in files. But the sodium functions for data encrypt and decrypt work with R objects. So, for example, it is convenient if token ingest can accept an R object as opposed to only a file path.
  • Raw vectors. You might think of the PASSWORD or even the secret file itself (e.g., JSON) in terms of plain text. But the sodium functions for data encrypt and decrypt work with raw vectors, not character vectors. Be prepared to see related conversions in the secret_*() functions.

Functions useful for these conversions:

Resources

bigrquery and googledrive, which both use this approach.

“Managing secrets” vignette of httr:

Vignettes of the sodium package, especially the parts relating to symmetric encryption:

The cyphr package, which smooths over frictions like those identified above relating to “file vs. object?” and “character vs. raw?”: