Containerisation - docker and singularity
1 Background information
In the previous tutorial, we discussed why it is important that your codebase is under version control and backed up to help ensure that your analyses can be evaluated and replicated by you and others both now and into the future. However, having access to the code (and data) does not always guarantee full reproducibility - this can also be effected by the exact software environment in which the code is run.
In the context of statistical analyses performed in R for example, R as well as the various packages that you have elected to use in support your analyses can (and do) evolve over time. Some functions get modified and some even get depreciated. Hence, over time, code that once worked perfectly (or at least adequately) can become broken.
Early solutions to this facet of reproducibility focused on virtual machines. Virtual machines (VM) build entire software environments on top of a software layer that mimics a physical computer such that each VM runs on a host computer is a completely separate self-contained entity. Whilst VMs do permit great flexibility (as virtually any operating system can be installed on a VM), they are considerably slower and less efficient than physical machines. Moreover, it is typically necessary to allocate a fixed amount of computer resources (particularly CPU) to the VM in advance.
More modern solutions focus instead on Containers. In contrast to VMs, containers do not mimic a physical computer, rather they only virtualise layers on top of the host operating system. Indeed, containers share (read only) the host OS kernel and binaries/libraries and thus containers and the applications contained therein can be very “light” and are typically are almost as performant as applications run natively on the host.
Time for some container terminology:
Container image is a static (unchangeable) file (or collection of files) that bundles code and all its dependencies (such as the necessary system libraries, code, runtime and system tools). Essentially, the image has all the information required to reproduce an software environment on any compatible machine. However, an image is just a snapshot which serves as a template to build a container. In other words, a container is a running image, and cannot exist without the image, whereas an image can exist without a container.
Container is a standard (linux) process whose software environment is defined by the contents of a container image and that runs on top of the host’s OS.
2 Preparations
If you intend to follow along with this tutorial, you may like to:
create a new folder (hereafter referred to as the sandpit folder) in which to create some files. On my local machine, I have a folder (
tmp
) in my home folder into which I will place a folder (calleddocker_tests
) for this very purpose. On Linux and MacOSX that would be achieved via the following:mkdir ~/tmp/docker_tests
install Docker
install apptainer/singularity (if you intend to follow along with containers on a HPC).
3 Docker
Currently, the most popular container engine in use today is Docker. Docker is easy to install on most operating systems and comes with tools to build, manage, run and distribute container images (the later of which is supported via the DockerHub container ecosystem.
3.1 Simple overview
Create a Docker definition file.
The
Dockerfile
contains a set of instructions that Docker uses to build your container with the correct specifications. For now you do not need to know all the bits and pieces here (though please see this link for a more in-depth understanding of what the Dockerfile is capable of).Lets start with a very simple
Dockerfile
(which should be a plain text file located in the root of a project directory). This first example will be a very minimal example and much of the rest of the Docker section of this tutorial will then progressively build on this example to introduce more and more complexity.The image we will build will start with a very minimal Debian Linux base called minidep sourced from Dockerhub. This provides a fully functioning Linux operating system complete with the typical terminal power tools.
To then extend the image by updating the package lists (location of repositories) before adding (installing) a small fun terminal application (cowsay) that generates ASCII art of a cow (or other animals) along with a speech bubble.
In the above
Dockerfile
:the three first rows in contain information on the base image on which to construct your docker container image (in this case bitnami/minideb:stretch container freely provided by the bitnami team), as well as information about yourself.
minideb
is a minimal debian operating system.the FROM command points to a parent image. Typically, this will point to a specific image within a registry on Docker hub. This generates the base layer of the container image.
the LABEL command is one of numerous environmental variables and is used to add metadata to an image. In this case, there are two entries to specify information about the maintainer and their contact details. Note, entries must be key-value pairs and values must be enclosed in double quotes.
the RUN command runs shell commands in a new layer on top of the current image and commits the result. Each RUN generates a new layer. The above example, first updates the packages list and then installs an additional package (
cowsay
).
Build the docker image
In a terminal in the same location as the
Dockerfile
(i.e. your sandpit folder), enter something the following:docker build . --tag minideb
where:.
indicates that the path used for the build context (in this case the current working directory -.
means current location). Any files within this path can be copied into the image or used in the build context (for example, aDockerfile
).--tag minideb
provides a name (and optionally, a tag) for the image (in this caseminideb
). The name (and tag) can be anything, yet should be descriptive enough to help you distinguish this container image from other container images that you might construct on your system.
This will build a series of docker container images (each of the layers built upon the layer before) in a local registry
More details
Usage:
docker build [OPTIONS] PATH | URL | -
Common options:Name Description Default --file, -f
Path and name of Dockerfile
Dockerfile
--no-cache
Do not use cache when building image --tag, -t
Name (and optional tag) in name:tag
formatTypical files in context:
Dockerfile
: a build recipe file.dockerignore
: similar to a.gitignore
, this file lists files to be ignored in collating the files that form the build context.
As an alternative to providing build instructions in the form of a
Dockerfile
, build can accept aURL
to a (remote) docker repository.
More info:
https://docs.docker.com/engine/reference/commandline/build/Check that the image(s) have been created
A list of images in your registry is obtained by:
Note, the above simply lists all named images in your local registry. To get a more complete list of all images:
docker images -a
The
-a
switch indicates all images (including unnamed and hanging images).This list appears chronologically from bottom to top. Hence, the Dockerhub image (
bitnami/minideb
) appears at the bottom of this list and above this there is a succession of intermediate images that correspond to each of the layers defined in theDockerfile
. Note, each successive image is progressively larger in size as the layer incorporates the layer below. At the top of the list is the full container image (withlatest
tag).Importantly, while we have built a container image, we do not yet have any running containers. We can demonstrate this by listing all existing containers:
The output is empty (assuming you have not previously generated any containers), indicating that there are currently no running containers.
Test the docker image (fire up an ephemeral container)
We will now test the image by generating and running a container from our container image. Once the container has started, the
cowsay
terminal application will display an ASCII cow saying ’Moo” before quietly terminating. The container will be automatically stopped and removed.docker run --entrypoint ./usr/games/cowsay --rm minideb Moo
where:
--entrypoint ./usr/games/cowsay
defines the base command that will be run once the container has started. In this case, it specifies the full path to thecowsay
executable file.--rm
indicates that the container should be removed after it has finished running.minideb
is the name of our docker container imageMoo
is the string passed on tocowsay
to display in the speech bubble. Feel free to experiment with other strings here.
To further appreciate the way arguments are passed on to applications within a container, lets alter the
cowsay
animal.docker run --entrypoint ./usr/games/cowsay --rm minideb -f /usr/share/cowsay/cows/koala.cow Grunt
In the above example, we passed
-f /usr/share/cowsay/cows/koala.cow Grunt
on tocowsay
. In this context,-f
points to were alternative animal definitions are located.Test the docker image interactively
Rather than fire up a container, run some command and then immediately terminate, it is possible to run a container in interactive mode. In this mode, after the container starts up, you will be placed in a terminal where you can issue any available command you like. Once you have finished the interactive session, simply enter
exit
and the container will then terminate.Try the following (within the container):
docker run --rm -it minideb
Once the prompt appears try entering the following:
- list the files and folders in the current working directory
ls -la
- run the cowsay application
./usr/games/cowsay Moo
- exit the container
exit
Running in interactive mode is very useful when developing/debugging code on a container.
3.2 Some Dockerfile goodness
Lets now step this up a bit and add some more information to the build recipe. Rather than alter the previous Dockerfile
, we will instead make a different file (Dockerfile2
) and inform docker
to build with this alternative Dockerfile
.
FROM bitnami/minideb:bookworm
LABEL maintainer="Author"
LABEL email="author_email@email.com"
## Install the os packages
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
cowsay \
&& rm -rf /var/lib/apt/lists/*
## Default command to run
ENTRYPOINT ["/usr/games/cowsay","-f","/usr/share/cowsay/cows/koala.cow"]
## Default extra parameters passed to the command
CMD ["Grunt"]
In the above Dockerfile
:
ENTRYPOINT
provides a default command to run within the container. This specification is in JSON format.CMD
provides default extra parameters that are passed on to the command (also in JSON format). This can be overridden by passing an alternative when running thedocker run
command (see below).
If we now build our container image using this Dockerfile2
:
docker build . --tag minideb -f Dockerfile2
If we again review the list of images, we see that there are now two additional intermediate images and the latest
image has been updated.
docker images -a
We can now run the container as:
To override the default arguments (CMD) we baked into the docker image, we can issue an alternative as a command line argument.
So far, we have used the docker container to display an ASCII art cow (or koala) in the terminal and then exit. Whilst, this might have some utility as a simple example of interacting with containers, it hardly represents typical work.
In the context of reproducible research, containers are useful for providing a consistent environment in which to run code. Thus in order to be useful, a container should have:
access (or a copy) of the code within the container
the ability to store the results on the host where they can be viewed and disseminated.
To illustrate these, we will add the R Statistical and Graphical Environment to our container image and use this in two further examples.
Copy
For the first example, we will add instructions to the Dockerfile
to copy a small R script (lets call it analysis.R
) into the container so that the code can be run within the container environment. Lets create two files:
- a
Dockerfile
calledDockerfile3
FROM bitnami/minideb:bookworm
LABEL maintainer="Author"
LABEL email="author_email@email.com"
## Install the os packages
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
r-base \
&& rm -rf /var/lib/apt/lists/*
COPY analysis.R ~/
WORKDIR ~/
## Default command to run
ENTRYPOINT ["Rscript"]
## Default command parameters
CMD ["analysis.R"]
This Dockerfile
includes instructions to:
- install R (`r-base`)
- copy our R script from the current working directory to the home
folder in the container (`COPY analysis.R ~/`)
- set the working directory within the container to be the home
folder (`WORKDIR ~/`)
- specify that once the container has started, the `Rscript`
command should be run (`ENTRYPOINT ["Rscript"]`)
- specify that once the container has started, the `Rscript`
command should be run using the `analysis.R` script
(`CMD ["analysis.R"]`)
Great. This time when we build the container image, we will provide both a name and tag for the image (via --tag r:1
). This will result in an image called r
with a tag of 1
.
docker build . --tag r:1 -f Dockerfile3
When we run this new image, we see that a data frame of 10 values is returned to the terminal.
Mount points
R did indeed run the analysis.R
script inside the container. However, what happened to the file containing the exported data (dat.csv
)? Although this file was created inside the container, it was completely lost when the container terminated. Obviously that is not that useful.
For the second example, rather than copy the R script to the container, we will instead mount a local folder to a point within the container. That way we can access select host files and folders within the container, thereby enabling us to both read the R script directly and write out any output files.
To support this, we will create another Dockerfile
(Dockerfile4
).
FROM bitnami/minideb:bookworm
LABEL maintainer="Author"
LABEL email="author_email@email.com"
## Install the os packages
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
r-base \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /home/Project
## Default command to run
ENTRYPOINT ["Rscript"]
## Default command parameters
CMD ["analysis.R"]
The changes from the previous Dockerfile
:
remove the
COPY
statement as we will not need to work on a copy of the R code, we can work on it directly.change the container working directory to
/home/Project
- Note, this path will not exist and will be created.
We will now build the container image with the name:tag of r:2
docker build . --tag r:2 -f Dockerfile4
This time, when we run the container image, we will indicate a volume to mount (and a folder to mount this volume to). This will define which host folder to mount (map) and the path to mount this to within the container.
cd ~/tmp/docker_tests/
ls -lat
docker run --rm -v $(pwd):/home/Project r:2
total 28
drwxr-xr-x 2 runner docker 4096 Feb 17 04:04 .
-rw-r--r-- 1 runner docker 361 Feb 17 04:04 Dockerfile4
-rw-r--r-- 1 runner docker 369 Feb 17 04:04 Dockerfile3
-rw-r--r-- 1 runner docker 89 Feb 17 04:04 analysis.R
-rw-r--r-- 1 runner docker 403 Feb 17 04:04 Dockerfile2
-rw-r--r-- 1 runner docker 238 Feb 17 04:04 Dockerfile
drwxr-xr-x 3 runner docker 4096 Feb 17 04:04 ..
y
1 -0.5948902
2 -1.0014230
3 0.6347289
4 -2.0479466
5 2.0936146
6 0.5159487
7 -0.3759240
8 -0.3827792
9 -0.7834866
10 0.1396498
where:
-v pwd:/home/Project
defines the mounting of the current working directory on the host machine to the/home/Project' (which the
Dockerfile` defined as the working directory) folder inside the container
If we list the contents of our local folder, we see that the output file (dat.csv
) has been created on the host filesystem.
cd ~/tmp/docker_tests/
ls -la
total 32
drwxr-xr-x 2 runner docker 4096 Feb 17 04:04 .
drwxr-xr-x 3 runner docker 4096 Feb 17 04:04 ..
-rw-r--r-- 1 runner docker 238 Feb 17 04:04 Dockerfile
-rw-r--r-- 1 runner docker 403 Feb 17 04:04 Dockerfile2
-rw-r--r-- 1 runner docker 369 Feb 17 04:04 Dockerfile3
-rw-r--r-- 1 runner docker 361 Feb 17 04:04 Dockerfile4
-rw-r--r-- 1 runner docker 89 Feb 17 04:04 analysis.R
-rw-r--r-- 1 root root 185 Feb 17 04:04 dat.csv
Of course, it is possible to define a combination of the above two examples - one in which a copy of the codebase is packaged up into the container image (to ensure that a specific codebase is always applied), yet a moutpoint is also specified at run time to enable the output(s) to be obtained by the host.
The .dockerignore
file is similar in structure to a .gitignore
file in that they both define files (or patterns) to ignore. The purpose of a dockerignore
is to indicate which files and folders should be excluded from the docker build context. Excluding certain files and directories:
- reduce image size by preventing unnecessary files from being copied (e.g., logs, temporary files).
- speed up builds by avoiding sending large or irrelevant files to the Docker daemon.
- improve security by excluding sensitive files like
.env
or credentials
3.3 Building a reproducible R environment
Just as it might be important to be able to recreate the state of an operating system and software from a previous time (to maximise the potential for reproducibility), it might be equally important to ensure that the entire environment reflects this state back in time. In the case of R software, this means that all included packages should be the same versions that were available at that previous time.
Posit (the developers of Rstudio) provide daily snapshots of CRAN (Posit package manager). We can therefore nominate a date when providing package installation instructions in a Dockerfile
.
On UNIX based systems, such as linux and MacOSX, many R packages need to be compiled from source during their installation. As such, they sometimes have additional external system dependencies. Normally, it it is necessary to install these external dependencies prior to attempting to install the R packages. This is done via the apt-get install
(debian) instructions in the Dockerfile. Unfortunately, this can turn into a very iterative process or attempting to install an R package, examine the progress and look out for any errors about missing dependencies and going back to the Dockerfile and adding the appropriate dependencies before trying again.
The R package called pak
is designed to install R packages and if necessary handle the installation of any additional external dependencies as well. This makes pak
a very useful addition in Dockerfiles.
Some R packages have system dependencies that usually must be installed before attempting to install R packages.
For this example, we will start from an image that already has a version of R (4.2.2) built in (rocker/r-ver:4.2.2
). Although this image is substantially larger than the mini debian we used earlier, it does come with all the build tools and R, each of which would otherwise require additional downloads and compilation. The net result is that the rocker/r-ver:4.2.2
image requires less overall download traffic than the individual parts.
Note, this image will take substantially longer to make as it not only has to pull down a larger base, it then has to compile the entire tidyverse from source.
The changes from the previous Dockerfile
:
- switch the base image to
rocker:r-ver:4.2.2
- add numerous
-dev
developer package dependencies - install the
tidyverse
collection of packages from a dated snapshot
FROM rocker/r-ver:4.2.2
LABEL maintainer="Author"
LABEL email="author_email@email.com"
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
libxml2-dev \
libcurl4-openssl-dev \
libssl-dev \
zlib1g-dev \
&& rm -rf /var/lib/apt/lists/*
## Install R package versions from Posit package manager (based on a date - YYYY-MM-DD)
RUN R -e "options(repos = \
list(CRAN = \"https://packagemanager.posit.co/cran/2024-01-10/\")); \
install.packages(\"pak\"); \
"
RUN R -e "options(repos = \
list(CRAN = \"https://packagemanager.posit.co/cran/2024-01-10/\")); \
pak::pkg_install(c(\"tidyverse\")); \
"
WORKDIR /home/Project
## Default command to run
ENTRYPOINT ["Rscript"]
CMD ["analysis.R"]
In the above Dockerfile
the three first rows in contain information on the base image on which to construct your docker container image (in this case rocker/r-ver:4.2.1 container freely provided by the rocker team), as well as information about yourself.
the FROM command points to a parent image. Typically, this will point to a specific image within a registry on Docker hub. This generates the base layer of the container image.
the LABEL command is one of numerous environmental variables and is used to add metadata to an image. In this case, there are two entries to specify information about the maintainer and their contact details. Note, entries must be key-value pairs and values must be enclosed in double quotes.
the RUN command runs shell commands in a new layer on top of the current image and commits the result. Each RUN generates a new layer. In the above example, there are two RUN commands.
- the first
RUN
command updates the underlying Ubuntu linux distribution and installs some necessary dependencies - the second
RUN
adds the R package (tidyverse
- which is technically a large collection of packages) from the Posit package manager) snapshot repository (base on the 4th of October 2022). This repository stores daily snapshots of all the packages on CRAN and thus allows us to obtain the set of packages in the state they existed on a nominated day.
- the first
the WORKDIR command sets the working directory for software within the container. In this case, we are creating a dedicated directory (
/home/Project
)the ENTRYPOINT command defines the default command/application to run within the container if the user does not provide a command. This is in JSON format. In this case, we are indicating that by default the container should run the
Rscript
application. TheRscript
application runs a non-interactiveR
session on a nominated R script file. That is, it will run the nominated R script and then terminate on completion.the CMD command defined the default arguments to provide to the ENTRYPOINT command. In this case, we have indicated which R script to run
Rscript
on.
Note, many of the large collection of R packages targeted for install as part of the tidyverse
ecosystem (or its dependencies) require full compilations. Whilst this does help ensure that the underlying package routines are optimised for your system, the entire install process may take up to an hour. This installation can be sped up substantially by instead installing a pre-bundled version of the tidyverse
packages (and dependencies) directly from the Ubuntu repositories. The associated alternative Dockerfile
is provided in the following expandable section.
Alternative Dockerfile
As an alternative, we could instead install tidyverse from the ubuntu r-cran repository. This install will be far faster, yet likely not as up-to-date and we would have a little less control over exactly which version we were installing…
We will now build the container image with the name:tag of r-tidyverse:1
Note, this will take some time to complete.
docker build . --tag r-tidyverse:1 -f Dockerfile5
If we now run docker
with this image, the resulting container will automatically run a non-interactive session using the analysis.R
script.
cd ~/tmp/docker_tests/
docker run --rm -v `pwd`:/home/Project r-tidyverse:1
where:
-v pwd:/home/Project
defines the mounting of the current working directory on the host machine to the/home/Project' (which the
Dockerfile` defined as the working directory) folder inside the container
Conveniently, we can override the CMD command when we run docker run
. In the current context, perhaps we would like to run the R session on an different R script. Lets try this by creating a new R script - this time, one that makes use of the tidyverse
ecosystem.
Now we can nominate this alternative R script as the argument to the Rscript
command.
cd ~/tmp/docker_tests/
docker run --rm -v `pwd`:/home/Project r-tidyverse:1 analysis5.R
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr 1.1.4 ✔ readr 2.1.4
✔ forcats 1.0.0 ✔ stringr 1.5.1
✔ ggplot2 3.4.4 ✔ tibble 3.2.1
✔ lubridate 1.9.3 ✔ tidyr 1.3.0
✔ purrr 1.0.2
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag() masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
y
1 -0.5198849
2 -0.5561143
3 0.4415712
4 -0.2906648
5 1.8838446
6 -0.3950799
7 1.0785308
8 1.4462143
9 -0.4401129
10 -0.5984643
Mean Median
1 0.204984 -0.3428723
3.4 Managing Docker images
To list the most recently created images.
docker images
The output will include all created images along with the base images that they are derived from (those sourced from Dockerhub for example).
The images (REPOSITORY
) entries that are <none>
are dangling images. That is, they are intermediate images that were previously used in the building of an image and are no longer used (because the Dockerfile
layer that they were associated with is no longer used in the latest version of that Dockerfile
).
We can exclude them from the output, by defining a filter that excludes dangling images.
docker images -f "dangling=false"
If instead of dangling=false
, you indicate dangling=true
, only dangling images are displayed. This can be useful for identifying redundant images.
Another useful filter is to predicate on time relative to an image. For example, to display all images that were created since minideb
:
docker images -f "since=minideb"
There is also a before version.
Note, the above examples will exclude the intermediate images that are associated with build layers. If we want to see all images (including the intermediate images), use the -a
switch.
This will also remove unused (dangling) containers, volumes and caches.
docker system prune
docker system prune -a
docker rmi <ID>
where <ID>
is the IMAGE ID
from docker images -a
docker images -a | grep <"pattern"> | awk '{print $3}' | xargs docker rmi
where <"pattern">
is a regular expression for isolating the name of the image(s) to remove.
docker rmi -f $(docker images -f "dangling=true" -q).
4 Generating docker images on github
Up to now, the docker images we have created have been housed in a registry locally on the machine that we are building them on. In the spirit of reproducibility, we ideally want these images to be available to others (as well as our future selves). There are numerous options for making images available:
- each user (or your future self) can build the image from the
Dockerfile
. In many cases, this will be a lengthy (yet automated) process. For this to be a viable option, each party will need to have the ability and resources to be able to build docker images from a Dockerfile. - the docker image could be hosted on a remote repository (such as dockerhub). This option requires that the original author of the docker image has a dockerhub account so that they can push the image that they created locally up to the remote registry.
- build and host the docker image on github. This option requires the original author/authors to have a github account.
Building Docker images on GitHub via GitHub Actions is useful because it enables automated, consistent, and reproducible builds directly from your remote repository. As such, it is possible to trigger a fresh image build every time there is a change to the repo itself. This reduces manual effort and minimizing deployment errors. With GitHub Actions, you can integrate CI/CD workflows, automatically test images, and push them to container registries (e.g., GitHub Container Registry or Docker Hub). It also enhances collaboration, as team members can track build logs, detect failures early, and enforce security best practices through automated vulnerability scanning. Finally, it also means that the one platform can be used to host both code and the environment in which to run the code.
To have Github Actions build and publish a docker image, we start by creating a github actions workflow file. This file is in yaml format and should be placed a directory called .github/workflows
within your git repository.
name: Create and publish the Docker image
on:
worflow_dispatch:
push:
branches: [ "main" ]
env:
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
build-image:
runs-on: ubuntu-latest
if: "!contains(github.event.head_commit.message, [ci skip])"
permissions:
contents: read
packages: write
name: ${{ matrix.config.r }}
strategy:
fail-fast: false
matrix:
config:
#- { r: devel }
#- { r: next }
- { r: 4.4.1 }
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Login to GitHub Container Registry
uses: docker/login-action@v1
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.repository_owner }} #
password: ${{ secrets.GITHUB_TOKEN }}
- name: Extract metadata (tags, labels) for Docker
id: meta
uses: docker/metadata-action@9ec57ed1fcdbf14dcef7dfbe97b2010124a938b7
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
- name: Build and push Docker image
uses: docker/build-push-action@v5
with:
context: .
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
In the above .github/workflow
:
name give a name to the workflow and is how it will be referred to within the Actions user interface
on: workflow_dispatch: specifies that this workflow can runs when manually triggered from the GitHub Actions UI.
on: push: branches [ “main” ]: specifies that this workflow will be triggered whenever there is a pushed change to the “main” branch
env: REGISTRY: sets the container registry to GitHub Container Registry (ghcr.io).
env: IMAGE_NAME: sets the repository name (in this case to the name of the github repository).
jobs: build-image: runs-on: defines a job (build-image) that runs on the latest version of Ubuntu
if: provides a way of defining exclusion rules. In this case it prevents execution if the latest commit message contains either “ci” or “skip”. This is useful for avoiding unnecessary builds.
permissions: contents: read: allows the workflow to read the repository contents.
packages: write: grants permission to push Docker images to Github Container Registry.
strategy: fail-fast:
strategy: matrix: config: defines a build matrix for testing against multiple R versions (only 4.4.1 is active). Commented-out lines suggest that devel and next versions were once considered.
steps: the actions to perform:
users: actions/checkout@v4 clones the repository so that the workflow can access files like the Dockerfile and any other code in the repository. It is using version four of this routine
users: docker/login-action@v1 … uses GitHub’s built-in token (
GITHUB_TOKEN
) to authenticate with Github Container Registry. Ensures that pushing images to Github Container Registry is authorized.users: docker/metadata-actions@… generates metadata (tags, labels) based on the repository and commit details. The specific commit hash (in this case 9ec57ed1…) ensures a fixed version of docker/metadata-action.
users: docker/build-push-action@v5 .. builds the Docker image from the repository. Pushes the image to Github Container Registry with the metadata-generated tags and labels.
Now any time there is a change to the main branch, the image will be rebuilt and published to Github Container Registry.
To access the built image:
navigate to the
Code
panel of the repositories Github page.click on the item under the heading
Packages
.
The displayed panel will provide information on the versions of the package available as well as instructions on how to obtain (pull) the image to a local repository from where it can be run.