Docker Bake — Deep Dive
Overview
This reference covers Docker Bake from internals to production patterns. It assumes you have read [[docker-bake-beginner-guide|Docker Bake Beginner Guide]] and are comfortable with basic docker-bake.hcl syntax.
What you will find here:
- HCL language features: variables, locals, functions, conditionals
- Target inheritance and DRY patterns
- Matrix builds for publishing multiple versions in one pass
- Advanced cache strategies (registry, GitHub Actions, S3)
- Remote native builders — the key to fast amd64 builds without QEMU
- Rocky Linux 8.10 HPC deployment patterns
- Singularity/Apptainer integration for cluster environments
- CI/CD integration (GitHub Actions, GitLab CI)
- Debugging bake configurations
Prerequisites
- Docker Desktop ≥ 4.20, or Docker Engine ≥ 24 with the Buildx plugin
- A working multi-platform builder — see [[docker-bake-beginner-guide|Beginner Guide Step 1]]
- Access to a container registry (Docker Hub, GHCR, or private)
- Familiarity with Docker multi-stage builds
- Optional: A Rocky Linux 8.10 HPC cluster or access to one via SSH
Key Concepts
How Bake Resolves Configuration
When you run docker bake, BuildKit searches for configuration in this order:
- Files named explicitly with
-f docker-bake.hclin the current directorydocker-bake.jsonin the current directorydocker-bake.override.hclmerged on topdocker-compose.ymlas fallback
Multiple files can be merged with repeated -f flags:
docker bake -f base.hcl -f overrides.hcl --push
HCL vs JSON vs Compose
| Format | Pros | Cons |
|---|---|---|
| HCL | Variables, functions, expressions | HCL-specific syntax to learn |
| JSON | Any language can generate it | No native variables or logic |
| Compose | Already in your project | Limited to Compose schema subset |
Prefer HCL for anything beyond a single static target.
The Build Graph
When you have multiple targets in a group, Bake builds them in parallel by default. Targets that share a common base layer benefit from BuildKit's shared layer cache — the base is pulled once and reused across targets running concurrently.
Step-by-Step Instructions
Step 1 — HCL Language Features
Variables
Variables accept a default and are overridden by environment variables of the same name:
variable "REGISTRY" {
default = "ghcr.io/myorg"
}
variable "GIT_SHA" {
default = "" # empty means: must be supplied via env in production
}
REGISTRY=myregistry.io GIT_SHA=abc1234 docker bake --push
Locals
Use locals for computed values that you don't want to expose as overridable variables:
variable "TAG" { default = "latest" }
variable "PUSH" { default = false }
locals {
is_release = TAG != "latest"
cache_repo = "${REGISTRY}/cache"
}
Functions
HCL supports a subset of the HashiCorp functions:
variable "VERSION" { default = "1.2.3" }
locals {
# Split "1.2.3" into ["1", "2", "3"] and take the major component
major = split(".", VERSION)[0]
# Build a list of all version tags to push
version_tags = [
"${REGISTRY}/myapp:${VERSION}",
"${REGISTRY}/myapp:${major}",
"${REGISTRY}/myapp:latest",
]
}
target "app" {
tags = local.version_tags
}
Conditionals
variable "CI" { default = "" }
target "app" {
# Only use registry cache in CI; avoid polluting cache from developer laptops
cache-from = CI != "" ? ["type=registry,ref=${REGISTRY}/cache:app"] : []
cache-to = CI != "" ? ["type=registry,ref=${REGISTRY}/cache:app,mode=max"] : []
}
Step 2 — Target Inheritance
Use inherits to share common configuration across targets. This keeps your bake file DRY:
# Shared base configuration — not a real build target
target "_common" {
dockerfile = "Dockerfile"
platforms = ["linux/amd64"]
cache-from = ["type=registry,ref=${REGISTRY}/cache:common"]
cache-to = ["type=registry,ref=${REGISTRY}/cache:common,mode=max"]
}
target "api" {
inherits = ["_common"]
context = "./services/api"
tags = ["${REGISTRY}/api:${TAG}"]
}
target "worker" {
inherits = ["_common"]
context = "./services/worker"
tags = ["${REGISTRY}/worker:${TAG}"]
# Override a field from _common
platforms = ["linux/amd64", "linux/arm64"]
}
Inheritance is additive for lists (tags, platforms) and overriding for scalars (dockerfile, context). An inherited platforms list is completely replaced when the child specifies its own.
Step 3 — Matrix Builds
Matrix builds let you produce multiple variants of a target from a single target definition. Docker Bake expands the matrix into N individual build jobs and runs them in parallel.
Versioned builds
target "app" {
name = "app-${version}"
matrix = {
version = ["3.9", "3.10", "3.11"]
}
args = {
PYTHON_VERSION = version
}
tags = ["${REGISTRY}/myapp:py${version}-${TAG}"]
}
docker bake --push
# Builds: app-3.9, app-3.10, app-3.11 in parallel
Cross-product matrix
target "app" {
name = "app-${python_version}-${distro}"
matrix = {
python_version = ["3.10", "3.11"]
distro = ["rocky8", "ubuntu22"]
}
dockerfile = "Dockerfile.${distro}"
args = {
PYTHON_VERSION = python_version
}
tags = ["${REGISTRY}/myapp:${python_version}-${distro}-${TAG}"]
}
This generates four build jobs: 3.10-rocky8, 3.10-ubuntu22, 3.11-rocky8, 3.11-ubuntu22.
Step 4 — Advanced Cache Strategies
Proper caching is the single biggest lever for build performance.
Registry cache (recommended)
Registry cache stores BuildKit cache metadata inline in your registry. It survives across machines and CI ephemeral environments:
target "app" {
cache-from = ["type=registry,ref=${REGISTRY}/cache:app-amd64"]
cache-to = ["type=registry,ref=${REGISTRY}/cache:app-amd64,mode=max"]
}
mode=max exports intermediate layer cache (not just the final image) — this is slower to write but dramatically accelerates subsequent builds when intermediate layers change.
GitHub Actions cache
If you run builds in GitHub Actions, use the gha cache backend to persist layer cache in Actions' cache storage:
target "app" {
cache-from = ["type=gha,scope=app-amd64"]
cache-to = ["type=gha,scope=app-amd64,mode=max"]
}
This requires no extra registry and uses GitHub's built-in cache quota.
S3 / Azure blob cache
For self-hosted CI with access to object storage:
target "app" {
cache-from = ["type=s3,bucket=my-build-cache,region=us-east-1,prefix=app/"]
cache-to = ["type=s3,bucket=my-build-cache,region=us-east-1,prefix=app/,mode=max"]
}
Inline cache (simpler, less powerful)
Inline cache embeds the cache manifest directly into the pushed image. It is easier to set up but only stores the final layer cache (equivalent to mode=min):
target "app" {
cache-from = ["type=registry,ref=${REGISTRY}/myapp:${TAG}"]
args = {
BUILDKIT_INLINE_CACHE = "1"
}
}
Step 5 — Remote Native Builders (Key for HPC Workflows)
QEMU emulation is the main bottleneck when building linux/amd64 on Apple Silicon. The solution for large projects is to add a native amd64 builder — a remote machine running Rocky Linux or another x86_64 host — to your BuildKit builder pool.
Adding a remote builder node
If you have SSH access to an amd64 Linux machine (or an EC2 instance):
# Create a multi-node builder: local arm64 + remote amd64
docker buildx create \
--name hybrid \
--driver docker-container \
--bootstrap \
--use
docker buildx create \
--name hybrid \
--append \
ssh://user@amd64-host.example.com \
--platform linux/amd64
With this hybrid builder, Bake dispatches linux/amd64 layers to the remote native host and linux/arm64 layers to your local machine — no QEMU involved for either architecture.
Verify both nodes appear:
docker buildx inspect hybrid
Expected output (abbreviated):
Name: hybrid
Driver: docker-container
Nodes:
Name: hybrid0
Endpoint: unix:///var/run/docker.sock
Platforms: linux/arm64, linux/arm/v7
Name: hybrid1
Endpoint: ssh://user@amd64-host.example.com
Platforms: linux/amd64, linux/386
Your docker-bake.hcl is unchanged — Bake automatically routes each platform to the correct builder node.
Using an ephemeral EC2 spot instance
For CI/CD pipelines that need fast amd64 builds but do not have a persistent builder:
# .github/workflows/build.yml (abbreviated)
jobs:
build:
runs-on: ubuntu-latest # x86_64 GitHub runner — native amd64!
steps:
- uses: docker/setup-buildx-action@v3
- uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- run: |
TAG=${{ github.sha }} docker bake --push
On a native amd64 GitHub runner, there is no QEMU overhead — the linux/amd64 target builds at full hardware speed.
Step 6 — Rocky Linux 8.10 HPC-Specific Patterns
Matching the cluster environment
Rocky Linux 8.10 ships with glibc 2.28. If your image links against a newer glibc (e.g., Ubuntu 24.04 ships glibc 2.39), the binary will fail to run on the cluster with:
/lib64/ld-linux-x86-64.so.2: No such file or directory
Always use rockylinux:8.10 (or rockylinux:8) as your runtime base when targeting Rocky 8.10 clusters.
# ✓ Correct — runtime glibc matches cluster
FROM rockylinux:8.10
# ✗ Will fail — glibc 2.39 vs cluster glibc 2.28
FROM ubuntu:24.04
Python on Rocky Linux 8
Rocky Linux 8 ships Python 3.6 as the system default. AppStream provides Python 3.9, 3.11, and 3.12 as modules:
FROM rockylinux:8.10
# Enable the Python 3.11 module stream
RUN dnf module enable -y python311 && \
dnf install -y python311 python311-pip python311-devel && \
dnf clean all
# Make python3 point to 3.11
RUN alternatives --set python3 /usr/bin/python3.11
EPEL and additional packages
For scientific software, EPEL (Extra Packages for Enterprise Linux) is often needed:
FROM rockylinux:8.10
RUN dnf install -y epel-release && \
dnf install -y \
hdf5 \
hdf5-devel \
openmpi \
openmpi-devel && \
dnf clean all
Running Docker images under Singularity/Apptainer
Many HPC clusters run Singularity (now Apptainer) rather than Docker for security reasons. Your Docker images work directly:
# Pull a Docker image into a Singularity Image File (.sif)
singularity pull myapp.sif docker://yourusername/myapp:latest
# Run it
singularity run myapp.sif
# Run with bind mounts (equivalent to Docker -v)
singularity run --bind /data:/data myapp.sif
# Run with GPU (equivalent to Docker --gpus all)
singularity run --nv myapp.sif
The bake pipeline is unchanged — you build and push a standard OCI image, and singularity pull handles the conversion.
See [[isaaclab-metagrasp-apptainer-hpc-beginner-guide|IsaacLab + Apptainer HPC Guide]] for a full example of this workflow with GPU workloads.
Practical Examples
Example 1 — Production-Ready Bake File for HPC
A complete docker-bake.hcl for a data science application deployed to a Rocky Linux 8.10 HPC cluster, built from an Apple Silicon Mac:
# docker-bake.hcl
variable "REGISTRY" { default = "ghcr.io/myorg" }
variable "IMAGE_NAME" { default = "hpc-app" }
variable "TAG" { default = "latest" }
variable "CI" { default = "" }
locals {
full_image = "${REGISTRY}/${IMAGE_NAME}"
is_ci = CI != ""
}
group "default" {
targets = ["app"]
}
group "all-versions" {
targets = ["app-matrix"]
}
# Standard single-tag build
target "app" {
context = "."
dockerfile = "Dockerfile"
platforms = ["linux/amd64"]
tags = [
"${local.full_image}:${TAG}",
]
cache-from = local.is_ci ? ["type=gha,scope=${IMAGE_NAME}"] : []
cache-to = local.is_ci ? ["type=gha,scope=${IMAGE_NAME},mode=max"] : []
}
# Matrix build across Python versions
target "app-matrix" {
name = "${IMAGE_NAME}-py${python_ver}"
matrix = {
python_ver = ["3.9", "3.11"]
}
context = "."
dockerfile = "Dockerfile"
platforms = ["linux/amd64"]
args = {
PYTHON_VERSION = python_ver
}
tags = ["${local.full_image}:py${python_ver}-${TAG}"]
cache-from = local.is_ci ? ["type=gha,scope=${IMAGE_NAME}-${python_ver}"] : []
cache-to = local.is_ci ? ["type=gha,scope=${IMAGE_NAME}-${python_ver},mode=max"] : []
}
# Dockerfile
# syntax=docker/dockerfile:1
ARG PYTHON_VERSION=3.11
# Build stage — native on your Mac (fast)
FROM python:${PYTHON_VERSION}-slim AS builder
WORKDIR /build
COPY requirements.txt .
RUN pip install --no-cache-dir --prefix=/install -r requirements.txt
# Runtime — Rocky Linux 8.10, matching HPC cluster
FROM rockylinux:8.10
ARG PYTHON_VERSION=3.11
# Install Python matching the requested version
RUN PYVER=$(echo ${PYTHON_VERSION} | tr -d '.') && \
dnf module enable -y python${PYVER} && \
dnf install -y python${PYVER} python${PYVER}-pip && \
dnf clean all
COPY /install /usr/local
COPY . /app
WORKDIR /app
ENTRYPOINT ["python3", "main.py"]
Example 2 — Debugging Bake Configuration
Print the resolved build configuration without actually building:
docker bake --print
Sample output:
{
"group": {
"default": {
"targets": ["app"]
}
},
"target": {
"app": {
"context": ".",
"dockerfile": "Dockerfile",
"platforms": ["linux/amd64"],
"tags": ["ghcr.io/myorg/hpc-app:latest"],
"cache-from": [],
"cache-to": []
}
}
}
This is invaluable when debugging variable resolution — the JSON shows exactly what Bake will pass to BuildKit.
Example 3 — GitHub Actions CI/CD Pipeline
# .github/workflows/docker-bake.yml
name: Build and Push
on:
push:
branches: [main]
release:
types: [published]
jobs:
bake:
runs-on: ubuntu-latest # native amd64 — no QEMU overhead
permissions:
contents: read
packages: write
steps:
- uses: actions/checkout@v4
- uses: docker/setup-buildx-action@v3
- uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Set image tag
id: meta
run: |
if [ "${{ github.event_name }}" = "release" ]; then
echo "tag=${{ github.event.release.tag_name }}" >> $GITHUB_OUTPUT
else
echo "tag=${{ github.sha }}" >> $GITHUB_OUTPUT
fi
- name: Build and push
env:
REGISTRY: ghcr.io/${{ github.repository_owner }}
TAG: ${{ steps.meta.outputs.tag }}
CI: "true"
run: docker bake --push
Because the GitHub runner is already linux/amd64, the build is native — no QEMU emulation, full speed.
Example 4 — Local Development Override File
Create docker-bake.override.hcl (gitignored) for personal development settings:
# docker-bake.override.hcl — personal overrides, not committed
variable "REGISTRY" {
# Use your personal namespace instead of the org registry during development
default = "docker.io/myusername"
}
target "app" {
# Always load locally instead of pushing during development
output = ["type=docker"]
# Skip the slow amd64 QEMU build; test with native arm64 locally
platforms = ["linux/arm64"]
}
Bake automatically merges docker-bake.override.hcl on top of docker-bake.hcl when it exists. Your teammates never see your local tweaks.
Hands-On Exercises
Exercise 1 — Conditional Cache
- Write a bake file where
cache-fromandcache-toare empty lists whenCIenv var is empty, and use registry cache whenCI=true. - Verify with
docker bake --printandCI=true docker bake --print.
Exercise 2 — Matrix with Three Python Versions
- Extend the matrix example to build for Python 3.9, 3.10, 3.11.
- Add a group
all-pythonthat contains all three targets. - Build the group:
docker bake --load all-python. - Verify all three images exist:
docker images | grep myapp.
Exercise 3 — Remote Builder
- If you have SSH access to any Linux x86_64 machine (even a cheap VPS), configure it as a remote builder node using
docker buildx create --append. - Run
docker buildx inspect hybridto confirm both nodes are present. - Build your app with
platforms = ["linux/amd64"]and observe the build happening on the remote host in the build output.
Exercise 4 — Singularity Round-trip
- Push a small
linux/amd64image to Docker Hub or GHCR. - On a Linux machine (or using an amd64 container locally), install Singularity/Apptainer.
singularity pull myapp.sif docker://yourusername/myapp:latestsingularity run myapp.sif— verify the app runs correctly under Singularity.
Troubleshooting
Build is ignoring my override file
Bake only auto-merges files named exactly docker-bake.override.hcl in the current directory. For any other name, use explicit -f flags:
docker bake -f docker-bake.hcl -f my-overrides.hcl --push
Matrix target names collide
If two matrix expansions resolve to the same name, Bake will error. Make sure your name template includes all dimensions of the matrix:
# ✗ Bad — "app-3.9" is ambiguous if there are two matrix dimensions
name = "app-${version}"
# ✓ Good — encodes all dimensions
name = "app-py${python_ver}-${distro}"
locals block produces wrong value
Use docker bake --print to inspect the resolved configuration. If a local expression looks wrong, check that:
- String interpolation uses
"${expr}"syntax - You are referencing
local.myvarnotvar.myvarfor locals - Function names are spelled exactly as the HCL stdlib defines them (
split,join,upper,lower, etc.)
Rocky Linux 8.10 image: module enable fails
In recent Rocky 8.10 container images, dnf module requires the dnf-plugins-core package and the appstream repo to be enabled:
RUN dnf install -y dnf-plugins-core && \
dnf config-manager --set-enabled appstream && \
dnf module enable -y python311 && \
dnf install -y python311 && \
dnf clean all
Build context too large — slow uploads to remote builder
Add a .dockerignore file to exclude non-essential files from the build context sent to BuildKit (local or remote):
# .dockerignore
.git
**/__pycache__
**/*.pyc
.env
.venv
dist/
build/
*.egg-info/
Debugging a failed amd64 build from Apple Silicon
Launch an interactive amd64 container to debug build failures:
# Open a bash shell inside an emulated amd64 Rocky Linux container
docker run --rm -it --platform linux/amd64 rockylinux:8.10 bash
# Now you are inside an amd64 Rocky environment on your Mac
uname -m # prints x86_64
cat /etc/os-release # shows Rocky Linux 8.10
References
- Docker Bake Reference (complete HCL schema)
- Docker Bake Matrices
- BuildKit Cache Documentation
- BuildKit GitHub Actions Cache
- Rocky Linux 8 Application Streams (Python)
- Apptainer (Singularity) Docker compatibility
- Vault: [[docker-multiplatform-apple-silicon|Building Multi-Platform Docker Images on Apple Silicon]]
Related Tutorials
- [[docker-bake-beginner-guide|Docker Bake Beginner Guide]] — start here if you are new to Bake
- [[docker-test-container-beginner-guide|Docker Test Container Beginner Guide]] — Docker fundamentals
- [[docker-test-container-deep-dive|Docker Test Container Deep Dive]] — multi-stage builds and advanced Dockerfile patterns
- [[docker-multiplatform-apple-silicon|Building Multi-Platform Docker Images on Apple Silicon]] — buildx fundamentals and QEMU internals
- [[kubernetes-beginner-guide|Kubernetes Beginner Guide]] — orchestrating containers after building them with Bake
- [[kubernetes-deep-dive|Kubernetes Deep Dive]] — production Kubernetes patterns
- [[isaaclab-metagrasp-apptainer-hpc-beginner-guide|IsaacLab + Apptainer HPC Guide]] — GPU workloads on HPC clusters using Singularity images built from Docker
- [[isaaclab-metagrasp-apptainer-hpc-deep-dive|IsaacLab + Apptainer HPC Deep Dive]] — advanced HPC container workflows
- [[hyperqueue-basics|HyperQueue Basics]] — HPC job scheduling for workloads packaged in Docker/Singularity images
- [[hyperqueue-deep-dive|HyperQueue Deep Dive]] — production HPC task dispatch
- [[parsl-beginner-guide|Parsl Beginner Guide]] — Python parallel computing on HPC clusters
- [[pixi-beginner-guide|Pixi Beginner Guide]] — fast conda-compatible environment management for reproducible builds
Summary
Key takeaways:
-
HCL gives you variables, locals, functions, and conditionals. Use them to keep your bake file DRY and make it work across dev/CI/production environments without duplication.
-
inheritsis the core DRY tool. Define a_commontarget with shared settings and inherit from it in all real targets. Override only what differs. -
Matrix builds parallelize version variants. Instead of maintaining N nearly-identical targets, declare one target with a
matrixblock and let Bake expand it. -
Remote native builders eliminate QEMU overhead. Add an amd64 host to your builder pool with
docker buildx create --append. For CI, use a nativeubuntu-latestGitHub Actions runner. -
Match your base image to your cluster OS. Use
rockylinux:8.10for Rocky 8.10 HPC clusters. glibc mismatches cause silent or cryptic failures at runtime. -
Docker images run under Singularity unchanged. Build and push a standard OCI image;
singularity pull docker://...handles the conversion for HPC environments that require Apptainer. -
docker bake --printis your debugging tool. Always inspect the resolved configuration before running long builds, especially when working with complex variable expressions.
Next steps:
- Add Docker Bake to your project's CI pipeline and enjoy deterministic, cached, multi-platform builds
- Set up a remote amd64 builder node (or use GitHub's native amd64 runners) for production-speed cross-platform builds
- Explore [[hyperqueue-basics|HyperQueue]] and [[parsl-beginner-guide|Parsl]] to schedule jobs on the HPC cluster that consume your freshly built images