johanneskueber.com

Signed OCI Artifacts for Flux with Cosign and Gitea

Signed OCI Artifacts for Flux with Cosign and Gitea

Flux will happily pull a Git repository, run kustomize build over a folder, and apply the result. I ran it that way for a long time and it works. What it does not give you is any statement about what is being applied or where it came from. The source of truth is a mutable branch - anyone who can write to that path, or anything that can, changes what lands in the cluster on the next reconcile. There is no integrity check sitting between “a commit exists” and “this is what is running in production.”

Moving the manifests into OCI artifacts and signing them closes that gap. Instead of pulling a branch, Flux pulls a content-addressed artifact from a registry, verifies a cosign signature against a key I trust, and only then applies it. This post is the setup I settled on for my Talos clusters: build one artifact per app in a Gitea Action, sign it with cosign, and verify on the Flux side so an unsigned or tampered artifact never reconciles.

Why an OCI artifact beats a Git checkout

A few things fall out of the change, and they are the reason I bothered:

  • Content addressing. An artifact is identified by its digest. A tag is a pointer; sha256:… is the bytes. You can pin a cluster to an exact digest and know that what reconciles tomorrow is byte-for-byte what reconciled today.
  • A signature you can actually verify. cosign signs the digest, and Flux refuses the source if the signature does not check out against my public key. The check is fail-closed: a bad signature means the OCIRepository goes NotReady and nothing downstream advances.
  • Build and deploy come apart. CI renders and publishes; the cluster consumes. The rendering happens once, in a place I control, instead of on every reconcile.
  • Per-app artifacts. Each app is its own repository in the registry, versioned and rolled independently. A change to one does not churn the others.
  • Promotion by digest. Because the artifact is immutable, “build once, promote the same bytes” is the natural model rather than something you have to engineer. More on that below.

None of this is exotic. It is the same supply-chain hygiene we already expect from container images, applied to the configuration that decides what those images do.

The shape of it

The pipeline is small. For each app under k8s/apps/talos/*:

  • kustomize build the folder. If it errors, stop - a broken render never gets published.
  • Compare the rendered output against what is already in the registry. If nothing changed, do nothing.
  • Push the artifact, stamped with the source repository and commit it came from.
  • Sign the digest with cosign.

On the cluster side, an OCIRepository pulls and verifies, and a Kustomization applies. That is the whole loop.

One decision worth making early: render in CI and push the output, not the raw folder. flux push artifact --path=./overlay tars up exactly that directory - so a kustomization.yaml that reaches out to ../../base or a shared component pushes a reference to a path the artifact does not contain, and the in-cluster build fails. kustomize build resolves every base and component into one self-contained set first. It also makes change detection honest: the comparison is against what actually deploys, not against incidental edits that leave the output untouched.

Building and signing in CI

The key pair is generated once. The private half is encrypted with a passphrase; the public half is what the cluster verifies against:

1
2
3
cosign generate-key-pair
# -> cosign.key (encrypted, protected by COSIGN_PASSWORD)
#    cosign.pub (the public half Flux trusts)

The job runs inside a small prebuilt image that already carries kustomize, flux, cosign, jq and git - so there are no tool installs and no JavaScript uses: actions in the loop, just shell. Per app it is four steps: render, push, sign, tag. The cosign key and its passphrase arrive as Actions secrets:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
# render: one self-contained set of manifests, all bases and components inlined
kustomize build "k8s/apps/talos/${app}" > out/manifests.yaml

# push to an immutable, content-traceable tag and capture the digest
digest=$(flux push artifact "oci://gitea.example.com/org/oci-talos-app-${app}:sha-${short}" \
  --path=out \
  --source="$(git config --get remote.origin.url)" \
  --revision="${GITHUB_REF_NAME}@sha1:${GITHUB_SHA}" \
  --output json | jq -r '.repository + "@" + .digest')

# sign the digest - the key PEM is read straight from a secret, never written to disk
cosign sign --yes --key env://COSIGN_KEY "$digest"

# move the environment-facing tag onto the same digest
flux tag artifact "oci://gitea.example.com/org/oci-talos-app-${app}:sha-${short}" --tag edge

The --source and --revision flags are not decoration - they are stored as annotations on the artifact and surface in-cluster under status.artifact.metadata, so the cluster can tell you which commit produced what it is running. Signing the digest rather than a tag matters too: the signature binds to exact bytes, and since every tag (:sha-…, :edge, a release version) is only a pointer to that digest, one signature covers all of them.

One practical note, learned the annoying way: I hand cosign the key with --key env://COSIGN_KEY, reading the PEM out of a secret rather than a file - it keeps the private key off the filesystem, and it sidesteps the mangled-newline problem that makes a key load fine yet sign nothing.

Versions and a rolling tag

Every push lands first on an immutable, content-traceable tag: :sha-<shortsha>. It never moves, and it ties the artifact straight back to the commit that produced it - so it is what I pin to and roll back to. What goes on top of that digest depends on how the build was triggered:

  • A push to main is an edge build. Only the apps whose rendered output actually changed are rebuilt, and each gets the rolling :edge tag moved onto its new digest. That is what the homelab clusters follow - always the newest.
  • A pushed git tag vX.Y.Z is a release build. Every app is snapshotted and gets the immutable semver tag :1.4.2 whether it changed or not, so a release is a complete, coherent set rather than a sparse one.

Three kinds of tag, each with a job: :sha-… for traceability and rollback, :edge as the rolling pointer, :1.4.2 for releases - and because cosign signed the digest, one signature already covers all of them. Which one a cluster follows is a property of the OCIRepository, not of the artifact. Chase the rolling tag:

1
2
  ref:
    tag: edge

take the newest immutable version in a range:

1
2
  ref:
    semver: ">=1.0.0"

or pin a digest, where nothing changes until I change it:

1
2
  ref:
    digest: sha256:…

I keep production on a semver range or a pinned digest and let the throwaway clusters ride :edge. I use :edge rather than :latest deliberately, so the tag announces what it is: a moving, bleeding-edge pointer with no promise of stability, not a default that quietly became load-bearing.

Key, KMS, or keyless?

This is the part where the right answer depends on your situation, so it is worth being explicit about the three options rather than reaching for the fashionable one.

Keyless signing is the Sigstore model. CI mints a short-lived OIDC token, Fulcio issues a short-lived certificate bound to that workflow identity, the signature lands in a public transparency log, and there is no long-lived key for anyone to steal. The signer is an identity - “this repository’s CI” - not a secret you hold. It is genuinely nice, and on GitHub Actions or GitLab, where the OIDC issuer is one Fulcio already trusts, I would reach for it first.

The catch on self-hosted Gitea is concrete: Gitea Actions does not mint an OIDC token that Fulcio will accept. Its runtime token is an internal credential without the issuer and subject claims a verifier expects, so the public Sigstore path is simply not open to you. That leaves two options that are, and both are fine:

  • A static cosign key with a passphrase, stored as an Actions secret. Self-contained, stays entirely inside your network, no transparency log, no external dependency. The cost is that you now own a long-lived key, and its safety reduces to your secret store’s access control. Rotate it, scope it, and keep a separate key per environment so a leak is contained.
  • A KMS-backed key - Vault transit if you already run Vault, or a cloud KMS. The private key never leaves the KMS; CI asks it to sign. You get central rotation and an audit trail of every signing call, at the price of a dependency to authenticate against.

So: SaaS CI with a trusted OIDC issuer and a wish for public, transparency-logged provenance points at keyless. Self-hosted, air-gapped, or simply “I want everything to stay in my network” points at a key, and a KMS-backed one once key custody starts to nag. For a homelab Gitea, a passphrase-protected key is the honest default, and there is no shame in it.

Why not just sign the commits?

A fair question, since signed commits are the more familiar control - and on this blog I sign plenty of them. But a commit signature attests to a source state in the repo, not to the bytes that reach the cluster. Between the signed commit and the running manifest sit kustomize build, any substitutions, and the packaging into an artifact - the same signed commit rendered with a different kustomize version produces different output. A signed commit proves who authored the source; a signed artifact proves these exact deployed bytes are the ones I trust. They answer different questions, and for “what is actually running” the artifact is the right thing to sign. The two compose nicely - signed commits at the source, signed artifacts at the boundary - but one does not replace the other.

There is also a practical limit worth knowing: Flux’s own commit-signature verification on a GitRepository currently supports GPG keys only, not SSH-signed commits. So if your team signs commits with SSH keys - increasingly the default - Flux cannot verify them at the source even if you wanted it to, which makes the signed artifact the dependable place to enforce trust.

Staged rollouts

This is where the approach really pays off. Because the artifact is content-addressed and immutable, build once and promote the same digest stops being an aspiration and becomes the obvious way to run staged deployments. CI publishes and signs a single artifact; a cheap gate runs against the rendered output before it is trusted - kubeconform for schema, a conftest or Kyverno CLI pass for policy, a kubectl apply --dry-run=server against a throwaway cluster for admission. Then promotion to staging and to production is repointing a Kustomization at the digest that passed, as a reviewed change in the GitOps repo. The bytes that cleared the test gate are the bytes that reach production - there is no rebuild in between to second-guess, and because the signature travels with the digest, what runs in prod is verifiably the same artifact that passed the gate.

Deploying and verifying

The public key goes into the cluster as a secret:

1
kubectl -n flux-system create secret generic cosign-public-keys --from-file=cosign.pub

Then the OCIRepository pulls the artifact and verifies it, and a Kustomization applies what it produces. The verify block is the whole enforcement - without a passing signature, no artifact is exposed and the Kustomization has nothing to reconcile:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
apiVersion: source.toolkit.fluxcd.io/v1
kind: OCIRepository
metadata:
  name: oci-talos-app-foo
  namespace: flux-system
spec:
  interval: 5m
  url: oci://gitea.example.com/org/oci-talos-app-foo
  ref:
    tag: edge                 # homelab tracks the rolling tag; prod pins a digest or semver
  secretRef:
    name: gitea-auth          # registry pull credentials
  verify:
    provider: cosign
    secretRef:
      name: cosign-public-keys
---
apiVersion: kustomize.toolkit.fluxcd.io/v1
kind: Kustomization
metadata:
  name: oci-talos-app-foo
  namespace: flux-system
spec:
  interval: 10m
  sourceRef:
    kind: OCIRepository
    name: oci-talos-app-foo
  path: ./
  prune: true

The verification is offline - Flux checks the signature against the public key it holds, no call back to a registry or a transparency log. You can watch it work:

1
2
3
flux get sources oci -n flux-system
# NAME               READY   MESSAGE
# oci-talos-app-foo  True    verified signature of revision edge@sha256:1a2b3c…

READY=True with a verified signature of revision message is the whole point made visible. To prove the gate is real, push an artifact signed with the wrong key, or none at all: the OCIRepository flips to NotReady with a verification error, and the Kustomization holds at the last good revision rather than rolling forward. It will not tear down what is already running - it simply refuses to advance to anything it cannot verify.

Where this leaves things

The result is a chain I can reason about. An artifact is identified by its bytes, carries the commit that produced it, is signed by a key I control, and is refused by the cluster unless that signature holds. Plain GitOps gave me “a commit exists”; this gives me “these exact, signed bytes are what is running, and here is the commit they came from.” For a self-hosted setup that is most of supply-chain security that I can realistically own, and the moving parts are a kustomize build, a flux push, and a cosign sign.

A note on cosign 3 and OCI 1.1

I sign with cosign 3, which stores the signature as an OCI 1.1 referrer artifact rather than the older sha256-<digest>.sig tag. On a current Flux and a current Gitea this is handled transparently - the signing command is unchanged and verification just works. I only flag it because the on-registry layout looks different from older write-ups, and a much older source-controller or registry would still expect the tag-based form. If you want to see where the signature landed, cosign tree <digest> shows it.